-
Notifications
You must be signed in to change notification settings - Fork 6.7k
[Modular] mellon doc etc #13051
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Open
yiyixuxu
wants to merge
16
commits into
main
Choose a base branch
from
more-mellon-related
base: main
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
+767
−581
Open
[Modular] mellon doc etc #13051
Changes from all commits
Commits
Show all changes
16 commits
Select commit
Hold shift + click to select a range
2890dd8
add metadata field to input/output param
yiyixuxu d2bee6a
refactor mellonparam: move the template outside, add metaclass, defin…
yiyixuxu c5c732b
add from_custom_block
yiyixuxu ffc5708
style
yiyixuxu 5ad8390
up up fix
yiyixuxu 29c5741
add mellon guide
yiyixuxu 26f59f1
add to toctree
yiyixuxu a71d86b
style
yiyixuxu 3393ef0
Merge branch 'main' into more-mellon-related
yiyixuxu 48160f6
add mellon_types
yiyixuxu 3fe2711
style
yiyixuxu 5c7273f
Merge branch 'more-mellon-related' of github.com:huggingface/diffuser…
yiyixuxu d4f2a89
mellon_type -> inpnt_types + output_types
yiyixuxu 46a713a
update doc
yiyixuxu 8c5b119
add quant info to components manager
yiyixuxu 3985c43
fix more
yiyixuxu File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,233 @@ | ||
| <!--Copyright 2025 The HuggingFace Team. All rights reserved. | ||
|
|
||
| Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with | ||
| the License. You may obtain a copy of the License at | ||
|
|
||
| http://www.apache.org/licenses/LICENSE-2.0 | ||
|
|
||
| Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on | ||
| an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the | ||
| specific language governing permissions and limitations under the License. | ||
| --> | ||
|
|
||
|
|
||
| ## Using Custom Blocks with Mellon | ||
|
|
||
| [Mellon](https://github.com/cubiq/Mellon) is a visual workflow interface (similar to ComfyUI) that integrates with Modular Diffusers. This guide shows how to add Mellon support to your custom blocks so they can be used in the Mellon UI. | ||
|
|
||
| ## Overview | ||
|
|
||
| To use a custom block in Mellon, you need a `mellon_pipeline_config.json` file that defines how your block's parameters map to Mellon UI components. Here's how to create one: | ||
|
|
||
| 1. **Add a "Mellon type" to your block's parameters** - Each `InputParam`/`OutputParam` needs a type that tells Mellon what UI component to render (e.g., `"textbox"`, `"dropdown"`, `"image"`). You can specify types via metadata in your block definitions, or pass them when generating the config. | ||
| 2. **Generate `mellon_pipeline_config.json`** - Use our utility to generate a default template and push it to your Hub repository | ||
| 3. **(Optional) Manually adjust the template** - Fine-tune the generated config for your specific needs | ||
|
|
||
| ## Step 1: Specify Mellon Types for Parameters | ||
|
|
||
| Mellon types determine how each parameter renders in the UI. If you don't specify a type for a parameter, it will default to `"custom"`, which renders as a simple connection dot. You can always adjust this later in the generated config. | ||
|
|
||
| ### Supported Mellon Types | ||
|
|
||
| | Type | Input/Output | Description | | ||
| |------|--------------|-------------| | ||
| | `image` | Both | Image (PIL Image) | | ||
| | `video` | Both | Video | | ||
| | `text` | Both | Text display | | ||
| | `textbox` | Input | Text input | | ||
| | `dropdown` | Input | Dropdown selection menu | | ||
| | `slider` | Input | Slider for numeric values | | ||
| | `number` | Input | Numeric input | | ||
| | `checkbox` | Input | Boolean toggle | | ||
|
|
||
| ### Method 1: Using `metadata` in Block Definitions | ||
|
|
||
| If you're defining a custom block from scratch, you can add `metadata={"mellon": "<type>"}` directly to your `InputParam` and `OutputParam` definitions: | ||
| ```python | ||
| class GeminiPromptExpander(ModularPipelineBlocks): | ||
|
|
||
| @property | ||
| def inputs(self) -> List[InputParam]: | ||
| return [ | ||
| InputParam( | ||
| "prompt", | ||
| type_hint=str, | ||
| required=True, | ||
| description="Prompt to use", | ||
| metadata={"mellon": "textbox"}, # Text input | ||
| ) | ||
| ] | ||
|
|
||
| @property | ||
| def intermediate_outputs(self) -> List[OutputParam]: | ||
| return [ | ||
| OutputParam( | ||
| "prompt", | ||
| type_hint=str, | ||
| description="Expanded prompt by the LLM", | ||
| metadata={"mellon": "text"}, # Text output | ||
| ), | ||
| OutputParam( | ||
| "old_prompt", | ||
| type_hint=str, | ||
| description="Old prompt provided by the user", | ||
| # No metadata - we don't want to render this in UI | ||
| ) | ||
| ] | ||
| ``` | ||
|
|
||
| ### Method 2: Using `input_types` and `output_types` When Generating Config | ||
|
|
||
| If you're working with an existing pipeline or prefer to keep your block definitions clean, you can specify types when generating the config using the `input_types/output_types` argument: | ||
| ```python | ||
| from diffusers.modular_pipelines.mellon_node_utils import MellonPipelineConfig | ||
|
|
||
| mellon_config = MellonPipelineConfig.from_custom_block( | ||
| blocks, | ||
| input_types={"prompt": "textbox"}, | ||
| output_types={"prompt": "text"} | ||
| ) | ||
| ``` | ||
|
|
||
| > [!NOTE] | ||
| > If you specify both `metadata` and `input_types`/`output_types`, the arguments take precedence, allowing you to override metadata when needed. | ||
|
|
||
| ## Step 2: Generate and Push the Mellon Config | ||
|
|
||
| After adding metadata to your block, generate the default Mellon configuration template and push it to the Hub: | ||
|
|
||
| ```python | ||
| from diffusers import ModularPipelineBlocks | ||
| from diffusers.modular_pipelines.mellon_node_utils import MellonPipelineConfig | ||
|
|
||
| # load your custom blocks from your local dir | ||
| blocks = ModularPipelineBlocks.from_pretrained("/path/local/folder", trust_remote_code=True) | ||
|
|
||
| # Generate the default config template | ||
| mellon_config = MellonPipelineConfig.from_custom_block(blocks) | ||
yiyixuxu marked this conversation as resolved.
Show resolved
Hide resolved
|
||
| # push the default template to `repo_id`, you will need to pass the same local folder path so that it will save the config locally first | ||
| mellon_config.save( | ||
| local_dir="/path/local/folder", | ||
| repo_id= repo_id, | ||
| push_to_hub=True | ||
| ) | ||
| ``` | ||
|
|
||
| This creates a `mellon_pipeline_config.json` file in your repository. | ||
|
|
||
| ## Step 3: Review and Adjust the Config (Optional) | ||
|
|
||
| The generated template is a starting point - you may want to adjust it for your needs. Let's walk through the generated config for the Gemini Prompt Expander: | ||
|
|
||
| ```json | ||
| { | ||
| "label": "Gemini Prompt Expander", | ||
| "default_repo": "", | ||
| "default_dtype": "", | ||
| "node_params": { | ||
| "custom": { | ||
| "params": { | ||
| "prompt": { | ||
| "label": "Prompt", | ||
| "type": "string", | ||
| "display": "textarea", | ||
| "default": "" | ||
| }, | ||
| "out_prompt": { | ||
| "label": "Prompt", | ||
| "type": "string", | ||
| "display": "output" | ||
| }, | ||
| "old_prompt": { | ||
| "label": "Old Prompt", | ||
| "type": "custom", | ||
| "display": "output" | ||
| }, | ||
| "doc": { | ||
| "label": "Doc", | ||
| "type": "string", | ||
| "display": "output" | ||
| } | ||
| }, | ||
| "input_names": ["prompt"], | ||
| "model_input_names": [], | ||
| "output_names": ["out_prompt", "old_prompt", "doc"], | ||
| "block_name": "custom", | ||
| "node_type": "custom" | ||
| } | ||
| } | ||
| } | ||
| ``` | ||
|
|
||
| ### Understanding the Structure | ||
|
|
||
| The `params` dict defines how each UI element renders. The `input_names`, `model_input_names`, and `output_names` lists map these UI elements to the underlying [`ModularPipelineBlocks`]'s I/O interface: | ||
|
|
||
| | Mellon Config | ModularPipelineBlocks | | ||
| |---------------|----------------------| | ||
| | `input_names` | `inputs` property | | ||
| | `model_input_names` | `expected_components` property | | ||
| | `output_names` | `intermediate_outputs` property | | ||
|
|
||
| In this example: `prompt` is the only input, there are no model components, and outputs include `out_prompt`, `old_prompt`, and `doc`. | ||
|
|
||
| Now let's look at the `params` dict: | ||
|
|
||
| **`prompt`** is an input parameter. It has `display: "textarea"` which renders as a text input box, `label: "Prompt"` shown in the UI, and `default: ""` so it starts empty. The `type: "string"` field is important in Mellon because it determines which nodes can connect together - only matching types can be linked with "noodles". | ||
|
|
||
| **`out_prompt`** is the expanded prompt output. The `out_` prefix was automatically added because the input and output share the same name (`prompt`), avoiding naming conflicts in the config. It has `display: "output"` which renders as an output socket. | ||
|
|
||
| **`old_prompt`** has `type: "custom"` because we didn't specify metadata. This renders as a simple dot in the UI. Since we don't actually want to expose this in the UI, we can remove it. | ||
|
|
||
| **`doc`** is the documentation output, automatically added to all custom blocks. | ||
|
|
||
| ### Making Adjustments | ||
|
|
||
| For the Gemini Prompt Expander, we don't need `old_prompt` in the UI. Remove it from both `params` and `output_names`: | ||
|
|
||
| ```json | ||
| { | ||
| "label": "Gemini Prompt Expander", | ||
| "default_repo": "", | ||
| "default_dtype": "", | ||
| "node_params": { | ||
| "custom": { | ||
| "params": { | ||
| "prompt": { | ||
| "label": "Prompt", | ||
| "type": "string", | ||
| "display": "textarea", | ||
| "default": "" | ||
| }, | ||
| "out_prompt": { | ||
| "label": "Prompt", | ||
| "type": "string", | ||
| "display": "output" | ||
| }, | ||
| "doc": { | ||
| "label": "Doc", | ||
| "type": "string", | ||
| "display": "output" | ||
| } | ||
| }, | ||
| "input_names": ["prompt"], | ||
| "model_input_names": [], | ||
| "output_names": ["out_prompt", "doc"], | ||
| "block_name": "custom", | ||
| "node_type": "custom" | ||
| } | ||
| } | ||
| } | ||
| ``` | ||
|
|
||
| See the final config at [YiYiXu/gemini-prompt-expander](https://huggingface.co/YiYiXu/gemini-prompt-expander). | ||
|
|
||
| ## Use in Mellon | ||
|
|
||
| 1. Start Mellon (see [Mellon installation guide](https://github.com/cubiq/Mellon)) | ||
|
|
||
| 2. In Mellon: | ||
| - Drag a **Dynamic Block Node** from the ModularDiffusers section | ||
| - Enter your `repo_id` (e.g., `YiYiXu/gemini-prompt-expander`) | ||
| - Click **Load Custom Block** | ||
| - The node will transform to show your block's inputs and outputs | ||
|
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Maybe show a screenshot of how it should render?
Collaborator
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. i will make a short video clip maybe:) |
||
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not strongly opinionated but WDYT of including the important guides (I think this is one) as a list in the overview page?