Native LongCat-Image implementation by Talmaj · Pull Request #12597 · Comfy-Org/ComfyUI

Talmaj · 2026-02-23T20:42:07Z

LongCat-Image ComfyUI Port

Adds native support for
LongCat-Image,
a Flux-based text-to-image model by Meituan, to ComfyUI.

Architecture

LongCat-Image is a Flux variant with:

Transformer: MM-DiT + Single-DiT (19 double blocks, 38 single blocks)
Text encoder: Qwen2.5-VL-7B with character-level encoding for quoted text
VAE: AutoencoderKL with 2x2 latent packing
3D MRoPE: Multimodal Rotary Position Embeddings with shifts
(t=1.0, y=512.0, x=512.0)

Key implementation details

Pre-converted weights

The original LongCat-Image weights use HuggingFace Diffusers key names.
ComfyUI requires pre-converted weights in its native Flux format. A standalone
download_original.sh and convert_original_to_comfy.py scripts (hosted alongside the weights in the Comfy-Org HF repo)
performs the one-time conversion:

Key renaming (e.g. x_embedder → img_in, context_embedder → txt_in,
transformer_blocks → double_blocks,
single_transformer_blocks → single_blocks)
Q/K/V fusion into single QKV tensors
Scale/shift half-swap on norm_out.linear weights — HuggingFace's
AdaLayerNormContinuous stores [scale | shift] while ComfyUI's LastLayer
expects [shift | scale]

Pre-converting avoids runtime torch.cat allocations, enabling ComfyUI's
zero-copy-from-disk memory mapping where tensors are referenced directly from
the safetensors file without loading into RAM.

Model detection

Pre-converted weights go through the standard Flux detection path. LongCat-Image
is distinguished from other Flux variants by a heuristic at the end of Flux
detection: context_in_dim == 3584 (from txt_in.weight shape) and
vec_in_dim is None (no vector_in layer). This sets txt_ids_dims = [1, 2],
matching the LongCatImage config. Since model_config_from_unet_config uses
first-match, LongCatImage is placed before FluxSchnell in the models list
(same pattern as FluxInpaint before Flux). LongCatImage's 5-key config only
matches when all of context_in_dim == 3584, vec_in_dim is None, and
txt_ids_dims == [1, 2] are present, so it never falsely matches other Flux
variants.

Tokenizer

LongCatImageBaseTokenizer applies the Qwen2.5 chat template, handles
character-level tokenization for quoted text via split_quotation, and pads to
a fixed max_length=512 to match the expected input format.

CFG renormalization

HuggingFace's LongCat-Image pipeline applies CFG renormalization after each
denoising step. Their implementation operates in noise space (input - denoised)
and computes L2 norms per 2x2 patch in Flux's packed representation, clamping
the post-CFG noise norm to not exceed the conditional prediction's norm.

The existing CFGNorm node performs the same core operation — clamping
norm(cond) / norm(pred) to [0, 1] — but differs in two ways: it operates
per-pixel across channels (dim=1) rather than per-patch, and works in prediction
space rather than noise space. In practice these differences are negligible, so
the template uses CFGNorm with strength=1.0 instead of a custom node.

No guidance embedding

Unlike standard Flux, LongCat-Image does not use a guidance conditioning tensor.
LongCatImage.extra_conds removes the guidance key.

Known differences from HuggingFace

Pad token embeddings: HuggingFace runs the text encoder in bfloat16,
which rounds pad token embeddings to identical vectors. ComfyUI runs in
float32, preserving small differences from causal attention and RoPE — each
pad position gets a slightly different vector. This does not affect output
quality since the attention mask zeros out pad tokens during the diffusion
transformer.
Sigma schedule: HuggingFace uses FlowMatchEulerDiscreteScheduler with
dynamic shifting (use_dynamic_shifting=True), computing a mu parameter
via linear interpolation based on image sequence length. ComfyUI's
ModelSamplingFlux uses a static shift=1.15 with flux_time_shift,
producing a slightly different sigma schedule for the same number of steps.

Files

File	Purpose
`comfy/supported_models.py`	`LongCatImage` config and detection matching
`comfy/model_base.py`	`LongCatImage` model class with MRoPE shifts
`comfy/model_detection.py`	Flux detection path with LongCat-Image heuristic
`comfy/text_encoders/longcat_image.py`	Tokenizer and text encoder
`user_templates/longcat_image_t2i.json`	User template
`blueprints/Text to Image (LongCat-Image).json`	Blueprint
`tests-unit/comfy_test/model_detection_test.py`	Model detection unit tests

coderabbitai · 2026-02-23T20:48:09Z

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

@coderabbitai resume to resume automatic reviews.
@coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

▶️ Resume reviews
🔍 Trigger review

📝 Walkthrough

Walkthrough

Adds LongCat-Image support across the codebase: a new Text-to-Image blueprint JSON; new LongCatImage model class and Flux adjustments in comfy/model_base.py; UNet detection update in comfy/model_detection.py; CLIPType enum and LONGCAT_IMAGE text-encoder loading in comfy/sd.py; new supported model entry in comfy/supported_models.py; a LongCatImage tokenizer/TE implementation in comfy/text_encoders/longcat_image.py; two Comfy nodes and an extension in comfy_extras/nodes_longcat_image.py; CLIPLoader option and extras registration; and unit tests for model detection and conversion.

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 14.29% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (2 passed)

Check name	Status	Explanation
Title check	✅ Passed	The title 'Native LongCat-Image implementation' directly and concisely describes the main change—adding native ComfyUI support for the LongCat-Image model.
Description check	✅ Passed	The PR description comprehensively explains the LongCat-Image model architecture, implementation details, file changes, and known differences from the original HuggingFace implementation, directly corresponding to the changeset.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 3

🧹 Nitpick comments (1)

tests-unit/comfy_test/model_detection_test.py (1)

73-73: Unused variable original_models.

original_models is assigned but never referenced. Likely leftover from a manual save/restore approach that was replaced by patch.object.

🧹 Remove unused variable

         sd = _make_longcat_diffusers_sd()
         unet_config = detect_unet_config(sd, "")
-        original_models = comfy.supported_models.models
 
         longcat_cls = comfy.supported_models.LongCatImage

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@tests-unit/comfy_test/model_detection_test.py` at line 73, Remove the unused
local variable original_models assigned from comfy.supported_models.models in
the test; since patch.object is handling temporary replacement/restore, delete
the assignment to original_models to eliminate the dead code and keep the test
clean (look for the assignment to original_models and the reference to
comfy.supported_models.models in model_detection_test.py).

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@comfy_extras/nodes_longcat_image.py`:
- Around line 61-82: The code assumes H and W are divisible by ps=2 before
reshaping (see noise.reshape(...), cond_packed reshape and renormed), which will
raise on odd spatial sizes; add a defensive check right after B, C, H, W =
denoised.shape to verify H % ps == 0 and W % ps == 0 and raise a clear
ValueError including the offending H/W and ps (or alternatively apply symmetric
padding to x/denoised/cond_denoised to make them divisible by ps before the
pack/unpack operations), then proceed with the existing noise/cond packing,
scaling and renorming logic unchanged.

In `@comfy/text_encoders/longcat_image.py`:
- Around line 70-95: In tokenize_with_weights: avoid letting
base_tok.tokenize_with_weights produce 512-length padding before you add the
LongCat template; call base_tok.tokenize_with_weights with padding disabled
(e.g., disable_padding=True or equivalent kwarg) so prompt_pairs is produced
without pre-padding, then build prefix_pairs, prompt_pairs and suffix_pairs into
combined, and only after combining perform truncation/padding to model length
(use your tokenizer's pad/truncate utility or call the shared super method once
on the final combined token list) so prefix_ids/suffix_ids are not separated by
mid-prompt padding; refer to tokenize_with_weights,
base_tok.tokenize_with_weights, prefix_ids, suffix_ids, prompt_pairs and
combined to locate where to change.
- Around line 102-144: The slice logic in encode_token_weights can use
template_end == -1 (no <|im_start|> found) which makes out = out[:, -1:] (last
token); change encode_token_weights to treat a missing marker by setting
template_end = 0 before slicing (or otherwise avoid negative slice) and only
apply the "+3 newline adjustment" when a real marker was detected (i.e., only
run the tok_pairs[template_end + 1]/[+2] checks if template_end was set from the
loop). Update references in encode_token_weights (template_end, tok_pairs, out,
extra, suffix_start) so the slicing/out = out[:, template_end:] and subsequent
attention_mask adjustments are guarded by the marker presence to avoid
accidentally keeping only the last token.

---

Nitpick comments:
In `@tests-unit/comfy_test/model_detection_test.py`:
- Line 73: Remove the unused local variable original_models assigned from
comfy.supported_models.models in the test; since patch.object is handling
temporary replacement/restore, delete the assignment to original_models to
eliminate the dead code and keep the test clean (look for the assignment to
original_models and the reference to comfy.supported_models.models in
model_detection_test.py).

ℹ️ Review info

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between dba2766 and bfd302f.

📒 Files selected for processing (9)

blueprints/Text to Image (LongCat-Image).json
comfy/model_base.py
comfy/model_detection.py
comfy/sd.py
comfy/supported_models.py
comfy/text_encoders/longcat_image.py
comfy_extras/nodes_longcat_image.py
nodes.py
tests-unit/comfy_test/model_detection_test.py

coderabbitai

Actionable comments posted: 1

🧹 Nitpick comments (3)

tests-unit/comfy_test/model_detection_test.py (2)

84-89: Consider isinstance over type(...).__name__ for class identity assertions.

String-based class-name checks will silently pass if the class is renamed or imported under an alias.

♻️ Proposed fix

-                assert type(result).__name__ == "LongCatImage", (
-                    f"Expected LongCatImage with order {label}, got {type(result).__name__}"
-                )
+                assert isinstance(result, comfy.supported_models.LongCatImage), (
+                    f"Expected LongCatImage with order {label}, got {type(result)}"
+                )

And at line 101:

-        assert type(model_config).__name__ == "LongCatImage"
+        assert isinstance(model_config, comfy.supported_models.LongCatImage)

Also applies to: 99-101

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@tests-unit/comfy_test/model_detection_test.py` around lines 84 - 89, Replace
fragile string-based class checks with real type checks: instead of asserting
type(result).__name__ == "LongCatImage" use an isinstance assertion against the
actual class (e.g., assert isinstance(result, LongCatImage)). Update both
occurrences (the assertion around model_config_from_unet_config and the similar
check at lines ~99-101) and ensure LongCatImage is imported or referenced from
the correct module so the isinstance call resolves.

103-113: Test only verifies key presence/absence, skipping the two non-trivial transforms.

The PR description calls out Q/K/V fusion and a scale/shift half-swap in process_unet_state_dict as the critical parts of the conversion. Neither is exercised here — verifying that, say, transformer_blocks.0.attn.to_q.weight + to_k.weight + to_v.weight are fused into a single double_blocks.0.img_attn.qkv.weight with the right shape, and that the norm_out scale/shift halves are swapped, would meaningfully increase confidence in the conversion correctness.

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@tests-unit/comfy_test/model_detection_test.py` around lines 103 - 113, The
test test_longcat_process_unet_state_dict_converts_keys only checks
presence/absence of keys but does not validate the two non-trivial transforms in
process_unet_state_dict: Q/K/V fusion and the norm_out scale/shift half-swap.
Update the test to build source weights for
transformer_blocks.0.attn.to_q/to_k/to_v and the norm_out affine, run converted
= model_config.process_unet_state_dict(...), then assert the fused tensor exists
at double_blocks.0.img_attn.qkv.weight with the expected concatenated shape and
contents (verify slices match original to_q/k/v), and assert norm_out parameters
have their scale/shift halves swapped compared to the input; reference
test_longcat_process_unet_state_dict_converts_keys, process_unet_state_dict,
transformer_blocks.*, attn.to_q/to_k/to_v, double_blocks.0.img_attn.qkv.weight,
and norm_out in your assertions.

comfy_extras/nodes_longcat_image.py (1)

82-83: Replace deprecated torch.norm with torch.linalg.vector_norm.

torch.norm is deprecated and may be removed in a future PyTorch release; its documentation and behavior may be incorrect, and it is no longer actively maintained. The recommended replacement for vector norms is torch.linalg.vector_norm().

♻️ Proposed fix

-            noise_norm = torch.norm(noise_packed, dim=-1, keepdim=True)
-            cond_norm = torch.norm(cond_packed, dim=-1, keepdim=True)
+            noise_norm = torch.linalg.vector_norm(noise_packed, dim=-1, keepdim=True)
+            cond_norm = torch.linalg.vector_norm(cond_packed, dim=-1, keepdim=True)

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@comfy_extras/nodes_longcat_image.py` around lines 82 - 83, Replace deprecated
torch.norm calls computing per-vector norms for noise_packed and cond_packed
with torch.linalg.vector_norm; specifically update the expressions that assign
noise_norm and cond_norm (currently using torch.norm(noise_packed, dim=-1,
keepdim=True) and torch.norm(cond_packed, dim=-1, keepdim=True)) to use
torch.linalg.vector_norm(noise_packed, dim=-1, keepdim=True) and
torch.linalg.vector_norm(cond_packed, dim=-1, keepdim=True) respectively so
behavior and signature remain the same but use the supported API.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@comfy/model_detection.py`:
- Around line 285-313: The current three-key detection in model_detection.py
(keys '{}x_embedder.weight', '{}transformer_blocks.0.attn.to_q.weight',
'{}single_transformer_blocks.0.attn.to_q.weight' using key_prefix) misidentifies
vanilla Flux diffusers as LongCat-Image; tighten the condition by requiring a
LongCat-specific key (use the existing ctx_key =
'{}context_embedder.weight'.format(key_prefix)) to be present as an additional
positive constraint before building dit_config, so only state dicts that include
context_embedder.weight are considered LongCat-Image (leave count_blocks usage
and subsequent field population unchanged).

---

Nitpick comments:
In `@comfy_extras/nodes_longcat_image.py`:
- Around line 82-83: Replace deprecated torch.norm calls computing per-vector
norms for noise_packed and cond_packed with torch.linalg.vector_norm;
specifically update the expressions that assign noise_norm and cond_norm
(currently using torch.norm(noise_packed, dim=-1, keepdim=True) and
torch.norm(cond_packed, dim=-1, keepdim=True)) to use
torch.linalg.vector_norm(noise_packed, dim=-1, keepdim=True) and
torch.linalg.vector_norm(cond_packed, dim=-1, keepdim=True) respectively so
behavior and signature remain the same but use the supported API.

In `@tests-unit/comfy_test/model_detection_test.py`:
- Around line 84-89: Replace fragile string-based class checks with real type
checks: instead of asserting type(result).__name__ == "LongCatImage" use an
isinstance assertion against the actual class (e.g., assert isinstance(result,
LongCatImage)). Update both occurrences (the assertion around
model_config_from_unet_config and the similar check at lines ~99-101) and ensure
LongCatImage is imported or referenced from the correct module so the isinstance
call resolves.
- Around line 103-113: The test
test_longcat_process_unet_state_dict_converts_keys only checks presence/absence
of keys but does not validate the two non-trivial transforms in
process_unet_state_dict: Q/K/V fusion and the norm_out scale/shift half-swap.
Update the test to build source weights for
transformer_blocks.0.attn.to_q/to_k/to_v and the norm_out affine, run converted
= model_config.process_unet_state_dict(...), then assert the fused tensor exists
at double_blocks.0.img_attn.qkv.weight with the expected concatenated shape and
contents (verify slices match original to_q/k/v), and assert norm_out parameters
have their scale/shift halves swapped compared to the input; reference
test_longcat_process_unet_state_dict_converts_keys, process_unet_state_dict,
transformer_blocks.*, attn.to_q/to_k/to_v, double_blocks.0.img_attn.qkv.weight,
and norm_out in your assertions.

ℹ️ Review info

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between bfd302f and 81a3792.

📒 Files selected for processing (9)

blueprints/Text to Image (LongCat-Image).json
comfy/model_base.py
comfy/model_detection.py
comfy/sd.py
comfy/supported_models.py
comfy/text_encoders/longcat_image.py
comfy_extras/nodes_longcat_image.py
nodes.py
tests-unit/comfy_test/model_detection_test.py

✅ Files skipped from review due to trivial changes (1)

blueprints/Text to Image (LongCat-Image).json

🚧 Files skipped from review as they are similar to previous changes (1)

nodes.py

coderabbitai

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@comfy/model_detection.py`:
- Around line 827-838: The selection currently adds
len(model_config.required_keys) into specificity even when no state_dict was
provided, biasing picks; modify model_config_from_unet_config so that
required_keys are only counted when a state_dict is present/used (i.e., only add
len(model_config.required_keys) to specificity if state_dict is not None and
thus those keys could be validated), preserving the previous first-match
ordering when state_dict is None; refer to model_config_from_unet_config,
best_specificity, model_config.required_keys and the matches(...) call to
implement this conditional weighting.

ℹ️ Review info

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 81a3792 and 449282d.

📒 Files selected for processing (2)

comfy/model_detection.py
comfy/supported_models.py

comfyanonymous · 2026-02-23T23:30:02Z

+                prefix = "double_blocks.{}.".format(idx)
+
+                if rest.startswith("norm1.linear."):
+                    out_sd[prefix + "img_mod.lin." + rest.split(".")[-1]] = v


Something we can do is repackage/convert the model weights to a format that is easier to load in comfyui instead of having all this logic here it can be a conversion.py file in the hf repo where we host our converted weights.

Is there some standard for repackaging that I can follow?

The standard is to make something that doesn't need to be converted by comfyui internally to work.

coderabbitai

Actionable comments posted: 1

🧹 Nitpick comments (1)

blueprints/Text to Image (LongCat-Image).json (1)
1-1: Expose a negative prompt input in the subgraph.

The second CLIPTextEncodeLongCatImage node isn’t wired to any input, so users can’t set a negative prompt from the blueprint. Consider adding a subgraph input and link for negative text (or remove the node if it’s intentionally fixed).
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@blueprints/Text` to Image (LongCat-Image).json at line 1, The subgraph lacks
a user-facing negative prompt because the second CLIPTextEncodeLongCatImage node
(node id 5, type CLIPTextEncodeLongCatImage) has its "text" input unlinked; add
a subgraph input (e.g., inp-neg-text) for the negative prompt and create a link
from that subgraph input to node 5's "text" input so users can supply a negative
prompt, or if the node is intentionally unused, remove node id 5 and its links
(including the link to KSampler) to avoid confusion.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@tests-unit/comfy_test/model_detection_test.py`:
- Around line 8-36: The fixture _make_longcat_comfyui_sd allocates very large
tensors (H=3072) causing multi-GB memory usage; change H to a small value (e.g.,
32) while keeping C_CTX at 3584 so txt_in.weight.shape[1] still reflects the
real context dim, and keep other shape formulas (C_IN, C_CTX, and all uses like
"img_in.weight", "txt_in.weight", "time_in.*", "final_layer.*", and blocks in
"double_blocks.*" and "single_blocks.*") unchanged so the detection logic that
reads tensor shapes and key presence continues to work but without large memory
allocations.

---

Nitpick comments:
In `@blueprints/Text` to Image (LongCat-Image).json:
- Line 1: The subgraph lacks a user-facing negative prompt because the second
CLIPTextEncodeLongCatImage node (node id 5, type CLIPTextEncodeLongCatImage) has
its "text" input unlinked; add a subgraph input (e.g., inp-neg-text) for the
negative prompt and create a link from that subgraph input to node 5's "text"
input so users can supply a negative prompt, or if the node is intentionally
unused, remove node id 5 and its links (including the link to KSampler) to avoid
confusion.

ℹ️ Review info

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 449282d and 310924a.

📒 Files selected for processing (5)

blueprints/Text to Image (LongCat-Image).json
comfy/model_detection.py
comfy/supported_models.py
comfy/text_encoders/longcat_image.py
tests-unit/comfy_test/model_detection_test.py

🚧 Files skipped from review as they are similar to previous changes (2)

comfy/model_detection.py
comfy/text_encoders/longcat_image.py

coderabbitai · 2026-02-24T22:20:19Z

+def _make_longcat_comfyui_sd():
+    """Minimal ComfyUI-format state dict for pre-converted LongCat-Image weights."""
+    sd = {}
+    H = 3072
+    C_IN = 16
+    C_CTX = 3584
+
+    sd["img_in.weight"] = torch.empty(H, C_IN * 4)
+    sd["img_in.bias"] = torch.empty(H)
+    sd["txt_in.weight"] = torch.empty(H, C_CTX)
+    sd["txt_in.bias"] = torch.empty(H)
+
+    sd["time_in.in_layer.weight"] = torch.empty(H, 256)
+    sd["time_in.in_layer.bias"] = torch.empty(H)
+    sd["time_in.out_layer.weight"] = torch.empty(H, H)
+    sd["time_in.out_layer.bias"] = torch.empty(H)
+
+    sd["final_layer.adaLN_modulation.1.weight"] = torch.empty(2 * H, H)
+    sd["final_layer.adaLN_modulation.1.bias"] = torch.empty(2 * H)
+    sd["final_layer.linear.weight"] = torch.empty(C_IN * 4, H)
+    sd["final_layer.linear.bias"] = torch.empty(C_IN * 4)
+
+    for i in range(19):
+        sd[f"double_blocks.{i}.img_attn.norm.key_norm.weight"] = torch.empty(128)
+        sd[f"double_blocks.{i}.img_attn.qkv.weight"] = torch.empty(3 * H, H)
+        sd[f"double_blocks.{i}.img_mod.lin.weight"] = torch.empty(H, H)
+    for i in range(38):
+        sd[f"single_blocks.{i}.modulation.lin.weight"] = torch.empty(H, H)
+


⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

# First, let's check if the test file exists and view the exact code cat -n tests-unit/comfy_test/model_detection_test.py | head -80

Repository: Comfy-Org/ComfyUI

Length of output: 3589

🏁 Script executed:

# Check git status to see if this is a new file or modified git log --oneline -- tests-unit/comfy_test/model_detection_test.py 2>/dev/null | head -5 || echo "File may be new or not tracked"

Repository: Comfy-Org/ComfyUI

Length of output: 116

🏁 Script executed:

# Calculate memory footprint of the tensors being allocated python3 << 'EOF' import torch # Current allocation (H=3072) H_current = 3072 C_IN = 16 C_CTX = 3584 # Calculate sizes for _make_longcat_comfyui_sd tensors_longcat = { "img_in.weight": (H_current, C_IN * 4), "img_in.bias": (H_current,), "txt_in.weight": (H_current, C_CTX), "txt_in.bias": (H_current,), "time_in.in_layer.weight": (H_current, 256), "time_in.in_layer.bias": (H_current,), "time_in.out_layer.weight": (H_current, H_current), "time_in.out_layer.bias": (H_current,), "final_layer.adaLN_modulation.1.weight": (2 * H_current, H_current), "final_layer.adaLN_modulation.1.bias": (2 * H_current,), "final_layer.linear.weight": (C_IN * 4, H_current), "final_layer.linear.bias": (C_IN * 4,), } # Add looped tensors (19 blocks) for i in range(19): tensors_longcat[f"double_blocks.{i}.img_attn.norm.key_norm.weight"] = (128,) tensors_longcat[f"double_blocks.{i}.img_attn.qkv.weight"] = (3 * H_current, H_current) tensors_longcat[f"double_blocks.{i}.img_mod.lin.weight"] = (H_current, H_current) # Add 38 single blocks for i in range(38): tensors_longcat[f"single_blocks.{i}.modulation.lin.weight"] = (H_current, H_current) # Calculate total memory (float32 = 4 bytes per element) total_elements = sum(1 for shape in tensors_longcat.values() for _ in [1] if all(isinstance(x, int) for x in shape)) total_elements = sum(torch.Size(shape).numel() for shape in tensors_longcat.values()) total_bytes = total_elements * 4 # float32 total_gb = total_bytes / (1024**3) print(f"H=3072 allocations:") print(f" Total elements: {total_elements:,}") print(f" Total memory: {total_gb:.2f} GB") print() # Proposed allocation (H=32) H_proposed = 32 tensors_longcat_small = { "img_in.weight": (H_proposed, C_IN * 4), "img_in.bias": (H_proposed,), "txt_in.weight": (H_proposed, C_CTX), "txt_in.bias": (H_proposed,), "time_in.in_layer.weight": (H_proposed, 256), "time_in.in_layer.bias": (H_proposed,), "time_in.out_layer.weight": (H_proposed, H_proposed), "time_in.out_layer.bias": (H_proposed,), "final_layer.adaLN_modulation.1.weight": (2 * H_proposed, H_proposed), "final_layer.adaLN_modulation.1.bias": (2 * H_proposed,), "final_layer.linear.weight": (C_IN * 4, H_proposed), "final_layer.linear.bias": (C_IN * 4,), } for i in range(19): tensors_longcat_small[f"double_blocks.{i}.img_attn.norm.key_norm.weight"] = (128,) tensors_longcat_small[f"double_blocks.{i}.img_attn.qkv.weight"] = (3 * H_proposed, H_proposed) tensors_longcat_small[f"double_blocks.{i}.img_mod.lin.weight"] = (H_proposed, H_proposed) for i in range(38): tensors_longcat_small[f"single_blocks.{i}.modulation.lin.weight"] = (H_proposed, H_proposed) total_elements_small = sum(torch.Size(shape).numel() for shape in tensors_longcat_small.values()) total_bytes_small = total_elements_small * 4 total_gb_small = total_bytes_small / (1024**3) print(f"H=32 allocations:") print(f" Total elements: {total_elements_small:,}") print(f" Total memory: {total_gb_small:.4f} GB ({total_bytes_small / (1024**2):.2f} MB)") print() print(f"Reduction: {total_gb / total_gb_small:.1f}x smaller") EOF

Repository: Comfy-Org/ComfyUI

Length of output: 178

🏁 Script executed:

# Calculate memory footprint manually (without torch module) python3 << 'EOF' # Current allocation (H=3072) H_current = 3072 C_IN = 16 C_CTX = 3584 # Calculate elements for _make_longcat_comfyui_sd elements = {} elements["img_in.weight"] = H_current * (C_IN * 4) # 3072 * 64 elements["img_in.bias"] = H_current elements["txt_in.weight"] = H_current * C_CTX # 3072 * 3584 elements["txt_in.bias"] = H_current elements["time_in.in_layer.weight"] = H_current * 256 elements["time_in.in_layer.bias"] = H_current elements["time_in.out_layer.weight"] = H_current * H_current # 3072 * 3072 elements["time_in.out_layer.bias"] = H_current elements["final_layer.adaLN_modulation.1.weight"] = (2 * H_current) * H_current # 6144 * 3072 elements["final_layer.adaLN_modulation.1.bias"] = 2 * H_current elements["final_layer.linear.weight"] = (C_IN * 4) * H_current elements["final_layer.linear.bias"] = C_IN * 4 # Add loops for i in range(19): elements[f"double_blocks.{i}.img_attn.norm.key_norm.weight"] = 128 elements[f"double_blocks.{i}.img_attn.qkv.weight"] = 3 * H_current * H_current elements[f"double_blocks.{i}.img_mod.lin.weight"] = H_current * H_current for i in range(38): elements[f"single_blocks.{i}.modulation.lin.weight"] = H_current * H_current total_elements = sum(elements.values()) total_bytes = total_elements * 4 # float32 total_gb = total_bytes / (1024**3) print(f"Longcat with H=3072:") print(f" Total elements: {total_elements:,}") print(f" Total memory: {total_gb:.2f} GB") print() # Proposed allocation (H=32) H_proposed = 32 elements_small = {} elements_small["img_in.weight"] = H_proposed * (C_IN * 4) elements_small["img_in.bias"] = H_proposed elements_small["txt_in.weight"] = H_proposed * C_CTX elements_small["txt_in.bias"] = H_proposed elements_small["time_in.in_layer.weight"] = H_proposed * 256 elements_small["time_in.in_layer.bias"] = H_proposed elements_small["time_in.out_layer.weight"] = H_proposed * H_proposed elements_small["time_in.out_layer.bias"] = H_proposed elements_small["final_layer.adaLN_modulation.1.weight"] = (2 * H_proposed) * H_proposed elements_small["final_layer.adaLN_modulation.1.bias"] = 2 * H_proposed elements_small["final_layer.linear.weight"] = (C_IN * 4) * H_proposed elements_small["final_layer.linear.bias"] = C_IN * 4 for i in range(19): elements_small[f"double_blocks.{i}.img_attn.norm.key_norm.weight"] = 128 elements_small[f"double_blocks.{i}.img_attn.qkv.weight"] = 3 * H_proposed * H_proposed elements_small[f"double_blocks.{i}.img_mod.lin.weight"] = H_proposed * H_proposed for i in range(38): elements_small[f"single_blocks.{i}.modulation.lin.weight"] = H_proposed * H_proposed total_elements_small = sum(elements_small.values()) total_bytes_small = total_elements_small * 4 total_gb_small = total_bytes_small / (1024**3) print(f"Longcat with H=32:") print(f" Total elements: {total_elements_small:,}") print(f" Total memory: {total_gb_small:.4f} GB ({total_bytes_small / (1024**2):.2f} MB)") print() print(f"Reduction factor: {total_gb / total_gb_small:.0f}x smaller") EOF

Repository: Comfy-Org/ComfyUI

Length of output: 252

🏁 Script executed:

# Check how detect_unet_config and model_config_from_unet_config use the state dict rg -A 10 "def detect_unet_config" --type py

Repository: Comfy-Org/ComfyUI

Length of output: 1030

🏁 Script executed:

# Search for how these detection functions inspect the state dict rg "double_blocks|single_blocks|img_attn" comfy/ --type py -A 2 | head -100

Repository: Comfy-Org/ComfyUI

Length of output: 7348

🏁 Script executed:

# Check the git diff to see if these test functions are newly added git diff HEAD~1 tests-unit/comfy_test/model_detection_test.py 2>/dev/null | head -150 || echo "Git diff not available, checking if file is new"

Repository: Comfy-Org/ComfyUI

Length of output: 43

🏁 Script executed:

# Check when this test file was created git log --follow --format="%H %ad %s" -- tests-unit/comfy_test/model_detection_test.py 2>/dev/null | head -5 || echo "No git history"

Repository: Comfy-Org/ComfyUI

Length of output: 180

🏁 Script executed:

# Let's check the actual detection logic more carefully - see if shapes/values are read rg "state_dict\[" comfy/model_detection.py -A 1 -B 1 | head -150

Repository: Comfy-Org/ComfyUI

Length of output: 7974

🏁 Script executed:

# Verify the detection logic doesn't hardcode H values rg "3072|4096" comfy/model_detection.py

Repository: Comfy-Org/ComfyUI

Length of output: 438

🏁 Script executed:

# Check if there are any shape-based assertions or value comparisons for LongCat/Flux detection rg -A 5 "LongCat|FluxSchnell" comfy/supported_models.py | head -50

Repository: Comfy-Org/ComfyUI

Length of output: 1759

🏁 Script executed:

# Verify the detection logic reads shapes from state_dict tensors rg "in_channels|context_in_dim|hidden_size" comfy/model_detection.py | grep "state_dict\[" | head -20

Repository: Comfy-Org/ComfyUI

Length of output: 1562

🏁 Script executed:

# Check if detection has any hardcoded dimension checks for LongCat specifically rg "3584|LongCat" comfy/model_detection.py -B 3 -A 3

Repository: Comfy-Org/ComfyUI

Length of output: 495

Reduce test tensor sizes to avoid multi-GB allocations.

These fixtures allocate several gigabytes of tensors (H=3072 with 19+38 blocks ~4.16 GB total), which can cause CI failures. The detection logic only inspects tensor shapes and key presence, not values—reducing H to 32 while keeping C_CTX at 3584 preserves correctness since context_in_dim is read from txt_in.weight.shape[1].

🧩 Suggested change

def _make_longcat_comfyui_sd(): """Minimal ComfyUI-format state dict for pre-converted LongCat-Image weights.""" sd = {} - H = 3072 + H = 32 # keep tests lightweight; only shapes and keys are used C_IN = 16 C_CTX = 3584 def _make_flux_schnell_comfyui_sd(): """Minimal ComfyUI-format state dict for standard Flux Schnell.""" sd = {} - H = 3072 + H = 32 # keep tests lightweight; only shapes and keys are used C_IN = 16

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@tests-unit/comfy_test/model_detection_test.py` around lines 8 - 36, The fixture _make_longcat_comfyui_sd allocates very large tensors (H=3072) causing multi-GB memory usage; change H to a small value (e.g., 32) while keeping C_CTX at 3584 so txt_in.weight.shape[1] still reflects the real context dim, and keep other shape formulas (C_IN, C_CTX, and all uses like "img_in.weight", "txt_in.weight", "time_in.*", "final_layer.*", and blocks in "double_blocks.*" and "single_blocks.*") unchanged so the detection logic that reads tensor shapes and key presence continues to work but without large memory allocations.

🧩 Analysis chain

I need to find the conversion code the user is referencing to provide complete context.

🏁 Script executed:

#!/bin/bash # Search for the weight conversion code that maps norm1.linear to img_mod.lin rg "norm1\.linear" --type py -B 5 -A 3

Length of output: 2984

@saiphyoheinkyaw-jpg Thanks for sharing that context! You're right—that's the conversion logic from comfy/utils.py that maps HuggingFace format (norm1.linear.*) to ComfyUI format (img_mod.lin.*) for double blocks.

The test fixtures in model_detection_test.py create pre-converted state dicts (already in ComfyUI format with keys like double_blocks.{i}.img_mod.lin.weight), so they bypass the conversion step and test only the detection logic.

Since detection reads:

Tensor shapes (e.g., context_in_dim from txt_in.weight.shape[1] which stays 3584)

Key presence (not values)

Reducing H from 3072 to 32 in the test fixtures is safe and avoids the multi-GB memory allocations while keeping all detection logic intact. The conversion code you showed isn't affected since these tests don't exercise it.

🧠 Learnings used

Learnt from: pythongosssss Repo: Comfy-Org/ComfyUI PR: 12555 File: comfy_extras/nodes_glsl.py:719-724 Timestamp: 2026-02-21T14:01:41.482Z Learning: In PyOpenGL, bare Python scalars can be accepted for 1-element array parameters by NumberHandler. This means you can pass an int/float directly to OpenGL texture deletion (e.g., glDeleteTextures(tex)) without wrapping in a list. Verify function-specific expectations and ensure types match what the OpenGL call expects; use explicit lists only when the API requires an array.

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash # Search for the conversion code the user is referencing rg -n "norm1\.linear\." --type py -A 2 -B 2 # Also search for the prefix formatting pattern rg -n 'prefix.*double_blocks.*format' --type py -A 3 -B 3

Length of output: 3099

@saiphyoheinkyaw-jpg Thanks for sharing the conversion code context! You're showing the weight conversion logic from comfy/utils.py that transforms HuggingFace format (transformer_blocks.{i}.norm1.linear.*) into ComfyUI format (double_blocks.{i}.img_mod.lin.*).

The test fixtures in _make_longcat_comfyui_sd() already create pre-converted ComfyUI-format weights—they use keys like double_blocks.{i}.img_mod.lin.weight which match the OUTPUT of the conversion you're showing. This is correct and intentional, as noted in the test's docstring: "Minimal ComfyUI-format state dict for pre-converted LongCat-Image weights."

Regarding CodeRabbit's suggestion to reduce H from 3072 to 32:

✅ This change is safe because:

Detection logic (detect_unet_config) only inspects key presence and specific shape dimensions (like C_CTX = 3584 from txt_in.weight.shape[1])

It doesn't read H values or tensor contents

Reducing H to 32 preserves all key names and the relevant shape dimensions while cutting memory from ~4 GB to ~4 MB

The key structure you're showing (img_mod.lin.*) remains unchanged—only the hidden dimension shrinks in the test tensors.

🧠 Learnings used

Learnt from: pythongosssss Repo: Comfy-Org/ComfyUI PR: 12555 File: comfy_extras/nodes_glsl.py:719-724 Timestamp: 2026-02-21T14:01:41.482Z Learning: In PyOpenGL, bare Python scalars can be accepted for 1-element array parameters by NumberHandler. This means you can pass an int/float directly to OpenGL texture deletion (e.g., glDeleteTextures(tex)) without wrapping in a list. Verify function-specific expectations and ensure types match what the OpenGL call expects; use explicit lists only when the API requires an array.

Talmaj · 2026-02-25T20:15:57Z

I've removed the change in model_config_from_unet_config and put LongCat-Image model in front of Fluxschnell. It auto-selects it correctly. Also it shouldn't influence auto-selection of other models:

LongCatImage's 5-key config only matches when all of context_in_dim == 3584, vec_in_dim is None, and
txt_ids_dims == [1, 2] are present, so it never falsely matches other Flux variants.

coderabbitai

♻️ Duplicate comments (3)

tests-unit/comfy_test/model_detection_test.py (2)

7-36: ⚠️ Potential issue | 🟠 Major

Reduce H to avoid multi-GB allocations in CI (still unaddressed).

With H = 3072, _make_longcat_comfyui_sd allocates well over 4 GB of float32 tensors (19 double blocks alone account for 19 × (3×3072²) ≈ 537 M elements). Detection logic only inspects .shape and key presence—H = 32 works identically because context_in_dim is inferred from txt_in.weight.shape[1] = C_CTX = 3584, which is unchanged.

🔧 Proposed fix

 def _make_longcat_comfyui_sd():
     """Minimal ComfyUI-format state dict for pre-converted LongCat-Image weights."""
     sd = {}
-    H = 3072
+    H = 32  # keep tests lightweight; only shapes and key presence are used
     C_IN = 16
     C_CTX = 3584

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@tests-unit/comfy_test/model_detection_test.py` around lines 7 - 36, The test
helper _make_longcat_comfyui_sd creates very large tensors (H = 3072) causing
multi-GB allocations; change H to a small value (e.g., 32) and update any tensor
shapes that use H (all occurrences in sd keys like "img_in.weight",
"img_in.bias", "txt_in.weight", "txt_in.bias", "time_in.*", "final_layer.*", and
the loops creating "double_blocks.{i}..." and "single_blocks.{i}...") so the
detection logic still sees correct dimensionality but with tiny allocations;
leave C_CTX and loop counts unchanged so context_in_dim inference via
txt_in.weight.shape[1] remains the same.

39-59: ⚠️ Potential issue | 🟠 Major

Same H = 3072 allocation issue in _make_flux_schnell_comfyui_sd.

Same fix applies; context_in_dim is read from txt_in.weight.shape[1] = 4096, independent of H.

🔧 Proposed fix

 def _make_flux_schnell_comfyui_sd():
     """Minimal ComfyUI-format state dict for standard Flux Schnell."""
     sd = {}
-    H = 3072
+    H = 32  # keep tests lightweight; only shapes and key presence are used
     C_IN = 16

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@tests-unit/comfy_test/model_detection_test.py` around lines 39 - 59, The test
helper _make_flux_schnell_comfyui_sd hardcodes txt_in.weight with shape (H,
4096) which conflates H with context dimension; introduce a separate variable
(e.g., CONTEXT_IN = 4096) and allocate txt_in.weight as torch.empty(H,
CONTEXT_IN) (and use CONTEXT_IN wherever the code should reflect the
context/input embedding width), leaving H = 3072 for channel/hidden sizes—this
ensures context_in_dim is read correctly from txt_in.weight.shape[1] and avoids
mixing H and context dimensions.

comfy/text_encoders/longcat_image.py (1)

137-143: ⚠️ Potential issue | 🟠 Major

The template_end == -1 guard fires too late; the +3 check can accidentally fire on index 0/1 (still unaddressed).

When no <|im_start|> (151644) token is found, template_end stays -1 after the loop. The block at Lines 137–140 then evaluates out.shape[1] > 2 (almost always True) and accidentally inspects tok_pairs[0] and tok_pairs[1] (because -1 + 1 = 0 and -1 + 2 = 1). If those tokens happen to be 872 and 198, template_end becomes 2 and the guard at Line 142 is bypassed, causing out[:, 2:] to silently discard the first two tokens.

The fix is to only run the +3 newline adjustment when template_end was actually set by the loop:

🛠️ Proposed fix

-        if out.shape[1] > (template_end + 3):
-            if tok_pairs[template_end + 1][0] == 872:
-                if tok_pairs[template_end + 2][0] == 198:
-                    template_end += 3
-
-        if template_end == -1:
-            template_end = 0
+        if template_end == -1:
+            template_end = 0
+        elif out.shape[1] > (template_end + 3):
+            if tok_pairs[template_end + 1][0] == 872:
+                if tok_pairs[template_end + 2][0] == 198:
+                    template_end += 3

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@comfy/text_encoders/longcat_image.py` around lines 137 - 143, The post-loop
"+3" adjustment currently runs even when template_end is still -1 and can index
tok_pairs[0/1]; change the logic so the checks that inspect
tok_pairs[template_end + 1] and tok_pairs[template_end + 2] only run when
template_end != -1 (i.e., the loop actually found the <|im_start|> marker).
Concretely, wrap the entire if-block that tests out.shape and tok_pairs[...]
with a guard like "if template_end != -1 and out.shape[1] > (template_end +
3):", leaving the existing fallback that sets template_end = 0 after that. This
ensures tok_pairs and template_end adjustments only occur when template_end was
set by the earlier search.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Duplicate comments:
In `@comfy/text_encoders/longcat_image.py`:
- Around line 137-143: The post-loop "+3" adjustment currently runs even when
template_end is still -1 and can index tok_pairs[0/1]; change the logic so the
checks that inspect tok_pairs[template_end + 1] and tok_pairs[template_end + 2]
only run when template_end != -1 (i.e., the loop actually found the <|im_start|>
marker). Concretely, wrap the entire if-block that tests out.shape and
tok_pairs[...] with a guard like "if template_end != -1 and out.shape[1] >
(template_end + 3):", leaving the existing fallback that sets template_end = 0
after that. This ensures tok_pairs and template_end adjustments only occur when
template_end was set by the earlier search.

In `@tests-unit/comfy_test/model_detection_test.py`:
- Around line 7-36: The test helper _make_longcat_comfyui_sd creates very large
tensors (H = 3072) causing multi-GB allocations; change H to a small value
(e.g., 32) and update any tensor shapes that use H (all occurrences in sd keys
like "img_in.weight", "img_in.bias", "txt_in.weight", "txt_in.bias",
"time_in.*", "final_layer.*", and the loops creating "double_blocks.{i}..." and
"single_blocks.{i}...") so the detection logic still sees correct dimensionality
but with tiny allocations; leave C_CTX and loop counts unchanged so
context_in_dim inference via txt_in.weight.shape[1] remains the same.
- Around line 39-59: The test helper _make_flux_schnell_comfyui_sd hardcodes
txt_in.weight with shape (H, 4096) which conflates H with context dimension;
introduce a separate variable (e.g., CONTEXT_IN = 4096) and allocate
txt_in.weight as torch.empty(H, CONTEXT_IN) (and use CONTEXT_IN wherever the
code should reflect the context/input embedding width), leaving H = 3072 for
channel/hidden sizes—this ensures context_in_dim is read correctly from
txt_in.weight.shape[1] and avoids mixing H and context dimensions.

ℹ️ Review info

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 310924a and 4b6fe40.

📒 Files selected for processing (6)

blueprints/Text to Image (LongCat-Image).json
comfy/model_detection.py
comfy/supported_models.py
comfy/text_encoders/longcat_image.py
comfy_extras/nodes_longcat_image.py
tests-unit/comfy_test/model_detection_test.py

🚧 Files skipped from review as they are similar to previous changes (2)

comfy/model_detection.py
blueprints/Text to Image (LongCat-Image).json

comfy-pr-bot · 2026-02-25T21:49:42Z

Test Evidence Check

⚠️ Warning: Visual Documentation Missing

If this PR changes user-facing behavior, visual proof (screen recording or screenshot) is required. PRs without applicable visual documentation may not be reviewed until provided.

You can add it by:

GitHub: Drag & drop media directly into the PR description
YouTube: Include a link to a short demo

Kosinkadink · 2026-02-26T16:18:25Z

Tested the PR on my machine with repackaged model, works! There is currently a merge conflict to resolve, and other than that would be nice to see how this PR's results compare to the original results.

matalama80td3l · 2026-02-28T06:46:10Z

So will the longcat-image-edit have native implementation?

Amazon90 · 2026-02-28T10:09:47Z

@comfyanonymous Is there a workflow?

jtreminio · 2026-02-28T15:45:10Z

@comfyanonymous Is there a workflow?

Just use standard t2i workflow, load model from diffusion_models directory.

Steps: 50 (20 is "fine")
CFG: 4
Text Encoder: qwen_2.5_vl_7b_fp8_scaled.safetensors via Load CLIP node, set "type" to "longcat_image".
VAE: flux_ae via Load VAE node

zwukong · 2026-03-01T03:38:30Z

edit turbo worth a try
https://huggingface.co/meituan-longcat/LongCat-Image-Edit-Turbo

Amazon90 · 2026-03-01T04:20:09Z

Does it currently only support the basic text‑to‑image model? Are the edit model and the turbo model still not supported yet?

yayonguan · 2026-03-02T02:44:43Z

@Talmaj @comfyanonymous This is a great job, and really appreciate the support for the longcat-image model. I am one of the developers of longcat-image. Currently, it looks like the text-to-image model is already supported. If time permits, could you please add support for LongCat-Image-Edit and LongCat-Image-Edit-Turbo as well? This would help more developers use the longcat-image series of models, thanks!

Talmaj · 2026-03-02T18:50:57Z

Thank you! I'm planning to add support for LongCat-Image-Edit next.

yayonguan · 2026-03-03T02:10:58Z

Thank you! I'm planning to add support for LongCat-Image-Edit next.

Great, looking forward to it.

commit 16cd8d8a8f5f16ce7e5f929fdba9f783990254ea Author: comfyanonymous <121283862+comfyanonymous@users.noreply.github.com> Date: Fri Mar 13 19:33:28 2026 -0700 Update README. (#12931) commit 7810f49702eac6e617eb7f2c30b00a8939ef1404 Author: rattus <46076784+rattus128@users.noreply.github.com> Date: Fri Mar 13 19:18:08 2026 -0700 comfy aimdo 0.2.11 + Improved RAM Pressure release strategies - Windows speedups (#12925) * Implement seek and read for pins Source pins from an mmap is pad because its its a CPU->CPU copy that attempts to fully buffer the same data twice. Instead, use seek and read which avoids the mmap buffering while usually being a faster read in the first place (avoiding mmap faulting etc). * pinned_memory: Use Aimdo pinner The aimdo pinner bypasses pytorches CPU allocator which can leak windows commit charge. * ops: bypass init() of weight for embedding layer This similarly consumes large commit charge especially for TEs. It can cause a permanement leaked commit charge which can destabilize on systems close to the commit ceiling and generally confuses the RAM stats. * model_patcher: implement pinned memory counter Implement a pinned memory counter for better accounting of what volume of memory pins have. * implement touch accounting Implement accounting of touching mmapped tensors. * mm+mp: add residency mmap getter * utils: use the aimdo mmap to load sft files * model_management: Implement tigher RAM pressure semantics Implement a pressure release on entire MMAPs as windows does perform faster when mmaps are unloaded and model loads free ramp into fully unallocated RAM. Make the concept of freeing for pins a completely separate concept. Now that pins are loadable directly from original file and don' touch the mmap, tighten the freeing budget to just the current loaded model - what you have left over. This still over-frees pins, but its a lot better than before. So after the pins are freed with that algorithm, bounce entire MMAPs to free RAM based on what the model needs, deducting off any known resident-in-mmap tensors to the free quota to keep it as tight as possible. * comfy-aimdo 0.2.11 Comfy aimdo 0.2.11 * mm: Implement file_slice path for QT * ruff * ops: put meta-tensors in place to allow custom nodes to check geo commit e1f10ca0932faf289757e7ec27a54894e271fdde Author: Dr.Lt.Data <128333288+ltdrdata@users.noreply.github.com> Date: Sat Mar 14 09:14:27 2026 +0900 bump manager version to 4.1b4 (#12930) commit 6cd35a0c5fd7d22df858be175f6a6e6ee0212e55 Author: Comfy Org PR Bot <snomiao+comfy-pr@gmail.com> Date: Sat Mar 14 03:31:25 2026 +0900 Bump comfyui-frontend-package to 1.41.19 (#12923) commit f9ceed9eefe20f6b54b801096cb80f874316f5b2 Author: Alexander Piskun <13381981+bigcat88@users.noreply.github.com> Date: Fri Mar 13 19:10:40 2026 +0200 fix(api-nodes): Tencent TextToModel and ImageToModel nodes (#12680) * fix(api-nodes): added "texture_image" output to TencentTextToModel and TencentImageToModel nodes. Fixed `OBJ` output when it is zipped * support additional solid texture outputs * fixed and enabled Tencent3DTextureEdit node commit 4a8cf359fe596fc4c25a0d335d303e42c3f8605d Author: Deep Mehta <42841935+deepme987@users.noreply.github.com> Date: Thu Mar 12 21:17:50 2026 -0700 Revert "Revert "feat: Add CacheProvider API for external distributed caching"" (#12915) * Revert "Revert "feat: Add CacheProvider API for external distributed caching …" This reverts commit d1d53c14be8442fca19aae978e944edad1935d46. * fix: gate provider lookups to outputs cache and fix UI coercion - Add `enable_providers` flag to BasicCache so only the outputs cache triggers external provider lookups/stores. The objects cache stores node class instances, not CacheEntry values, so provider calls were wasted round-trips that always missed. - Remove `or {}` coercion on `result.ui` — an empty dict passes the `is not None` gate in execution.py and causes KeyError when the history builder indexes `["output"]` and `["meta"]`. Preserving `None` correctly skips the ui_node_outputs addition. commit 63d1bbdb407c69370d407ce5ced6ca3f917528a8 Author: comfyanonymous <comfyanonymous@protonmail.com> Date: Thu Mar 12 20:41:48 2026 -0400 ComfyUI v0.17.0 commit 5df1427124f6ceb70166326ee257d52076adea37 Author: PxTicks <PxTicks@gmail.com> Date: Fri Mar 13 00:44:15 2026 +0000 Fix audio extraction and truncation bugs (#12652) Bug report in #12651 - to_skip fix: Prevents negative array slicing when the start offset is negative. - __duration check: Prevents the extraction loop from breaking after a single audio chunk when the requested duration is 0 (which is a sentinel for unlimited). commit d1d53c14be8442fca19aae978e944edad1935d46 Author: comfyanonymous <121283862+comfyanonymous@users.noreply.github.com> Date: Thu Mar 12 17:21:23 2026 -0700 Revert "feat: Add CacheProvider API for external distributed caching (#12056)" (#12912) This reverts commit af7b4a921d7abab7c852d7b5febb654be6e57eba. commit af7b4a921d7abab7c852d7b5febb654be6e57eba Author: Deep Mehta <42841935+deepme987@users.noreply.github.com> Date: Thu Mar 12 16:09:07 2026 -0700 feat: Add CacheProvider API for external distributed caching (#12056) * feat: Add CacheProvider API for external distributed caching Introduces a public API for external cache providers, enabling distributed caching across multiple ComfyUI instances (e.g., Kubernetes pods). New files: - comfy_execution/cache_provider.py: CacheProvider ABC, CacheContext/CacheValue dataclasses, thread-safe provider registry, serialization utilities Modified files: - comfy_execution/caching.py: Add provider hooks to BasicCache (_notify_providers_store, _check_providers_lookup), subcache exclusion, prompt ID propagation - execution.py: Add prompt lifecycle hooks (on_prompt_start/on_prompt_end) to PromptExecutor, set _current_prompt_id on caches Key features: - Local-first caching (check local before external for performance) - NaN detection to prevent incorrect external cache hits - Subcache exclusion (ephemeral subgraph results not cached externally) - Thread-safe provider snapshot caching - Graceful error handling (provider errors logged, never break execution) 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com> * fix: use deterministic hash for cache keys instead of pickle Pickle serialization is NOT deterministic across Python sessions due to hash randomization affecting frozenset iteration order. This causes distributed caching to fail because different pods compute different hashes for identical cache keys. Fix: Use _canonicalize() + JSON serialization which ensures deterministic ordering regardless of Python's hash randomization. This is critical for cross-pod cache key consistency in Kubernetes deployments. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com> * test: add unit tests for CacheProvider API - Add comprehensive tests for _canonicalize deterministic ordering - Add tests for serialize_cache_key hash consistency - Add tests for contains_nan utility - Add tests for estimate_value_size - Add tests for provider registry (register, unregister, clear) - Move json import to top-level (fix inline import) 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com> * style: remove unused imports in test_cache_provider.py 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com> * fix: move _torch_available before usage and use importlib.util.find_spec Fixes ruff F821 (undefined name) and F401 (unused import) errors. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com> * fix: use hashable types in frozenset test and add dict test Frozensets can only contain hashable types, so use nested frozensets instead of dicts. Added separate test for dict handling via serialize_cache_key. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com> * refactor: expose CacheProvider API via comfy_api.latest.Caching - Add Caching class to comfy_api/latest/__init__.py that re-exports from comfy_execution.cache_provider (source of truth) - Fix docstring: "Skip large values" instead of "Skip small values" (small compute-heavy values are good cache targets) - Maintain backward compatibility: comfy_execution.cache_provider imports still work Usage: from comfy_api.latest import Caching class MyProvider(Caching.CacheProvider): def on_lookup(self, context): ... def on_store(self, context, value): ... Caching.register_provider(MyProvider()) 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com> * docs: clarify should_cache filtering criteria Change docstring from "Skip large values" to "Skip if download time > compute time" which better captures the cost/benefit tradeoff for external caching. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com> * docs: make should_cache docstring implementation-agnostic Remove prescriptive filtering suggestions - let implementations decide their own caching logic based on their use case. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com> * feat: add optional ui field to CacheValue - Add ui field to CacheValue dataclass (default None) - Pass ui when creating CacheValue for external providers - Use result.ui (or default {}) when returning from external cache lookup This allows external cache implementations to store/retrieve UI data if desired, while remaining optional for implementations that skip it. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com> * refactor: rename _is_cacheable_value to _is_external_cacheable_value Clearer name since objects are also cached locally - this specifically checks for external caching eligibility. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com> * refactor: async CacheProvider API + reduce public surface - Make on_lookup/on_store async on CacheProvider ABC - Simplify CacheContext: replace cache_key + cache_key_bytes with cache_key_hash (str hex digest) - Make registry/utility functions internal (_prefix) - Trim comfy_api.latest.Caching exports to core API only - Make cache get/set async throughout caching.py hierarchy - Use asyncio.create_task for fire-and-forget on_store - Add NaN gating before provider calls in Core - Add await to 5 cache call sites in execution.py Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: remove unused imports (ruff) and update tests for internal API - Remove unused CacheContext and _serialize_cache_key imports from caching.py (now handled by _build_context helper) - Update test_cache_provider.py to use _-prefixed internal names - Update tests for new CacheContext.cache_key_hash field (str) - Make MockCacheProvider methods async to match ABC Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: address coderabbit review feedback - Add try/except to _build_context, return None when hash fails - Return None from _serialize_cache_key on total failure (no id()-based fallback) - Replace hex-like test literal with non-secret placeholder Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: use _-prefixed imports in _notify_prompt_lifecycle The lifecycle notification method was importing the old non-prefixed names (has_cache_providers, get_cache_providers, logger) which no longer exist after the API cleanup. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: add sync get_local/set_local for graph traversal ExecutionList in graph.py calls output_cache.get() and .set() from sync methods (is_cached, cache_link, get_cache). These cannot await the now-async get/set. Add get_local/set_local that bypass external providers and only access the local dict — which is all graph traversal needs. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * chore: remove cloud-specific language from cache provider API Make all docstrings and comments generic for the OSS codebase. Remove references to Kubernetes, Redis, GCS, pods, and other infrastructure-specific terminology. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * style: align documentation with codebase conventions Strip verbose docstrings and section banners to match existing minimal documentation style used throughout the codebase. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: add usage example to Caching class, remove pickle fallback - Add docstring with usage example to Caching class matching the convention used by sibling APIs (Execution.set_progress, ComfyExtension) - Remove non-deterministic pickle fallback from _serialize_cache_key; return None on JSON failure instead of producing unretrievable hashes - Move cache_provider imports to top of execution.py (no circular dep) Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * refactor: move public types to comfy_api, eager provider snapshot Address review feedback: - Move CacheProvider/CacheContext/CacheValue definitions to comfy_api/latest/_caching.py (source of truth for public API) - comfy_execution/cache_provider.py re-exports types from there - Build _providers_snapshot eagerly on register/unregister instead of lazy memoization in _get_cache_providers Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: generalize self-inequality check, fail-closed canonicalization Address review feedback from guill: - Rename _contains_nan to _contains_self_unequal, use not (x == x) instead of math.isnan to catch any self-unequal value - Remove Unhashable and repr() fallbacks from _canonicalize; raise ValueError for unknown types so _serialize_cache_key returns None and external caching is skipped (fail-closed) - Update tests for renamed function and new fail-closed behavior Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: suppress ruff F401 for re-exported CacheContext CacheContext is imported from _caching and re-exported for use by caching.py. Add noqa comment to satisfy the linter. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: enable external caching for subcache (expanded) nodes Subcache nodes (from node expansion) now participate in external provider store/lookup. Previously skipped to avoid duplicates, but the cost of missing partial-expansion cache hits outweighs redundant stores — especially with looping behavior on the horizon. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: wrap register/unregister as explicit static methods Define register_provider and unregister_provider as wrapper functions in the Caching class instead of re-importing. This locks the public API signature in comfy_api/ so internal changes can't accidentally break it. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: use debug-level logging for provider registration Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: follow ProxiedSingleton pattern for Caching class Add Caching as a nested class inside ComfyAPI_latest inheriting from ProxiedSingleton with async instance methods, matching the Execution and NodeReplacement patterns. Retains standalone Caching class for direct import convenience. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: inline registration logic in Caching class Follow the Execution/NodeReplacement pattern — the public API methods contain the actual logic operating on cache_provider module state, not wrapper functions delegating to free functions. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: single Caching definition inside ComfyAPI_latest Remove duplicate standalone Caching class. Define it once as a nested class in ComfyAPI_latest (matching Execution/NodeReplacement pattern), with a module-level alias for import convenience. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: remove prompt_id from CacheContext, type-safe canonicalization Remove prompt_id from CacheContext — it's not relevant for cache matching and added unnecessary plumbing (_current_prompt_id on every cache). Lifecycle hooks still receive prompt_id directly. Include type name in canonicalized primitives so that int 7 and str "7" produce distinct hashes. Also canonicalize dict keys properly instead of str() coercion. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: address review feedback on cache provider API - Hold references to pending store tasks to prevent "Task was destroyed but it is still pending" warnings (bigcat88) - Parallel cache lookups with asyncio.gather instead of sequential awaits for better performance (bigcat88) - Delegate Caching.register/unregister_provider to existing functions in cache_provider.py instead of reimplementing (bigcat88) Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> --------- Co-authored-by: Claude <noreply@anthropic.com> commit 8d9faaa181b9089cf8e4e00284443ef5c3405a12 Author: Christian Byrne <cbyrne@comfy.org> Date: Thu Mar 12 15:14:59 2026 -0700 Update requirements.txt (#12910) commit 47e1e316c580ce6bf264cb069bffc10a50d3f167 Author: comfyanonymous <121283862+comfyanonymous@users.noreply.github.com> Date: Thu Mar 12 13:54:38 2026 -0700 Lower kv cache memory usage. (#12909) commit 712411d53919350ae5050cbdf7ed60fcc2b52cda Author: ComfyUI Wiki <contact@comfyui-wiki.com> Date: Fri Mar 13 03:16:54 2026 +0800 chore: update workflow templates to v0.9.21 (#12908) commit 3fa8c5686dc86fe4e63ad3ca84d71524792a17b1 Author: Terry Jia <terryjia88@gmail.com> Date: Thu Mar 12 10:14:28 2026 -0700 fix: use frontend-compatible format for Float gradient_stops (#12789) Co-authored-by: guill <jacob.e.segal@gmail.com> Co-authored-by: Jedrzej Kosinski <kosinkadink1@gmail.com> commit 73d9599495e45c22ef3672176f34945deeea5444 Author: Terry Jia <terryjia88@gmail.com> Date: Thu Mar 12 09:55:29 2026 -0700 add painter node (#12294) * add painter node * use io.Color * code improve --------- Co-authored-by: guill <jacob.e.segal@gmail.com> commit 44f1246c899ed188759f799dbd00c31def289114 Author: comfyanonymous <121283862+comfyanonymous@users.noreply.github.com> Date: Thu Mar 12 08:30:50 2026 -0700 Support flux 2 klein kv cache model: Use the FluxKVCache node. (#12905) commit 8f9ea495713d4565dfe564e0c06f362bd627f902 Author: comfyanonymous <121283862+comfyanonymous@users.noreply.github.com> Date: Wed Mar 11 21:17:31 2026 -0700 Bump comfy-kitchen version to 0.2.8 (#12895) commit 9ce4c3dd87c9c77dfe0371045fa920ce55e08973 Author: Comfy Org PR Bot <snomiao+comfy-pr@gmail.com> Date: Thu Mar 12 10:16:30 2026 +0900 Bump comfyui-frontend-package to 1.41.16 (#12894) Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com> commit abc87d36693b007bdbdab5ee753ccea6326acb34 Author: Comfy Org PR Bot <snomiao+comfy-pr@gmail.com> Date: Thu Mar 12 06:04:51 2026 +0900 Bump comfyui-frontend-package to 1.41.15 (#12891) --------- Co-authored-by: Alexander Brown <DrJKL0424@gmail.com> commit f6274c06b4e7bce8adbc1c60ae5a4c168825a614 Author: comfyanonymous <121283862+comfyanonymous@users.noreply.github.com> Date: Wed Mar 11 13:37:31 2026 -0700 Fix issue with batch_size > 1 on some models. (#12892) commit 4f4f8659c205069f74da8ac47378a5b1c0e142ca Author: Adi Borochov <58855640+adiborochov@users.noreply.github.com> Date: Wed Mar 11 19:04:13 2026 +0200 fix: guard torch.AcceleratorError for compatibility with torch < 2.8.0 (#12874) * fix: guard torch.AcceleratorError for compatibility with torch < 2.8.0 torch.AcceleratorError was introduced in PyTorch 2.8.0. Accessing it directly raises AttributeError on older versions. Use a try/except fallback at module load time, consistent with the existing pattern used for OOM_EXCEPTION. * fix: address review feedback for AcceleratorError compat - Fall back to RuntimeError instead of type(None) for ACCELERATOR_ERROR, consistent with OOM_EXCEPTION fallback pattern and valid for except clauses - Add "out of memory" message introspection for RuntimeError fallback case - Use RuntimeError directly in discard_cuda_async_error except clause --------- commit 3365008dfe5a7a46cbe76d8ad0d7efb054617733 Author: Alexander Piskun <13381981+bigcat88@users.noreply.github.com> Date: Wed Mar 11 18:53:55 2026 +0200 feat(api-nodes): add Reve Image nodes (#12848) commit 980621da83267beffcb84839a27101b7092256e7 Author: rattus <46076784+rattus128@users.noreply.github.com> Date: Wed Mar 11 08:49:38 2026 -0700 comfy-aimdo 0.2.10 (#12890) Comfy Aimdo 0.2.10 fixes the aimdo allocator hook for legacy cudaMalloc consumers. Some consumers of cudaMalloc assume implicit synchronization built in closed source logic inside cuda. This is preserved by passing through to cuda as-is and accouting after the fact as opposed to integrating these hooks with Aimdos VMA based allocator. commit 9642e4407b60b291744cc1d34501783cff6702e5 Author: comfyanonymous <121283862+comfyanonymous@users.noreply.github.com> Date: Tue Mar 10 21:09:35 2026 -0700 Add pre attention and post input patches to qwen image model. (#12879) commit 3ad36d6be66b2af2a7c3dc9ab6936eebc6b98075 Author: comfyanonymous <121283862+comfyanonymous@users.noreply.github.com> Date: Tue Mar 10 17:09:12 2026 -0700 Allow model patches to have a cleanup function. (#12878) The function gets called after sampling is finished. commit 8086468d2a1a5a6ed70fea3391e7fb9248ebc7da Author: rattus <46076784+rattus128@users.noreply.github.com> Date: Tue Mar 10 09:05:31 2026 -0700 main: switch on faulthandler (#12868) When we get segfault bug reports we dont get much. Switch on pythons inbuilt tracer for segfault. commit 535c16ce6e3d2634d6eb2fd17ecccb8d497e26a0 Author: rattus <46076784+rattus128@users.noreply.github.com> Date: Mon Mar 9 21:41:02 2026 -0700 Widen OOM_EXCEPTION to AcceleratorError form (#12835) Pytorch only filters for OOMs in its own allocators however there are paths that can OOM on allocators made outside the pytorch allocators. These manifest as an AllocatorError as pytorch does not have universal error translation to its OOM type on exception. Handle it. A log I have for this also shows a double report of the error async, so call the async discarder to cleanup and make these OOMs look like OOMs. commit a912809c252f5a2d69c8ab4035fc262a578fdcee Author: rattus <46076784+rattus128@users.noreply.github.com> Date: Mon Mar 9 20:50:10 2026 -0700 model_detection: deep clone pre edited edited weights (#12862) Deep clone these weights as needed to avoid segfaulting when it tries to touch the original mmap. commit c4fb0271cd7fbddb2381372b1f7c1206d1dd58fc Author: comfyanonymous <121283862+comfyanonymous@users.noreply.github.com> Date: Mon Mar 9 20:37:58 2026 -0700 Add a way for nodes to add pre attn patches to flux model. (#12861) commit 740d998c9cc821ca0a72b5b5d4b17aba1aec6b44 Author: Dr.Lt.Data <128333288+ltdrdata@users.noreply.github.com> Date: Tue Mar 10 11:49:31 2026 +0900 fix(manager): improve install guidance when comfyui-manager is not installed (#12810) commit 814dab9f4636df22a36cbbad21e35ac7609a0ef2 Author: ComfyUI Wiki <contact@comfyui-wiki.com> Date: Tue Mar 10 10:03:22 2026 +0800 Update workflow templates to v0.9.18 (#12857) commit 06f85e2c792c626f2cab3cb4f94cd30d43e9347b Author: Jukka Seppänen <40791699+kijai@users.noreply.github.com> Date: Mon Mar 9 22:08:51 2026 +0200 Fix text encoder lora loading for wrapped models (#12852) commit e4b0bb8305a4069ef7ff8396bfc6057c736ab95b Author: comfyanonymous <121283862+comfyanonymous@users.noreply.github.com> Date: Sun Mar 8 13:25:30 2026 -0700 Import assets seeder later, print some package versions. (#12841) commit 7723f20bbe010a3ea4eac602f77b0ff496f123c4 Author: rattus <46076784+rattus128@users.noreply.github.com> Date: Sun Mar 8 13:17:40 2026 -0700 comfy-aimdo 0.2.9 (#12840) Comfy-aimdo 0.2.9 fixes a context issue where if a non-main thread does a spurious garbage collection, cudaFrees are attempted with bad context. Some new APIs for displaying aimdo stats in UI widgets are also added. These are purely additive getters that dont touch cuda APIs. commit 29b24cb5177e9d5aa5b3d2e5869999efb4d538c7 Author: Luke Mino-Altherr <lminoaltherr@gmail.com> Date: Sat Mar 7 17:37:25 2026 -0800 refactor(assets): modular architecture + async two-phase scanner & background seeder (#12621) commit a7a6335be538f55faa2abf7404c9b8e970847d1f Author: comfyanonymous <comfyanonymous@protonmail.com> Date: Sat Mar 7 16:52:39 2026 -0500 ComfyUI v0.16.4 commit bcf1a1fab1e9efe0d4999ea14e9c0318409e0000 Author: rattus <46076784+rattus128@users.noreply.github.com> Date: Sat Mar 7 09:38:08 2026 -0800 mm: reset_cast_buffers: sync compute stream before free (#12822) Sync the compute stream before freeing the cast buffers. This can cause use after free issues when the cast stream frees the buffer while the compute stream is behind enough to still needs a casted weight. commit 6ac8152fc80734b084d12865460e5e9a5d9a4e1b Author: ComfyUI Wiki <contact@comfyui-wiki.com> Date: Sat Mar 7 15:54:09 2026 +0800 chore: update workflow templates to v0.9.11 (#12821) commit afc00f00553885eeb96ded329878fe732f6b9f7a Author: comfyanonymous <121283862+comfyanonymous@users.noreply.github.com> Date: Fri Mar 6 17:10:53 2026 -0800 Fix requirements version. (#12817) commit d69d30819b91aa020d0bb888df2a5b917f83bb7e Author: comfyanonymous <121283862+comfyanonymous@users.noreply.github.com> Date: Fri Mar 6 16:11:16 2026 -0800 Don't run TE on cpu when dynamic vram enabled. (#12815) commit f466b066017b9ebe5df67decfcbd09f78c5c66fa Author: rattus <46076784+rattus128@users.noreply.github.com> Date: Fri Mar 6 15:20:07 2026 -0800 Fix fp16 audio encoder models (#12811) * mp: respect model_defined_dtypes in default caster This is needed for parametrizations when the dtype changes between sd and model. * audio_encoders: archive model dtypes Archive model dtypes to stop the state dict load override the dtypes defined by the core for compute etc. commit 34e55f006156801a6b5988d046d9041cb681f12d Author: Alexander Piskun <13381981+bigcat88@users.noreply.github.com> Date: Fri Mar 6 19:54:27 2026 +0200 feat(api-nodes): add Gemini 3.1 Flash Lite model to LLM node (#12803) commit 3b93d5d571cb3e018da65f822cd11b60202b11c2 Author: Alexander Piskun <13381981+bigcat88@users.noreply.github.com> Date: Fri Mar 6 11:04:48 2026 +0200 feat(api-nodes): add TencentSmartTopology node (#12741) * feat(api-nodes): add TencentSmartTopology node * feat(api-nodes): enable TencentModelTo3DUV node * chore(Tencent endpoints): add "wait" to queued statuses commit e544c65db91df5a070be69a0a9b922201fe79335 Author: Dante <bunggl@naver.com> Date: Fri Mar 6 11:51:28 2026 +0900 feat: add Math Expression node with simpleeval evaluation (#12687) * feat: add EagerEval dataclass for frontend-side node evaluation Add EagerEval to the V3 API schema, enabling nodes to declare frontend-evaluated JSONata expressions. The frontend uses this to display computation results as badges without a backend round-trip. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * feat: add Math Expression node with JSONata evaluation Add ComfyMathExpression node that evaluates JSONata expressions against dynamically-grown numeric inputs using Autogrow + MatchType. Sends input context via ui output so the frontend can re-evaluate when the expression changes without a backend round-trip. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * feat: register nodes_math.py in extras_files loader list Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: address CodeRabbit review feedback - Harden EagerEval.validate with type checks and strip() for empty strings - Add _positional_alias for spreadsheet-style names beyond z (aa, ab...) - Validate JSONata result is numeric before returning - Add jsonata to requirements.txt Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * refactor: remove EagerEval, scope PR to math node only Remove EagerEval dataclass from _io.py and eager_eval usage from nodes_math.py. Eager execution will be designed as a general-purpose system in a separate effort. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: use TemplateNames, cap inputs at 26, improve error message Address Kosinkadink review feedback: - Switch from Autogrow.TemplatePrefix to Autogrow.TemplateNames so input slots are named a-z, matching expression variables directly - Cap max inputs at 26 (a-z) instead of 100 - Simplify execute() by removing dual-mapping hack - Include expression and result value in error message Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * test: add unit tests for Math Expression node Add tests for _positional_alias (a-z mapping) and execute() covering arithmetic operations, float inputs, $sum(values), and error cases. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * refactor: replace jsonata with simpleeval for math evaluation jsonata PyPI package has critical issues: no Python 3.12/3.13 wheels, no ARM/Apple Silicon wheels, abandoned (last commit 2023), C extension. Replace with simpleeval (pure Python, 3.4M downloads/month, MIT, AST-based security). Add math module functions (sqrt, ceil, floor, log, sin, cos, tan) and variadic sum() supporting both sum(values) and sum(a, b, c). Pin version to >=1.0,<2.0. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * test: update tests for simpleeval migration Update JSONata syntax to Python syntax ($sum -> sum, $string -> str), add tests for math functions (sqrt, ceil, floor, sin, log10) and variadic sum(a, b, c). Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * refactor: replace MatchType with MultiType inputs and dual FLOAT/INT outputs Allow mixing INT and FLOAT connections on the same node by switching from MatchType (which forces all inputs to the same type) to MultiType. Output both FLOAT and INT so users can pick the type they need. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * test: update tests for mixed INT/FLOAT inputs and dual outputs Add assertions for both FLOAT (result[0]) and INT (result[1]) outputs. Add test_mixed_int_float_inputs and test_mixed_resolution_scale to verify the primary use case of multiplying resolutions by a float factor. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * feat: make expression input multiline and validate empty expression - Add multiline=True to expression input for better UX with longer expressions - Add empty expression validation with clear "Expression cannot be empty." message Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * test: add tests for empty expression validation Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: address review feedback — safe pow, isfinite guard, test coverage - Wrap pow() with _safe_pow to prevent DoS via huge exponents (pow() bypasses simpleeval's safe_power guard on **) - Add math.isfinite() check to catch inf/nan before int() conversion - Add int/float converters to MATH_FUNCTIONS for explicit casting - Add "calculator" search alias - Replace _positional_alias helper with string.ascii_lowercase - Narrow test assertions and add error path + function coverage tests Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * Update requirements.txt --------- Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com> Co-authored-by: Jedrzej Kosinski <kosinkadink1@gmail.com> Co-authored-by: Christian Byrne <abolkonsky.rem@gmail.com> commit 1c218282369a6cc80651d878fc51fa33d7bf34e2 Author: comfyanonymous <comfyanonymous@protonmail.com> Date: Thu Mar 5 17:25:49 2026 -0500 ComfyUI v0.16.3 commit 58017e8726bdddae89704b1e0123bedc29994424 Author: Tavi Halperin <tavi@lightricks.com> Date: Thu Mar 5 23:51:20 2026 +0200 feat: add causal_fix parameter to add_keyframe_index and append_keyframe (#12797) Allows explicit control over the causal_fix flag passed to latent_to_pixel_coords. Defaults to frame_idx == 0 when not specified, fixing the previous heuristic. commit 17b43c2b87eba43f0f071471b855e0ed659a2627 Author: comfyanonymous <121283862+comfyanonymous@users.noreply.github.com> Date: Thu Mar 5 13:31:28 2026 -0800 LTX audio vae novram fixes. (#12796) commit 8befce5c7b84ff3451a6bd3bcbae1355ad322855 Author: Jukka Seppänen <40791699+kijai@users.noreply.github.com> Date: Thu Mar 5 22:37:25 2026 +0200 Add manual cast to LTX2 vocoder conv_transpose1d (#12795) * Add manual cast to LTX2 vocoder * Update vocoder.py commit 50549aa252903b936b2ed00b5de418c8b47f0841 Author: comfyanonymous <comfyanonymous@protonmail.com> Date: Thu Mar 5 13:41:06 2026 -0500 ComfyUI v0.16.2 commit 1c3b651c0a1539a374e3d29a3ce695b5844ac5fc Author: comfyanonymous <121283862+comfyanonymous@users.noreply.github.com> Date: Thu Mar 5 10:35:56 2026 -0800 Refactor. (#12794) commit 5073da57ad20a2abb921f79458e49a7f7d608740 Author: ComfyUI Wiki <contact@comfyui-wiki.com> Date: Fri Mar 6 02:22:38 2026 +0800 chore: update workflow templates to v0.9.10 (#12793) commit 42e0e023eee6a19c1adb7bd3dc11c81ff6dcc9c8 Author: rattus <46076784+rattus128@users.noreply.github.com> Date: Thu Mar 5 10:22:17 2026 -0800 ops: Handle CPU weight in VBAR caster (#12792) This shouldn't happen but custom nodes gets there. Handle it as best we can. commit 6481569ad4c3606bc50e9de39ce810651690ae79 Author: rattus <46076784+rattus128@users.noreply.github.com> Date: Thu Mar 5 09:04:24 2026 -0800 comfy-aimdo 0.2.7 (#12791) Comfy-aimdo 0.2.7 fixes a crash when a spurious cudaAsyncFree comes in and would cause an infinite stack overflow (via detours hooks). A lock is also introduced on the link list holding the free sections to avoid any possibility of threaded miscellaneous cuda allocations being the root cause. commit 6ef82a89b83a49247081dc57b154172573c9e313 Author: comfyanonymous <comfyanonymous@protonmail.com> Date: Thu Mar 5 10:38:33 2026 -0500 ComfyUI v0.16.1 commit da29b797ce00b491c269e864cc3b8fceb279e530 Author: ComfyUI Wiki <contact@comfyui-wiki.com> Date: Thu Mar 5 23:23:23 2026 +0800 Update workflow templates to v0.9.8 (#12788) commit 9cdfd7403bc46f75d12be16ba6041b8bcdd3f7fd Author: Alexander Piskun <13381981+bigcat88@users.noreply.github.com> Date: Thu Mar 5 17:12:38 2026 +0200 feat(api-nodes): enable Kling 3.0 Motion Control (#12785) commit bd21363563ce8e312c9271a0c64a0145335df8a9 Author: Alexander Piskun <13381981+bigcat88@users.noreply.github.com> Date: Thu Mar 5 14:29:39 2026 +0200 feat(api-nodes-xAI): updated models, pricing, added features (#12756) commit e04d0dbeb8266aa9262b5a4c3934ba4e4a371e37 Author: comfyanonymous <comfyanonymous@protonmail.com> Date: Thu Mar 5 04:06:29 2026 -0500 ComfyUI v0.16.0 commit c8428541a6b6e4b1e0fbd685e9c846efcb60179e Author: ComfyUI Wiki <contact@comfyui-wiki.com> Date: Thu Mar 5 16:58:25 2026 +0800 chore: update workflow templates to v0.9.7 (#12780) commit 4941671b5a5c65fea48be922caa76b7f6a0a4595 Author: comfyanonymous <121283862+comfyanonymous@users.noreply.github.com> Date: Wed Mar 4 23:39:51 2026 -0800 Fix cuda getting initialized in cpu mode. (#12779) commit c5fe8ace68c432a262a5093bdd84b3ed70b9d283 Author: ComfyUI Wiki <contact@comfyui-wiki.com> Date: Thu Mar 5 15:37:35 2026 +0800 chore: update workflow templates to v0.9.6 (#12778) commit f2ee7f2d367f98bb8a33bcb4a224bda441eb8a07 Author: comfyanonymous <121283862+comfyanonymous@users.noreply.github.com> Date: Wed Mar 4 22:21:55 2026 -0800 Fix cublas ops on dynamic vram. (#12776) commit 43c64b6308f93c331f057e12799bad0a68be5117 Author: comfyanonymous <121283862+comfyanonymous@users.noreply.github.com> Date: Wed Mar 4 17:06:20 2026 -0800 Support the LTXAV 2.3 model. (#12773) commit ac4a943ff364885166def5d418582db971554caf Author: comfyanonymous <121283862+comfyanonymous@users.noreply.github.com> Date: Wed Mar 4 13:33:14 2026 -0800 Initial load device should be cpu when using dynamic vram. (#12766) commit 8811db52db5d0aea49c1dbedd733a6b9304b83a9 Author: rattus <46076784+rattus128@users.noreply.github.com> Date: Wed Mar 4 12:12:37 2026 -0800 comfy-aimdo 0.2.6 (#12764) Comfy Aimdo 0.2.6 fixes a GPU virtual address leak. This would manfiest as an error after a number of workflow runs. commit 0a7446ade4bbeecfaf36e9a70eeabbeb0f6e59ea Author: Jukka Seppänen <40791699+kijai@users.noreply.github.com> Date: Wed Mar 4 18:59:56 2026 +0200 Pass tokens when loading text gen model for text generation (#12755) Co-authored-by: Jedrzej Kosinski <kosinkadink1@gmail.com> commit 9b85cf955858b0aca6b7b30c30b404470ea0c964 Author: rattus <46076784+rattus128@users.noreply.github.com> Date: Wed Mar 4 07:49:13 2026 -0800 Comfy Aimdo 0.2.5 + Fix offload performance in DynamicVram (#12754) * ops: dont unpin nothing This was calling into aimdo in the none case (offloaded weight). Whats worse, is aimdo syncs for unpinning an offloaded weight, as that is the corner case of a weight getting evicted by its own use which does require a sync. But this was heppening every offloaded weight causing slowdown. * mp: fix get_free_memory policy The ModelPatcherDynamic get_free_memory was deducting the model from to try and estimate the conceptual free memory with doing any offloading. This is kind of what the old memory_memory_required was estimating in ModelPatcher load logic, however in practical reality, between over-estimates and padding, the loader usually underloaded models enough such that sampling could send CFG +/- through together even when partially loaded. So don't regress from the status quo and instead go all in on the idea that offloading is less of an issue than debatching. Tell the sampler it can use everything. commit d531e3fb2a885d675d5b6d3a496b4af5d9757af1 Author: rattus <46076784+rattus128@users.noreply.github.com> Date: Wed Mar 4 07:47:44 2026 -0800 model_patcher: Improve dynamic offload heuristic (#12759) Define a threshold below which a weight loading takes priority. This actually makes the offload consistent with non-dynamic, because what happens, is when non-dynamic fills ints to_load list, it will fill-up any left-over pieces that could fix large weights with small weights and load them, even though they were lower priority. This actually improves performance because the timy weights dont cost any VRAM and arent worth the control overhead of the DMA etc. commit eb011733b6e4d8a9f7b67a1787d817bfc8c0a5b4 Author: Arthur R Longbottom <art.longbottom.jr@gmail.com> Date: Tue Mar 3 21:29:00 2026 -0800 Fix VideoFromComponents.save_to crash when writing to BytesIO (#12683) * Fix VideoFromComponents.save_to crash when writing to BytesIO When `get_container_format()` or `get_stream_source()` is called on a tensor-based video (VideoFromComponents), it calls `save_to(BytesIO())`. Since BytesIO has no file extension, `av.open` can't infer the output format and throws `ValueError: Could not determine output format`. The sibling class `VideoFromFile` already handles this correctly via `get_open_write_kwargs()`, which detects BytesIO and sets the format explicitly. `VideoFromComponents` just never got the same treatment. This surfaces when any downstream node validates the container format of a tensor-based video, like TopazVideoEnhance or any node that calls `validate_container_format_is_mp4()`. Three-line fix in `comfy_api/latest/_input_impl/video_types.py`. * Add docstring to save_to to satisfy CI coverage check commit ac6513e142f881202c40eacc5e337982b777ccd0 Author: rattus <46076784+rattus128@users.noreply.github.com> Date: Tue Mar 3 18:19:40 2026 -0800 DynamicVram: Add casting / fix torch Buffer weights (#12749) * respect model dtype in non-comfy caster * utils: factor out parent and name functionality of set_attr * utils: implement set_attr_buffer for torch buffers * ModelPatcherDynamic: Implement torch Buffer loading If there is a buffer in dynamic - force load it. commit b6ddc590ed8dafd50df8aad1e626b78276a690c0 Author: Terry Jia <terryjia88@gmail.com> Date: Tue Mar 3 19:58:53 2026 -0500 CURVE type (#12581) * CURVE type * fix: update typed wrapper unwrap keys to __type__ and __value__ * code improve * code improve commit f719a9d928049f85b07b8ecc2259fba4832d37bb Author: comfyanonymous <121283862+comfyanonymous@users.noreply.github.com> Date: Tue Mar 3 14:35:22 2026 -0800 Adjust memory usage factor of zeta model. (#12746) commit 174fd6759deee5ea73e4cde4ba2936e8d62d8d66 Author: rattus <46076784+rattus128@users.noreply.github.com> Date: Tue Mar 3 08:51:15 2026 -0800 main: Load aimdo after logger is setup (#12743) This was too early. Aimdo can use the logger in error paths and this causes a rogue default init if aimdo has something to log. commit 09bcbddfcf804634f008f53c1827b7ba9a3956ec Author: rattus <46076784+rattus128@users.noreply.github.com> Date: Tue Mar 3 08:50:33 2026 -0800 ModelPatcherDynamic: Force load all non-comfy weights (#12739) * model_management: Remove non-comfy dynamic _v caster * Force pre-load non-comfy weights to GPU in ModelPatcherDynamic Non-comfy weights may expect to be pre-cast to the target device without in-model casting. Previously they were allocated in the vbar with _v which required the _v fault path in cast_to. Instead, back up the original CPU weight and move it directly to GPU at load time. commit dff0a4a15887383c90a031e3fd48ebc41f6928e7 Author: xeinherjer <112741359+xeinherjer-dev@users.noreply.github.com> Date: Tue Mar 3 10:17:51 2026 +0900 Fix VAEDecodeAudioTiled ignoring tile_size input (#12735) (#12738) commit 9ebee0a2179b361a24c20838c1848d7988320636 Author: Lodestone <lodestone.rock@gmail.com> Date: Tue Mar 3 07:43:47 2026 +0700 Feat: z-image pixel space (model still training atm) (#12709) * draft zeta (z-image pixel space) * revert gitignore * model loaded and able to run however vector direction still wrong tho * flip the vector direction to original again this time * Move wrongly positioned Z image pixel space class * inherit Radiance LatentFormat class * Fix parameters in classes for Zeta x0 dino * remove arbitrary nn.init instances * Remove unused import of lru_cache --------- Co-authored-by: silveroxides <ishimarukaito@gmail.com> commit 57dd6c1aadf500d90f635a8d3c15418c0d6d6ecd Author: comfyanonymous <121283862+comfyanonymous@users.noreply.github.com> Date: Mon Mar 2 15:54:18 2026 -0800 Support loading zeta chroma weights properly. (#12734) commit f1f8996e1562c3753666d1c568b2ff629edb9e36 Author: ComfyUI Wiki <contact@comfyui-wiki.com> Date: Tue Mar 3 01:13:42 2026 +0800 chore: update workflow templates to v0.9.5 (#12732) commit afb54219fac341fa8614fdab090fe8096d0aec1e Author: Alexander Piskun <13381981+bigcat88@users.noreply.github.com> Date: Mon Mar 2 09:24:33 2026 +0200 feat(api-nodes): allow to use "IMAGE+TEXT" in NanoBanana2 (#12729) commit 7175c11a4ed41278c9cb9e6961b8d8776ef69f00 Author: rattus <46076784+rattus128@users.noreply.github.com> Date: Sun Mar 1 22:21:41 2026 -0800 comfy aimdo 0.2.4 (#12727) Comfy Aimdo 0.2.4 fixes a VRAM buffer alignment issue that happens in someworkflows where action is able to bypass the pytorch allocator and go straight to the cuda hook. commit dfbf99a06172a5c54002d80abf3e74c0d82c10b9 Author: rattus <46076784+rattus128@users.noreply.github.com> Date: Sun Mar 1 19:18:56 2026 -0800 model_mangament: make dynamic --disable-smart-memory work (#12724) This was previously considering the pool of dynamic models as one giant entity for the sake of smart memory, but that isnt really the useful or what a user would reasonably expect. Make Dynamic VRAM properly purge its models just like the old --disable-smart-memory but conditioning the dynamic-for-dynamic bypass on smart memory. Re-enable dynamic smart memory. commit 602f6bd82c1f8b31d1b10b5f9ae4aa9637772ad5 Author: comfyanonymous <121283862+comfyanonymous@users.noreply.github.com> Date: Sun Mar 1 12:28:39 2026 -0800 Make --disable-smart-memory disable dynamic vram. (#12722) commit c0d472e5b9b256d9e802ecac703bb6a8ca5f9eb8 Author: rattus <46076784+rattus128@users.noreply.github.com> Date: Sun Mar 1 11:14:56 2026 -0800 comfy-aimdo 0.2.3 (#12720) commit 4d79f4f0280da6c0a0e37123b9c80f24e2403536 Author: drozbay <17261091+drozbay@users.noreply.github.com> Date: Sun Mar 1 10:38:30 2026 -0700 fix: handle substep sigmas in context window set_step (#12719) Multi-step samplers (eg. dpmpp_2s_ancestral) call the model at intermediate sigma values not present in the schedule. This caused set_step to crash with "No sample_sigmas matched current timestep" when context windows were enabled. The fix is to keep self._step from the last exact match when a substep sigma is encountered, since substeps are still logically part of their parent step and should use the same context windows. Co-authored-by: ozbayb <17261091+ozbayb@users.noreply.github.com> commit 850e8b42ff67cec295edb686c4b85dc7811f5e7f Author: Christian Byrne <cbyrne@comfy.org> Date: Sat Feb 28 21:38:19 2026 -0800 feat: add text preview support to jobs API (#12169) * feat: add text preview support to jobs API Amp-Thread-ID: https://ampcode.com/threads/T-019c0be0-9fc6-71ac-853a-7c7cc846b375 Co-authored-by: Amp <amp@ampcode.com> * test: update tests to expect text as previewable media type Amp-Thread-ID: https://ampcode.com/threads/T-019c0be0-9fc6-71ac-853a-7c7cc846b375 --------- commit d159142615e0a1a7ae4eb711a6ae9f66a5f2d76e Author: Christian Byrne <cbyrne@comfy.org> Date: Sat Feb 28 20:59:24 2026 -0800 refactor: rename Mahiro CFG to Similarity-Adaptive Guidance (#12172) * refactor: rename Mahiro CFG to Similarity-Adaptive Guidance Rename the display name to better describe what the node does: adaptively blends guidance based on cosine similarity between positive and negative conditions. Amp-Thread-ID: https://ampcode.com/threads/T-019c0d36-8b43-745f-b7b2-e35b53f17fa1 Co-authored-by: Amp <amp@ampcode.com> * feat: add search aliases for old mahiro name Amp-Thread-ID: https://ampcode.com/threads/T-019c0d36-8b43-745f-b7b2-e35b53f17fa1 * rename: Similarity-Adaptive Guidance → Positive-Biased Guidance (per reviewer) - display_name changed to 'Positive-Biased Guidance' to avoid SAG acronym collision - search_aliases expanded: mahiro, mahiro cfg, similarity-adaptive guidance, positive-biased cfg - ruff format applied --------- Co-authored-by: Amp <amp@ampcode.com> Co-authored-by: Jedrzej Kosinski <kosinkadink1@gmail.com> commit 1080bd442a7509d29bfe0b29cac9222de406c994 Author: comfyanonymous <121283862+comfyanonymous@users.noreply.github.com> Date: Sat Feb 28 19:23:28 2026 -0800 Disable dynamic vram on wsl. (#12706) commit 17106cb124fcfa0b75ea24993c65aa024059fc8d Author: comfyanonymous <121283862+comfyanonymous@users.noreply.github.com> Date: Sat Feb 28 19:21:32 2026 -0800 Move parsing of requirements logic to function. (#12701) commit 48bb0bd18aa90bba0eac7b4c1a1400c4f7110046 Author: rattus <46076784+rattus128@users.noreply.github.com> Date: Sat Feb 28 13:52:30 2026 -0800 cli_args: Default comfy to DynamicVram mode (#12658) commit 5f41584e960d3ad90f6581278e57f7b52e771db4 Author: rattus <46076784+rattus128@users.noreply.github.com> Date: Sat Feb 28 13:50:18 2026 -0800 Disable dynamic_vram when weight hooks applied (#12653) * sd: add support for clip model reconstruction * nodes: SetClipHooks: Demote the dynamic model patcher * mp: Make dynamic_disable more robust The backup need to not be cloned. In addition add a delegate object to ModelPatcherDynamic so that non-cloning code can do ModelPatcherDynamic demotion * sampler_helpers: Demote to non-dynamic model patcher when hooking * code rabbit review comments commit 1f6744162f606cce895f2d9818207ddecbce5932 Author: Jukka Seppänen <40791699+kijai@users.noreply.github.com> Date: Sat Feb 28 23:49:12 2026 +0200 feat: Support SCAIL WanVideo model (#12614) commit 95e1059661f7a1584b5f84a6ece72ed8d8992b73 Author: fappaz <fernando.augusto.paz@gmail.com> Date: Sat Feb 28 19:18:40 2026 +1300 fix(ace15): handle missing lm_metadata in memory estimation during checkpoint export #12669 (#12686) commit 80d49441e5e255f8d91d2f335f930e74ba85cbe8 Author: Christian Byrne <cbyrne@comfy.org> Date: Fri Feb 27 20:53:46 2026 -0800 refactor: use AspectRatio enum members as ASPECT_RATIOS dict keys (#12689) Amp-Thread-ID: https://ampcode.com/threads/T-019ca1cb-0150-7549-8b1b-6713060d3408 Co-authored-by: Jedrzej Kosinski <kosinkadink1@gmail.com> commit 9d0e114ee380d3eac8aeb00260a9df1212b6046a Author: comfyanonymous <121283862+comfyanonymous@users.noreply.github.com> Date: Fri Feb 27 20:34:58 2026 -0800 PyOpenGL-accelerate is not necessary. (#12692) commit ac4412d0fa2b9df8469fb6018e0036c47332397a Author: Talmaj <Talmaj@users.noreply.github.com> Date: Sat Feb 28 05:04:34 2026 +0100 Native LongCat-Image implementation (#12597) commit 94f1a1cc9df69cbc75fe6d0f78a4de5d1d857d9d Author: comfyanonymous <121283862+comfyanonymous@users.noreply.github.com> Date: Fri Feb 27 17:16:24 2026 -0800 Limit overlap in image tile and combine nodes to prevent issues. (#12688) commit e721e24136b5480c396bf0e37a114f6e4083482b Author: rattus <46076784+rattus128@users.noreply.github.com> Date: Fri Feb 27 16:05:51 2026 -0800 ops: implement lora requanting for non QuantizedTensor fp8 (#12668) Allow non QuantizedTensor layer to set want_requant to get the post lora calculation stochastic cast down to the original input dtype. This is then used by the legacy fp8 Linear implementation to set the compute_dtype to the preferred lora dtype but then want_requant it back down to fp8. This fixes the issue with --fast fp8_matrix_mult is combined with --fast dynamic_vram which doing a lora on an fp8_ non QT model. commit 25ec3d96a323c8455c6ee69e43bdd7a5599d3cc0 Author: Reiner "Tiles" Prokein <ReinerBforartists@users.noreply.github.com> Date: Sat Feb 28 01:03:45 2026 +0100 Class WanVAE, def encode, feat_map is using self.decoder instead of self.encoder (#12682) commit 1f1ec377ce9d4c525d1615099524231756a69e5e Author: Christian Byrne <cbyrne@comfy.org> Date: Fri Feb 27 09:13:57 2026 -0800 feat: add ResolutionSelector node for aspect ratio and megapixel-based resolution calculation (#12199) Amp-Thread-ID: https://ampcode.com/threads/T-019c179e-cd8c-768f-ae66-207c7a53c01d Co-authored-by: Jedrzej Kosinski <kosinkadink1@gmail.com> commit 0a7f8e11b6f280b1b574f5dd642e0b46b8f0e045 Author: pythongosssss <125205205+pythongosssss@users.noreply.github.com> Date: Fri Feb 27 16:13:24 2026 +0000 fix torch.cat requiring inputs to all be same dimensions (#12673) commit 35e9fce7756604050f07a05d090e697b81322c44 Author: vickytsang <vtsang@amd.com> Date: Thu Feb 26 17:16:12 2026 -0800 Enable Pytorch Attention for gfx950 (#12641) commit c7f7d52b684f661d911f1747bb6954978fa1d1b9 Author: Jukka Seppänen <40791699+kijai@users.noreply.github.com> Date: Fri Feb 27 02:59:05 2026 +0200 feat: Support SDPose-OOD (#12661) commit 08b26ed7c2fe43417058f4c6c5934de3cebf3f20 Author: rattus <46076784+rattus128@users.noreply.github.com> Date: Thu Feb 26 15:59:24 2026 -0800 bug_report template: Push harder for logs (#12657) We get a lot od bug reports without logs, especially for performance issues. commit b233dbe0bc179847b81680e0b59c493a8dc8d9a6 Author: fappaz <fernando.augusto.paz@gmail.com> Date: Fri Feb 27 12:19:19 2026 +1300 feat(ace-step): add ACE-Step 1.5 lycoris key alias mapping for LoKR #12638 (#12665) commit 3811780e4f73f9dbace01a85e6d97502406f8ccb Author: comfyanonymous <121283862+comfyanonymous@users.noreply.github.com> Date: Thu Feb 26 14:12:29 2026 -0800 Portable with cu128 isn't useful anymore. (#12666) Users should either use the cu126 one or the regular one (cu130 at the moment) The cu128 portable is still included in the latest github release but I will stop including it as soon as it becomes slightly annoying to deal with. This might happen as soon as next week. commit 3dd10a59c00248d00f0cb0ab794ff1bb9fb00a5f Author: comfyanonymous <comfyanonymous@protonmail.com> Date: Thu Feb 26 15:59:22 2026 -0500 ComfyUI v0.15.1 commit 88d05fe483a9f420a69d681e615422930404292b Author: ComfyUI Wiki <contact@comfyui-wiki.com> Date: Fri Feb 27 04:52:45 2026 +0800 chore: update workflow templates to v0.9.4 (#12664) commit fd41ec97cc2e457f322afdf136fd2f2b2454a240 Author: Alexander Piskun <13381981+bigcat88@users.noreply.github.com> Date: Thu Feb 26 22:52:10 2026 +0200 feat(api-nodes): add NanoBanana2 (#12660) commit 420e900f692f72a4e0108594a80a3465c036bebe Author: rattus <46076784+rattus128@users.noreply.github.com> Date: Thu Feb 26 12:19:38 2026 -0800 main: load aimdo earlier (#12655) Some custom node packs are naughty, and violate the dont-load-torch-on-load rule. This causes aimdo to lose preference on its allocator hook on linux. Go super early on the aimdo first-stage init before custom nodes are mentioned at all. commit 38ca94599f7444f55589308d1cf611fb77f6ca16 Author: pythongosssss <125205205+pythongosssss@users.noreply.github.com> Date: Thu Feb 26 11:07:35 2026 +0000 pyopengl-accelerate can cause object to be numpy ints instead of bare ints, which the glDeleteTextures function does not accept, explicitly cast to int (#12650) commit 74b5a337dcc4d6b276e4af45aa8a654c82569072 Author: Christian Byrne <cbyrne@comfy.org> Date: Thu Feb 26 01:00:32 2026 -0800 fix: move essentials_category to correct replacement nodes (#12568) Move essentials_category from deprecated/incorrect nodes to their replacements: - ImageBatch → BatchImagesNode (ImageBatch is deprecated) - Blur → removed (should use subgraph blueprint) - GetVideoComponents → Video Slice Amp-Thread-ID: https://ampcode.com/threads/T-019c8340-4da2-723b-a09f-83895c5bbda5 commit 8a4d85c708435b47d0570637fdf1e89199702c48 Author: comfyanonymous <121283862+comfyanonymous@users.noreply.github.com> Date: Wed Feb 25 22:30:31 2026 -0800 Cleanups to the last PR. (#12646) commit a4522017c518d1f0c3c5d2a803a2d31265da5cd4 Author: Tavi Halperin <tavi@lightricks.com> Date: Thu Feb 26 08:25:23 2026 +0200 feat: per-guide attention strength control in self-attention (#12518) Implements per-guide attention attenuation via log-space additive bias in self-attention. Each guide reference tracks its own strength and optional spatial mask in conditioning metadata (guide_attention_entries). commit 907e5dcbbffab5e7011346af280a428dc40f3136 Author: Jukka Seppänen <40791699+kijai@users.noreply.github.com> Date: Thu Feb 26 06:38:46 2026 +0200 initial FlowRVS support (#12637) commit 72535316701ea8074b99755194f149a26e88b4c8 Author: comfyanonymous <121283862+comfyanonymous@users.noreply.github.com> Date: Wed Feb 25 20:13:47 2026 -0800 Fix ltxav te mem estimation. (#12643) commit e14b04478c1712ec8417a046832b821af263ea13 Author: comfyanonymous <121283862+comfyanonymous@users.noreply.github.com> Date: Wed Feb 25 19:36:02 2026 -0800 Fix LTXAV text enc min length. (#12640) Should have been 1024 instead of 512 commit eb8737d675022e730364294c395111af3545d523 Author: Christian Byrne <cbyrne@comfy.org> Date: Wed Feb 25 18:30:48 2026 -0800 Update requirements.txt (#12642) commit 0467f690a85400c8bfa6dcb6bcc848914b57562a Author: rattus <46076784+rattus128@users.noreply.github.com> Date: Wed Feb 25 13:50:05 2026 -0800 comfy aimdo 0.2.2 (#12635) Comfy Aimdo 0.2.2 moves the cuda allocator hook from the cudart API to the cuda driver API on windows. This is needed to handle Windows+cu13 where cudart is statically linked. commit 4f5b7dbf1f9ec61af8518e6a613499d5ab91835a Author: rattus <46076784+rattus128@users.noreply.github.com> Date: Wed Feb 25 13:49:48 2026 -0800 Fix Aimdo fallback on probe to not use zero-copy sft (#12634) * utils: dont use comfy sft loader in aimdo fallback This was going to the raw command line switch and should respect main.py probe of whether aimdo actually loaded successfully. * ops: dont use deferred linear load in Aimdo fallback Avoid changes of behaviour on --fast dynamic_vram when aimdo doesnt work. commit 3ebe1ac22e090c10ecf4c478fe6f89dc8b398fa0 Author: rattus <46076784+rattus128@users.noreply.github.com> Date: Tue Feb 24 16:13:46 2026 -0800 Disable dynamic_vram when using torch compiler (#12612) * mp: attach re-construction arguments to model patcher When making a model-patcher from a unet or ckpt, attach a callable function that can be called to replay the model construction. This can be used to deep clone model patcher WRT the actual model. Originally written by Kosinkadink https://github.com/Comfy-Org/ComfyUI/commit/f4b99bc62389af315013dda85f24f2bbd262b686 * mp: Add disable_dynamic clone argument Add a clone argument that lets a caller clone a ModelPatcher but disable dynamic to demote the clone to regular MP. This is useful for legacy features where dynamic_vram support is missing or TBD. * torch_compile: disable dynamic_vram This is a bigger feature. Disable for the interim to preserve functionality. commit befa83d43448c3dc64f72aa3eb771159d20f89f1 Author: rattus <46076784+rattus128@users.noreply.github.com> Date: Tue Feb 24 13:02:26 2026 -0800 comfy aimdo 0.2.1 (#12620) Changes: throttle VRAM threshold checks to restore performance in high-layer-rate conditions. commit 33f83d53ae3897962b8248c12276759192fde0c0 Author: Jedrzej Kosinski <kosinkadink1@gmail.com> Date: Tue Feb 24 13:02:05 2026 -0800 Fix KeyError when prompt entries lack class_type key (#12595) Skip entries in the prompt dict that don't contain a class_type key in apply_replacements(), preventing crashes on metadata or non-node entries. Fixes Comfy-Org/ComfyUI#12517 commit b874bd2b8c324d58cfc37bff0754dd16815a8f3c Author: comfyanonymous <comfyanonymous@protonmail.com> Date: Tue Feb 24 12:37:16 2026 -0500 ComfyUI v0.15.0 commit 0aa02453bbe7d2fac332e829ade1f13a1cf53820 Author: ComfyUI Wiki <contact@comfyui-wiki.com> Date: Wed Feb 25 01:41:36 2026 +0800 chore: update embedded docs to v0.4.3 (#12601) commit 599f9c50109d3c6fbb2791de1810ecf84601affa Author: comfyanonymous <121283862+comfyanonymous@users.noreply.github.com> Date: Tue Feb 24 09:28:25 2026 -0800 Don't crash right away if op is uninitialized. (#12615) commit 11fefa58e987604711169845e9081d14b3915ba1 Author: ComfyUI Wiki <contact@comfyui-wiki.com> Date: Wed Feb 25 01:04:51 2026 +0800 chore: update workflow templates to v0.9.3 (#12610) commit d8090013b87d787dc12a8c3956fcd8e0a60db38d Author: Alexander Piskun <13381981+bigcat88@users.noreply.github.com> Date: Tue Feb 24 19:03:30 2026 +0200 feat(api-nodes): add ByteDance Seedream-5 model (#12609) * feat(api-nodes): add ByteDance Seedream-5 model * made error message more correct * rename seedream 5.0 model commit 048dd2f3219523202e19081f776536ea1e62ad4a Author: Christian Byrne <cbyrne@comfy.org> Date: Tue Feb 24 00:44:40 2026 -0800 Patch frontend to 1.39.16 (from 1.39.14) (#12604) * Update requirements.txt * Update requirements.txt --------- Co-authored-by: Jedrzej Kosinski <kosinkadink1@gmail.com> commit 84aba95e03c6498f9e73df100cbcde9d14128ea9 Author: comfyanonymous <121283862+comfyanonymous@users.noreply.github.com> Date: Mon Feb 23 21:50:03 2026 -0800 Temporality unbreak some LTXAV workflows to give people time to migrate. (#12605) commit 9b1c63eb6927e42b36de8af7c3f58b2a554564ad Author: comfyanonymous <121283862+comfyanonymous@users.noreply.github.com> Date: Mon Feb 23 18:01:17 2026 -0800 Add SplitImageToTileList and ImageMergeTileList nodes. (#12599) With these you can split an image into tiles, do operations and then combine it back to a single image. commit 7a7debcaf11bf55257e667f33c447268ea76d412 Author: ComfyUI Wiki <contact@comfyui-wiki.com> Date: Tue Feb 24 07:27:20 2026 +0800 chore: update workflow templates to v0.9.2 (#12596) commit dba2766e5354a4b5ec957bd4e005550d83214a34 Author: Alexander Piskun <13381981+bigcat88@users.noreply.github.com> Date: Mon Feb 23 21:27:16 2026 +0200 feat(api-nodes): add KlingAvatar node (#12591) commit caa43d2395a69e93e52fe903da515fb2adbbb677 Author: comfyanonymous <121283862+comfyanonymous@users.noreply.github.com> Date: Sun Feb 22 13:00:02 2026 -0800 Fix issue loading fp8 ltxav checkpoints. (#12582) commit 07ca6852e8dc332f98531f0b51735eff66469755 Author: comfyanonymous <121283862+comfyanonymous@users.noreply.github.com> Date: Sun Feb 22 00:18:20 2026 -0800 Fix dtype issue in embeddings connector. (#12570) commit f266b8d352607799afb4adf339cdfa854025185e Author: comfyanonymous <121283862+comfyanonymous@users.noreply.github.com> Date: Sat Feb 21 19:29:58 2026 -0800 Move LTXAV av embedding connectors to diffusion model. (#12569) commit b6cb30bab52c47ae38bc2ae404929f49ba765017 Author: Christian Byrne <cbyrne@comfy.org> Date: Sat Feb 21 18:32:15 2026 -0800 chore: tune CodeRabbit config to limit review scope and disable for drafts (#12567) * chore: tune CodeRabbit config to limit review scope and disable for drafts - Add tone_instructions to focus only on newly introduced issues - Add global path_instructions entry to ignore pre-existing issues in moved/reformatted code - Disable draft PR reviews (drafts: false) and add WIP title keywords - Disable ruff tool to prevent linter-based outside-diff-range comments Addresses feedback from maintainers about CodeRabbit flagging pre-existing issues in code that was merely moved or de-indented (e.g., PR #12557), which can discourage community contributions and cause scope creep. Amp-Thread-ID: https://ampcode.com/threads/T-019c82de-0481-7253-ad42-20cb595bb1ba * chore: add 'DO NOT MERGE' to ignore_title_keywords Amp-Thread-ID: https://ampcode.com/threads/T-019c82de-0481-7253-ad42-20cb595bb1ba commit ee72752162fb8b56b8165aa93633c57f0d85002c Author: Christian Byrne <cbyrne@comfy.org> Date: Sat Feb 21 16:51:21 2026 -0800 Add category to Normalized Attention Guidance node (#12565) commit 7591d781a7a1cdfd6f8f0a9ec6ecd692495e14b5 Author: Alexander Brown <DrJKL0424@gmail.com> Date: Sat Feb 21 15:05:00 2026 -0800 fix: spe…

Amazon90 · 2026-03-21T06:39:14Z

Thank you! I'm planning to add support for LongCat-Image-Edit next.

Has the plan for a native ComfyUI implementation of longcat image edit been permanently shelved?

Kosinkadink · 2026-03-21T06:42:30Z

There's already a PR for long cat image edit, just needs to be verified

Amazon90 · 2026-03-21T08:22:33Z

There's already a PR for long cat image edit, just needs to be verified

@comfyanonymous

Could you speed up the validation for Longcat image edit? I really hope it gets native ComfyUI support soon, thanks.

Talmaj requested review from Kosinkadink, comfyanonymous and guill as code owners February 23, 2026 20:42

coderabbitai Bot reviewed Feb 23, 2026

View reviewed changes

Comment thread comfy_extras/nodes_longcat_image.py Outdated

Comment thread comfy/text_encoders/longcat_image.py

Comment thread comfy/text_encoders/longcat_image.py

Talmaj force-pushed the LongCat-Image branch from bfd302f to 81a3792 Compare February 23, 2026 21:16

coderabbitai Bot reviewed Feb 23, 2026

View reviewed changes

Comment thread comfy/model_detection.py Outdated

coderabbitai Bot reviewed Feb 23, 2026

View reviewed changes

Comment thread comfy/model_detection.py Outdated

comfyanonymous reviewed Feb 23, 2026

View reviewed changes

Comment thread comfy/model_detection.py Outdated

coderabbitai Bot reviewed Feb 24, 2026

View reviewed changes

Talmaj force-pushed the LongCat-Image branch from 310924a to 4b6fe40 Compare February 25, 2026 20:11

coderabbitai Bot reviewed Feb 25, 2026

View reviewed changes

talmaj-at-hypothetic added 13 commits February 26, 2026 21:58

Initial commit for LongCat-Image.

2e9d4e9

Should be working now.

ab708b4

Add CFGRenormLongCatImage Node.

7959f6d

Fix correct unet detection for LongCat-Image with ading required keys.

61f1436

Update LongCat-Image blueprint.

1dcc4d5

Add model_detection_test.py

fcf3d9f

Reduce git diff.

87d0a15

Fix potential shape missmatch in CFGRenormLongCatImage

4fe5072

Add a guard if no <|im_start|> token is found.

6445abe

Increase LongCat model detection precision.

32c896a

Fix ruff formatting issues.

dfcd0ca

Simplify the logic by using repackaged weights.

adae3b4

Put LongCat-Image before FluxSchnell for correct selection.

0ee3231

Talmaj force-pushed the LongCat-Image branch from 4b6fe40 to 0ee3231 Compare February 26, 2026 21:03

Reduce memory consumption in model_detection tests.

d7eb2ac

comfyanonymous reviewed Feb 26, 2026

View reviewed changes

Comment thread comfy_extras/nodes_longcat_image.py Outdated

comfyanonymous reviewed Feb 26, 2026

View reviewed changes

Comment thread comfy_extras/nodes_longcat_image.py Outdated

talmaj-at-hypothetic and others added 4 commits February 27, 2026 21:00

Update LongCat-Image blueprint.

8e668d9

Remove unnecessary module.

fe515cf

Fix

15fd31c

Temp remove.

409a3a9

comfyanonymous merged commit ac4412d into Comfy-Org:master Feb 28, 2026
13 checks passed

stduhpf mentioned this pull request Feb 28, 2026

feat: Longcat-Image / Longcat-Image-Edit support leejet/stable-diffusion.cpp#1053

Open

Amazon90 mentioned this pull request Mar 1, 2026

The expanded size of the tensor (3072) must match the existing size (3264) at non-singleton dimension 1. Target sizes: [3072, 3072]. Tensor sizes: [3072, 3264] city96/ComfyUI-GGUF#423

Closed

Kozatiju mentioned this pull request Mar 4, 2026

Native support for LongCat-Image & LongCat-Image-Edit models #11418

Open

coderabbitai Bot mentioned this pull request Mar 5, 2026

CUDAGuardImpl initialized with non-CUDA DeviceType: cpu #12774

Closed

1 task

Conversation

Talmaj commented Feb 23, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

LongCat-Image ComfyUI Port

Architecture

Key implementation details

Pre-converted weights

Model detection

Tokenizer

CFG renormalization

No guidance embedding

Known differences from HuggingFace

Files

Uh oh!

coderabbitai Bot commented Feb 23, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Reviews paused

Walkthrough

❌ Failed checks (1 warning)

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

comfyanonymous Feb 23, 2026

Choose a reason for hiding this comment

Uh oh!

Talmaj Feb 24, 2026

Choose a reason for hiding this comment

Uh oh!

comfyanonymous Feb 25, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Feb 24, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Feb 28, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Feb 28, 2026

Choose a reason for hiding this comment

Uh oh!

Talmaj commented Feb 25, 2026

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

comfy-pr-bot commented Feb 25, 2026

Test Evidence Check

Uh oh!

Kosinkadink commented Feb 26, 2026

Uh oh!

Uh oh!

Uh oh!

Uh oh!

matalama80td3l commented Feb 28, 2026

Uh oh!

Amazon90 commented Feb 28, 2026

Uh oh!

jtreminio commented Feb 28, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

zwukong commented Mar 1, 2026

Uh oh!

Amazon90 commented Mar 1, 2026

Talmaj commented Feb 23, 2026 •

edited

Loading

coderabbitai Bot commented Feb 23, 2026 •

edited

Loading

coderabbitai Bot Feb 24, 2026 •

edited

Loading

jtreminio commented Feb 28, 2026 •

edited

Loading