fastgen DMD2: make the Qwen-Image example self-contained on stock nemo_automodel by jingyu-ml · Pull Request #1688 · NVIDIA/Model-Optimizer

jingyu-ml · 2026-06-11T20:35:02Z

What does this PR do?

Type of change: new example / refactor (example self-containment)

The published examples/diffusers/fastgen DMD2 Qwen-Image distillation example previously relied on
local, unpublished modifications to the sibling nemo_automodel package (data path, collate, a
partial-load checkpointer, and Qwen-Image preprocessing). An external user running the published
Model-Optimizer example against official stock nemo_automodel would hit import/attribute errors.

This PR makes the example self-contained on stock nemo_automodel>=0.4.0 — with the changes kept
as small as possible: the team's actual AutoModel delta is only ~430 lines, so wherever the
upstream module is importable, the delta is expressed as a thin subclass/wrapper rather than a
full-file copy.

fastgen_data/ — the DMD2 data path:
- collate_fns.py — thin wrappers over stock: import the unpatched upstream collate +
  bucket sampler and re-add only the DMD2 delta (text_embeddings_mask, stacked from the
  per-item mask the stock collate omits; and an optional broadcast negative_text_embeddings
  for CFG). The builder loads an optional negative_prompt_embedding_path.
- text_to_image_dataset.py — a faithful vendored copy of the upstream reader (its
  prompt_embeds_mask emission is interleaved with cache loading, so wrapping it would force a
  redundant per-item torch.load; carried verbatim instead).
fastgen_checkpoint.py — PartialLoadCheckpointer(Checkpointer) that overrides only
load_optimizer (FSDP2 DefaultLoadPlanner(allow_partial_load=True)) so optimizer resume works
without patching upstream. Injected via an in-place re-bless of self.checkpointer in the
recipe's load_checkpoint (model-state load stays strict).
preprocess/ — Qwen-Image preprocessing (preprocessing_multiprocess.py + processors/),
trimmed to the image path (drops the flux/wan/hunyuan processors and the video base class).
It lives in AutoModel's top-level tools/ tree, which is not shipped in the pip package, so
it cannot be wrapped and is vendored; MultiTierBucketCalculator is imported from stock upstream.
make_negative_prompt_embedding.py — generates the optional CFG negative-prompt embedding.
All configs/*.yaml target fastgen_data.build_* (a test enumerates every config).
Licensing/provenance per CONTRIBUTING.md (source link + commit, original Apache-2.0 header, then
NVIDIA SPDX header on verbatim-vendored files; a NeMo-AutoModel note in LICENSE; insert-license
excludes for the verbatim files; nemo_automodel[diffusion] version bound in requirements.txt).

The DMD2 math in modelopt/torch/fastgen/ is unchanged — only example/training-time glue moved.

Usage

# Install example deps (stock nemo_automodel) from a source checkout
pip install -r examples/diffusers/fastgen/requirements.txt

# Build the training cache from raw images (Qwen-Image VAE latents + text embeddings)
python examples/diffusers/fastgen/preprocess_qwen_image.py image \
    --image_dir <raw images> --output_dir <cache dir> --processor qwen_image \
    --caption_format meta_json

# Generate the CFG negative-prompt embedding once
python examples/diffusers/fastgen/make_negative_prompt_embedding.py \
    --output <cache dir>/negative_prompt_embedding.pt

# Point the config's data.dataloader.cache_dir + negative_prompt_embedding_path at the cache, then train.

See examples/diffusers/fastgen/README.md → "Requirements & self-contained data path".

Testing

New tests/examples/diffusers/fastgen/test_vendored_migration.py: environment-independent
invariants (every config targets a vendored builder; no tools.* imports; each former AutoModel
patch is vendored / wrapped / a documented exclusion; provenance + SPDX headers present) plus
dependency-guarded structural tests (the collate wrapper emits the batch contract + broadcasts the
negative embedding; the builder accepts negative_prompt_embedding_path; the checkpointer
overrides only load_optimizer; the Qwen-Image processor self-registers).
Validated 9/9 against a pure stock nemo_automodel 0.4.0 worktree (none of the local patches
present) via SLURM — re-run after this slim-down.
The migrated code path is exercised by a live multi-GPU DMD2 run that resumed from a checkpoint
through the vendored PartialLoadCheckpointer.
ruff check + ruff format --check clean on all changed files.

Before your PR is "Ready for review"

Is this change backward compatible?: ✅ (additive; bundled configs target the vendored builders)
If you copied code from any other sources or added a new PIP dependency, did you follow guidance in CONTRIBUTING.md: ✅ (NeMo-AutoModel @ e42584e3, Apache-2.0 — provenance + original license retained, NVIDIA SPDX header added after; LICENSE third-party note; nemo_automodel was already a dependency)
Did you write any new necessary tests?: ✅
Did you update Changelog?: N/A (example-only change)
Did you get Claude approval on this PR?: ❌ (pending — opened as draft)

Additional Information

Opened as a draft pending: OSRB review of the vendored NeMo-AutoModel (Apache-2.0) code, CI green,
and (optional) a multi-GPU smoke-train + resume on stock upstream. Vendored from
NVIDIA-NeMo/Automodel at commit e42584e3.

🤖 Generated with Claude Code

Summary by CodeRabbit

New Features
- Vendored Qwen‑Image preprocessing and processors; multi‑process dataset builder; fastgen_data dataloader package, TextToImageDataset, collate/batching with negative‑prompt support; negative‑prompt embedding generator; checkpoint loader tolerant of partial FSDP2 optimizer shards; CLI entrypoints and updated training script/configs (removed smoke config).
Documentation
- README updated for real‑data training, install instructions, config references, and troubleshooting.
Tests
- New tests validating vendored migration, dataloader/collate behavior, processor registry, and checkpoint upgrade.
Chores
- License attribution added, example requirements pinned, mypy override for vendored code, pre‑commit exclusion and CODEOWNERS updated.

…n-time self-contained) Make the published DMD2 Qwen-Image example run against STOCK upstream nemo_automodel by removing its dependence on local AutoModel modifications (staged patches #1-5): - Vendor the patched data files into examples/diffusers/fastgen/fastgen_data/ (collate_fns, text_to_image_dataset, mock_dataloader), importing the UNPATCHED upstream helpers (sampler, base_dataset, text_to_video_dataset) from the stock nemo_automodel package. Provenance headers per CONTRIBUTING (source + commit e42584e3, Apache-2.0). - Repoint the published configs' dataloader _target_ to fastgen_data.build_* and add an explicit sys.path seam + startup resolution logging in dmd2_finetune.py (source-checkout). - Add fastgen_checkpoint.PartialLoadCheckpointer (overrides only load_optimizer with allow_partial_load=True) and inject it in dmd2_recipe.load_checkpoint, so both the student and fake-score optimizer restores tolerate FSDP2 partial shards without a patched AutoModel. No change to modelopt/torch/fastgen/ (DMD2 math). Preprocessing vendoring + tests + the removability audit follow in subsequent commits. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> Signed-off-by: Jingyu Xin <jingyux@nvidia.com>

Bound nemo_automodel[diffusion] to the tested range (>=0.4.0,<1.0; 0.4.0 == the validated commit e42584e3) and add a runtime soft-guard in fastgen_data/__init__.py that converts a missing unpatched-upstream-helper ImportError into an actionable message naming the supported range. Supports AC-7 (bounded version) and AC-1 (clear failure on drift). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> Signed-off-by: Jingyu Xin <jingyux@nvidia.com>

…ined, no tools/ tree) Vendor the AutoModel preprocessing stack needed to build the VAE+text-embed cache from raw images, trimmed to the Qwen-Image path: - preprocess/processors/{base,base_video,registry,caption_loaders,qwen_image}.py + a trimmed processors/__init__.py that drops flux/wan/hunyuan (and their deps, e.g. nemo_automodel.shared.transformers_patches). The Qwen-Image processor self-registers. - preprocess/preprocessing_multiprocess.py (driver): the `from tools.diffusion.processors` import is rewritten to the vendored `.processors` package; MultiTierBucketCalculator is imported from the stock nemo_automodel package (it IS packaged). Image subcommand --processor defaults to qwen_image. - preprocess_qwen_image.py launcher mirroring dmd2_finetune.py's sys.path seam. Removes the example's dependence on AutoModel staged patches #7-8 and on the un-packaged AutoModel tools/ tree. Provenance headers per CONTRIBUTING. No tools.* imports remain. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> Signed-off-by: Jingyu Xin <jingyux@nvidia.com>

…per CONTRIBUTING) The 10 vendored files (fastgen_data/{collate_fns,text_to_image_dataset,mock_dataloader}.py and preprocess/{preprocessing_multiprocess,processors/*}.py) carry the original NeMo-AutoModel Apache-2.0 header plus a "Vendored from ... @ e42584e3" provenance note. Add them to the insert-license-py exclude so the hook does not prepend a second NVIDIA header on top. LICENSE Third-Party Software Notices is intentionally NOT modified: every vendored file is NVIDIA-copyright Apache-2.0 (same entity and license as this repo, from the sibling NVIDIA-NeMo/Automodel repo), so there is no external third-party copyright holder to list; the per-file provenance headers are the applicable record. OSRB sign-off remains a maintainer release gate. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> Signed-off-by: Jingyu Xin <jingyux@nvidia.com>

Add examples/diffusers/fastgen/tests/test_vendored_migration.py with two tiers: - Environment-independent (run anywhere): published configs repointed to fastgen_data.*, no tools.* imports in vendored code, the AC-8 removability coverage audit (all nine staged AutoModel files vendored-or-excluded), and provenance headers on the 10 vendored files. - Dependency-guarded (importorskip nemo_automodel/torch, run in the container): data builders accept negative_prompt_embedding_path, collate emits the batch contract + broadcasts the negative embed, PartialLoadCheckpointer overrides only load_optimizer (model load stays strict) + the recipe injects it, and the qwen_image processor self-registers. The four environment-independent tests pass on CPU; the GPU/container end-to-end positive checks (smoke train, FSDP2 resume, preprocessing on real images) run via SLURM. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> Signed-off-by: Jingyu Xin <jingyux@nvidia.com>

Add a "Requirements & self-contained data path" section: runs on stock nemo_automodel (>=0.4.0,<1.0) from a source checkout; data loading (fastgen_data/) and preprocessing (preprocess/) are vendored so no nemo_automodel modifications are needed; how to build the cache via preprocess_qwen_image.py. Completes the train-time + preprocessing migration. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> Signed-off-by: Jingyu Xin <jingyux@nvidia.com>

…style Round 1 closes the login-node-actionable AC-7 gaps Codex flagged: - LICENSE: record the NeMo-AutoModel vendoring under Third-Party Software Notices (Apache-2.0). - Vendored files: add the NVIDIA SPDX header after the original AutoModel license, per CONTRIBUTING "Copying code from other sources" (matches modelopt/.../calib_utils.py) — all 10. - Runtime guard: fastgen_data/__init__.py soft-warns when installed nemo_automodel is outside the tested >=0.4.0,<1.0 range; preprocess/__init__.py guards the multi_tier_bucketing import with an actionable message. - Tests: drop plan workflow markers (AC-/Phase/Step) from comments per the plan's code-style note. Not changed (deliberate): the collate prompt_embeds_mask -> text_embeddings_mask mapping is byte-identical to the patched AutoModel (faithful vendor; changing it would alter current training behavior). GPU/SLURM end-to-end validation (smoke/resume/preprocessing/revert-all-nine) remains deferred to the container environment. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> Signed-off-by: Jingyu Xin <jingyux@nvidia.com>

… tests Address the Round 1 review's actionable gaps: - Add make_negative_prompt_embedding.py: generates the CFG negative_prompt_embedding.pt the canonical config requires, reusing the vendored QwenImageProcessor's text encoder (default negative prompt ""), saved in the dataloader's payload format {"embed", "mask"}. Closes the AC-5/AC-8 self-containment gap (preprocessing previously emitted only per-image caches, so a from-scratch user had no negative-prompt embedding to point the config at). Documented in README. - Move migration tests to tests/examples/diffusers/fastgen/ (the repo's example CI lane) and fix the example-dir resolution from the new location. - Strengthen the provenance test to assert the NVIDIA SPDX header presence + ordering (after the original license), not just the "Vendored from" note. - Update stale "python -m tools.diffusion..." invocation strings in the vendored driver docs to the preprocess_qwen_image.py launcher. GPU/SLURM end-to-end execution remains environment-deferred. No modelopt/torch/fastgen changes. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> Signed-off-by: Jingyu Xin <jingyux@nvidia.com>

…ig test Address the Round 2 review: - Every real-data config on disk now targets fastgen_data.build_text_to_image_multiresolution_dataloader. The 2 committed configs (dmd2_qwen_image.yaml, smoke) were already repointed; the 6 local-only experiment configs (1M_shift3, df0.7, 320k, 320k_shift3, 8steps, 8steps_320k — git-excluded, with hardcoded /lustre data paths) are repointed on disk so the user's own runs survive deleting the AutoModel patches. (They remain excluded from the published diff.) - The config test now enumerates configs/*.yaml and fails on ANY upstream dataloader target, so a new config cannot silently reintroduce the upstream dependence. Robust to whatever set ships (2 in a fresh clone, 8 on the user's disk). - Assert the builder's negative_prompt_embedding_path defaults to None (CFG-less is optional). GPU/SLURM end-to-end execution remains environment-deferred. No modelopt/torch/fastgen changes. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> Signed-off-by: Jingyu Xin <jingyux@nvidia.com>

…ntract The stock-upstream SLURM validation (8/9 pass) surfaced a test-fixture bug, not a migration bug: test_collate_emits_contract_keys_and_broadcasts_negative_prompt built a fake sample with only {latent, prompt_embeds, prompt_embeds_mask}, but collate_fn_production also requires crop_resolution / original_resolution / crop_offset / prompt / image_path / bucket_id / aspect_ratio (the keys TextToImageDataset emits). Add them so the test exercises the real collate path. Vendored code unchanged. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> Signed-off-by: Jingyu Xin <jingyux@nvidia.com>

Apply the repo ruff config to the vendored data path and preprocessing files: ruff-format + autofixes (pyupgrade py310, isort) plus manual fixes for E402 (logger placement), E501 (provenance comment wrap), N806 (lowercase dims in the video-only 5D shape validator), SIM103, RUF022 (sorted __all__), C401, PERF401. No behavioral change: correctness-critical imports (recipe checkpointer injection, re-exports, # noqa: F401 side-effect imports, MultiTierBucketCalculator, DefaultLoadPlanner) and all provenance/SPDX headers preserved. ruff check + format --check clean; env-independent migration tests pass. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> Signed-off-by: Jingyu Xin <jingyux@nvidia.com>

copy-pr-bot · 2026-06-11T20:35:06Z

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

coderabbitai · 2026-06-11T20:35:12Z

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: 60318e73-5a45-4262-b995-bde11fbd270f

📥 Commits

Reviewing files that changed from the base of the PR and between 0d46f48 and 85ae40c.

📒 Files selected for processing (4)

examples/diffusers/fastgen/fastgen_data/collate_fns.py
examples/diffusers/fastgen/preprocess/processors/caption_loaders.py
examples/diffusers/fastgen/preprocess/processors/qwen_image.py
tests/examples/diffusers/fastgen/test_vendored_migration.py

🚧 Files skipped from review as they are similar to previous changes (4)

examples/diffusers/fastgen/fastgen_data/collate_fns.py
examples/diffusers/fastgen/preprocess/processors/qwen_image.py
tests/examples/diffusers/fastgen/test_vendored_migration.py
examples/diffusers/fastgen/preprocess/processors/caption_loaders.py

📝 Walkthrough

Walkthrough

Vendored NeMo-AutoModel preprocessing and data loaders into the fastgen example, rewired configs/scripts to use local builders, added FSDP2-tolerant optimizer restore, implemented preprocessing and negative-prompt tooling, and added tests validating the migration.

Changes

FastGen DMD2 NeMo-AutoModel Vendoring & Integration

Layer / File(s)	Summary
License, configuration, and documentation updates `.pre-commit-config.yaml`, `LICENSE`, `examples/diffusers/fastgen/README.md`, `examples/diffusers/fastgen/requirements.txt`, `pyproject.toml`, `.github/CODEOWNERS`	Updated Apache-2 attribution for vendored NVIDIA code, excluded fastgen files from automatic license insertion, extended README with vendored preprocessing instructions, pinned `nemo_automodel[diffusion]` to `>=0.4.0,<1.0`, added MyPy overrides for vendored preprocess code, and added CODEOWNERS entry.
Processor contracts, registry, and Qwen-Image processor `examples/diffusers/fastgen/preprocess/processors/{base,caption_loaders,qwen_image,registry}.py`, `examples/diffusers/fastgen/preprocess/processors/__init__.py`	Adds `BaseModelProcessor` abstraction, caption loader implementations (sidecar/meta/jsonl), `ProcessorRegistry`, and `QwenImageProcessor` implementing model loading, image/text encoding, latent verification, and cache payload construction.
Multiprocess preprocessing orchestration and launcher `examples/diffusers/fastgen/preprocess/preprocessing_multiprocess.py`, `examples/diffusers/fastgen/preprocess/__init__.py`, `examples/diffusers/fastgen/preprocess_qwen_image.py`	Implements multiprocess image/video preprocessing with bucketing, sharded metadata output, CLI dispatch, and a dedicated Qwen-image launcher entrypoint; includes import guards.
Vendored data package, dataset, collate, and CLI tools `examples/diffusers/fastgen/fastgen_data/{__init__,collate_fns,text_to_image_dataset}.py`, `examples/diffusers/fastgen/make_negative_prompt_embedding.py`, `examples/diffusers/fastgen/configs/dmd2_qwen_image.yaml`, `examples/diffusers/fastgen/dmd2_finetune.py`	Adds `fastgen_data` package with import guards and upstream-version warnings, `TextToImageDataset` for cached .pt samples, `collate_fn_text_to_image` with negative-prompt broadcasting and mask reconstruction, dataloader builder, negative-prompt embedding generator script, and rewires config/script defaults to use vendored builders.
FSDP2-tolerant checkpoint optimizer loading `examples/diffusers/fastgen/fastgen_checkpoint.py`, `examples/diffusers/fastgen/dmd2_recipe.py`	Introduced `PartialLoadCheckpointer` that loads optimizer state via torch.distributed.checkpoint with `DefaultLoadPlanner(allow_partial_load=True)`, added `make_optimizer_partial_load_tolerant()` to upgrade instances in-place, and injected the upgrade call into recipe checkpoint restoration.
Validation tests `tests/examples/diffusers/fastgen/test_vendored_migration.py`	Adds tests validating YAMLs target vendored builders, staged AutoModel file dispositions, provenance headers, dataloader/collate contracts (including negative-prompt broadcasting), checkpointer subclassing and idempotent upgrade, recipe injection, and processor registration.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Caution

Pre-merge checks failed

Please resolve all errors before merging. Addressing warnings is optional.

Ignore

❌ Failed checks (1 error)

Check name	Status	Explanation	Resolution
Security Anti-Patterns	❌ Error	Found torch.load(..., weights_only=False) without any inline safety comment in examples/diffusers/fastgen/dmd2_recipe.py and inference_dmd2_qwen_image.py (ema/state/discriminator checkpoints).	Change these loads to weights_only=True where possible, or add an explicit inline comment justifying why the checkpoint files are internally-generated/trusted.

✅ Passed checks (5 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title clearly and specifically describes the main objective: making the fastgen DMD2 Qwen-Image example self-contained and compatible with stock nemo_automodel.
Docstring Coverage	✅ Passed	Docstring coverage is 93.42% which is sufficient. The required threshold is 80.00%.
Linked Issues check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check	✅ Passed	Check skipped because no linked issues were found for this pull request.

✨ Finishing Touches

📝 Generate docstrings

Create stacked PR
Commit on current branch

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch jingyux/fastgen-vendor-automodel-patches

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

codecov · 2026-06-11T20:44:17Z

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 75.57%. Comparing base (d3acf45) to head (85ae40c).
⚠️ Report is 19 commits behind head on main.

Additional details and impacted files

@@             Coverage Diff             @@
##             main    #1688       +/-   ##
===========================================
+ Coverage   56.59%   75.57%   +18.97%     
===========================================
  Files         507      511        +4     
  Lines       55794    57337     +1543     
===========================================
+ Hits        31579    43333    +11754     
+ Misses      24215    14004    -10211

Flag	Coverage Δ
examples	`40.87% <ø> (+22.31%)`	⬆️
gpu	`57.85% <ø> (+37.31%)`	⬆️
regression	`14.69% <ø> (-0.16%)`	⬇️
unit	`54.33% <ø> (-0.09%)`	⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Harness.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

…collate) Reduce the migration's footprint per review: (1) delete the vendored mock_dataloader.py and the mock-data smoke config + its README/recipe references (the published quick-start was broken on stock anyway; real training does not use it); (2) delete the vendored base_video.py (video code, dead in an image-only example) and make the --list_processors display image-only; (3) rewrite collate_fns.py as thin wrappers over stock nemo_automodel (import the unpatched collate + sampler; re-add only the DMD2 delta: text_embeddings_mask + broadcast negative-prompt embedding) instead of a full ~440-line vendored copy, also dropping the dead video collate/builder. text_to_image_dataset.py stays a faithful vendored copy (its prompt_embeds_mask change is interleaved with cache loading). Default config + tests/licensing updated; ruff + env-independent checks pass. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> Signed-off-by: Jingyu Xin <jingyux@nvidia.com>

…quality) The preprocess/ tree is vendored verbatim from NVIDIA-NeMo/Automodel (Apache-2.0) and uses the upstream worker-global pattern (module-level 'X | None' set lazily in _init_worker), which trips mypy's strict union-attr/arg-type under examples.*. Add a scoped mypy override (ignore_errors, like tests.*) for examples.diffusers.fastgen.preprocess.* so the faithful copy is not annotated/diverged from upstream. The authored fastgen_data/ wrapper is unaffected (already mypy-clean). 36 errors in 2 files -> 0. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> Signed-off-by: Jingyu Xin <jingyux@nvidia.com>

Assign modelopt/torch/fastgen (DMD2 library) and /examples/diffusers/fastgen (this example) to @NVIDIA/modelopt-torch-fastgen-codeowners. Both are more-specific entries placed after their parent rules (modelopt/torch and /examples/diffusers), so GitHub's last-match-wins routes fastgen review to the fastgen team, matching the repo convention (e.g. modelopt/torch/quantization carving out of modelopt/torch). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> Signed-off-by: Jingyu Xin <jingyux@nvidia.com>

coderabbitai

Warning

CodeRabbit couldn't request changes on this pull request because it doesn't have sufficient GitHub permissions.

Please grant CodeRabbit Pull requests: Read and write permission and re-run the review.

👉 Steps to fix this

Actionable comments posted: 7

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)

examples/diffusers/fastgen/dmd2_recipe.py (1)

671-671: ⚠️ Potential issue | 🔴 Critical

Fix unsafe checkpoint deserialization (weights_only=False) in diffusers fastgen restore paths

examples/diffusers/fastgen/dmd2_recipe.py restores sidecar checkpoint payloads using torch.load(..., weights_only=False) at lines 671, 676, 689, and 698 with no inline justification. Update these loads to weights_only=True (or add an inline comment explaining why weights_only=False is safe).

Suggested patch

-            ema_state = torch.load(ema_path, map_location="cpu", weights_only=False)
+            ema_state = torch.load(ema_path, map_location="cpu", weights_only=True)
...
-            state = torch.load(state_path, map_location="cpu", weights_only=False)
+            state = torch.load(state_path, map_location="cpu", weights_only=True)
...
-                disc_state = torch.load(disc_path, map_location="cpu", weights_only=False)
+                disc_state = torch.load(disc_path, map_location="cpu", weights_only=True)
...
-                disc_opt_state = torch.load(disc_opt_path, map_location="cpu", weights_only=False)
+                disc_opt_state = torch.load(disc_opt_path, map_location="cpu", weights_only=True)

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@examples/diffusers/fastgen/dmd2_recipe.py` at line 671, Summary: Unsafe
checkpoint deserialization uses torch.load(..., weights_only=False); change to
weights_only=True or justify why full deserialization is required. Locate the
torch.load calls that load sidecar checkpoints (e.g., the ema_state load
referenced by the symbol ema_state and variable ema_path in dmd2_recipe.py) and
update the argument weights_only=False to weights_only=True for all similar
restore paths (the other torch.load calls flagged in the review). If full object
deserialization truly is required, instead leave weights_only=False but add an
inline comment explaining why it is safe and unavoidable; otherwise prefer
weights_only=True to avoid executing arbitrary pickle payloads.

Source: Coding guidelines

🧹 Nitpick comments (1)

examples/diffusers/fastgen/preprocess/processors/base.py (1)
187-190: ⚡ Quick win

Move the NumPy import to module scope (or document why it must be deferred).

Line 187 imports numpy inside preprocess_image without an explicit circular/optional/heavy-import justification, so import errors surface at runtime instead of module load.

As per coding guidelines, imports should stay at file top unless there is a documented exception.
Proposed change
 from abc import ABC, abstractmethod
 from typing import Any
 
+import numpy as np
 import torch
 from PIL import Image
@@
     def preprocess_image(self, image: Image.Image) -> torch.Tensor:
@@
-        import numpy as np
-
         image_tensor = torch.from_numpy(np.array(image)).float() / 255.0
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@examples/diffusers/fastgen/preprocess/processors/base.py` around lines 187 -
190, The numpy import is done inside the preprocessing code (where image_tensor
is created in preprocess_image), which causes import errors to appear at
runtime; move "import numpy as np" to module scope near the other imports at the
top of examples/diffusers/fastgen/preprocess/processors/base.py (or add a short
comment explaining a deliberate deferred import) so that preprocess_image (and
the image_tensor creation) uses a top-level np import; update any references in
preprocess_image and related functions to use the module-scoped np.
Source: Coding guidelines

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@examples/diffusers/fastgen/fastgen_data/collate_fns.py`:
- Line 103: The code uses torch.load(path, map_location="cpu",
weights_only=False) to load a user-configurable negative_prompt_embedding_path,
which allows unsafe pickle loading; change the load to a safe mode and validate
the input: ensure the loader uses weights_only=True (i.e., torch.load(path,
map_location="cpu", weights_only=True)) or otherwise deserialize with a safe
format (e.g., torch.jit.load or a JSON/npz reader) and validate
negative_prompt_embedding_path exists and is from a trusted source before
loading; update the line creating payload (payload = torch.load(...)) and any
code that assumes pickled objects accordingly.

In `@examples/diffusers/fastgen/preprocess/processors/base.py`:
- Around line 208-210: The code currently dereferences
models["vae"].config.scaling_factor without ensuring the attribute exists;
update the getter so that after confirming "vae" is in models and models["vae"]
has a config, you also check that the config has a scaling_factor (e.g.,
hasattr(models["vae"].config, "scaling_factor") or getattr with default) and
only return it when present, otherwise return the fallback 0.18215; reference
the models dict, the "vae" entry, its config attribute, and the scaling_factor
property when implementing this guard.

In `@examples/diffusers/fastgen/preprocess/processors/caption_loaders.py`:
- Line 425: The import "from collections import defaultdict" is placed inside a
function in caption_loaders.py—move this import to the module top-level and
remove the in-function import; search for the in-function occurrence of "from
collections import defaultdict" (and any use of defaultdict) and add a single
top-of-file import statement instead so import errors surface at module load and
the function body no longer contains the import.

In `@examples/diffusers/fastgen/preprocess/processors/qwen_image.py`:
- Around line 184-191: The second value returned by pipeline.encode_prompt (the
prompt_embeds_mask) is being ignored in encode_prompt calls and therefore not
serialized; update the encode_prompt usage in qwen_image.py to capture both
outputs (e.g., prompt_embeds, prompt_embeds_mask = pipeline.encode_prompt(...))
and include prompt_embeds_mask in the returned/persisted dict alongside
prompt_embeds (apply the same .detach().cpu().to(torch.bfloat16) or appropriate
dtype/shape handling as done for prompt_embeds) so downstream TextToImageDataset
receives the real mask instead of falling back to an all-ones mask.
- Line 217: The autocast call inside verify_latent is using an invalid dtype for
CUDA (torch.float32); update the autocast usage in verify_latent (qwen_image.py)
to use a valid reduced-precision dtype such as torch.float16 or torch.bfloat16,
or remove the dtype argument so autocast picks the default, ensuring the with
torch.no_grad(), autocast(...) context no longer raises and causes false
verification failures.

In `@LICENSE`:
- Around line 226-228: Add a top-of-file vendored header to each Python file
listed (dmd2_finetune.py, dmd2_recipe.py, export_diffusers_qwen_image.py,
fastgen_checkpoint.py, fastgen_data/__init__.py, fastgen_data/collate_fns.py,
inference_dmd2_qwen_image.py, make_negative_prompt_embedding.py,
preprocess_qwen_image.py) that matches the LICENSE expectation: a single-line or
block comment beginning with "Vendored from" and including the original
NVIDIA-NeMo/Automodel source path and commit SHA (e.g., "Vendored from
NVIDIA-NeMo/Automodel: <original/path> @ <commit>"); place it at the very top of
each file before imports and include any short attribution or license snippet
required by the original file so each file contains the "Vendored from" string
referenced in LICENSE.

In `@tests/examples/diffusers/fastgen/test_vendored_migration.py`:
- Around line 157-163: Move the local import of inspect out of the test function
to a top-level import; update the module imports to include "import inspect" at
the module scope and remove the "import inspect" line from inside
test_data_builders_importable_and_accept_negative_prompt_path so the test uses
the module-level inspect symbol.

---

Outside diff comments:
In `@examples/diffusers/fastgen/dmd2_recipe.py`:
- Line 671: Summary: Unsafe checkpoint deserialization uses torch.load(...,
weights_only=False); change to weights_only=True or justify why full
deserialization is required. Locate the torch.load calls that load sidecar
checkpoints (e.g., the ema_state load referenced by the symbol ema_state and
variable ema_path in dmd2_recipe.py) and update the argument weights_only=False
to weights_only=True for all similar restore paths (the other torch.load calls
flagged in the review). If full object deserialization truly is required,
instead leave weights_only=False but add an inline comment explaining why it is
safe and unavoidable; otherwise prefer weights_only=True to avoid executing
arbitrary pickle payloads.

---

Nitpick comments:
In `@examples/diffusers/fastgen/preprocess/processors/base.py`:
- Around line 187-190: The numpy import is done inside the preprocessing code
(where image_tensor is created in preprocess_image), which causes import errors
to appear at runtime; move "import numpy as np" to module scope near the other
imports at the top of examples/diffusers/fastgen/preprocess/processors/base.py
(or add a short comment explaining a deliberate deferred import) so that
preprocess_image (and the image_tensor creation) uses a top-level np import;
update any references in preprocess_image and related functions to use the
module-scoped np.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: 5a5a2507-bada-45a1-9617-9f3d3b0b7710

📥 Commits

Reviewing files that changed from the base of the PR and between dd49a46 and c7fc9bf.

📒 Files selected for processing (23)

.pre-commit-config.yaml
LICENSE
examples/diffusers/fastgen/README.md
examples/diffusers/fastgen/configs/dmd2_qwen_image.yaml
examples/diffusers/fastgen/configs/dmd2_qwen_image_smoke.yaml
examples/diffusers/fastgen/dmd2_finetune.py
examples/diffusers/fastgen/dmd2_recipe.py
examples/diffusers/fastgen/fastgen_checkpoint.py
examples/diffusers/fastgen/fastgen_data/__init__.py
examples/diffusers/fastgen/fastgen_data/collate_fns.py
examples/diffusers/fastgen/fastgen_data/text_to_image_dataset.py
examples/diffusers/fastgen/make_negative_prompt_embedding.py
examples/diffusers/fastgen/preprocess/__init__.py
examples/diffusers/fastgen/preprocess/preprocessing_multiprocess.py
examples/diffusers/fastgen/preprocess/processors/__init__.py
examples/diffusers/fastgen/preprocess/processors/base.py
examples/diffusers/fastgen/preprocess/processors/caption_loaders.py
examples/diffusers/fastgen/preprocess/processors/qwen_image.py
examples/diffusers/fastgen/preprocess/processors/registry.py
examples/diffusers/fastgen/preprocess_qwen_image.py
examples/diffusers/fastgen/requirements.txt
pyproject.toml
tests/examples/diffusers/fastgen/test_vendored_migration.py

💤 Files with no reviewable changes (1)

examples/diffusers/fastgen/configs/dmd2_qwen_image_smoke.yaml

coderabbitai · 2026-06-11T23:06:35Z

+        if "vae" in models and hasattr(models["vae"], "config"):
+            return models["vae"].config.scaling_factor
+        return 0.18215  # Default for most models


⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Guard scaling_factor existence before dereference.

Line 209 assumes models["vae"].config.scaling_factor always exists once config exists. If scaling_factor is missing, this raises instead of using the default fallback.

Proposed change

def get_vae_scaling_factor(self, models: dict[str, Any]) -> float: @@ - if "vae" in models and hasattr(models["vae"], "config"): - return models["vae"].config.scaling_factor + if "vae" in models and hasattr(models["vae"], "config"): + scaling_factor = getattr(models["vae"].config, "scaling_factor", None) + if scaling_factor is not None: + return scaling_factor return 0.18215 # Default for most models

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@examples/diffusers/fastgen/preprocess/processors/base.py` around lines 208 - 210, The code currently dereferences models["vae"].config.scaling_factor without ensuring the attribute exists; update the getter so that after confirming "vae" is in models and models["vae"] has a config, you also check that the config has a scaling_factor (e.g., hasattr(models["vae"].config, "scaling_factor") or getattr with default) and only return it when present, otherwise return the fallback 0.18215; reference the models dict, the "vae" entry, its config attribute, and the scaling_factor property when implementing this guard.

coderabbitai · 2026-06-11T23:06:35Z

+            prompt_embeds, _ = pipeline.encode_prompt(
+                prompt=prompt,
+                device=device,
+            )
+
+        return {
+            "prompt_embeds": prompt_embeds.detach().cpu().to(torch.bfloat16),
+        }


⚠️ Potential issue | 🟠 Major | ⚡ Quick win

prompt_embeds_mask is dropped before cache serialization.

Line 184 receives a second output from encode_prompt but discards it, and Lines 261-266 persist only prompt_embeds. Downstream, TextToImageDataset falls back to an all-ones mask when the mask key is absent, so the true sequence mask contract is lost.

Proposed change

with torch.no_grad(): - prompt_embeds, _ = pipeline.encode_prompt( + prompt_embeds, prompt_embeds_mask = pipeline.encode_prompt( prompt=prompt, device=device, ) - return { + result = { "prompt_embeds": prompt_embeds.detach().cpu().to(torch.bfloat16), } + if prompt_embeds_mask is not None: + result["prompt_embeds_mask"] = prompt_embeds_mask.detach().cpu().to(torch.long) + return result @@ return { # Image latent "latent": latent, # Text embeddings "prompt_embeds": text_encodings["prompt_embeds"], + "prompt_embeds_mask": text_encodings.get("prompt_embeds_mask"), # Metadata "original_resolution": metadata["original_resolution"],

Also applies to: 261-266

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@examples/diffusers/fastgen/preprocess/processors/qwen_image.py` around lines 184 - 191, The second value returned by pipeline.encode_prompt (the prompt_embeds_mask) is being ignored in encode_prompt calls and therefore not serialized; update the encode_prompt usage in qwen_image.py to capture both outputs (e.g., prompt_embeds, prompt_embeds_mask = pipeline.encode_prompt(...)) and include prompt_embeds_mask in the returned/persisted dict alongside prompt_embeds (apply the same .detach().cpu().to(torch.bfloat16) or appropriate dtype/shape handling as done for prompt_embeds) so downstream TextToImageDataset receives the real mask instead of falling back to an all-ones mask.

coderabbitai · 2026-06-11T23:06:35Z

+  Copyright (c) 2025 NVIDIA CORPORATION — portions of examples/diffusers/fastgen/ are
+    vendored from NVIDIA-NeMo/Automodel (https://github.com/NVIDIA-NeMo/Automodel); see the
+    per-file "Vendored from" headers for the exact source paths and commit.


⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash # Check that vendored files have "Vendored from" headers. # Expected: Each file starts with a comment containing "Vendored from" + source URL. vendored_files=( "examples/diffusers/fastgen/fastgen_data/text_to_image_dataset.py" "examples/diffusers/fastgen/preprocess/preprocessing_multiprocess.py" "examples/diffusers/fastgen/preprocess/processors/__init__.py" "examples/diffusers/fastgen/preprocess/processors/base.py" "examples/diffusers/fastgen/preprocess/processors/registry.py" "examples/diffusers/fastgen/preprocess/processors/caption_loaders.py" "examples/diffusers/fastgen/preprocess/processors/qwen_image.py" ) echo "Checking for 'Vendored from' headers in vendored files..." for file in "${vendored_files[@]}"; do if [ -f "$file" ]; then if head -20 "$file" | grep -q "Vendored from"; then echo "✓ $file: has Vendored from header" else echo "✗ $file: MISSING Vendored from header" fi else echo "⚠ $file: file not found" fi done

Repository: NVIDIA/Model-Optimizer

Length of output: 763

🏁 Script executed:

#!/bin/bash set -euo pipefail dir="examples/diffusers/fastgen" echo "Scanning for Python files under: $dir" mapfile -t files < <(find "$dir" -type f -name "*.py" | sort) echo "Found ${`#files`[@]} .py files." missing=() for f in "${files[@]}"; do if head -30 "$f" | rg -q "Vendored from"; then echo "✓ $f" else echo "✗ $f (missing 'Vendored from' in first 30 lines)" missing+=("$f") fi done if [ ${`#missing`[@]} -gt 0 ]; then echo echo "Missing count: ${`#missing`[@]}" printf '%s\n' "${missing[@]}" exit 1 else echo echo "All scanned Python files have 'Vendored from' headers." fi

Repository: NVIDIA/Model-Optimizer

Length of output: 201

🏁 Script executed:

#!/bin/bash set -euo pipefail dir="examples/diffusers/fastgen" echo "Scanning for Python files under: $dir" tmp="$(mktemp)" find "$dir" -type f -name "*.py" | sort > "$tmp" mapfile -t files < "$tmp" rm -f "$tmp" echo "Found ${`#files`[@]} .py files." missing=() for f in "${files[@]}"; do if head -40 "$f" | rg -q "Vendored from"; then echo "✓ $f" else echo "✗ $f (missing 'Vendored from' in first 40 lines)" missing+=("$f") fi done if [ ${`#missing`[@]} -gt 0 ]; then echo echo "Missing count: ${`#missing`[@]}" printf '%s\n' "${missing[@]}" exit 1 else echo echo "All scanned Python files have 'Vendored from' headers." fi

Repository: NVIDIA/Model-Optimizer

Length of output: 2025

🏁 Script executed:

#!/bin/bash set -euo pipefail dir="examples/diffusers/fastgen" tmp="$(mktemp)" find "$dir" -type f -name "*.py" | sort > "$tmp" mapfile -t files < "$tmp" rm -f "$tmp" echo "Checking where 'Vendored from' appears (first 200 lines + anywhere in file):" for f in "${files[@]}"; do first200_has=0 any_has=0 if head -200 "$f" | rg -q "Vendored from"; then first200_has=1 fi if rg -q "Vendored from" "$f"; then any_has=1 fi if [ "$any_has" -eq 1 ]; then if [ "$first200_has" -eq 1 ]; then echo "✓ $f (Vendored from in first 200 lines)" else echo "⚠ $f (Vendored from exists but after first 200 lines)" fi else echo "✗ $f (no 'Vendored from' anywhere)" fi done

Repository: NVIDIA/Model-Optimizer

Length of output: 1689

Add missing # Vendored from ... headers to vendored fastgen Python files

LICENSE points to per-file “Vendored from” headers, but these vendored files under examples/diffusers/fastgen/ contain no “Vendored from” string anywhere:

examples/diffusers/fastgen/dmd2_finetune.py

examples/diffusers/fastgen/dmd2_recipe.py

examples/diffusers/fastgen/export_diffusers_qwen_image.py

examples/diffusers/fastgen/fastgen_checkpoint.py

examples/diffusers/fastgen/fastgen_data/__init__.py

examples/diffusers/fastgen/fastgen_data/collate_fns.py

examples/diffusers/fastgen/inference_dmd2_qwen_image.py

examples/diffusers/fastgen/make_negative_prompt_embedding.py

examples/diffusers/fastgen/preprocess_qwen_image.py

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@LICENSE` around lines 226 - 228, Add a top-of-file vendored header to each Python file listed (dmd2_finetune.py, dmd2_recipe.py, export_diffusers_qwen_image.py, fastgen_checkpoint.py, fastgen_data/__init__.py, fastgen_data/collate_fns.py, inference_dmd2_qwen_image.py, make_negative_prompt_embedding.py, preprocess_qwen_image.py) that matches the LICENSE expectation: a single-line or block comment beginning with "Vendored from" and including the original NVIDIA-NeMo/Automodel source path and commit SHA (e.g., "Vendored from NVIDIA-NeMo/Automodel: <original/path> @ <commit>"); place it at the very top of each file before imports and include any short attribution or license snippet required by the original file so each file contains the "Vendored from" string referenced in LICENSE.

…e diffusers team Drop the /examples/diffusers/fastgen carve-out; @NVIDIA/modelopt-torch-fastgen-codeowners owns modelopt/torch/fastgen (the DMD2 library). The example stays under /examples/diffusers (@NVIDIA/modelopt-examples-diffusers-codeowners). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> Signed-off-by: Jingyu Xin <jingyux@nvidia.com>

Apply the safe, sensible CodeRabbit findings: (1) collate_fns.py loads the negative-prompt embedding with weights_only=True (it is a dict of plain tensors saved by make_negative_prompt_embedding.py, so it loads cleanly) — removes the pickle/RCE vector per SECURITY.md; (2) move 'from collections import defaultdict' (caption_loaders) and 'import inspect' (test) to module top; (3) drop the no-op autocast(dtype=torch.float32) in QwenImageProcessor.verify_latent — the latent is already .float(), so this is behavior-identical and removes the invalid-CUDA-autocast-dtype warning. Preprocessing/data only; no change to DMD2 training math. Intentionally not changed (would alter validated behavior or are false positives): not persisting prompt_embeds_mask in the cache (would switch the text-attention mask from the all-ones fallback the existing caches + running jobs use to a real mask — changes training math); the speculative scaling_factor guard (Qwen-Image VAE always defines it; vendored verbatim); 'Vendored from' headers on authored files (those correctly carry the standard NVIDIA SPDX header — LICENSE states only 'portions' are vendored). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> Signed-off-by: Jingyu Xin <jingyux@nvidia.com>

jingyu-ml · 2026-06-12T04:25:35Z

/ok to test 85ae40c

kevalmorabia97 · 2026-06-12T05:52:36Z

These are also nvidia authored and apache 2 license files so wr dont need to exclude in pre commit or have duplicate nvidia license. Let license hook auto add correct license everywhere

jingyu-ml and others added 11 commits June 10, 2026 17:41

jingyu-ml added the cherry-pick-0.45.0 After code freeze, cherry-pick to release branch for next rc (bulk update). Only for bug fixes / doc label Jun 11, 2026

jingyu-ml self-assigned this Jun 11, 2026

jingyu-ml marked this pull request as ready for review June 11, 2026 22:48

jingyu-ml requested review from a team as code owners June 11, 2026 22:48

jingyu-ml requested a review from kevalmorabia97 June 11, 2026 22:48

coderabbitai Bot reviewed Jun 11, 2026

View reviewed changes

jingyu-ml and others added 2 commits June 11, 2026 16:09

kevalmorabia97 reviewed Jun 12, 2026

View reviewed changes

kevalmorabia97 requested changes Jun 12, 2026

View reviewed changes

Conversation

jingyu-ml commented Jun 11, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What does this PR do?

Usage

Testing

Before your PR is "Ready for review"

Additional Information

Summary by CodeRabbit

Uh oh!

copy-pr-bot Bot commented Jun 11, 2026

Uh oh!

coderabbitai Bot commented Jun 11, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Estimated code review effort

Pre-merge checks failed

❌ Failed checks (1 error)

Uh oh!

codecov Bot commented Jun 11, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

coderabbitai Bot Jun 11, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

coderabbitai Bot Jun 11, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

coderabbitai Bot Jun 11, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

jingyu-ml commented Jun 12, 2026

Uh oh!

kevalmorabia97 Jun 12, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

jingyu-ml commented Jun 11, 2026 •

edited by coderabbitai Bot

Loading

coderabbitai Bot commented Jun 11, 2026 •

edited

Loading

codecov Bot commented Jun 11, 2026 •

edited

Loading