Skip to content

what are the major differences between fintuning QwenImage and QwenImageLayer #1274

@garychan22

Description

@garychan22

Hi, I am now using my own scripts implemented with diffusers and accelerate to finetune QwenImageLayer and I found that the convergence is slow and even it didn't converge with large loss around 0.84.

My modification from QwenImage:

  1. RGBA-VAE encoding
  2. 3Drope is constructed in the order of [[recontructed_image, layer00, layer01, ..., cond_image]]
  3. resolution 640

I found a released training script here https://github.com/modelscope/DiffSynth-Studio/blob/main/examples/qwen_image/model_training/full/Qwen-Image-Layered.sh. However, I am new to DiffSynth-Studio.

Can someone share the major differences between finetuning QwenImage and QwenImageLayer, thanks!

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions