Skip to content

Commit ae023a3

Browse files
authored
Merge pull request #6 from Ednaordinary/main
sync
2 parents 72748ec + 5e48f46 commit ae023a3

286 files changed

Lines changed: 21754 additions & 3767 deletions

File tree

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

docs/source/en/_toctree.yml

Lines changed: 20 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -359,10 +359,14 @@
359359
title: HunyuanDiT2DModel
360360
- local: api/models/hunyuanimage_transformer_2d
361361
title: HunyuanImageTransformer2DModel
362+
- local: api/models/hunyuan_video15_transformer_3d
363+
title: HunyuanVideo15Transformer3DModel
362364
- local: api/models/hunyuan_video_transformer_3d
363365
title: HunyuanVideoTransformer3DModel
364366
- local: api/models/latte_transformer3d
365367
title: LatteTransformer3DModel
368+
- local: api/models/longcat_image_transformer2d
369+
title: LongCatImageTransformer2DModel
366370
- local: api/models/ltx_video_transformer3d
367371
title: LTXVideoTransformer3DModel
368372
- local: api/models/lumina2_transformer2d
@@ -373,6 +377,8 @@
373377
title: MochiTransformer3DModel
374378
- local: api/models/omnigen_transformer
375379
title: OmniGenTransformer2DModel
380+
- local: api/models/ovisimage_transformer2d
381+
title: OvisImageTransformer2DModel
376382
- local: api/models/pixart_transformer2d
377383
title: PixArtTransformer2DModel
378384
- local: api/models/prior_transformer
@@ -397,6 +403,8 @@
397403
title: WanAnimateTransformer3DModel
398404
- local: api/models/wan_transformer_3d
399405
title: WanTransformer3DModel
406+
- local: api/models/z_image_transformer2d
407+
title: ZImageTransformer2DModel
400408
title: Transformers
401409
- sections:
402410
- local: api/models/stable_cascade_unet
@@ -433,6 +441,8 @@
433441
title: AutoencoderKLHunyuanImageRefiner
434442
- local: api/models/autoencoder_kl_hunyuan_video
435443
title: AutoencoderKLHunyuanVideo
444+
- local: api/models/autoencoder_kl_hunyuan_video15
445+
title: AutoencoderKLHunyuanVideo15
436446
- local: api/models/autoencoderkl_ltx_video
437447
title: AutoencoderKLLTXVideo
438448
- local: api/models/autoencoderkl_magvit
@@ -545,6 +555,8 @@
545555
title: Kandinsky 2.2
546556
- local: api/pipelines/kandinsky3
547557
title: Kandinsky 3
558+
- local: api/pipelines/kandinsky5_image
559+
title: Kandinsky 5.0 Image
548560
- local: api/pipelines/kolors
549561
title: Kolors
550562
- local: api/pipelines/latent_consistency_models
@@ -553,6 +565,8 @@
553565
title: Latent Diffusion
554566
- local: api/pipelines/ledits_pp
555567
title: LEDITS++
568+
- local: api/pipelines/longcat_image
569+
title: LongCat-Image
556570
- local: api/pipelines/lumina2
557571
title: Lumina 2.0
558572
- local: api/pipelines/lumina
@@ -563,6 +577,8 @@
563577
title: MultiDiffusion
564578
- local: api/pipelines/omnigen
565579
title: OmniGen
580+
- local: api/pipelines/ovis_image
581+
title: Ovis-Image
566582
- local: api/pipelines/pag
567583
title: PAG
568584
- local: api/pipelines/paint_by_example
@@ -638,6 +654,8 @@
638654
title: VisualCloze
639655
- local: api/pipelines/wuerstchen
640656
title: Wuerstchen
657+
- local: api/pipelines/z_image
658+
title: Z-Image
641659
title: Image
642660
- sections:
643661
- local: api/pipelines/allegro
@@ -652,6 +670,8 @@
652670
title: Framepack
653671
- local: api/pipelines/hunyuan_video
654672
title: HunyuanVideo
673+
- local: api/pipelines/hunyuan_video15
674+
title: HunyuanVideo1.5
655675
- local: api/pipelines/i2vgenxl
656676
title: I2VGen-XL
657677
- local: api/pipelines/kandinsky5_video

docs/source/en/api/cache.md

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -34,3 +34,9 @@ Cache methods speedup diffusion transformers by storing and reusing intermediate
3434
[[autodoc]] FirstBlockCacheConfig
3535

3636
[[autodoc]] apply_first_block_cache
37+
38+
### TaylorSeerCacheConfig
39+
40+
[[autodoc]] TaylorSeerCacheConfig
41+
42+
[[autodoc]] apply_taylorseer_cache

docs/source/en/api/loaders/lora.md

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -31,6 +31,7 @@ LoRA is a fast and lightweight training method that inserts and trains a signifi
3131
- [`AmusedLoraLoaderMixin`] is for the [`AmusedPipeline`].
3232
- [`HiDreamImageLoraLoaderMixin`] provides similar functions for [HiDream Image](https://huggingface.co/docs/diffusers/main/en/api/pipelines/hidream)
3333
- [`QwenImageLoraLoaderMixin`] provides similar functions for [Qwen Image](https://huggingface.co/docs/diffusers/main/en/api/pipelines/qwen).
34+
- [`ZImageLoraLoaderMixin`] provides similar functions for [Z-Image](https://huggingface.co/docs/diffusers/main/en/api/pipelines/zimage).
3435
- [`Flux2LoraLoaderMixin`] provides similar functions for [Flux2](https://huggingface.co/docs/diffusers/main/en/api/pipelines/flux2).
3536
- [`LoraBaseMixin`] provides a base class with several utility methods to fuse, unfuse, unload, LoRAs and more.
3637

@@ -112,6 +113,10 @@ LoRA is a fast and lightweight training method that inserts and trains a signifi
112113

113114
[[autodoc]] loaders.lora_pipeline.QwenImageLoraLoaderMixin
114115

116+
## ZImageLoraLoaderMixin
117+
118+
[[autodoc]] loaders.lora_pipeline.ZImageLoraLoaderMixin
119+
115120
## KandinskyLoraLoaderMixin
116121
[[autodoc]] loaders.lora_pipeline.KandinskyLoraLoaderMixin
117122

Lines changed: 36 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,36 @@
1+
<!-- Copyright 2025 The HuggingFace Team. All rights reserved.
2+
3+
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
4+
the License. You may obtain a copy of the License at
5+
6+
http://www.apache.org/licenses/LICENSE-2.0
7+
8+
Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on
9+
an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the
10+
specific language governing permissions and limitations under the License. -->
11+
12+
# AutoencoderKLHunyuanVideo15
13+
14+
The 3D variational autoencoder (VAE) model with KL loss used in [HunyuanVideo1.5](https://github.com/Tencent/HunyuanVideo1-1.5) by Tencent.
15+
16+
The model can be loaded with the following code snippet.
17+
18+
```python
19+
from diffusers import AutoencoderKLHunyuanVideo15
20+
21+
vae = AutoencoderKLHunyuanVideo15.from_pretrained("hunyuanvideo-community/HunyuanVideo-1.5-Diffusers-480p_t2v", subfolder="vae", torch_dtype=torch.float32)
22+
23+
# make sure to enable tiling to avoid OOM
24+
vae.enable_tiling()
25+
```
26+
27+
## AutoencoderKLHunyuanVideo15
28+
29+
[[autodoc]] AutoencoderKLHunyuanVideo15
30+
- decode
31+
- encode
32+
- all
33+
34+
## DecoderOutput
35+
36+
[[autodoc]] models.autoencoders.vae.DecoderOutput

docs/source/en/api/models/controlnet.md

Lines changed: 15 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -33,6 +33,21 @@ url = "https://huggingface.co/stable-diffusion-v1-5/stable-diffusion-v1-5/blob/m
3333
pipe = StableDiffusionControlNetPipeline.from_single_file(url, controlnet=controlnet)
3434
```
3535

36+
## Loading from Control LoRA
37+
38+
Control-LoRA is introduced by Stability AI in [stabilityai/control-lora](https://huggingface.co/stabilityai/control-lora) by adding low-rank parameter efficient fine tuning to ControlNet. This approach offers a more efficient and compact method to bring model control to a wider variety of consumer GPUs.
39+
40+
```py
41+
from diffusers import ControlNetModel, UNet2DConditionModel
42+
43+
lora_id = "stabilityai/control-lora"
44+
lora_filename = "control-LoRAs-rank128/control-lora-canny-rank128.safetensors"
45+
46+
unet = UNet2DConditionModel.from_pretrained("stabilityai/stable-diffusion-xl-base-1.0", subfolder="unet", torch_dtype=torch.bfloat16).to("cuda")
47+
controlnet = ControlNetModel.from_unet(unet).to(device="cuda", dtype=torch.bfloat16)
48+
controlnet.load_lora_adapter(lora_id, weight_name=lora_filename, prefix=None, controlnet_config=controlnet.config)
49+
```
50+
3651
## ControlNetModel
3752

3853
[[autodoc]] ControlNetModel
Lines changed: 30 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,30 @@
1+
<!-- Copyright 2025 The HuggingFace Team. All rights reserved.
2+
3+
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
4+
the License. You may obtain a copy of the License at
5+
6+
http://www.apache.org/licenses/LICENSE-2.0
7+
8+
Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on
9+
an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the
10+
specific language governing permissions and limitations under the License. -->
11+
12+
# HunyuanVideo15Transformer3DModel
13+
14+
A Diffusion Transformer model for 3D video-like data used in [HunyuanVideo1.5](https://github.com/Tencent/HunyuanVideo1-1.5).
15+
16+
The model can be loaded with the following code snippet.
17+
18+
```python
19+
from diffusers import HunyuanVideo15Transformer3DModel
20+
21+
transformer = HunyuanVideo15Transformer3DModel.from_pretrained("hunyuanvideo-community/HunyuanVideo-1.5-Diffusers-480p_t2v" subfolder="transformer", torch_dtype=torch.bfloat16)
22+
```
23+
24+
## HunyuanVideo15Transformer3DModel
25+
26+
[[autodoc]] HunyuanVideo15Transformer3DModel
27+
28+
## Transformer2DModelOutput
29+
30+
[[autodoc]] models.modeling_outputs.Transformer2DModelOutput
Lines changed: 25 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,25 @@
1+
<!--Copyright 2025 The HuggingFace Team. All rights reserved.
2+
3+
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
4+
the License. You may obtain a copy of the License at
5+
6+
http://www.apache.org/licenses/LICENSE-2.0
7+
8+
Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on
9+
an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the
10+
specific language governing permissions and limitations under the License.
11+
-->
12+
13+
# LongCatImageTransformer2DModel
14+
15+
The model can be loaded with the following code snippet.
16+
17+
```python
18+
from diffusers import LongCatImageTransformer2DModel
19+
20+
transformer = LongCatImageTransformer2DModel.from_pretrained("meituan-longcat/LongCat-Image ", subfolder="transformer", torch_dtype=torch.bfloat16)
21+
```
22+
23+
## LongCatImageTransformer2DModel
24+
25+
[[autodoc]] LongCatImageTransformer2DModel
Lines changed: 24 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,24 @@
1+
<!-- Copyright 2025 The HuggingFace Team. All rights reserved.
2+
3+
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
4+
the License. You may obtain a copy of the License at
5+
6+
http://www.apache.org/licenses/LICENSE-2.0
7+
8+
Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on
9+
an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the
10+
specific language governing permissions and limitations under the License. -->
11+
12+
# OvisImageTransformer2DModel
13+
14+
The model can be loaded with the following code snippet.
15+
16+
```python
17+
from diffusers import OvisImageTransformer2DModel
18+
19+
transformer = OvisImageTransformer2DModel.from_pretrained("AIDC-AI/Ovis-Image-7B", subfolder="transformer", torch_dtype=torch.bfloat16)
20+
```
21+
22+
## OvisImageTransformer2DModel
23+
24+
[[autodoc]] OvisImageTransformer2DModel
Lines changed: 19 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,19 @@
1+
<!--Copyright 2025 The HuggingFace Team. All rights reserved.
2+
3+
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
4+
the License. You may obtain a copy of the License at
5+
6+
http://www.apache.org/licenses/LICENSE-2.0
7+
8+
Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on
9+
an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the
10+
specific language governing permissions and limitations under the License.
11+
-->
12+
13+
# ZImageTransformer2DModel
14+
15+
A Transformer model for image-like data from [Z-Image](https://huggingface.co/Tongyi-MAI/Z-Image-Turbo).
16+
17+
## ZImageTransformer2DModel
18+
19+
[[autodoc]] ZImageTransformer2DModel

docs/source/en/api/pipelines/bria_fibo.md

Lines changed: 6 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -21,9 +21,10 @@ With only 8 billion parameters, FIBO provides a new level of image quality, prom
2121
FIBO is trained exclusively on a structured prompt and will not work with freeform text prompts.
2222
you can use the [FIBO-VLM-prompt-to-JSON](https://huggingface.co/briaai/FIBO-VLM-prompt-to-JSON) model or the [FIBO-gemini-prompt-to-JSON](https://huggingface.co/briaai/FIBO-gemini-prompt-to-JSON) to convert your freeform text prompt to a structured JSON prompt.
2323

24-
its not recommended to use freeform text prompts directly with FIBO, as it will not produce the best results.
24+
> [!NOTE]
25+
> Avoid using freeform text prompts directly with FIBO because it does not produce the best results.
2526
26-
you can learn more about FIBO in [Bria Fibo Hugging Face page](https://huggingface.co/briaai/FIBO).
27+
Refer to the Bria Fibo Hugging Face [page](https://huggingface.co/briaai/FIBO) to learn more.
2728

2829

2930
## Usage
@@ -37,9 +38,8 @@ hf auth login
3738
```
3839

3940

40-
## BriaPipeline
41+
## BriaFiboPipeline
4142

42-
[[autodoc]] BriaPipeline
43+
[[autodoc]] BriaFiboPipeline
4344
- all
44-
- __call__
45-
45+
- __call__

0 commit comments

Comments
 (0)