mirror of https://github.com/huggingface/diffusers.git synced 2026-01-29 07:22:12 +03:00

Files

Sayak Paul 5ffb73d4ae let's go Flux2 🚀 (#12711 )

* add vae

* Initial commit for Flux 2 Transformer implementation

* add pipeline part

* small edits to the pipeline and conversion

* update conversion script

* fix

* up up

* finish pipeline

* Remove Flux IP Adapter logic for now

* Remove deprecated 3D id logic

* Remove ControlNet logic for now

* Add link to ViT-22B paper as reference for parallel transformer blocks such as the Flux 2 single stream block

* update pipeline

* Don't use biases for input projs and output AdaNorm

* up

* Remove bias for double stream block text QKV projections

* Add script to convert Flux 2 transformer to diffusers

* make style and make quality

* fix a few things.

* allow sft files to go.

* fix image processor

* fix batch

* style a bit

* Fix some bugs in Flux 2 transformer implementation

* Fix dummy input preparation and fix some test bugs

* fix dtype casting in timestep guidance module.

* resolve conflicts.,

* remove ip adapter stuff.

* Fix Flux 2 transformer consistency test

* Fix bug in Flux2TransformerBlock (double stream block)

* Get remaining Flux 2 transformer tests passing

* make style; make quality; make fix-copies

* remove stuff.

* fix type annotaton.

* remove unneeded stuff from tests

* tests

* up

* up

* add sf support

* Remove unused IP Adapter and ControlNet logic from transformer (#9)

* copied from

* Apply suggestions from code review

Co-authored-by: YiYi Xu <yixu310@gmail.com>
Co-authored-by: apolinário <joaopaulo.passos@gmail.com>

* up

* up

* up

* up

* up

* Refactor Flux2Attention into separate classes for double stream and single stream attention

* Add _supports_qkv_fusion to AttentionModuleMixin to allow subclasses to disable QKV fusion

* Have Flux2ParallelSelfAttention inherit from AttentionModuleMixin with _supports_qkv_fusion=False

* Log debug message when calling fuse_projections on a AttentionModuleMixin subclass that does not support QKV fusion

* Address review comments

* Update src/diffusers/pipelines/flux2/pipeline_flux2.py

Co-authored-by: YiYi Xu <yixu310@gmail.com>

* up

* Remove maybe_allow_in_graph decorators for Flux 2 transformer blocks (#12)

* up

* support ostris loras. (#13)

* up

* update schdule

* up

* up (#17)

* add training scripts (#16)

* add training scripts

Co-authored-by: Linoy Tsaban <linoytsaban@gmail.com>

* model cpu offload in validation.

* add flux.2 readme

* add img2img and tests

* cpu offload in log validation

* Apply suggestions from code review

* fix

* up

* fixes

* remove i2i training tests for now.

---------

Co-authored-by: Linoy Tsaban <linoytsaban@gmail.com>
Co-authored-by: linoytsaban <linoy@huggingface.co>

* up

---------

Co-authored-by: yiyixuxu <yixu310@gmail.com>
Co-authored-by: Daniel Gu <dgu8957@gmail.com>
Co-authored-by: yiyi@huggingface.co <yiyi@ip-10-53-87-203.ec2.internal>
Co-authored-by: dg845 <58458699+dg845@users.noreply.github.com>
Co-authored-by: Dhruv Nair <dhruv.nair@gmail.com>
Co-authored-by: apolinário <joaopaulo.passos@gmail.com>
Co-authored-by: yiyi@huggingface.co <yiyi@ip-26-0-160-103.ec2.internal>
Co-authored-by: Linoy Tsaban <linoytsaban@gmail.com>
Co-authored-by: linoytsaban <linoy@huggingface.co>

2025-11-25 21:49:04 +05:30

5.6 KiB

Raw Permalink Blame History

LoRA

LoRA is a fast and lightweight training method that inserts and trains a significantly smaller number of parameters instead of all the model parameters. This produces a smaller file (~100 MBs) and makes it easier to quickly train a model to learn a new concept. LoRA weights are typically loaded into the denoiser, text encoder or both. The denoiser usually corresponds to a UNet ([UNet2DConditionModel], for example) or a Transformer ([SD3Transformer2DModel], for example). There are several classes for loading LoRA weights:

[StableDiffusionLoraLoaderMixin] provides functions for loading and unloading, fusing and unfusing, enabling and disabling, and more functions for managing LoRA weights. This class can be used with any model.
[StableDiffusionXLLoraLoaderMixin] is a Stable Diffusion (SDXL) version of the [StableDiffusionLoraLoaderMixin] class for loading and saving LoRA weights. It can only be used with the SDXL model.
[SD3LoraLoaderMixin] provides similar functions for Stable Diffusion 3.
[FluxLoraLoaderMixin] provides similar functions for Flux.
[CogVideoXLoraLoaderMixin] provides similar functions for CogVideoX.
[Mochi1LoraLoaderMixin] provides similar functions for Mochi.
[AuraFlowLoraLoaderMixin] provides similar functions for AuraFlow.
[LTXVideoLoraLoaderMixin] provides similar functions for LTX-Video.
[SanaLoraLoaderMixin] provides similar functions for Sana.
[HunyuanVideoLoraLoaderMixin] provides similar functions for HunyuanVideo.
[Lumina2LoraLoaderMixin] provides similar functions for Lumina2.
[WanLoraLoaderMixin] provides similar functions for Wan.
[SkyReelsV2LoraLoaderMixin] provides similar functions for SkyReels-V2.
[CogView4LoraLoaderMixin] provides similar functions for CogView4.
[AmusedLoraLoaderMixin] is for the [AmusedPipeline].
[HiDreamImageLoraLoaderMixin] provides similar functions for HiDream Image
[QwenImageLoraLoaderMixin] provides similar functions for Qwen Image.
[Flux2LoraLoaderMixin] provides similar functions for Flux2.
[LoraBaseMixin] provides a base class with several utility methods to fuse, unfuse, unload, LoRAs and more.

Tip

To learn more about how to load LoRA weights, see the LoRA loading guide.

LoraBaseMixin

autodoc loaders.lora_base.LoraBaseMixin

StableDiffusionLoraLoaderMixin

autodoc loaders.lora_pipeline.StableDiffusionLoraLoaderMixin

StableDiffusionXLLoraLoaderMixin

autodoc loaders.lora_pipeline.StableDiffusionXLLoraLoaderMixin

SD3LoraLoaderMixin

autodoc loaders.lora_pipeline.SD3LoraLoaderMixin

FluxLoraLoaderMixin

autodoc loaders.lora_pipeline.FluxLoraLoaderMixin

Flux2LoraLoaderMixin

autodoc loaders.lora_pipeline.Flux2LoraLoaderMixin

CogVideoXLoraLoaderMixin

autodoc loaders.lora_pipeline.CogVideoXLoraLoaderMixin

Mochi1LoraLoaderMixin

autodoc loaders.lora_pipeline.Mochi1LoraLoaderMixin

AuraFlowLoraLoaderMixin

autodoc loaders.lora_pipeline.AuraFlowLoraLoaderMixin

LTXVideoLoraLoaderMixin

autodoc loaders.lora_pipeline.LTXVideoLoraLoaderMixin

SanaLoraLoaderMixin

autodoc loaders.lora_pipeline.SanaLoraLoaderMixin

HunyuanVideoLoraLoaderMixin

autodoc loaders.lora_pipeline.HunyuanVideoLoraLoaderMixin

Lumina2LoraLoaderMixin

autodoc loaders.lora_pipeline.Lumina2LoraLoaderMixin

CogView4LoraLoaderMixin

autodoc loaders.lora_pipeline.CogView4LoraLoaderMixin

WanLoraLoaderMixin

autodoc loaders.lora_pipeline.WanLoraLoaderMixin

SkyReelsV2LoraLoaderMixin

autodoc loaders.lora_pipeline.SkyReelsV2LoraLoaderMixin

AmusedLoraLoaderMixin

autodoc loaders.lora_pipeline.AmusedLoraLoaderMixin

HiDreamImageLoraLoaderMixin

autodoc loaders.lora_pipeline.HiDreamImageLoraLoaderMixin

QwenImageLoraLoaderMixin

autodoc loaders.lora_pipeline.QwenImageLoraLoaderMixin

KandinskyLoraLoaderMixin

autodoc loaders.lora_pipeline.KandinskyLoraLoaderMixin

LoraBaseMixin

autodoc loaders.lora_base.LoraBaseMixin

5.6 KiB Raw Permalink Blame History

LoRA

LoraBaseMixin

StableDiffusionLoraLoaderMixin

StableDiffusionXLLoraLoaderMixin

SD3LoraLoaderMixin

FluxLoraLoaderMixin

Flux2LoraLoaderMixin

CogVideoXLoraLoaderMixin

Mochi1LoraLoaderMixin

AuraFlowLoraLoaderMixin

LTXVideoLoraLoaderMixin

SanaLoraLoaderMixin

HunyuanVideoLoraLoaderMixin

Lumina2LoraLoaderMixin

CogView4LoraLoaderMixin

WanLoraLoaderMixin

SkyReelsV2LoraLoaderMixin

AmusedLoraLoaderMixin

HiDreamImageLoraLoaderMixin

QwenImageLoraLoaderMixin

KandinskyLoraLoaderMixin

LoraBaseMixin

5.6 KiB

Raw Permalink Blame History