mirror of
https://github.com/huggingface/diffusers.git
synced 2026-01-27 17:22:53 +03:00
let's go Flux2 🚀 (#12711)
* add vae * Initial commit for Flux 2 Transformer implementation * add pipeline part * small edits to the pipeline and conversion * update conversion script * fix * up up * finish pipeline * Remove Flux IP Adapter logic for now * Remove deprecated 3D id logic * Remove ControlNet logic for now * Add link to ViT-22B paper as reference for parallel transformer blocks such as the Flux 2 single stream block * update pipeline * Don't use biases for input projs and output AdaNorm * up * Remove bias for double stream block text QKV projections * Add script to convert Flux 2 transformer to diffusers * make style and make quality * fix a few things. * allow sft files to go. * fix image processor * fix batch * style a bit * Fix some bugs in Flux 2 transformer implementation * Fix dummy input preparation and fix some test bugs * fix dtype casting in timestep guidance module. * resolve conflicts., * remove ip adapter stuff. * Fix Flux 2 transformer consistency test * Fix bug in Flux2TransformerBlock (double stream block) * Get remaining Flux 2 transformer tests passing * make style; make quality; make fix-copies * remove stuff. * fix type annotaton. * remove unneeded stuff from tests * tests * up * up * add sf support * Remove unused IP Adapter and ControlNet logic from transformer (#9) * copied from * Apply suggestions from code review Co-authored-by: YiYi Xu <yixu310@gmail.com> Co-authored-by: apolinário <joaopaulo.passos@gmail.com> * up * up * up * up * up * Refactor Flux2Attention into separate classes for double stream and single stream attention * Add _supports_qkv_fusion to AttentionModuleMixin to allow subclasses to disable QKV fusion * Have Flux2ParallelSelfAttention inherit from AttentionModuleMixin with _supports_qkv_fusion=False * Log debug message when calling fuse_projections on a AttentionModuleMixin subclass that does not support QKV fusion * Address review comments * Update src/diffusers/pipelines/flux2/pipeline_flux2.py Co-authored-by: YiYi Xu <yixu310@gmail.com> * up * Remove maybe_allow_in_graph decorators for Flux 2 transformer blocks (#12) * up * support ostris loras. (#13) * up * update schdule * up * up (#17) * add training scripts (#16) * add training scripts Co-authored-by: Linoy Tsaban <linoytsaban@gmail.com> * model cpu offload in validation. * add flux.2 readme * add img2img and tests * cpu offload in log validation * Apply suggestions from code review * fix * up * fixes * remove i2i training tests for now. --------- Co-authored-by: Linoy Tsaban <linoytsaban@gmail.com> Co-authored-by: linoytsaban <linoy@huggingface.co> * up --------- Co-authored-by: yiyixuxu <yixu310@gmail.com> Co-authored-by: Daniel Gu <dgu8957@gmail.com> Co-authored-by: yiyi@huggingface.co <yiyi@ip-10-53-87-203.ec2.internal> Co-authored-by: dg845 <58458699+dg845@users.noreply.github.com> Co-authored-by: Dhruv Nair <dhruv.nair@gmail.com> Co-authored-by: apolinário <joaopaulo.passos@gmail.com> Co-authored-by: yiyi@huggingface.co <yiyi@ip-26-0-160-103.ec2.internal> Co-authored-by: Linoy Tsaban <linoytsaban@gmail.com> Co-authored-by: linoytsaban <linoy@huggingface.co>
This commit is contained in:
@@ -349,6 +349,8 @@
|
||||
title: DiTTransformer2DModel
|
||||
- local: api/models/easyanimate_transformer3d
|
||||
title: EasyAnimateTransformer3DModel
|
||||
- local: api/models/flux2_transformer
|
||||
title: Flux2Transformer2DModel
|
||||
- local: api/models/flux_transformer
|
||||
title: FluxTransformer2DModel
|
||||
- local: api/models/hidream_image_transformer
|
||||
@@ -525,6 +527,8 @@
|
||||
title: EasyAnimate
|
||||
- local: api/pipelines/flux
|
||||
title: Flux
|
||||
- local: api/pipelines/flux2
|
||||
title: Flux2
|
||||
- local: api/pipelines/control_flux_inpaint
|
||||
title: FluxControlInpaint
|
||||
- local: api/pipelines/hidream
|
||||
|
||||
@@ -30,7 +30,8 @@ LoRA is a fast and lightweight training method that inserts and trains a signifi
|
||||
- [`CogView4LoraLoaderMixin`] provides similar functions for [CogView4](https://huggingface.co/docs/diffusers/main/en/api/pipelines/cogview4).
|
||||
- [`AmusedLoraLoaderMixin`] is for the [`AmusedPipeline`].
|
||||
- [`HiDreamImageLoraLoaderMixin`] provides similar functions for [HiDream Image](https://huggingface.co/docs/diffusers/main/en/api/pipelines/hidream)
|
||||
- [`QwenImageLoraLoaderMixin`] provides similar functions for [Qwen Image](https://huggingface.co/docs/diffusers/main/en/api/pipelines/qwen)
|
||||
- [`QwenImageLoraLoaderMixin`] provides similar functions for [Qwen Image](https://huggingface.co/docs/diffusers/main/en/api/pipelines/qwen).
|
||||
- [`Flux2LoraLoaderMixin`] provides similar functions for [Flux2](https://huggingface.co/docs/diffusers/main/en/api/pipelines/flux2).
|
||||
- [`LoraBaseMixin`] provides a base class with several utility methods to fuse, unfuse, unload, LoRAs and more.
|
||||
|
||||
> [!TIP]
|
||||
@@ -56,6 +57,10 @@ LoRA is a fast and lightweight training method that inserts and trains a signifi
|
||||
|
||||
[[autodoc]] loaders.lora_pipeline.FluxLoraLoaderMixin
|
||||
|
||||
## Flux2LoraLoaderMixin
|
||||
|
||||
[[autodoc]] loaders.lora_pipeline.Flux2LoraLoaderMixin
|
||||
|
||||
## CogVideoXLoraLoaderMixin
|
||||
|
||||
[[autodoc]] loaders.lora_pipeline.CogVideoXLoraLoaderMixin
|
||||
|
||||
19
docs/source/en/api/models/flux2_transformer.md
Normal file
19
docs/source/en/api/models/flux2_transformer.md
Normal file
@@ -0,0 +1,19 @@
|
||||
<!--Copyright 2025 The HuggingFace Team. All rights reserved.
|
||||
|
||||
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
|
||||
the License. You may obtain a copy of the License at
|
||||
|
||||
http://www.apache.org/licenses/LICENSE-2.0
|
||||
|
||||
Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on
|
||||
an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the
|
||||
specific language governing permissions and limitations under the License.
|
||||
-->
|
||||
|
||||
# Flux2Transformer2DModel
|
||||
|
||||
A Transformer model for image-like data from [Flux2](https://hf.co/black-forest-labs/FLUX.2-dev).
|
||||
|
||||
## Flux2Transformer2DModel
|
||||
|
||||
[[autodoc]] Flux2Transformer2DModel
|
||||
33
docs/source/en/api/pipelines/flux2.md
Normal file
33
docs/source/en/api/pipelines/flux2.md
Normal file
@@ -0,0 +1,33 @@
|
||||
<!--Copyright 2025 The HuggingFace Team. All rights reserved.
|
||||
|
||||
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
|
||||
the License. You may obtain a copy of the License at
|
||||
|
||||
http://www.apache.org/licenses/LICENSE-2.0
|
||||
|
||||
Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on
|
||||
an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the
|
||||
specific language governing permissions and limitations under the License.
|
||||
-->
|
||||
|
||||
# Flux2
|
||||
|
||||
<div class="flex flex-wrap space-x-1">
|
||||
<img alt="LoRA" src="https://img.shields.io/badge/LoRA-d8b4fe?style=flat"/>
|
||||
<img alt="MPS" src="https://img.shields.io/badge/MPS-000000?style=flat&logo=apple&logoColor=white%22">
|
||||
</div>
|
||||
|
||||
Flux.2 is the recent series of image generation models from Black Forest Labs, preceded by the [Flux.1](./flux.md) series. It is an entirely new model with a new architecture and pre-training done from scratch!
|
||||
|
||||
Original model checkpoints for Flux can be found [here](https://huggingface.co/black-forest-labs). Original inference code can be found [here](https://github.com/black-forest-labs/flux2-dev).
|
||||
|
||||
> [!TIP]
|
||||
> Flux2 can be quite expensive to run on consumer hardware devices. However, you can perform a suite of optimizations to run it faster and in a more memory-friendly manner. Check out [this section](https://huggingface.co/blog/sd3#memory-optimizations-for-sd3) for more details. Additionally, Flux can benefit from quantization for memory efficiency with a trade-off in inference latency. Refer to [this blog post](https://huggingface.co/blog/quanto-diffusers) to learn more.
|
||||
>
|
||||
> [Caching](../../optimization/cache) may also speed up inference by storing and reusing intermediate outputs.
|
||||
|
||||
## Flux2Pipeline
|
||||
|
||||
[[autodoc]] Flux2Pipeline
|
||||
- all
|
||||
- __call__
|
||||
Reference in New Issue
Block a user