1
0
mirror of https://github.com/huggingface/diffusers.git synced 2026-01-27 17:22:53 +03:00
Commit Graph

912 Commits

Author SHA1 Message Date
Dhruv Nair
e24941b2a7 [Single File] Add GGUF support (#9964)
* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* Update src/diffusers/quantizers/gguf/utils.py

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* Update docs/source/en/quantization/gguf.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* update

* update

* update

* update

---------

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
2024-12-17 16:09:37 +05:30
Aryan
f9d5a9324d [docs] Clarify dtypes for Sana (#10248)
update

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
2024-12-17 13:43:24 +05:30
Sayak Paul
6fb94d51cb [chore] add contribution note for lawrence. (#10253)
add contribution note for lawrence.
2024-12-17 09:17:40 +05:30
Steven Liu
7667cfcb41 [docs] Add missing AttnProcessors (#10246)
* attnprocessors

* lora

* make style

* fix

* fix

* sana

* typo
2024-12-16 15:36:26 -08:00
Aryan
9f00c617a0 [core] TorchAO Quantizer (#10009)
* torchao quantizer


---------

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
2024-12-16 13:35:40 -10:00
Sayak Paul
ea893a9ae7 [Docs] add rest of the lora loader mixins to the docs. (#10230)
add rest of the lora loader mixins to the docs.
2024-12-16 08:50:27 -08:00
Aryan
aace1f412b [core] Hunyuan Video (#10136)
* copy transformer

* copy vae

* copy pipeline

* make fix-copies

* refactor; make original code work with diffusers; test latents for comparison generated with this commit

* move rope into pipeline; remove flash attention; refactor

* begin conversion script

* make style

* refactor attention

* refactor

* refactor final layer

* their mlp -> our feedforward

* make style

* add docs

* refactor layer names

* refactor modulation

* cleanup

* refactor norms

* refactor activations

* refactor single blocks attention

* refactor attention processor

* make style

* cleanup a bit

* refactor double transformer block attention

* update mochi attn proc

* use diffusers attention implementation in all modules; checkpoint for all values matching original

* remove helper functions in vae

* refactor upsample

* refactor causal conv

* refactor resnet

* refactor

* refactor

* refactor

* grad checkpointing

* autoencoder test

* fix scaling factor

* refactor clip

* refactor llama text encoding

* add coauthor

Co-Authored-By: "Gregory D. Hunkins" <greg@ollano.com>

* refactor rope; diff: 0.14990234375; reason and fix: create rope grid on cpu and move to device

Note: The following line diverges from original behaviour. We create the grid on the device, whereas
original implementation creates it on CPU and then moves it to device. This results in numerical
differences in layerwise debugging outputs, but visually it is the same.

* use diffusers timesteps embedding; diff: 0.10205078125

* rename

* convert

* update

* add tests for transformer

* add pipeline tests; text encoder 2 is not optional

* fix attention implementation for torch

* add example

* update docs

* update docs

* apply suggestions from review

* refactor vae

* update

* Apply suggestions from code review

Co-authored-by: hlky <hlky@hlky.ac>

* Update src/diffusers/pipelines/hunyuan_video/pipeline_hunyuan_video.py

Co-authored-by: hlky <hlky@hlky.ac>

* Update src/diffusers/pipelines/hunyuan_video/pipeline_hunyuan_video.py

Co-authored-by: hlky <hlky@hlky.ac>

* make fix-copies

* update

---------

Co-authored-by: "Gregory D. Hunkins" <greg@ollano.com>
Co-authored-by: hlky <hlky@hlky.ac>
2024-12-16 13:56:18 +05:30
Sayak Paul
e68092a471 [docs] minor stuff to ltx video docs. (#10229)
minor stuff to ltx video docs.
2024-12-16 12:24:14 +05:30
Sayak Paul
3bf5400a64 Update sana.md with minor corrections (#10232) 2024-12-16 10:26:06 +05:30
Junsong Chen
5a196e3d46 [Sana] Add Sana, including SanaPipeline, SanaPAGPipeline, LinearAttentionProcessor, Flow-based DPM-sovler and so on. (#9982)
* first add a script for DC-AE;

* DC-AE init

* replace triton with custom implementation

* 1. rename file and remove un-used codes;

* no longer rely on omegaconf and dataclass

* replace custom activation with diffuers activation

* remove dc_ae attention in attention_processor.py

* iinherit from ModelMixin

* inherit from ConfigMixin

* dc-ae reduce to one file

* update downsample and upsample

* clean code

* support DecoderOutput

* remove get_same_padding and val2tuple

* remove autocast and some assert

* update ResBlock

* remove contents within super().__init__

* Update src/diffusers/models/autoencoders/dc_ae.py

Co-authored-by: YiYi Xu <yixu310@gmail.com>

* remove opsequential

* update other blocks to support the removal of build_norm

* remove build encoder/decoder project in/out

* remove inheritance of RMSNorm2d from LayerNorm

* remove reset_parameters for RMSNorm2d

Co-authored-by: YiYi Xu <yixu310@gmail.com>

* remove device and dtype in RMSNorm2d __init__

Co-authored-by: YiYi Xu <yixu310@gmail.com>

* Update src/diffusers/models/autoencoders/dc_ae.py

Co-authored-by: YiYi Xu <yixu310@gmail.com>

* Update src/diffusers/models/autoencoders/dc_ae.py

Co-authored-by: YiYi Xu <yixu310@gmail.com>

* Update src/diffusers/models/autoencoders/dc_ae.py

Co-authored-by: YiYi Xu <yixu310@gmail.com>

* remove op_list & build_block

* remove build_stage_main

* change file name to autoencoder_dc

* move LiteMLA to attention.py

* align with other vae decode output;

* add DC-AE into init files;

* update

* make quality && make style;

* quick push before dgx disappears again

* update

* make style

* update

* update

* fix

* refactor

* refactor

* refactor

* update

* possibly change to nn.Linear

* refactor

* make fix-copies

* replace vae with ae

* replace get_block_from_block_type to get_block

* replace downsample_block_type from Conv to conv for consistency

* add scaling factors

* incorporate changes for all checkpoints

* make style

* move mla to attention processor file; split qkv conv to linears

* refactor

* add tests

* from original file loader

* add docs

* add standard autoencoder methods

* combine attention processor

* fix tests

* update

* minor fix

* minor fix

* minor fix & in/out shortcut rename

* minor fix

* make style

* fix paper link

* update docs

* update single file loading

* make style

* remove single file loading support; todo for DN6

* Apply suggestions from code review

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* add abstract

* 1. add DCAE into diffusers;
2. make style and make quality;

* add DCAE_HF into diffusers;

* bug fixed;

* add SanaPipeline, SanaTransformer2D into diffusers;

* add sanaLinearAttnProcessor2_0;

* first update for SanaTransformer;

* first update for SanaPipeline;

* first success run SanaPipeline;

* model output finally match with original model with the same intput;

* code update;

* code update;

* add a flow dpm-solver scripts

* 🎉[important update]
1. Integrate flow-dpm-sovler into diffusers;
2. finally run successfully on both `FlowMatchEulerDiscreteScheduler` and `FlowDPMSolverMultistepScheduler`;

* 🎉🔧[important update & fix huge bugs!!]
1. add SanaPAGPipeline & several related Sana linear attention operators;
2. `SanaTransformer2DModel` not supports multi-resolution input;
2. fix the multi-scale HW bugs in SanaPipeline and SanaPAGPipeline;
3. fix the flow-dpm-solver set_timestep() init `model_output` and `lower_order_nums` bugs;

* remove prints;

* add convert sana official checkpoint to diffusers format Safetensor.

* Update src/diffusers/models/transformers/sana_transformer_2d.py

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update src/diffusers/models/transformers/sana_transformer_2d.py

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update src/diffusers/models/transformers/sana_transformer_2d.py

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update src/diffusers/pipelines/pag/pipeline_pag_sana.py

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update src/diffusers/models/transformers/sana_transformer_2d.py

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update src/diffusers/models/transformers/sana_transformer_2d.py

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update src/diffusers/pipelines/sana/pipeline_sana.py

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update src/diffusers/pipelines/sana/pipeline_sana.py

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* update Sana for DC-AE's recent commit;

* make style && make quality

* Add StableDiffusion3PAGImg2Img Pipeline + Fix SD3 Unconditional PAG (#9932)

* fix progress bar updates in SD 1.5 PAG Img2Img pipeline

---------

Co-authored-by: Vinh H. Pham <phamvinh257@gmail.com>
Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>

* make the vae can be None in `__init__` of `SanaPipeline`

* Update src/diffusers/models/transformers/sana_transformer_2d.py

Co-authored-by: hlky <hlky@hlky.ac>

* change the ae related code due to the latest update of DCAE branch;

* change the ae related code due to the latest update of DCAE branch;

* 1. change code based on AutoencoderDC;
2. fix the bug of new GLUMBConv;
3. run success;

* update for solving conversation.

* 1. fix bugs and run convert script success;
2. Downloading ckpt from hub automatically;

* make style && make quality;

* 1. remove un-unsed parameters in init;
2. code update;

* remove test file

* refactor; add docs; add tests; update conversion script

* make style

* make fix-copies

* refactor

* udpate pipelines

* pag tests and refactor

* remove sana pag conversion script

* handle weight casting in conversion script

* update conversion script

* add a processor

* 1. add bf16 pth file path;
2. add complex human instruct in pipeline;

* fix fast \tests

* change gemma-2-2b-it ckpt to a non-gated repo;

* fix the pth path bug in conversion script;

* change grad ckpt to original; make style

* fix the complex_human_instruct bug and typo;

* remove dpmsolver flow scheduler

* apply review suggestions

* change the `FlowMatchEulerDiscreteScheduler` to default `DPMSolverMultistepScheduler` with flow matching scheduler.

* fix the tokenizer.padding_side='right' bug;

* update docs

* make fix-copies

* fix imports

* fix docs

* add integration test

* update docs

* update examples

* fix convert_model_output in schedulers

* fix failing tests

---------

Co-authored-by: Junyu Chen <chenjydl2003@gmail.com>
Co-authored-by: YiYi Xu <yixu310@gmail.com>
Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
Co-authored-by: chenjy2003 <70215701+chenjy2003@users.noreply.github.com>
Co-authored-by: Aryan <aryan@huggingface.co>
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
Co-authored-by: hlky <hlky@hlky.ac>
2024-12-16 02:16:56 +05:30
Aryan
96c376a5ff [core] LTX Video (#10021)
* transformer

* make style & make fix-copies

* transformer

* add transformer tests

* 80% vae

* make style

* make fix-copies

* fix

* undo cogvideox changes

* update

* update

* match vae

* add docs

* t2v pipeline working; scheduler needs to be checked

* docs

* add pipeline test

* update

* update

* make fix-copies

* Apply suggestions from code review

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* update

* copy t2v to i2v pipeline

* update

* apply review suggestions

* update

* make style

* remove framewise encoding/decoding

* pack/unpack latents

* image2video

* update

* make fix-copies

* update

* update

* rope scale fix

* debug layerwise code

* remove debug

* Apply suggestions from code review

Co-authored-by: YiYi Xu <yixu310@gmail.com>

* propagate precision changes to i2v pipeline

* remove downcast

* address review comments

* fix comment

* address review comments

* [Single File] LTX support for loading original weights (#10135)

* from original file mixin for ltx

* undo config mapping fn changes

* update

* add single file to pipelines

* update docs

* Update src/diffusers/models/autoencoders/autoencoder_kl_ltx.py

* Update src/diffusers/models/autoencoders/autoencoder_kl_ltx.py

* rename classes based on ltx review

* point to original repository for inference

* make style

* resolve conflicts correctly

---------

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
Co-authored-by: YiYi Xu <yixu310@gmail.com>
2024-12-12 16:21:28 +05:30
hlky
914a585be8 Add ControlNetUnion (#10131)
* ControlNetUnion model
2024-12-11 07:07:50 -10:00
Dhruv Nair
ad40e26515 [Single File] Add single file support for AutoencoderDC (#10183)
* update

* update

* update
2024-12-11 16:57:36 +05:30
Darshil Jariwala
65b98b5da4 Add PAG Support for Stable Diffusion Inpaint Pipeline (#9386)
* using sd inpaint pipeline and sdxl pag inpaint pipeline to add changes

* using sd inpaint pipeline and sdxl pag inpaint pipeline to add changes

* finished the call function

* added auto pipeline

* merging diffusers

* ready to test

* ready to test

* added copied from and removed unnecessary tests

* make style changes

* doc changes

* updating example doc string

* style fix

* init

* adding imports

* quality

* Update src/diffusers/pipelines/pag/pipeline_pag_sd_inpaint.py

* make

* Update tests/pipelines/pag/test_pag_sd_inpaint.py

* slice and size

* slice

---------

Co-authored-by: Darshil Jariwala <darshiljariwala@Darshils-MacBook-Air.local>
Co-authored-by: Darshil Jariwala <jariwala.darshil2002@gmail.com>
Co-authored-by: YiYi Xu <yixu310@gmail.com>
Co-authored-by: hlky <hlky@hlky.ac>
2024-12-10 21:06:31 +00:00
Aryan
49a9143479 Flux Control LoRA (#9999)
* update


---------

Co-authored-by: yiyixuxu <yixu310@gmail.com>
Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
2024-12-10 09:08:13 -10:00
Junsong Chen
cd892041e2 [DC-AE] Add the official Deep Compression Autoencoder code(32x,64x,128x compression ratio); (#9708)
* first add a script for DC-AE;

* DC-AE init

* replace triton with custom implementation

* 1. rename file and remove un-used codes;

* no longer rely on omegaconf and dataclass

* replace custom activation with diffuers activation

* remove dc_ae attention in attention_processor.py

* iinherit from ModelMixin

* inherit from ConfigMixin

* dc-ae reduce to one file

* update downsample and upsample

* clean code

* support DecoderOutput

* remove get_same_padding and val2tuple

* remove autocast and some assert

* update ResBlock

* remove contents within super().__init__

* Update src/diffusers/models/autoencoders/dc_ae.py

Co-authored-by: YiYi Xu <yixu310@gmail.com>

* remove opsequential

* update other blocks to support the removal of build_norm

* remove build encoder/decoder project in/out

* remove inheritance of RMSNorm2d from LayerNorm

* remove reset_parameters for RMSNorm2d

Co-authored-by: YiYi Xu <yixu310@gmail.com>

* remove device and dtype in RMSNorm2d __init__

Co-authored-by: YiYi Xu <yixu310@gmail.com>

* Update src/diffusers/models/autoencoders/dc_ae.py

Co-authored-by: YiYi Xu <yixu310@gmail.com>

* Update src/diffusers/models/autoencoders/dc_ae.py

Co-authored-by: YiYi Xu <yixu310@gmail.com>

* Update src/diffusers/models/autoencoders/dc_ae.py

Co-authored-by: YiYi Xu <yixu310@gmail.com>

* remove op_list & build_block

* remove build_stage_main

* change file name to autoencoder_dc

* move LiteMLA to attention.py

* align with other vae decode output;

* add DC-AE into init files;

* update

* make quality && make style;

* quick push before dgx disappears again

* update

* make style

* update

* update

* fix

* refactor

* refactor

* refactor

* update

* possibly change to nn.Linear

* refactor

* make fix-copies

* replace vae with ae

* replace get_block_from_block_type to get_block

* replace downsample_block_type from Conv to conv for consistency

* add scaling factors

* incorporate changes for all checkpoints

* make style

* move mla to attention processor file; split qkv conv to linears

* refactor

* add tests

* from original file loader

* add docs

* add standard autoencoder methods

* combine attention processor

* fix tests

* update

* minor fix

* minor fix

* minor fix & in/out shortcut rename

* minor fix

* make style

* fix paper link

* update docs

* update single file loading

* make style

* remove single file loading support; todo for DN6

* Apply suggestions from code review

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* add abstract

---------

Co-authored-by: Junyu Chen <chenjydl2003@gmail.com>
Co-authored-by: YiYi Xu <yixu310@gmail.com>
Co-authored-by: chenjy2003 <70215701+chenjy2003@users.noreply.github.com>
Co-authored-by: Aryan <aryan@huggingface.co>
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
2024-12-07 01:01:51 +05:30
Aritra Roy Gosthipaty
bf64b32652 [Guide] Quantize your Diffusion Models with bnb (#10012)
* chore: initial draft

* Apply suggestions from code review

Co-authored-by: Pedro Cuenca <pedro@huggingface.co>
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* chore: link in place

* chore: review suggestions

* Apply suggestions from code review

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* chore: review suggestions

* Update docs/source/en/quantization/bitsandbytes.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* review suggestions

* chore: review suggestions

* Apply suggestions from code review

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* adding same changes to 4 bit section

* review suggestions

---------

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
Co-authored-by: Pedro Cuenca <pedro@huggingface.co>
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
2024-12-05 13:54:03 -08:00
Steven Liu
0d11ab26c4 [docs] load_lora_adapter (#10119)
* load_lora_adapter

* save

---------

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
2024-12-05 08:00:03 +05:30
Benjamin Paine
63b631f383 Add StableDiffusion3PAGImg2Img Pipeline + Fix SD3 Unconditional PAG (#9932)
* fix progress bar updates in SD 1.5 PAG Img2Img pipeline



---------

Co-authored-by: Vinh H. Pham <phamvinh257@gmail.com>
Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
2024-12-02 21:39:47 -10:00
Lucain
0763a7edf4 Let server decide default repo visibility (#10047) 2024-12-02 17:15:46 -10:00
ChG
c44fba8899 fix link in the docs (#10058)
* fix link in the docs

* fix same issue for ko
2024-12-02 11:45:12 -08:00
Parag Ekbote
922c5f5c3c Fixed Nits in Evaluation Docs (#10063)
Minor fixes and script improvement in evaluation
docs.
2024-12-02 10:50:00 -08:00
Álvaro Somoza
fdec8bd675 Change image_gen_aux repository URL (#10048)
change image_gen_aux repo url
2024-11-28 12:57:55 -05:00
Aryan
ad5ecd1251 [docs] Fix CogVideoX table (#10008)
* fix

* fix
2024-11-26 09:14:14 -08:00
Sayak Paul
047bf49291 [Docs] add: missing pipelines from the spec. (#10005)
add: missing pipelines from the spec.
2024-11-25 00:27:59 -10:00
Aryan
7ac6e286ee Flux Fill, Canny, Depth, Redux (#9985)
* update

---------

Co-authored-by: yiyixuxu <yixu310@gmail.com>
Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
2024-11-23 01:41:25 -10:00
Yuxuan.Zhang
3b2830618d CogVideoX 1.5 (#9877)
* CogVideoX1_1PatchEmbed test

* 1360 * 768

* refactor

* make style

* update docs

* add modeling tests for cogvideox 1.5

* update

* make fix-copies

* add ofs embed(for convert)

* add ofs embed(for convert)

* more resolution for cogvideox1.5-5b-i2v

* use even number of latent frames only

* update pipeline implementations

* make style

* set patch_size_t as None by default

* #skip frames 0

* refactor

* make style

* update docs

* fix ofs_embed

* update docs

* invert_scale_latents

* update

* fix

* Update docs/source/en/api/pipelines/cogvideox.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/api/pipelines/cogvideox.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/api/pipelines/cogvideox.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/api/pipelines/cogvideox.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update src/diffusers/models/transformers/cogvideox_transformer_3d.py

* update conversion script

* remove copied from

* fix test

* Update docs/source/en/api/pipelines/cogvideox.md

* Update docs/source/en/api/pipelines/cogvideox.md

* Update docs/source/en/api/pipelines/cogvideox.md

* Update docs/source/en/api/pipelines/cogvideox.md

---------

Co-authored-by: Aryan <aryan@huggingface.co>
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
2024-11-19 00:56:34 +05:30
Grant Sherrick
c3c94fe71b Add server example (#9918)
* Add server example.

* Minor updates to README.

* Add fixes after local testing.

* Apply suggestions from code review

Updates to README from code review

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* More doc updates.

* Maybe this will work to build the docs correctly?

* Fix style issues.

* Fix toc.

* Minor reformatting.

* Move docs to proper loc.

* Fix missing tick.

* Apply suggestions from code review

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Sync docs changes back to README.

* Very minor update to docs to add space.

---------

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
2024-11-18 09:26:13 -08:00
Pakkapon Phongthawee
40ab1c03f3 add depth controlnet sd3 pre-trained checkpoints to docs (#9937) 2024-11-16 18:36:01 +05:30
Sayak Paul
ded3db164b [Core] introduce controlnet module (#8768)
* move vae flax module.

* controlnet module.

* prepare for PR.

* revert a commit

* gracefully deprecate controlnet deps.

* fix

* fix doc path

* fix-copies

* fix path

* style

* style

* conflicts

* fix

* fix-copies

* sparsectrl.

* updates

* fix

* updates

* updates

* updates

* fix

---------

Co-authored-by: Dhruv Nair <dhruv.nair@gmail.com>
2024-11-06 22:08:55 -04:00
Vahid Askari
a03bf4a531 Fix: Remove duplicated comma in distributed_inference.md (#9868)
Fix: Remove duplicated comma

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
2024-11-05 23:37:11 +01:00
Aryan
3f329a426a [core] Mochi T2V (#9769)
* update

* udpate

* update transformer

* make style

* fix

* add conversion script

* update

* fix

* update

* fix

* update

* fixes

* make style

* update

* update

* update

* init

* update

* update

* add

* up

* up

* up

* update

* mochi transformer

* remove original implementation

* make style

* update inits

* update conversion script

* docs

* Update src/diffusers/pipelines/mochi/pipeline_mochi.py

Co-authored-by: Dhruv Nair <dhruv.nair@gmail.com>

* Update src/diffusers/pipelines/mochi/pipeline_mochi.py

Co-authored-by: Dhruv Nair <dhruv.nair@gmail.com>

* fix docs

* pipeline fixes

* make style

* invert sigmas in scheduler; fix pipeline

* fix pipeline num_frames

* flip proj and gate in swiglu

* make style

* fix

* make style

* fix tests

* latent mean and std fix

* update

* cherry-pick 1069d210e1

* remove additional sigma already handled by flow match scheduler

* fix

* remove hardcoded value

* replace conv1x1 with linear

* Update src/diffusers/pipelines/mochi/pipeline_mochi.py

Co-authored-by: Dhruv Nair <dhruv.nair@gmail.com>

* framewise decoding and conv_cache

* make style

* Apply suggestions from code review

* mochi vae encoder changes

* rebase correctly

* Update scripts/convert_mochi_to_diffusers.py

* fix tests

* fixes

* make style

* update

* make style

* update

* add framewise and tiled encoding

* make style

* make original vae implementation behaviour the default; note: framewise encoding does not work

* remove framewise encoding implementation due to presence of attn layers

* fight test 1

* fight test 2

---------

Co-authored-by: Dhruv Nair <dhruv.nair@gmail.com>
Co-authored-by: yiyixuxu <yixu310@gmail.com>
2024-11-05 20:33:41 +05:30
YiYi Xu
d2e5cb3c10 Revert "[LoRA] fix: lora loading when using with a device_mapped mode… (#9823)
Revert "[LoRA] fix: lora loading when using with a device_mapped model. (#9449)"

This reverts commit 41e4779d98.
2024-10-31 08:19:32 -10:00
Sayak Paul
41e4779d98 [LoRA] fix: lora loading when using with a device_mapped model. (#9449)
* fix: lora loading when using with a device_mapped model.

* better attibutung

* empty

Co-authored-by: Benjamin Bossan <BenjaminBossan@users.noreply.github.com>

* Apply suggestions from code review

Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>

* minors

* better error messages.

* fix-copies

* add: tests, docs.

* add hardware note.

* quality

* Update docs/source/en/training/distributed_inference.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* fixes

* skip properly.

* fixes

---------

Co-authored-by: Benjamin Bossan <BenjaminBossan@users.noreply.github.com>
Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
2024-10-31 21:17:41 +05:30
Aryan
0d1d267b12 [core] Allegro T2V (#9736)
* update

* refactor transformer part 1

* refactor part 2

* refactor part 3

* make style

* refactor part 4; modeling tests

* make style

* refactor part 5

* refactor part 6

* gradient checkpointing

* pipeline tests (broken atm)

* update

* add coauthor

Co-Authored-By: Huan Yang <hyang@fastmail.com>

* refactor part 7

* add docs

* make style

* add coauthor

Co-Authored-By: YiYi Xu <yixu310@gmail.com>

* make fix-copies

* undo unrelated change

* revert changes to embeddings, normalization, transformer

* refactor part 8

* make style

* refactor part 9

* make style

* fix

* apply suggestions from review

* Apply suggestions from code review

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* update example

* remove attention mask for self-attention

* update

* copied from

* update

* update

---------

Co-authored-by: Huan Yang <hyang@fastmail.com>
Co-authored-by: YiYi Xu <yixu310@gmail.com>
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
2024-10-29 13:14:36 +05:30
Jingya HUANG
52d4449810 Add a doc for AWS Neuron in Diffusers (#9766)
* start draft

* add doc

* Update docs/source/en/optimization/neuron.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/optimization/neuron.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/optimization/neuron.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/optimization/neuron.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/optimization/neuron.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/optimization/neuron.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/optimization/neuron.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* bref intro of ON

* Update docs/source/en/optimization/neuron.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

---------

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
2024-10-25 08:24:58 -07:00
Dhruv Nair
0d9d98fe5f Fix typos (#9739)
* update

* update

* update

* update

* update

* update
2024-10-22 16:12:28 +05:30
Sayak Paul
60ffa84253 [bitsandbbytes] follow-ups (#9730)
* bnb follow ups.

* add a warning when dtypes mismatch.

* fx-copies

* clear cache.

* check_if_quantized_param

* add a check on shape.

* updates

* docs

* improve readability.

* resources.

* fix
2024-10-22 16:00:05 +05:30
YiYi Xu
e2d037bbf1 minor doc/test update (#9734)
* update some docs and tests!

---------

Co-authored-by: Aryan <contact.aryanvs@gmail.com>
Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
Co-authored-by: Aryan <aryan@huggingface.co>
Co-authored-by: apolinário <joaopaulo.passos@gmail.com>
2024-10-21 13:06:13 -10:00
Sayak Paul
d27ecc5960 [Docs] docs to xlabs controlnets. (#9688)
* docs to xlabs controlnets.

Co-authored-by: Anzhella Pankratova <son0shad@gmail.com>

* Apply suggestions from code review

Co-authored-by: Anzhella Pankratova <54744846+Anghellia@users.noreply.github.com>

---------

Co-authored-by: Anzhella Pankratova <son0shad@gmail.com>
Co-authored-by: Anzhella Pankratova <54744846+Anghellia@users.noreply.github.com>
2024-10-21 09:38:22 -07:00
Sayak Paul
b821f006d0 [Quantization] Add quantization support for bitsandbytes (#9213)
* quantization config.

* fix-copies

* fix

* modules_to_not_convert

* add bitsandbytes utilities.

* make progress.

* fixes

* quality

* up

* up

rotary embedding refactor 2: update comments, fix dtype for use_real=False (#9312)

fix notes and dtype

up

up

* minor

* up

* up

* fix

* provide credits where due.

* make configurations work.

* fixes

* fix

* update_missing_keys

* fix

* fix

* make it work.

* fix

* provide credits to transformers.

* empty commit

* handle to() better.

* tests

* change to bnb from bitsandbytes

* fix tests

fix slow quality tests

SD3 remark

fix

complete int4 tests

add a readme to the test files.

add model cpu offload tests

warning test

* better safeguard.

* change merging status

* courtesy to transformers.

* move  upper.

* better

* make the unused kwargs warning friendlier.

* harmonize changes with https://github.com/huggingface/transformers/pull/33122

* style

* trainin tests

* feedback part i.

* Add Flux inpainting and Flux Img2Img (#9135)

---------

Co-authored-by: yiyixuxu <yixu310@gmail.com>

Update `UNet2DConditionModel`'s error messages (#9230)

* refactor

[CI] Update Single file Nightly Tests (#9357)

* update

* update

feedback.

improve README for flux dreambooth lora (#9290)

* improve readme

* improve readme

* improve readme

* improve readme

fix one uncaught deprecation warning for accessing vae_latent_channels in VaeImagePreprocessor (#9372)

deprecation warning vae_latent_channels

add mixed int8 tests and more tests to nf4.

[core] Freenoise memory improvements (#9262)

* update

* implement prompt interpolation

* make style

* resnet memory optimizations

* more memory optimizations; todo: refactor

* update

* update animatediff controlnet with latest changes

* refactor chunked inference changes

* remove print statements

* update

* chunk -> split

* remove changes from incorrect conflict resolution

* remove changes from incorrect conflict resolution

* add explanation of SplitInferenceModule

* update docs

* Revert "update docs"

This reverts commit c55a50a271.

* update docstring for freenoise split inference

* apply suggestions from review

* add tests

* apply suggestions from review

quantization docs.

docs.

* Revert "Add Flux inpainting and Flux Img2Img (#9135)"

This reverts commit 5799954dd4.

* tests

* don

* Apply suggestions from code review

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* contribution guide.

* changes

* empty

* fix tests

* harmonize with https://github.com/huggingface/transformers/pull/33546.

* numpy_cosine_distance

* config_dict modification.

* remove if config comment.

* note for load_state_dict changes.

* float8 check.

* quantizer.

* raise an error for non-True low_cpu_mem_usage values when using quant.

* low_cpu_mem_usage shenanigans when using fp32 modules.

* don't re-assign _pre_quantization_type.

* make comments clear.

* remove comments.

* handle mixed types better when moving to cpu.

* add tests to check if we're throwing warning rightly.

* better check.

* fix 8bit test_quality.

* handle dtype more robustly.

* better message when keep_in_fp32_modules.

* handle dtype casting.

* fix dtype checks in pipeline.

* fix warning message.

* Update src/diffusers/models/modeling_utils.py

Co-authored-by: YiYi Xu <yixu310@gmail.com>

* mitigate the confusing cpu warning

---------

Co-authored-by: Vishnu V Jaddipal <95531133+Gothos@users.noreply.github.com>
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
Co-authored-by: YiYi Xu <yixu310@gmail.com>
2024-10-21 10:11:57 +05:30
Aryan
8cabd4a0db [pipeline] CogVideoX-Fun Control (#9671)
* cogvideox-fun control

* make style

* make fix-copies

* karras schedulers

* Update src/diffusers/pipelines/cogvideo/pipeline_cogvideox_fun_control.py

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/api/pipelines/cogvideox.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* apply suggestions from review

---------

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
2024-10-16 16:21:09 +05:30
glide-the
0d935df67d Docs: CogVideoX (#9578)
* CogVideoX docs


---------

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
Co-authored-by: YiYi Xu <yixu310@gmail.com>
2024-10-15 14:41:56 -10:00
Yuxuan.Zhang
8d81564b27 CogView3Plus DiT (#9570)
* merge 9588

* max_shard_size="5GB" for colab running

* conversion script updates; modeling test; refactor transformer

* make fix-copies

* Update convert_cogview3_to_diffusers.py

* initial pipeline draft

* make style

* fight bugs 🐛🪳

* add example

* add tests; refactor

* make style

* make fix-copies

* add co-author

YiYi Xu <yixu310@gmail.com>

* remove files

* add docs

* add co-author

Co-Authored-By: YiYi Xu <yixu310@gmail.com>

* fight docs

* address reviews

* make style

* make model work

* remove qkv fusion

* remove qkv fusion tets

* address review comments

* fix make fix-copies error

* remove None and TODO

* for FP16(draft)

* make style

* remove dynamic cfg

* remove pooled_projection_dim as a parameter

* fix tests

---------

Co-authored-by: Aryan <aryan@huggingface.co>
Co-authored-by: YiYi Xu <yixu310@gmail.com>
2024-10-14 19:30:36 +05:30
Jinzhe Pan
6a5f06488c [docs] Fix xDiT doc image damage (#9655)
* docs: fix xDiT doc image damage

* doc: move xdit images to hf dataset

---------

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
2024-10-12 13:05:07 +05:30
SahilCarterr
af28ae2d5b add PAG support for SD Img2Img (#9463)
* added pag to sd img2img pipeline


---------

Co-authored-by: YiYi Xu <yixu310@gmail.com>
2024-10-09 10:40:58 -10:00
Sayak Paul
31058cdaef [LoRA] allow loras to be loaded with low_cpu_mem_usage. (#9510)
* allow loras to be loaded with low_cpu_mem_usage.

* add flux support but note https://github.com/huggingface/diffusers/pull/9510\#issuecomment-2378316687

* low_cpu_mem_usage.

* fix-copies

* fix-copies again

* tests

* _LOW_CPU_MEM_USAGE_DEFAULT_LORA

* _peft_version default.

* version checks.

* version check.

* version check.

* version check.

* require peft 0.13.1.

* explicitly specify low_cpu_mem_usage=False.

* docs.

* transformers version 4.45.2.

* update

* fix

* empty

* better name initialize_dummy_state_dict.

* doc todos.

* Apply suggestions from code review

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* style

* fix-copies

---------

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
2024-10-09 10:57:16 +05:30
Sayak Paul
63a5c8742a Update distributed_inference.md to include transformer.device_map (#9553)
* Update distributed_inference.md to include `transformer.device_map`

* Update docs/source/en/training/distributed_inference.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

---------

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
2024-10-08 08:03:51 +05:30
JuanCarlosPi
33fafe3d14 Add PAG support to StableDiffusionControlNetPAGInpaintPipeline (#8875)
* Add pag to controlnet inpainting pipeline


---------

Co-authored-by: YiYi Xu <yixu310@gmail.com>
2024-09-30 20:04:42 -10:00
hlky
c4a8979f30 Add beta sigmas to other schedulers and update docs (#9538) 2024-09-30 09:00:54 -10:00