1
0
mirror of https://github.com/huggingface/diffusers.git synced 2026-01-27 17:22:53 +03:00

4446 Commits

Author SHA1 Message Date
Aryan
c9ff360966 Release: v0.30.3 v0.30.3 2024-09-17 06:31:37 +02:00
Yuxuan.Zhang
27b5786b14 CogVideoX-5b-I2V support (#9418)
* draft Init

* draft

* vae encode image

* make style

* image latents preparation

* remove image encoder from conversion script

* fix minor bugs

* make pipeline work

* make style

* remove debug prints

* fix imports

* update example

* make fix-copies

* add fast tests

* fix import

* update vae

* update docs

* update image link

* apply suggestions from review

* apply suggestions from review

* add slow test

* make use of learned positional embeddings

* apply suggestions from review

* doc change

* Update convert_cogvideox_to_diffusers.py

* make style

* final changes

* make style

* fix tests

---------

Co-authored-by: Aryan <aryan@huggingface.co>
2024-09-17 06:29:52 +02:00
Aryan
c7407a3f3b [refactor] move positional embeddings to patch embed layer for CogVideoX (#9263)
* remove frame limit in cogvideox

* remove debug prints

* Update src/diffusers/models/transformers/cogvideox_transformer_3d.py

* revert pipeline; remove frame limitation

* revert transformer changes

* address review comments

* add error message

* apply suggestions from review
2024-09-17 06:29:46 +02:00
Dhruv Nair
7be261169a [CI] Quick fix for Cog Video Test (#9373)
update
2024-09-17 06:27:18 +02:00
Aryan
c980e4bed9 [core] CogVideoX memory optimizations in VAE encode (#9340)
fake context parallel cache, vae encode tiling

(cherry picked from commit bf890bca0e)
2024-09-17 06:27:04 +02:00
Aryan
6dd94b37e3 [core] Support VideoToVideo with CogVideoX (#9333)
* add vid2vid pipeline for cogvideox

* make fix-copies

* update docs

* fake context parallel cache, vae encode tiling

* add test for cog vid2vid

* use video link from HF docs repo

* add copied from comments; correctly rename test class
2024-09-17 06:26:50 +02:00
Álvaro Somoza
f63c12633f Release: v0.30.2 v0.30.2 2024-08-30 23:28:03 +00:00
YiYi Xu
be5995a815 update runway repo for single_file (#9323)
update to a place holder
2024-08-30 23:26:24 +00:00
Dhruv Nair
065978474b Fix Flux CLIP prompt embeds repeat for num_images_per_prompt > 1 (#9280)
update
2024-08-30 23:26:01 +00:00
Álvaro Somoza
cc1e589537 [IP Adapter] Fix cache_dir and local_files_only for image encoder (#9272)
initial fix
2024-08-30 23:22:40 +00:00
YiYi Xu
8b9bfaea80 Release v0.30.1 v0.30.1 2024-08-23 15:24:29 -10:00
Dhruv Nair
b12c7f8390 [Single File] Support loading Comfy UI Flux checkpoints (#9243)
update
2024-08-23 15:19:50 -10:00
zR
06f36713ae Cogvideox-5B Model adapter change (#9203)
* draft of embedding

---------

Co-authored-by: Aryan <aryan@huggingface.co>
2024-08-23 15:17:20 -10:00
Aryan
19c5d7b376 [tests] fix broken xformers tests (#9206)
* fix xformers tests

* remove unnecessary modifications to cogvideox tests

* update
2024-08-23 15:16:58 -10:00
Sayak Paul
99a64aa63c [Flux LoRA] support parsing alpha from a flux lora state dict. (#9236)
* support parsing alpha from a flux lora state dict.

* conditional import.

* fix breaking changes.

* safeguard alpha.

* fix
2024-08-23 15:11:29 -10:00
Dhruv Nair
1bb419672d [Single File] Fix configuring scheduler via legacy kwargs (#9229)
update
2024-08-23 15:11:06 -10:00
Simo Ryu
a655574710 Add Learned PE selection for Auraflow (#9182)
* add pe

* Update src/diffusers/models/transformers/auraflow_transformer_2d.py

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>

* Update src/diffusers/models/transformers/auraflow_transformer_2d.py

* beauty

* retrigger ci.

---------

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
2024-08-23 15:10:13 -10:00
Aryan
67a80dfbd5 [refactor] CogVideoX followups + tiled decoding support (#9150)
* refactor context parallel cache; update torch compile time benchmark

* add tiling support

* make style

* remove num_frames % 8 == 0 requirement

* update default num_frames to original value

* add explanations + refactor

* update torch compile example

* update docs

* update

* clean up if-statements

* address review comments

* add test for vae tiling

* update docs

* update docs

* update docstrings

* add modeling test for cogvideox transformer

* make style
2024-08-23 15:09:38 -10:00
Dhruv Nair
1f77300d23 Update Video Loading/Export to use imageio (#9094)
* update

* update

* update

---------

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
2024-08-23 15:09:10 -10:00
sayakpaul
8a79d8ec39 Release: v0.30.0 v0.30.0 2024-08-07 13:00:43 +05:30
zR
2dad462d9b Add CogVideoX text-to-video generation model (#9082)
* add CogVideoX

---------

Co-authored-by: Aryan <aryan@huggingface.co>
Co-authored-by: sayakpaul <spsayakpaul@gmail.com>
Co-authored-by: Aryan <contact.aryanvs@gmail.com>
Co-authored-by: yiyixuxu <yixu310@gmail.com>
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
2024-08-06 21:23:57 -10:00
Dhruv Nair
e3568d14ba Freenoise change vae_batch_size to decode_chunk_size (#9110)
* update

* update
2024-08-07 12:47:18 +05:30
Aryan
f6df22447c [feat] allow sparsectrl to be loaded from single file (#9073)
* allow sparsectrl to be loaded with single file

* update

---------

Co-authored-by: Dhruv Nair <dhruv.nair@gmail.com>
2024-08-07 11:12:30 +05:30
latentCall145
9b5180cb5f Flux fp16 inference fix (#9097)
* clipping for fp16

* fix typo

* added fp16 inference to docs

* fix docs typo

* include link for fp16 investigation

---------

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
2024-08-07 10:54:20 +05:30
Aryan
16a93f1a25 [core] FreeNoise (#8948)
* initial work draft for freenoise; needs massive cleanup

* fix freeinit bug

* add animatediff controlnet implementation

* revert attention changes

* add freenoise

* remove old helper functions

* add decode batch size param to all pipelines

* make style

* fix copied from comments

* make fix-copies

* make style

* copy animatediff controlnet implementation from #8972

* add experimental support for num_frames not perfectly fitting context length, ocntext stride

* make unet motion model lora work again based on #8995

* copy load video utils from #8972

* copied from AnimateDiff::prepare_latents

* address the case where last batch of frames does not match length of indices in prepare latents

* decode_batch_size->vae_batch_size; batch vae encode support in animatediff vid2vid

* revert sparsectrl and sdxl freenoise changes

* revert pia

* add freenoise tests

* make fix-copies

* improve docstrings

* add freenoise tests to animatediff controlnet

* update tests

* Update src/diffusers/models/unets/unet_motion_model.py

* add freenoise to animatediff pag

* address review comments

* make style

* update tests

* make fix-copies

* fix error message

* remove copied from comment

* fix imports in tests

* update

---------

Co-authored-by: Dhruv Nair <dhruv.nair@gmail.com>
2024-08-07 10:35:18 +05:30
Sayak Paul
2d753b6fb5 fix train_dreambooth_lora_sd3.py loading hook (#9107) 2024-08-07 10:09:47 +05:30
Álvaro Somoza
39e1f7eaa4 [Kolors] Add PAG (#8934)
* txt2img pag added

* autopipe added, fixed case

* style

* apply suggestions

* added fast tests, added todo tests

* revert dummy objects for kolors

* fix pag dummies

* fix test imports

* update pag tests

* add kolor pag to docs

---------

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
2024-08-07 09:29:52 +05:30
Dhruv Nair
e1b603dc2e [Single File] Add single file support for Flux Transformer (#9083)
* update

* update

* update

---------

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
2024-08-07 08:49:57 +05:30
Marc Sun
e4325606db Fix loading sharded checkpoints when we have variants (#9061)
* Fix loading sharded checkpoint when we have variant

* add test

* remote print

---------

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
2024-08-06 13:38:44 -10:00
Ahn Donghoon (안동훈 / suno)
926daa30f9 add PAG support for Stable Diffusion 3 (#8861)
add pag sd3


---------

Co-authored-by: HyoungwonCho <jhw9811@korea.ac.kr>
Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
Co-authored-by: crepejung00 <jaewoojung00@naver.com>
Co-authored-by: YiYi Xu <yixu310@gmail.com>
Co-authored-by: Aryan <contact.aryanvs@gmail.com>
Co-authored-by: Aryan <aryan@huggingface.co>
2024-08-06 09:11:35 -10:00
Dhruv Nair
325a5de3a9 [Docs] Add community projects section to docs (#9013)
* update

* update

* update
2024-08-06 08:59:39 -07:00
Dhruv Nair
4c6152c2fb update 2024-08-06 12:00:14 +00:00
Vinh H. Pham
87e50a2f1d [Tests] Improve transformers model test suite coverage - Hunyuan DiT (#8916)
* add hunyuan model test

* apply suggestions

* reduce dims further

* reduce dims further

* run make style

---------

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
2024-08-06 12:59:30 +05:30
Aryan
a57a7af45c [bug] remove unreachable norm_type=ada_norm_continuous from norm3 initialization conditions (#9006)
remove ada_norm_continuous from norm3 list

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
2024-08-06 07:23:48 +05:30
Sayak Paul
52f1378e64 [Core] add QKV fusion to AuraFlow and PixArt Sigma (#8952)
* add fusion support to pixart

* add to auraflow.

* add tests

* apply review feedback.

* add back args and kwargs

* style
2024-08-05 14:09:37 -10:00
Tolga Cangöz
3dc97bd148 Update CLIPFeatureExtractor to CLIPImageProcessor and DPTFeatureExtractor to DPTImageProcessor (#9002)
* fix: update `CLIPFeatureExtractor` to `CLIPImageProcessor` in codebase

* `make style && make quality`

* Update `DPTFeatureExtractor` to `DPTImageProcessor` in codebase

* `make style`

---------

Co-authored-by: Aryan <aryan@huggingface.co>
2024-08-05 09:20:29 -10:00
omahs
6d32b29239 Fix typos (#9077)
* fix typo
2024-08-05 09:00:08 -10:00
YiYi Xu
bc3c73ad0b add sentencepiece as a soft dependency (#9065)
* add sentencepiece as  soft dependency for kolors

* up

---------

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
2024-08-05 08:04:51 -10:00
Sayak Paul
5934873b8f [Docs] add stable cascade unet doc. (#9066)
* add stable cascade unet doc.

* fix path
2024-08-05 21:28:48 +05:30
Aryan
b7058d142c PAG variant for HunyuanDiT, PAG refactor (#8936)
* copy hunyuandit pipeline

* pag variant of hunyuan dit

* add tests

* update docs

* make style

* make fix-copies

* Update src/diffusers/pipelines/pag/pag_utils.py

* remove incorrect copied from

* remove pag hunyuan attn procs to resolve conflicts

* add pag attn procs again

* new implementation for pag_utils

* revert pag changes

* add pag refactor back; update pixart sigma

* update pixart pag tests

* apply suggestions from review

Co-Authored-By: yixu310@gmail.com

* make style

* update docs, fix tests

* fix tests

* fix test_components_function since list not accepted as valid __init__ param

* apply patch to fix broken tests

Co-Authored-By: Sayak Paul <spsayakpaul@gmail.com>

* make style

* fix hunyuan tests

---------

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
2024-08-05 17:56:09 +05:30
Vinh H. Pham
e1d508ae92 [Tests] Improve transformers model test suite coverage - Latte (#8919)
* add LatteTransformer3DModel model test

* change patch_size to 1

* reduce req len

* reduce channel dims

* increase num_layers

* reduce dims further

* run make style

---------

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
Co-authored-by: Aryan <aryan@huggingface.co>
2024-08-05 17:13:03 +05:30
Sayak Paul
fc6a91e383 [FLUX] support LoRA (#9057)
* feat: lora support for Flux.

add tests

fix imports

major fixes.

* fix

fixes

final fixes?

* fix

* remove is_peft_available.
2024-08-05 10:24:05 +05:30
Aryan
2b76099610 [refactor] apply qk norm in attention processors (#9071)
* apply qk norm in attention processors

* revert attention processor

* qk-norm in only attention proc 2.0 and fused variant
2024-08-04 05:42:46 -10:00
psychedelicious
4f0d01d387 type get_attention_scores as optional in get_attention_scores (#9075)
`None` is valid for `get_attention_scores`, should be typed as such
2024-08-04 17:19:05 +05:30
asfiyab-nvidia
3dc10a535f Update TensorRT txt2img and inpaint community pipelines (#9037)
* Update TensorRT txt2img and inpaint community pipelines

Signed-off-by: Asfiya Baig <asfiyab@nvidia.com>

* update tensorrt install instructions

Signed-off-by: Asfiya Baig <asfiyab@nvidia.com>

---------

Signed-off-by: Asfiya Baig <asfiyab@nvidia.com>
Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
2024-08-04 16:00:40 +05:30
Sayak Paul
c370b90ff1 [Flux] minor documentation fixes for flux. (#9048)
* minor documentation fixes for flux.

* clipskip

* add gist
2024-08-04 15:53:01 +05:30
Philip Rideout
ebf3ab1477 Fix grammar mistake. (#9072) 2024-08-04 04:32:03 +05:30
Aryan
fbe29c6298 [refactor] create modeling blocks specific to AnimateDiff (#8979)
* animatediff specific transformer model

* make style

* make fix-copies

* move blocks to unet motion model

* make style

* remove dummy object

* fix incorrectly passed param causing test failures

* rename model and output class

* fix sparsectrl imports

* remove todo comments

* remove temporal double self attn param from controlnet sparsectrl

* add deprecated versions of blocks

* apply suggestions from review

* update

---------

Co-authored-by: Dhruv Nair <dhruv.nair@gmail.com>
2024-08-03 13:03:39 +05:30
Tolga Cangöz
7071b7461b Errata: Fix typos & \s+$ (#9008)
* Fix typos

* chore: Fix typos

* chore: Update README.md for promptdiffusion example

* Trim trailing white spaces

* Fix a typo

* update number

* chore: update number

* Trim trailing white space

* Update README.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update README.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

---------

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
2024-08-02 21:24:25 -07:00
Frank (Haofan) Wang
a054c78495 Update transformer_flux.py (#9060) 2024-08-03 08:58:32 +05:30