Tran Thanh Luan
6290fdfda4
[Feat] TaylorSeer Cache ( #12648 )
...
* init taylor_seer cache
* make compatible with any tuple size returned
* use logger for printing, add warmup feature
* still update in warmup steps
* refractor, add docs
* add configurable cache, skip compute module
* allow special cache ids only
* add stop_predicts (cooldown)
* update docs
* apply ruff
* update to handle multple calls per timestep
* refractor to use state manager
* fix format & doc
* chores: naming, remove redundancy
* add docs
* quality & style
* fix taylor precision
* Apply style fixes
* add tests
* Apply style fixes
* Remove TaylorSeerCacheTesterMixin from flux2 tests
* rename identifiers, use more expressive taylor predict loop
* torch compile compatible
* Apply style fixes
* Update src/diffusers/hooks/taylorseer_cache.py
Co-authored-by: Dhruv Nair <dhruv.nair@gmail.com >
* update docs
* make fix-copies
* fix example usage.
* remove tests on flux kontext
---------
Co-authored-by: toilaluan <toilaluan@github.com >
Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
Co-authored-by: Dhruv Nair <dhruv.nair@gmail.com >
Co-authored-by: Sayak Paul <spsayakpaul@gmail.com >
2025-12-06 05:39:54 +05:30
Yao Matrix
0e12ba7454
fix 3 xpu failures uts w/ latest pytorch ( #12408 )
...
fix xpu ut failures w/ latest pytorch
Signed-off-by: Yao, Matrix <matrix.yao@intel.com >
2025-09-30 14:07:48 +05:30
Dhruv Nair
7aa6af1138
[Refactor] Move testing utils out of src ( #12238 )
...
* update
* update
* update
* update
* update
* merge main
* Revert "merge main"
This reverts commit 65efbcead5 .
2025-08-28 19:53:02 +05:30
Aryan
18c8f10f20
[refactor] Flux/Chroma single file implementation + Attention Dispatcher ( #11916 )
...
* update
* update
* add coauthor
Co-Authored-By: Dhruv Nair <dhruv.nair@gmail.com >
* improve test
* handle ip adapter params correctly
* fix chroma qkv fusion test
* fix fastercache implementation
* fix more tests
* fight more tests
* add back set_attention_backend
* update
* update
* make style
* make fix-copies
* make ip adapter processor compatible with attention dispatcher
* refactor chroma as well
* remove rmsnorm assert
* minify and deprecate npu/xla processors
---------
Co-authored-by: Dhruv Nair <dhruv.nair@gmail.com >
2025-07-17 17:30:39 +05:30
Aryan
06fd427797
[tests] Improve Flux tests ( #11919 )
...
update
2025-07-15 10:47:41 +05:30
Aryan
0454fbb30b
First Block Cache ( #11180 )
...
* update
* modify flux single blocks to make compatible with cache techniques (without too much model-specific intrusion code)
* remove debug logs
* update
* cache context for different batches of data
* fix hs residual bug for single return outputs; support ltx
* fix controlnet flux
* support flux, ltx i2v, ltx condition
* update
* update
* Update docs/source/en/api/cache.md
* Update src/diffusers/hooks/hooks.py
Co-authored-by: Dhruv Nair <dhruv.nair@gmail.com >
* address review comments pt. 1
* address review comments pt. 2
* cache context refacotr; address review pt. 3
* address review comments
* metadata registration with decorators instead of centralized
* support cogvideox
* support mochi
* fix
* remove unused function
* remove central registry based on review
* update
---------
Co-authored-by: Dhruv Nair <dhruv.nair@gmail.com >
2025-07-09 03:27:15 +05:30
Dhruv Nair
cbc8ced20f
[CI] Fix big GPU test marker ( #11786 )
...
* update
* update
2025-07-08 22:09:09 +05:30
VΖ°Ζ‘ng ΔΓ¬nh Minh
d6fa3298fa
update: FluxKontextInpaintPipeline support ( #11820 )
...
* update: FluxKontextInpaintPipeline support
* fix: Refactor code, remove mask_image_latents and ruff check
* feat: Add test case and fix with pytest
* Apply style fixes
* copies
---------
Co-authored-by: YiYi Xu <yixu310@gmail.com >
Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
2025-07-01 23:34:27 -10:00
Aryan
eea76892e8
Flux Kontext ( #11812 )
...
* support flux kontext
* make fix-copies
* add example
* add tests
* update docs
* update
* add note on integrity checker
* make fix-copies issue
* add copied froms
* make style
* update repository ids
* more copied froms
2025-06-26 21:29:59 +05:30
Yao Matrix
33e636cea5
enable torchao test cases on XPU and switch to device agnostic APIs for test cases ( #11654 )
...
* enable torchao cases on XPU
Signed-off-by: Matrix YAO <matrix.yao@intel.com >
* device agnostic APIs
Signed-off-by: YAO Matrix <matrix.yao@intel.com >
* more
Signed-off-by: YAO Matrix <matrix.yao@intel.com >
* fix style
Signed-off-by: YAO Matrix <matrix.yao@intel.com >
* enable test_torch_compile_recompilation_and_graph_break on XPU
Signed-off-by: YAO Matrix <matrix.yao@intel.com >
* resolve comments
Signed-off-by: YAO Matrix <matrix.yao@intel.com >
---------
Signed-off-by: Matrix YAO <matrix.yao@intel.com >
Signed-off-by: YAO Matrix <matrix.yao@intel.com >
2025-06-11 15:17:06 +05:30
Dhruv Nair
edc154da09
Update Ruff to latest Version ( #10919 )
...
* update
* update
* update
* update
2025-04-09 16:51:34 +05:30
Yao Matrix
c36c745ceb
fix FluxReduxSlowTests::test_flux_redux_inference case failure on XPU ( #11245 )
...
* loose test_float16_inference's tolerance from 5e-2 to 6e-2, so XPU can
pass UT
Signed-off-by: Matrix Yao <matrix.yao@intel.com >
* fix test_pipeline_flux_redux fail on XPU
Signed-off-by: Matrix Yao <matrix.yao@intel.com >
---------
Signed-off-by: Matrix Yao <matrix.yao@intel.com >
2025-04-09 11:41:15 +01:00
Aryan
844221ae4e
[core] FasterCache ( #10163 )
...
* init
* update
* update
* update
* make style
* update
* fix
* make it work with guidance distilled models
* update
* make fix-copies
* add tests
* update
* apply_faster_cache -> apply_fastercache
* fix
* reorder
* update
* refactor
* update docs
* add fastercache to CacheMixin
* update tests
* Apply suggestions from code review
* make style
* try to fix partial import error
* Apply style fixes
* raise warning
* update
---------
Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
2025-03-21 09:35:04 +05:30
Fanli Lin
15ad97f782
[tests] make cuda only tests device-agnostic ( #11058 )
...
* enable bnb on xpu
* add 2 more cases
* add missing change
* add missing change
* add one more
* enable cuda only tests on xpu
* enable big gpu cases
2025-03-20 10:12:35 +00:00
Fanli Lin
7855ac597e
[tests] make tests device-agnostic (part 4) ( #10508 )
...
* initial comit
* fix empty cache
* fix one more
* fix style
* update device functions
* update
* update
* Update src/diffusers/utils/testing_utils.py
Co-authored-by: hlky <hlky@hlky.ac >
* Update src/diffusers/utils/testing_utils.py
Co-authored-by: hlky <hlky@hlky.ac >
* Update src/diffusers/utils/testing_utils.py
Co-authored-by: hlky <hlky@hlky.ac >
* Update tests/pipelines/controlnet/test_controlnet.py
Co-authored-by: hlky <hlky@hlky.ac >
* Update src/diffusers/utils/testing_utils.py
Co-authored-by: hlky <hlky@hlky.ac >
* Update src/diffusers/utils/testing_utils.py
Co-authored-by: hlky <hlky@hlky.ac >
* Update tests/pipelines/controlnet/test_controlnet.py
Co-authored-by: hlky <hlky@hlky.ac >
* with gc.collect
* update
* make style
* check_torch_dependencies
* add mps empty cache
* add changes
* bug fix
* enable on xpu
* update more cases
* revert
* revert back
* Update test_stable_diffusion_xl.py
* Update tests/pipelines/stable_diffusion/test_stable_diffusion.py
Co-authored-by: hlky <hlky@hlky.ac >
* Update tests/pipelines/stable_diffusion/test_stable_diffusion.py
Co-authored-by: hlky <hlky@hlky.ac >
* Update tests/pipelines/stable_diffusion/test_stable_diffusion_img2img.py
Co-authored-by: hlky <hlky@hlky.ac >
* Update tests/pipelines/stable_diffusion/test_stable_diffusion_img2img.py
Co-authored-by: hlky <hlky@hlky.ac >
* Update tests/pipelines/stable_diffusion/test_stable_diffusion_img2img.py
Co-authored-by: hlky <hlky@hlky.ac >
* Apply suggestions from code review
Co-authored-by: hlky <hlky@hlky.ac >
* add test marker
---------
Co-authored-by: hlky <hlky@hlky.ac >
2025-03-04 08:26:06 +00:00
Sayak Paul
7513162b8b
[Tests] Remove more encode prompts tests ( #10942 )
...
* fix-copies went uncaught it seems.
* remove more unneeded encode_prompt() tests
* Revert "fix-copies went uncaught it seems."
This reverts commit eefb302791 .
* empty
2025-03-03 16:55:01 +05:30
hlky
694f9658c1
Support IPAdapter for more Flux pipelines ( #10708 )
...
* Support IPAdapter for more Flux pipelines
* -copied from
---------
Co-authored-by: Sayak Paul <spsayakpaul@gmail.com >
2025-03-02 15:04:12 +00:00
Aryan
9a147b82f7
Module Group Offloading ( #10503 )
...
* update
* fix
* non_blocking; handle parameters and buffers
* update
* Group offloading with cuda stream prefetching (#10516 )
* cuda stream prefetch
* remove breakpoints
* update
* copy model hook implementation from pab
* update; ~very workaround based implementation but it seems to work as expected; needs cleanup and rewrite
* more workarounds to make it actually work
* cleanup
* rewrite
* update
* make sure to sync current stream before overwriting with pinned params
not doing so will lead to erroneous computations on the GPU and cause bad results
* better check
* update
* remove hook implementation to not deal with merge conflict
* re-add hook changes
* why use more memory when less memory do trick
* why still use slightly more memory when less memory do trick
* optimise
* add model tests
* add pipeline tests
* update docs
* add layernorm and groupnorm
* address review comments
* improve tests; add docs
* improve docs
* Apply suggestions from code review
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
* apply suggestions from code review
* update tests
* apply suggestions from review
* enable_group_offloading -> enable_group_offload for naming consistency
* raise errors if multiple offloading strategies used; add relevant tests
* handle .to() when group offload applied
* refactor some repeated code
* remove unintentional change from merge conflict
* handle .cuda()
---------
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
2025-02-14 12:59:45 +05:30
Aryan
658e24e86c
[core] Pyramid Attention Broadcast ( #9562 )
...
* start pyramid attention broadcast
* add coauthor
Co-Authored-By: Xuanlei Zhao <43881818+oahzxl@users.noreply.github.com >
* update
* make style
* update
* make style
* add docs
* add tests
* update
* Update docs/source/en/api/pipelines/cogvideox.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
* Update docs/source/en/api/pipelines/cogvideox.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
* Pyramid Attention Broadcast rewrite + introduce hooks (#9826 )
* rewrite implementation with hooks
* make style
* update
* merge pyramid-attention-rewrite-2
* make style
* remove changes from latte transformer
* revert docs changes
* better debug message
* add todos for future
* update tests
* make style
* cleanup
* fix
* improve log message; fix latte test
* refactor
* update
* update
* update
* revert changes to tests
* update docs
* update tests
* Apply suggestions from code review
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
* update
* fix flux test
* reorder
* refactor
* make fix-copies
* update docs
* fixes
* more fixes
* make style
* update tests
* update code example
* make fix-copies
* refactor based on reviews
* use maybe_free_model_hooks
* CacheMixin
* make style
* update
* add current_timestep property; update docs
* make fix-copies
* update
* improve tests
* try circular import fix
* apply suggestions from review
* address review comments
* Apply suggestions from code review
* refactor hook implementation
* add test suite for hooks
* PAB Refactor (#10667 )
* update
* update
* update
---------
Co-authored-by: DN6 <dhruv.nair@gmail.com >
* update
* fix remove hook behaviour
---------
Co-authored-by: Xuanlei Zhao <43881818+oahzxl@users.noreply.github.com >
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
Co-authored-by: DN6 <dhruv.nair@gmail.com >
2025-01-28 05:09:04 +05:30
Aryan
beacaa5528
[core] Layerwise Upcasting ( #10347 )
...
* update
* update
* make style
* remove dynamo disable
* add coauthor
Co-Authored-By: Dhruv Nair <dhruv.nair@gmail.com >
* update
* update
* update
* update mixin
* add some basic tests
* update
* update
* non_blocking
* improvements
* update
* norm.* -> norm
* apply suggestions from review
* add example
* update hook implementation to the latest changes from pyramid attention broadcast
* deinitialize should raise an error
* update doc page
* Apply suggestions from code review
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
* update docs
* update
* refactor
* fix _always_upcast_modules for asym ae and vq_model
* fix lumina embedding forward to not depend on weight dtype
* refactor tests
* add simple lora inference tests
* _always_upcast_modules -> _precision_sensitive_module_patterns
* remove todo comments about review; revert changes to self.dtype in unets because .dtype on ModelMixin should be able to handle fp8 weight case
* check layer dtypes in lora test
* fix UNet1DModelTests::test_layerwise_upcasting_inference
* _precision_sensitive_module_patterns -> _skip_layerwise_casting_patterns based on feedback
* skip test in NCSNppModelTests
* skip tests for AutoencoderTinyTests
* skip tests for AutoencoderOobleckTests
* skip tests for UNet1DModelTests - unsupported pytorch operations
* layerwise_upcasting -> layerwise_casting
* skip tests for UNetRLModelTests; needs next pytorch release for currently unimplemented operation support
* add layerwise fp8 pipeline test
* use xfail
* Apply suggestions from code review
Co-authored-by: Dhruv Nair <dhruv.nair@gmail.com >
* add assertion with fp32 comparison; add tolerance to fp8-fp32 vs fp32-fp32 comparison (required for a few models' test to pass)
* add note about memory consumption on tesla CI runner for failing test
---------
Co-authored-by: Dhruv Nair <dhruv.nair@gmail.com >
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
2025-01-22 19:49:37 +05:30
Sayak Paul
edb8c1bce6
[Flux] Improve true cfg condition ( #10539 )
...
* improve flux true cfg condition
* add test
2025-01-12 18:33:34 +05:30
Sayak Paul
a6f043a80f
[LoRA] allow big CUDA tests to run properly for LoRA (and others) ( #9845 )
...
* allow big lora tests to run on the CI.
* print
* print.
* print
* print
* print
* print
* more
* print
* remove print.
* remove print
* directly place on cuda.
* remove pipeline.
* remove
* fix
* fix
* spaces
* quality
* updates
* directly place flux controlnet pipeline on cuda.
* torch_device instead of cuda.
* style
* device placement.
* fixes
* add big gpu marker for mochi; rename test correctly
* address feedback
* fix
---------
Co-authored-by: Aryan <aryan@huggingface.co >
2025-01-10 12:50:24 +05:30
hlky
be2070991f
Support Flux IP Adapter ( #10261 )
...
* Flux IP-Adapter
* test cfg
* make style
* temp remove copied from
* fix test
* fix test
* v2
* fix
* make style
* temp remove copied from
* Apply suggestions from code review
Co-authored-by: YiYi Xu <yixu310@gmail.com >
* Move encoder_hid_proj to inside FluxTransformer2DModel
* merge
* separate encode_prompt, add copied from, image_encoder offload
* make
* fix test
* fix
* Update src/diffusers/pipelines/flux/pipeline_flux.py
* test_flux_prompt_embeds change not needed
* true_cfg -> true_cfg_scale
* fix merge conflict
* test_flux_ip_adapter_inference
* add fast test
* FluxIPAdapterMixin not test mixin
* Update pipeline_flux.py
Co-authored-by: YiYi Xu <yixu310@gmail.com >
---------
Co-authored-by: YiYi Xu <yixu310@gmail.com >
2024-12-21 17:49:58 +00:00
AndrΓ©s Romero
83709d5a06
Flux Control(Depth/Canny) + Inpaint ( #10192 )
...
* flux_control_inpaint - failing test_flux_different_prompts
* removing test_flux_different_prompts?
* fix style
* fix from PR comments
* fix style
* reducing guidance_scale in demo
* Update src/diffusers/pipelines/flux/pipeline_flux_control_inpaint.py
Co-authored-by: hlky <hlky@hlky.ac >
* make
* prepare_latents is not copied from
* update docs
* typos
---------
Co-authored-by: affromero <ubuntu@ip-172-31-17-146.ec2.internal >
Co-authored-by: Sayak Paul <spsayakpaul@gmail.com >
Co-authored-by: hlky <hlky@hlky.ac >
2024-12-18 09:14:16 +00:00
Aryan
7ac6e286ee
Flux Fill, Canny, Depth, Redux ( #9985 )
...
* update
---------
Co-authored-by: yiyixuxu <yixu310@gmail.com >
Co-authored-by: Sayak Paul <spsayakpaul@gmail.com >
2024-11-23 01:41:25 -10:00
Dhruv Nair
f6f7afa1d7
Flux latents fix ( #9929 )
...
* update
* update
* update
* update
* update
* update
---------
Co-authored-by: Sayak Paul <spsayakpaul@gmail.com >
2024-11-20 17:30:17 +05:30
YiYi Xu
d2e5cb3c10
Revert "[LoRA] fix: lora loading when using with a device_mapped mode⦠( #9823 )
...
Revert "[LoRA] fix: lora loading when using with a device_mapped model. (#9449 )"
This reverts commit 41e4779d98 .
2024-10-31 08:19:32 -10:00
Sayak Paul
41e4779d98
[LoRA] fix: lora loading when using with a device_mapped model. ( #9449 )
...
* fix: lora loading when using with a device_mapped model.
* better attibutung
* empty
Co-authored-by: Benjamin Bossan <BenjaminBossan@users.noreply.github.com >
* Apply suggestions from code review
Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com >
* minors
* better error messages.
* fix-copies
* add: tests, docs.
* add hardware note.
* quality
* Update docs/source/en/training/distributed_inference.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
* fixes
* skip properly.
* fixes
---------
Co-authored-by: Benjamin Bossan <BenjaminBossan@users.noreply.github.com >
Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com >
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
2024-10-31 21:17:41 +05:30
Sayak Paul
ff182ad669
[CI] add a big GPU marker to run memory-intensive tests separately on CI ( #9691 )
...
* add a marker for big gpu tests
* update
* trigger on PRs temporarily.
* onnx
* fix
* total memory
* fixes
* reduce memory threshold.
* bigger gpu
* empty
* g6e
* Apply suggestions from code review
* address comments.
* fix
* fix
* fix
* fix
* fix
* okay
* further reduce.
* updates
* remove
* updates
* updates
* updates
* updates
* fixes
* fixes
* updates.
* fix
* workflow fixes.
---------
Co-authored-by: Aryan <aryan@huggingface.co >
2024-10-31 18:44:34 +05:30
Sayak Paul
adf1f911f0
[Tests] fix some fast gpu tests. ( #9379 )
...
fix some fast gpu tests.
2024-09-11 06:50:02 +05:30
Vishnu V Jaddipal
249a9e48e8
Add Flux inpainting and Flux Img2Img ( #9135 )
...
---------
Co-authored-by: yiyixuxu <yixu310@gmail.com >
2024-09-04 10:31:43 -10:00
Dhruv Nair
007ad0e2aa
[CI] More fixes for Fast GPU Tests on main ( #9300 )
...
update
2024-09-02 17:51:48 +05:30
Sayak Paul
2d9ccf39b5
[Core] fuse_qkv_projection() to Flux ( #9185 )
...
* start fusing flux.
* test
* finish fusion
* fix-copues
2024-08-23 10:54:13 +05:30
Sayak Paul
0e460675e2
[Flux] allow tests to run ( #9050 )
...
* fix tests
* fix
* float64 skip
* remove sample_size.
* remove
* remove more
* default_sample_size.
* credit black forest for flux model.
* skip
* fix: tests
* remove OriginalModelMixin
* add transformer model test
* add: transformer model tests
2024-08-02 11:49:59 +05:30
Sayak Paul
27637a5402
Flux pipeline ( #9043 )
...
add flux!
Signed-off-by: Adrien <adrien@huggingface.co >
Co-authored-by: Adrien <adrien.69740@gmail.com >
Co-authored-by: Anatoly Belikov <abelikov@singularitynet.io >
Co-authored-by: Dhruv Nair <dhruv.nair@gmail.com >
Co-authored-by: yiyixuxu <yixu310@gmail.com >
2024-08-01 11:30:52 -10:00