AstraliteHeart
|
cb342b745a
|
Add AuraFlow GGUF support (#10463)
* Add support for loading AuraFlow models from GGUF
https://huggingface.co/city96/AuraFlow-v0.3-gguf
* Update AuraFlow documentation for GGUF, add GGUF tests and model detection.
* Address code review comments.
* Remove unused config.
---------
Co-authored-by: hlky <hlky@hlky.ac>
|
2025-01-08 13:23:12 +05:30 |
|
Aryan
|
cd991d1e1a
|
Fix TorchAO related bugs; revert device_map changes (#10371)
* Revert "Add support for sharded models when TorchAO quantization is enabled (#10256)"
This reverts commit 41ba8c0bf6.
* update tests
* udpate
* update
* update
* update device map tests
* apply review suggestions
* update
* make style
* fix
* update docs
* update tests
* update workflow
* update
* improve tests
* allclose tolerance
* Update src/diffusers/models/modeling_utils.py
Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
* Update tests/quantization/torchao/test_torchao.py
Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
* improve tests
* fix
* update correct slices
---------
Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
|
2024-12-25 15:37:49 +05:30 |
|
Aryan
|
02c777c065
|
[tests] Refactor TorchAO serialization fast tests (#10271)
refactor
|
2024-12-23 11:04:57 +05:30 |
|
Aryan
|
ffc0eaab6d
|
Bump minimum TorchAO version to 0.7.0 (#10293)
* bump min torchao version to 0.7.0
* update
|
2024-12-23 11:03:04 +05:30 |
|
Aryan
|
41ba8c0bf6
|
Add support for sharded models when TorchAO quantization is enabled (#10256)
* add sharded + device_map check
|
2024-12-19 15:42:20 -10:00 |
|
Aryan
|
1524781b88
|
[tests] Remove/rename unsupported quantization torchao type (#10263)
update
|
2024-12-17 21:43:15 +05:30 |
|
Dhruv Nair
|
e24941b2a7
|
[Single File] Add GGUF support (#9964)
* update
* update
* update
* update
* update
* update
* update
* update
* update
* update
* update
* update
* update
* update
* update
* update
* update
* update
* update
* update
* update
* update
* update
* update
* update
* update
* update
* Update src/diffusers/quantizers/gguf/utils.py
Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
* update
* update
* update
* update
* update
* update
* update
* update
* update
* update
* Update docs/source/en/quantization/gguf.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* update
* update
* update
* update
---------
Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
|
2024-12-17 16:09:37 +05:30 |
|
Aryan
|
9f00c617a0
|
[core] TorchAO Quantizer (#10009)
* torchao quantizer
---------
Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
|
2024-12-16 13:35:40 -10:00 |
|
Sayak Paul
|
e8da75dff5
|
[bitsandbytes] allow directly CUDA placements of pipelines loaded with bnb components (#9840)
* allow device placement when using bnb quantization.
* warning.
* tests
* fixes
* docs.
* require accelerate version.
* remove print.
* revert to()
* tests
* fixes
* fix: missing AutoencoderKL lora adapter (#9807)
* fix: missing AutoencoderKL lora adapter
* fix
---------
Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
* fixes
* fix condition test
* updates
* updates
* remove is_offloaded.
* fixes
* better
* empty
---------
Co-authored-by: Emmanuel Benazera <emmanuel.benazera@jolibrain.com>
|
2024-12-04 22:27:43 +05:30 |
|
Sayak Paul
|
827b6c25f9
|
[CI] Add quantization (#9832)
* add quantization to nightly CI.
* prep.
* fix lib name.
* remove deps that are not needed.
* fix slice.
|
2024-12-02 14:53:43 +05:30 |
|
Sayak Paul
|
60ffa84253
|
[bitsandbbytes] follow-ups (#9730)
* bnb follow ups.
* add a warning when dtypes mismatch.
* fx-copies
* clear cache.
* check_if_quantized_param
* add a check on shape.
* updates
* docs
* improve readability.
* resources.
* fix
|
2024-10-22 16:00:05 +05:30 |
|
Sayak Paul
|
b821f006d0
|
[Quantization] Add quantization support for bitsandbytes (#9213)
* quantization config.
* fix-copies
* fix
* modules_to_not_convert
* add bitsandbytes utilities.
* make progress.
* fixes
* quality
* up
* up
rotary embedding refactor 2: update comments, fix dtype for use_real=False (#9312)
fix notes and dtype
up
up
* minor
* up
* up
* fix
* provide credits where due.
* make configurations work.
* fixes
* fix
* update_missing_keys
* fix
* fix
* make it work.
* fix
* provide credits to transformers.
* empty commit
* handle to() better.
* tests
* change to bnb from bitsandbytes
* fix tests
fix slow quality tests
SD3 remark
fix
complete int4 tests
add a readme to the test files.
add model cpu offload tests
warning test
* better safeguard.
* change merging status
* courtesy to transformers.
* move upper.
* better
* make the unused kwargs warning friendlier.
* harmonize changes with https://github.com/huggingface/transformers/pull/33122
* style
* trainin tests
* feedback part i.
* Add Flux inpainting and Flux Img2Img (#9135)
---------
Co-authored-by: yiyixuxu <yixu310@gmail.com>
Update `UNet2DConditionModel`'s error messages (#9230)
* refactor
[CI] Update Single file Nightly Tests (#9357)
* update
* update
feedback.
improve README for flux dreambooth lora (#9290)
* improve readme
* improve readme
* improve readme
* improve readme
fix one uncaught deprecation warning for accessing vae_latent_channels in VaeImagePreprocessor (#9372)
deprecation warning vae_latent_channels
add mixed int8 tests and more tests to nf4.
[core] Freenoise memory improvements (#9262)
* update
* implement prompt interpolation
* make style
* resnet memory optimizations
* more memory optimizations; todo: refactor
* update
* update animatediff controlnet with latest changes
* refactor chunked inference changes
* remove print statements
* update
* chunk -> split
* remove changes from incorrect conflict resolution
* remove changes from incorrect conflict resolution
* add explanation of SplitInferenceModule
* update docs
* Revert "update docs"
This reverts commit c55a50a271.
* update docstring for freenoise split inference
* apply suggestions from review
* add tests
* apply suggestions from review
quantization docs.
docs.
* Revert "Add Flux inpainting and Flux Img2Img (#9135)"
This reverts commit 5799954dd4.
* tests
* don
* Apply suggestions from code review
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* contribution guide.
* changes
* empty
* fix tests
* harmonize with https://github.com/huggingface/transformers/pull/33546.
* numpy_cosine_distance
* config_dict modification.
* remove if config comment.
* note for load_state_dict changes.
* float8 check.
* quantizer.
* raise an error for non-True low_cpu_mem_usage values when using quant.
* low_cpu_mem_usage shenanigans when using fp32 modules.
* don't re-assign _pre_quantization_type.
* make comments clear.
* remove comments.
* handle mixed types better when moving to cpu.
* add tests to check if we're throwing warning rightly.
* better check.
* fix 8bit test_quality.
* handle dtype more robustly.
* better message when keep_in_fp32_modules.
* handle dtype casting.
* fix dtype checks in pipeline.
* fix warning message.
* Update src/diffusers/models/modeling_utils.py
Co-authored-by: YiYi Xu <yixu310@gmail.com>
* mitigate the confusing cpu warning
---------
Co-authored-by: Vishnu V Jaddipal <95531133+Gothos@users.noreply.github.com>
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
Co-authored-by: YiYi Xu <yixu310@gmail.com>
|
2024-10-21 10:11:57 +05:30 |
|