Sayak Paul
|
cefa28f449
|
[docs] Promote AutoModel usage (#11300)
* docs: promote the usage of automodel.
* bitsandbytes
* Apply suggestions from code review
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
---------
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
|
2025-04-15 09:25:40 +05:30 |
|
Dhruv Nair
|
9add071592
|
[Quantization] Allow loading TorchAO serialized Tensor objects with torch>=2.6 (#11018)
* update
* update
* update
* update
* update
* update
* update
* update
* update
|
2025-03-11 10:52:01 +05:30 |
|
Dhruv Nair
|
f5edaa7894
|
[Quantization] Add Quanto backend (#10756)
* update
* updaet
* update
* update
* update
* update
* update
* update
* update
* update
* update
* update
* Update docs/source/en/quantization/quanto.md
Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
* update
* update
* update
* update
* update
* update
* update
* update
* update
* update
* update
* update
* update
* update
* update
* update
* update
* update
* Update src/diffusers/quantizers/quanto/utils.py
Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
* update
* update
---------
Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
|
2025-03-10 08:33:05 +05:30 |
|
Aryan
|
cd991d1e1a
|
Fix TorchAO related bugs; revert device_map changes (#10371)
* Revert "Add support for sharded models when TorchAO quantization is enabled (#10256)"
This reverts commit 41ba8c0bf6.
* update tests
* udpate
* update
* update
* update device map tests
* apply review suggestions
* update
* make style
* fix
* update docs
* update tests
* update workflow
* update
* improve tests
* allclose tolerance
* Update src/diffusers/models/modeling_utils.py
Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
* Update tests/quantization/torchao/test_torchao.py
Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
* improve tests
* fix
* update correct slices
---------
Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
|
2024-12-25 15:37:49 +05:30 |
|
Sayak Paul
|
6a970a45c5
|
[docs] fix: torchao example. (#10278)
fix: torchao example.
|
2024-12-23 11:03:50 +05:30 |
|
Sayak Paul
|
d41388145e
|
[Docs] Update gguf.md to remove generator from the pipeline from_pretrained (#10299)
Update gguf.md to remove generator from the pipeline from_pretrained
|
2024-12-21 07:15:03 +05:30 |
|
Steven Liu
|
7d4db57037
|
[docs] Fix quantization links (#10323)
Update overview.md
|
2024-12-20 08:30:21 -08:00 |
|
Dhruv Nair
|
b389f339ec
|
Fix Doc links in GGUF and Quantization overview docs (#10279)
* update
* Update docs/source/en/quantization/gguf.md
Co-authored-by: Aryan <aryan@huggingface.co>
---------
Co-authored-by: Aryan <aryan@huggingface.co>
|
2024-12-18 18:32:36 +05:30 |
|
Dhruv Nair
|
e24941b2a7
|
[Single File] Add GGUF support (#9964)
* update
* update
* update
* update
* update
* update
* update
* update
* update
* update
* update
* update
* update
* update
* update
* update
* update
* update
* update
* update
* update
* update
* update
* update
* update
* update
* update
* Update src/diffusers/quantizers/gguf/utils.py
Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
* update
* update
* update
* update
* update
* update
* update
* update
* update
* update
* Update docs/source/en/quantization/gguf.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* update
* update
* update
* update
---------
Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
|
2024-12-17 16:09:37 +05:30 |
|
Aryan
|
9f00c617a0
|
[core] TorchAO Quantizer (#10009)
* torchao quantizer
---------
Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
|
2024-12-16 13:35:40 -10:00 |
|
Aritra Roy Gosthipaty
|
bf64b32652
|
[Guide] Quantize your Diffusion Models with bnb (#10012)
* chore: initial draft
* Apply suggestions from code review
Co-authored-by: Pedro Cuenca <pedro@huggingface.co>
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* chore: link in place
* chore: review suggestions
* Apply suggestions from code review
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* chore: review suggestions
* Update docs/source/en/quantization/bitsandbytes.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* review suggestions
* chore: review suggestions
* Apply suggestions from code review
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* adding same changes to 4 bit section
* review suggestions
---------
Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
Co-authored-by: Pedro Cuenca <pedro@huggingface.co>
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
|
2024-12-05 13:54:03 -08:00 |
|
Sayak Paul
|
60ffa84253
|
[bitsandbbytes] follow-ups (#9730)
* bnb follow ups.
* add a warning when dtypes mismatch.
* fx-copies
* clear cache.
* check_if_quantized_param
* add a check on shape.
* updates
* docs
* improve readability.
* resources.
* fix
|
2024-10-22 16:00:05 +05:30 |
|
Sayak Paul
|
b821f006d0
|
[Quantization] Add quantization support for bitsandbytes (#9213)
* quantization config.
* fix-copies
* fix
* modules_to_not_convert
* add bitsandbytes utilities.
* make progress.
* fixes
* quality
* up
* up
rotary embedding refactor 2: update comments, fix dtype for use_real=False (#9312)
fix notes and dtype
up
up
* minor
* up
* up
* fix
* provide credits where due.
* make configurations work.
* fixes
* fix
* update_missing_keys
* fix
* fix
* make it work.
* fix
* provide credits to transformers.
* empty commit
* handle to() better.
* tests
* change to bnb from bitsandbytes
* fix tests
fix slow quality tests
SD3 remark
fix
complete int4 tests
add a readme to the test files.
add model cpu offload tests
warning test
* better safeguard.
* change merging status
* courtesy to transformers.
* move upper.
* better
* make the unused kwargs warning friendlier.
* harmonize changes with https://github.com/huggingface/transformers/pull/33122
* style
* trainin tests
* feedback part i.
* Add Flux inpainting and Flux Img2Img (#9135)
---------
Co-authored-by: yiyixuxu <yixu310@gmail.com>
Update `UNet2DConditionModel`'s error messages (#9230)
* refactor
[CI] Update Single file Nightly Tests (#9357)
* update
* update
feedback.
improve README for flux dreambooth lora (#9290)
* improve readme
* improve readme
* improve readme
* improve readme
fix one uncaught deprecation warning for accessing vae_latent_channels in VaeImagePreprocessor (#9372)
deprecation warning vae_latent_channels
add mixed int8 tests and more tests to nf4.
[core] Freenoise memory improvements (#9262)
* update
* implement prompt interpolation
* make style
* resnet memory optimizations
* more memory optimizations; todo: refactor
* update
* update animatediff controlnet with latest changes
* refactor chunked inference changes
* remove print statements
* update
* chunk -> split
* remove changes from incorrect conflict resolution
* remove changes from incorrect conflict resolution
* add explanation of SplitInferenceModule
* update docs
* Revert "update docs"
This reverts commit c55a50a271.
* update docstring for freenoise split inference
* apply suggestions from review
* add tests
* apply suggestions from review
quantization docs.
docs.
* Revert "Add Flux inpainting and Flux Img2Img (#9135)"
This reverts commit 5799954dd4.
* tests
* don
* Apply suggestions from code review
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* contribution guide.
* changes
* empty
* fix tests
* harmonize with https://github.com/huggingface/transformers/pull/33546.
* numpy_cosine_distance
* config_dict modification.
* remove if config comment.
* note for load_state_dict changes.
* float8 check.
* quantizer.
* raise an error for non-True low_cpu_mem_usage values when using quant.
* low_cpu_mem_usage shenanigans when using fp32 modules.
* don't re-assign _pre_quantization_type.
* make comments clear.
* remove comments.
* handle mixed types better when moving to cpu.
* add tests to check if we're throwing warning rightly.
* better check.
* fix 8bit test_quality.
* handle dtype more robustly.
* better message when keep_in_fp32_modules.
* handle dtype casting.
* fix dtype checks in pipeline.
* fix warning message.
* Update src/diffusers/models/modeling_utils.py
Co-authored-by: YiYi Xu <yixu310@gmail.com>
* mitigate the confusing cpu warning
---------
Co-authored-by: Vishnu V Jaddipal <95531133+Gothos@users.noreply.github.com>
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
Co-authored-by: YiYi Xu <yixu310@gmail.com>
|
2024-10-21 10:11:57 +05:30 |
|