a120092009
|
11d8e3ce2c
|
[Quantization] support pass MappingType for TorchAoConfig (#10927)
* [Quantization] support pass MappingType for TorchAoConfig
* Apply style fixes
---------
Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
|
2025-03-04 16:40:50 +05:30 |
|
Marc Sun
|
f5929e0306
|
[FEAT] Model loading refactor (#10604)
* first draft model loading refactor
* revert name change
* fix bnb
* revert name
* fix dduf
* fix huanyan
* style
* Update src/diffusers/models/model_loading_utils.py
Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
* suggestions from reviews
* Update src/diffusers/models/modeling_utils.py
Co-authored-by: YiYi Xu <yixu310@gmail.com>
* remove safetensors check
* fix default value
* more fix from suggestions
* revert logic for single file
* style
* typing + fix couple of issues
* improve speed
* Update src/diffusers/models/modeling_utils.py
Co-authored-by: Aryan <aryan@huggingface.co>
* fp8 dtype
* add tests
* rename resolved_archive_file to resolved_model_file
* format
* map_location default cpu
* add utility function
* switch to smaller model + test inference
* Apply suggestions from code review
Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
* rm comment
* add log
* Apply suggestions from code review
Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
* add decorator
* cosine sim instead
* fix use_keep_in_fp32_modules
* comm
---------
Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
Co-authored-by: YiYi Xu <yixu310@gmail.com>
Co-authored-by: Aryan <aryan@huggingface.co>
|
2025-02-19 17:34:53 +05:30 |
|
Aryan
|
aa79d7da46
|
Test sequential cpu offload for torchao quantization (#10506)
test sequential cpu offload
|
2025-01-14 09:54:06 +05:30 |
|
Aryan
|
cd991d1e1a
|
Fix TorchAO related bugs; revert device_map changes (#10371)
* Revert "Add support for sharded models when TorchAO quantization is enabled (#10256)"
This reverts commit 41ba8c0bf6.
* update tests
* udpate
* update
* update
* update device map tests
* apply review suggestions
* update
* make style
* fix
* update docs
* update tests
* update workflow
* update
* improve tests
* allclose tolerance
* Update src/diffusers/models/modeling_utils.py
Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
* Update tests/quantization/torchao/test_torchao.py
Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
* improve tests
* fix
* update correct slices
---------
Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
|
2024-12-25 15:37:49 +05:30 |
|
Aryan
|
02c777c065
|
[tests] Refactor TorchAO serialization fast tests (#10271)
refactor
|
2024-12-23 11:04:57 +05:30 |
|
Aryan
|
ffc0eaab6d
|
Bump minimum TorchAO version to 0.7.0 (#10293)
* bump min torchao version to 0.7.0
* update
|
2024-12-23 11:03:04 +05:30 |
|
Aryan
|
41ba8c0bf6
|
Add support for sharded models when TorchAO quantization is enabled (#10256)
* add sharded + device_map check
|
2024-12-19 15:42:20 -10:00 |
|
Aryan
|
1524781b88
|
[tests] Remove/rename unsupported quantization torchao type (#10263)
update
|
2024-12-17 21:43:15 +05:30 |
|
Aryan
|
9f00c617a0
|
[core] TorchAO Quantizer (#10009)
* torchao quantizer
---------
Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
|
2024-12-16 13:35:40 -10:00 |
|