1
0
mirror of https://github.com/huggingface/diffusers.git synced 2026-01-27 17:22:53 +03:00
Commit Graph

9 Commits

Author SHA1 Message Date
Patrick von Platen
78744b6a8f No more use_auth_token=True (#733)
* up

* uP

* uP

* make style

* Apply suggestions from code review

* up

* finish
2022-10-05 17:16:15 +02:00
Yuta Hayashibe
7e92c5bc73 Fix typos (#718)
* Fix typos

* Update examples/dreambooth/train_dreambooth.py

Co-authored-by: Pedro Cuenca <pedro@huggingface.co>

Co-authored-by: Pedro Cuenca <pedro@huggingface.co>
2022-10-04 15:22:14 +02:00
Nouamane Tazi
daa22050c7 [docs] fix table in fp16.mdx (#683) 2022-09-30 15:15:22 +02:00
Nouamane Tazi
9ebaea545f Optimize Stable Diffusion (#371)
* initial commit

* make UNet stream capturable

* try to fix noise_pred value

* remove cuda graph and keep NB

* non blocking unet with PNDMScheduler

* make timesteps np arrays for pndm scheduler
because lists don't get formatted to tensors in `self.set_format`

* make max async in pndm

* use channel last format in unet

* avoid moving timesteps device in each unet call

* avoid memcpy op in `get_timestep_embedding`

* add `channels_last` kwarg to `DiffusionPipeline.from_pretrained`

* update TODO

* replace `channels_last` kwarg with `memory_format` for more generality

* revert the channels_last changes to leave it for another PR

* remove non_blocking when moving input ids to device

* remove blocking from all .to() operations at beginning of pipeline

* fix merging

* fix merging

* model can run in other precisions without autocast

* attn refactoring

* Revert "attn refactoring"

This reverts commit 0c70c0e189.

* remove restriction to run conv_norm in fp32

* use `baddbmm` instead of `matmul`for better in attention for better perf

* removing all reshapes to test perf

* Revert "removing all reshapes to test perf"

This reverts commit 006ccb8a8c.

* add shapes comments

* hardcore whats needed for jitting

* Revert "hardcore whats needed for jitting"

This reverts commit 2fa9c698ea.

* Revert "remove restriction to run conv_norm in fp32"

This reverts commit cec592890c.

* revert using baddmm in attention's forward

* cleanup comment

* remove restriction to run conv_norm in fp32. no quality loss was noticed

This reverts commit cc9bc1339c.

* add more optimizations techniques to docs

* Revert "add shapes comments"

This reverts commit 31c58eadb8.

* apply suggestions

* make quality

* apply suggestions

* styling

* `scheduler.timesteps` are now arrays so we dont need .to()

* remove useless .type()

* use mean instead of max in `test_stable_diffusion_inpaint_pipeline_k_lms`

* move scheduler timestamps to correct device if tensors

* add device to `set_timesteps` in LMSD scheduler

* `self.scheduler.set_timesteps` now uses device arg for schedulers that accept it

* quick fix

* styling

* remove kwargs from schedulers `set_timesteps`

* revert to using max in K-LMS inpaint pipeline test

* Revert "`self.scheduler.set_timesteps` now uses device arg for schedulers that accept it"

This reverts commit 00d5a51e5c.

* move timesteps to correct device before loop in SD pipeline

* apply previous fix to other SD pipelines

* UNet now accepts tensor timesteps even on wrong device, to avoid errors
- it shouldnt affect performance if timesteps are alrdy on correct device
- it does slow down performance if they're on the wrong device

* fix pipeline when timesteps are arrays with strides
2022-09-30 09:49:13 +02:00
Pedro Cuenca
1a79969d23 Initial ONNX doc (TODO: Installation) (#426) 2022-09-08 16:46:24 +02:00
Patrick von Platen
98f346835a [Docs] Minor fixes in optimization section (#420)
* uP

* more
2022-09-08 13:13:46 +02:00
Pedro Cuenca
c29d81c3e3 Docs: fp16 page (#404)
* Initial version of `fp16` page.

* Fix typo in README.

* Change titles of fp16 section in toctree.

* PR suggestion

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* PR suggestion

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* Clarify attention slicing is useful even for batches of 1

Explained by @patrickvonplaten after a suggestion by @keturn.

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* Do not talk about `batches` in `enable_attention_slicing`.

* Use Tip (just for fun), add link to method.

* Comment about fp16 results looking the same as float32 in practice.

* Style: docstring line wrapping.

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
2022-09-08 09:17:51 +02:00
Pedro Cuenca
492f5c9a6c Docs: optimization / special hardware (#390)
Add mps documentation.
2022-09-07 16:27:14 +02:00
Patrick von Platen
5a38033de4 [Docs] Let's go (#385) 2022-09-07 11:31:13 +02:00