diffusers

mirror of https://github.com/huggingface/diffusers.git synced 2026-01-27 17:22:53 +03:00

Author	SHA1	Message	Date
Patrick von Platen	4deb16e830	[Docs] Advertise fp16 instead of autocast (#740 ) up	2022-10-05 22:20:53 +02:00
Patrick von Platen	78744b6a8f	No more use_auth_token=True (#733 ) * up * uP * uP * make style * Apply suggestions from code review * up * finish	2022-10-05 17:16:15 +02:00
Yuta Hayashibe	7e92c5bc73	Fix typos (#718 ) * Fix typos * Update examples/dreambooth/train_dreambooth.py Co-authored-by: Pedro Cuenca <pedro@huggingface.co> Co-authored-by: Pedro Cuenca <pedro@huggingface.co>	2022-10-04 15:22:14 +02:00
Nouamane Tazi	daa22050c7	[docs] fix table in fp16.mdx (#683 )	2022-09-30 15:15:22 +02:00
Nouamane Tazi	9ebaea545f	Optimize Stable Diffusion (#371 ) * initial commit * make UNet stream capturable * try to fix noise_pred value * remove cuda graph and keep NB * non blocking unet with PNDMScheduler * make timesteps np arrays for pndm scheduler because lists don't get formatted to tensors in `self.set_format` * make max async in pndm * use channel last format in unet * avoid moving timesteps device in each unet call * avoid memcpy op in `get_timestep_embedding` * add `channels_last` kwarg to `DiffusionPipeline.from_pretrained` * update TODO * replace `channels_last` kwarg with `memory_format` for more generality * revert the channels_last changes to leave it for another PR * remove non_blocking when moving input ids to device * remove blocking from all .to() operations at beginning of pipeline * fix merging * fix merging * model can run in other precisions without autocast * attn refactoring * Revert "attn refactoring" This reverts commit `0c70c0e189`. * remove restriction to run conv_norm in fp32 * use `baddbmm` instead of `matmul`for better in attention for better perf * removing all reshapes to test perf * Revert "removing all reshapes to test perf" This reverts commit `006ccb8a8c`. * add shapes comments * hardcore whats needed for jitting * Revert "hardcore whats needed for jitting" This reverts commit `2fa9c698ea`. * Revert "remove restriction to run conv_norm in fp32" This reverts commit `cec592890c`. * revert using baddmm in attention's forward * cleanup comment * remove restriction to run conv_norm in fp32. no quality loss was noticed This reverts commit `cc9bc1339c`. * add more optimizations techniques to docs * Revert "add shapes comments" This reverts commit `31c58eadb8`. * apply suggestions * make quality * apply suggestions * styling * `scheduler.timesteps` are now arrays so we dont need .to() * remove useless .type() * use mean instead of max in `test_stable_diffusion_inpaint_pipeline_k_lms` * move scheduler timestamps to correct device if tensors * add device to `set_timesteps` in LMSD scheduler * `self.scheduler.set_timesteps` now uses device arg for schedulers that accept it * quick fix * styling * remove kwargs from schedulers `set_timesteps` * revert to using max in K-LMS inpaint pipeline test * Revert "`self.scheduler.set_timesteps` now uses device arg for schedulers that accept it" This reverts commit `00d5a51e5c`. * move timesteps to correct device before loop in SD pipeline * apply previous fix to other SD pipelines * UNet now accepts tensor timesteps even on wrong device, to avoid errors - it shouldnt affect performance if timesteps are alrdy on correct device - it does slow down performance if they're on the wrong device * fix pipeline when timesteps are arrays with strides	2022-09-30 09:49:13 +02:00
Pedro Cuenca	1a79969d23	Initial ONNX doc (TODO: Installation) (#426 )	2022-09-08 16:46:24 +02:00
Patrick von Platen	98f346835a	[Docs] Minor fixes in optimization section (#420 ) * uP * more	2022-09-08 13:13:46 +02:00
Pedro Cuenca	c29d81c3e3	Docs: fp16 page (#404 ) * Initial version of `fp16` page. * Fix typo in README. * Change titles of fp16 section in toctree. * PR suggestion Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> * PR suggestion Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> * Clarify attention slicing is useful even for batches of 1 Explained by @patrickvonplaten after a suggestion by @keturn. Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> * Do not talk about `batches` in `enable_attention_slicing`. * Use Tip (just for fun), add link to method. * Comment about fp16 results looking the same as float32 in practice. * Style: docstring line wrapping. Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>	2022-09-08 09:17:51 +02:00
Pedro Cuenca	492f5c9a6c	Docs: optimization / special hardware (#390 ) Add mps documentation.	2022-09-07 16:27:14 +02:00
Patrick von Platen	5a38033de4	[Docs] Let's go (#385 )	2022-09-07 11:31:13 +02:00

10 Commits