* Timestep bias for fine-tuning SDXL
* Adjust parameter choices to include "range" and reword the help statements
* Condition our use of weighted timesteps on the value of timestep_bias_strategy
* style
---------
Co-authored-by: bghira <bghira@users.github.com>
Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
* Fix FullAdapterXL.total_downscale_factor.
* Fix incorrect error message in T2IAdapter.__init__(...).
* Move IP-Adapter test_total_downscale_factor(...) to pipeline test file (requested in code review).
* Add more info to error message about an unsupported T2I-Adapter adapter_type.
---------
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
Make sure the repo_id is valid before sending it to huggingface_hub to get a more understandable error message.
Re #5110
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
* SDXL microconditioning documentation should indicate the correct default order of parameters, so that developers know
* SDXL microconditioning documentation should indicate the correct default order of parameters, so that developers know
* empty
---------
Co-authored-by: bghira <bghira@users.github.com>
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
* support transformer_layers_per block in flax UNet
* add support for text_time additional embeddings to Flax UNet
* rename attention layers for VAE
* add shape asserts when renaming attention layers
* transpose VAE attention layers
* add pipeline flax SDXL code [WIP]
* continue add pipeline flax SDXL code [WIP]
* cleanup
* Working on JIT support
Fixed prompt embedding shapes so they work in parallel mode. Assuming we
always have both text encoders for now, for simplicity.
* Fixing embeddings (untested)
* Remove spurious line
* Shard guidance_scale when jitting.
* Decode images
* Fix sharding
* style
* Refiner UNet can be loaded.
* Refiner / img2img pipeline
* Allow latent outputs from base and latent inputs in refiner
This makes it possible to chain base + refiner without having to use the
vae decoder in the base model, the vae encoder in the refiner, skipping
conversions to/from PIL, and avoiding TPU <-> CPU memory copies.
* Adapt to FlaxCLIPTextModelOutput
* Update Flax XL pipeline to FlaxCLIPTextModelOutput
* make fix-copies
* make style
* add euler scheduler
* Fix import
* Fix copies, comment unused code.
* Fix SDXL Flax imports
* Fix euler discrete begin
* improve init import
* finish
* put discrete euler in init
* fix flax euler
* Fix more
* make style
* correct init
* correct init
* Temporarily remove FlaxStableDiffusionXLImg2ImgPipeline
* correct pipelines
* finish
---------
Co-authored-by: Martin Müller <martin.muller.me@gmail.com>
Co-authored-by: patil-suraj <surajp815@gmail.com>
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
* min-SNR gamma for Dreambooth training
* Align the mse_loss_weights style with SDXL training example
---------
Co-authored-by: bghira <bghira@users.github.com>
Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
* Resolve v_prediction issue for min-SNR gamma weighted loss function
* Combine MSE loss calculation of epsilon and velocity, with a note about the application of the epsilon code to sample prediction
* style
---------
Co-authored-by: bghira <bghira@users.github.com>
Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
* fix test
* initial commit
* change test
* updates:
* fix tests
* test fix
* test fix
* fix tests
* make test faster
* clean up
* fix precision in test
* fix precision
* Fix tests
* Fix logging test
* fix test
* fix test
---------
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
* [SDXL] Make sure multi batch prompt embeds works
* [SDXL] Make sure multi batch prompt embeds works
* improve more
* improve more
* Apply suggestions from code review
Fixed `get_word_inds` mistake/typo in P2P community pipeline
The function `get_word_inds` was taking a string of text and either a word (str) or a word index (int) and returned the indices of token(s) the word would be encoded to.
However, there was a typo, in which in the second `if` branch the word was checked to be a `str` **again**, not `int`, which resulted in an [example code from the docs](https://github.com/huggingface/diffusers/tree/main/examples/community#prompt2prompt-pipeline) to result in an error
* add support for clip skip
* fix condition
* fix
* add clip_output_layer_to_default
* expose
* remove the previous functions.
* correct condition.
* apply final layer norm
* address feedback
* Apply suggestions from code review
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
* refactor clip_skip.
* port to the other pipelines.
* fix copies one more time
---------
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>