* Move files to research-projects.
* docs: add IP Adapter training instructions
* Delete venv
* Update examples/ip_adapter/tutorial_train_sdxl.py
Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
* Cherry-picked commits and re-moved files
to research_projects.
* make style.
* Update toctree and delete ip_adapter.
* Nit Fix
* Fix nit.
* Fix nit.
* Create training script for single GPU and set
model format to .safetensors
* Add sample inference script and restore _toctree
* Restore toctree.yaml
* fix spacing.
* Update toctree.yaml
---------
Co-authored-by: AMohamedAakhil <a.aakhilmohamed@gmail.com>
Co-authored-by: BootesVoid <78485654+AMohamedAakhil@users.noreply.github.com>
Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
* Add server example.
* Minor updates to README.
* Add fixes after local testing.
* Apply suggestions from code review
Updates to README from code review
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* More doc updates.
* Maybe this will work to build the docs correctly?
* Fix style issues.
* Fix toc.
* Minor reformatting.
* Move docs to proper loc.
* Fix missing tick.
* Apply suggestions from code review
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* Sync docs changes back to README.
* Very minor update to docs to add space.
---------
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* Add new community pipeline for 'Adaptive Mask Inpainting', introduced in [ECCV2024] Beyond the Contact: Discovering Comprehensive Affordance for 3D Objects from Pre-trained 2D Diffusion Models
Update train_controlnet_flux.py
Fix the problem of inconsistency between size of image and size of validation_image which causes np.stack to report error.
Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
* modelcard generation edit
* add missed tag
* fix param name
* fix var
* change str to dict
* add use_dora check
* use correct tags for lora
* make style && make quality
---------
Co-authored-by: Aryan <aryan@huggingface.co>
* make lora target modules configurable and change the default
* style
* make lora target modules configurable and change the default
* fix bug when using prodigy and training te
* fix mixed precision training as proposed in https://github.com/huggingface/diffusers/pull/9565 for full dreambooth as well
* add test and notes
* style
* address sayaks comments
* style
* fix test
---------
Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
* fix: removed setting of text encoder lr for T5 as it's not being tuned
* fix: removed setting of text encoder lr for T5 as it's not being tuned
---------
Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
Co-authored-by: Linoy Tsaban <57615435+linoytsaban@users.noreply.github.com>
* add ostris trainer to README & add cache latents of vae
* add ostris trainer to README & add cache latents of vae
* style
* readme
* add test for latent caching
* add ostris noise scheduler
9ee1ef2a0a/toolkit/samplers/custom_flowmatch_sampler.py (L95)
* style
* fix import
* style
* fix tests
* style
* --change upcasting of transformer?
* update readme according to main
* add pivotal tuning for CLIP
* fix imports, encode_prompt call,add TextualInversionLoaderMixin to FluxPipeline for inference
* TextualInversionLoaderMixin support for FluxPipeline for inference
* move changes to advanced flux script, revert canonical
* add latent caching to canonical script
* revert changes to canonical script to keep it separate from https://github.com/huggingface/diffusers/pull/9160
* revert changes to canonical script to keep it separate from https://github.com/huggingface/diffusers/pull/9160
* style
* remove redundant line and change code block placement to align with logic
* add initializer_token arg
* add transformer frac for range support from pure textual inversion to the orig pivotal tuning
* support pure textual inversion - wip
* adjustments to support pure textual inversion and transformer optimization in only part of the epochs
* fix logic when using initializer token
* fix pure_textual_inversion_condition
* fix ti/pivotal loading of last validation run
* remove embeddings loading for ti in final training run (to avoid adding huggingface hub dependency)
* support pivotal for t5
* adapt pivotal for T5 encoder
* adapt pivotal for T5 encoder and support in flux pipeline
* t5 pivotal support + support fo pivotal for clip only or both
* fix param chaining
* fix param chaining
* README first draft
* readme
* readme
* readme
* style
* fix import
* style
* add fix from https://github.com/huggingface/diffusers/pull/9419
* add to readme, change function names
* te lr changes
* readme
* change concept tokens logic
* fix indices
* change arg name
* style
* dummy test
* revert dummy test
* reorder pivoting
* add warning in case the token abstraction is not the instance prompt
* experimental - wip - specific block training
* fix documentation and token abstraction processing
* remove transformer block specification feature (for now)
* style
* fix copies
* fix indexing issue when --initializer_concept has different amounts
* add if TextualInversionLoaderMixin to all flux pipelines
* style
* fix import
* fix imports
* address review comments - remove necessary prints & comments, use pin_memory=True, use free_memory utils, unify warning and prints
* style
* logger info fix
* make lora target modules configurable and change the default
* make lora target modules configurable and change the default
* style
* make lora target modules configurable and change the default, add notes to readme
* style
* add tests
* style
* fix repo id
* add updated requirements for advanced flux
* fix indices of t5 pivotal tuning embeddings
* fix path in test
* remove `pin_memory`
* fix filename of embedding
* fix filename of embedding
---------
Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
Co-authored-by: YiYi Xu <yixu310@gmail.com>
* Improve the performance and suitable for NPU
* Improve the performance and suitable for NPU computing
* Improve the performance and suitable for NPU
* Improve the performance and suitable for NPU
* Improve the performance and suitable for NPU
* Improve the performance and suitable for NPU
---------
Co-authored-by: θη‘ <jiangshuo9@h-partners.com>
Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
* Removed int8 to float32 conversion (`* 2.0 - 1.0`) from `train_transforms` as it caused image overexposure.
Added `_resize_for_rectangle_crop` function to enable video cropping functionality. The cropping mode can be configured via `video_reshape_mode`, supporting options: ['center', 'random', 'none'].
* The number 127.5 may experience precision loss during division operations.
* wandb request pil image Type
* Resizing bug
* del jupyter
* make style
* Update examples/cogvideo/README.md
* make style
---------
Co-authored-by: --unset <--unset>
Co-authored-by: Aryan <aryan@huggingface.co>