diffusers

mirror of https://github.com/huggingface/diffusers.git synced 2026-01-27 17:22:53 +03:00

Author	SHA1	Message	Date
scxue	784db0eaab	Add cross attention type for Sana-Sprint training in diffusers. (#11514 ) * test permission * Add cross attention type for Sana-Sprint. * Add Sana-Sprint training script in diffusers. * make style && make quality; * modify the attention processor with `set_attn_processor` and change `SanaAttnProcessor3_0` to `SanaVanillaAttnProcessor` * Add import for SanaVanillaAttnProcessor * Add README file. * Apply suggestions from code review * style * Update examples/research_projects/sana/README.md --------- Co-authored-by: lawrence-cj <cjs1020440147@icloud.com> Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>	2025-05-08 18:55:29 +05:30
Linoy Tsaban	66e50d4e24	[LoRA] make lora alpha and dropout configurable (#11467 ) * add lora_alpha and lora_dropout * Apply style fixes * add lora_alpha and lora_dropout * Apply style fixes * revert lora_alpha until #11324 is merged * Apply style fixes * empty commit --------- Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>	2025-05-08 11:54:50 +03:00
RogerSinghChugh	ed4efbd63d	Update training script for txt to img sdxl with lora supp with new interpolation. (#11496 ) * Update training script for txt to img sdxl with lora supp with new interpolation. * ran make style and make quality.	2025-05-05 12:33:28 -04:00
Yijun Lee	9c29e938d7	Set LANCZOS as the default interpolation method for image resizing. (#11492 ) * Set LANCZOS as the default interpolation method for image resizing. * style: run make style and quality checks	2025-05-05 12:18:40 -04:00
Sayak Paul	071807c853	[training] feat: enable quantization for hidream lora training. (#11494 ) * feat: enable quantization for hidream lora training. * better handle compute dtype. * finalize. * fix dtype. --------- Co-authored-by: Linoy Tsaban <57615435+linoytsaban@users.noreply.github.com>	2025-05-05 20:44:35 +05:30
Evan Han	ee1516e5c7	[train_dreambooth_lora_lumina2] Add LANCZOS as the default interpolation mode for image resizing (#11491 ) [ADD] interpolation	2025-05-05 10:41:33 -04:00
MinJu-Ha	ec9323996b	[train_dreambooth_lora_sdxl] Add --image_interpolation_mode option for image resizing (default to lanczos) (#11490 ) feat(train_dreambooth_lora_sdxl): support --image_interpolation_mode with default to lanczos	2025-05-05 10:19:30 -04:00
Parag Ekbote	fc5e906689	[train_text_to_image_sdxl]Add LANCZOS as default interpolation mode for image resizing (#11455 ) * Add LANCZOS as default interplotation mode. * update script * Update as per code review. * make style.	2025-05-05 09:52:19 -04:00
Yash	ec3d58286d	[train_dreambooth_lora_flux_advanced] Add LANCZOS as the default interpolation mode for image resizing (#11472 ) * [train_controlnet_sdxl] Add LANCZOS as the default interpolation mode for image resizing * [train_dreambooth_lora_flux_advanced] Add LANCZOS as the default interpolation mode for image resizing	2025-05-02 18:14:41 -04:00
Yuanzhou	ed6cf52572	[train_dreambooth_lora_sdxl_advanced] Add LANCZOS as the default interpolation mode for image resizing (#11471 )	2025-05-02 16:46:01 -04:00
co63oc	86294d3c7f	Fix typos in docs and comments (#11416 ) * Fix typos in docs and comments * Apply style fixes --------- Co-authored-by: Sayak Paul <spsayakpaul@gmail.com> Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>	2025-04-30 20:30:53 -10:00
Vaibhav Kumawat	daf0a23958	Add LANCZOS as default interplotation mode. (#11463 ) * Add LANCZOS as default interplotation mode. * LANCZOS as default interplotation * LANCZOS as default interplotation mode * Added LANCZOS as default interplotation mode	2025-04-30 14:22:38 -04:00
captainzz	8cd7426e56	Add StableDiffusion3InstructPix2PixPipeline (#11378 ) * upload StableDiffusion3InstructPix2PixPipeline * Move to community * Add readme * Fix images * remove images * Change image url * fix * Apply style fixes	2025-04-30 06:13:12 -04:00
Youlun Peng	58431f102c	Set LANCZOS as the default interpolation for image resizing in ControlNet training (#11449 ) Set LANCZOS as the default interpolation for image resizing	2025-04-29 08:47:02 -04:00
Linoy Tsaban	0ac1d5b482	[Hi-Dream LoRA] fix bug in validation (#11439 ) remove unnecessary pipeline moving to cpu in validation Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>	2025-04-28 06:22:32 -10:00
tongyu	3da98e7ee3	[train_text_to_image_lora] Better image interpolation in training scripts follow up (#11427 ) * Update train_text_to_image_lora.py * update_train_text_to_image_lora	2025-04-28 11:23:24 -04:00
tongyu	b3b04fefde	[train_text_to_image] Better image interpolation in training scripts follow up (#11426 ) * Update train_text_to_image.py * update	2025-04-28 10:50:33 -04:00
Mert Erbak	bd96a084d3	[train_dreambooth_lora.py] Set LANCZOS as default interpolation mode for resizing (#11421 ) * Set LANCZOS as default interpolation mode for resizing * [train_dreambooth_lora.py] Set LANCZOS as default interpolation mode for resizing	2025-04-26 01:58:41 -04:00
co63oc	f00a995753	Fix typos in strings and comments (#11407 )	2025-04-24 08:53:47 -10:00
Linoy Tsaban	edd7880418	[HiDream LoRA] optimizations + small updates (#11381 ) * 1. add pre-computation of prompt embeddings when custom prompts are used as well 2. save model card even if model is not pushed to hub 3. remove scheduler initialization from code example - not necessary anymore (it's now if the base model's config) 4. add skip_final_inference - to allow to run with validation, but skip the final loading of the pipeline with the lora weights to reduce memory reqs * pre encode validation prompt as well * Update examples/dreambooth/train_dreambooth_lora_hidream.py Co-authored-by: Sayak Paul <spsayakpaul@gmail.com> * Update examples/dreambooth/train_dreambooth_lora_hidream.py Co-authored-by: Sayak Paul <spsayakpaul@gmail.com> * Update examples/dreambooth/train_dreambooth_lora_hidream.py Co-authored-by: Sayak Paul <spsayakpaul@gmail.com> * pre encode validation prompt as well * Apply style fixes * empty commit * change default trained modules * empty commit * address comments + change encoding of validation prompt (before it was only pre-encoded if custom prompts are provided, but should be pre-encoded either way) * Apply style fixes * empty commit * fix validation_embeddings definition * fix final inference condition * fix pipeline deletion in last inference * Apply style fixes * empty commit * layers * remove readme remarks on only pre-computing when instance prompt is provided and change example to 3d icons * smol fix * empty commit --------- Co-authored-by: Sayak Paul <spsayakpaul@gmail.com> Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>	2025-04-24 07:48:19 +03:00
Teriks	b4be42282d	Kolors additional pipelines, community contrib (#11372 ) * Kolors additional pipelines, community contrib --------- Co-authored-by: Teriks <Teriks@users.noreply.github.com> Co-authored-by: Linoy Tsaban <57615435+linoytsaban@users.noreply.github.com>	2025-04-23 11:07:27 -10:00
Ishan Dutta	4b60f4b602	[train_dreambooth_flux] Add LANCZOS as the default interpolation mode for image resizing (#11395 )	2025-04-23 10:47:05 -04:00
Ameer Azam	026507c06c	Update README_hidream.md (#11386 ) Small change requirements_sana.txt to requirements_hidream.txt	2025-04-22 20:08:26 -04:00
Linoy Tsaban	e30d3bf544	[LoRA] add LoRA support to HiDream and fine-tuning script (#11281 ) * initial commit * initial commit * initial commit * initial commit * initial commit * initial commit * Update examples/dreambooth/train_dreambooth_lora_hidream.py Co-authored-by: Bagheera <59658056+bghira@users.noreply.github.com> * move prompt embeds, pooled embeds outside * Update examples/dreambooth/train_dreambooth_lora_hidream.py Co-authored-by: hlky <hlky@hlky.ac> * Update examples/dreambooth/train_dreambooth_lora_hidream.py Co-authored-by: hlky <hlky@hlky.ac> * fix import * fix import and tokenizer 4, text encoder 4 loading * te * prompt embeds * fix naming * shapes * initial commit to add HiDreamImageLoraLoaderMixin * fix init * add tests * loader * fix model input * add code example to readme * fix default max length of text encoders * prints * nullify training cond in unpatchify for temp fix to incompatible shaping of transformer output during training * smol fix * unpatchify * unpatchify * fix validation * flip pred and loss * fix shift!!! * revert unpatchify changes (for now) * smol fix * Apply style fixes * workaround moe training * workaround moe training * remove prints * to reduce some memory, keep vae in `weight_dtype` same as we have for flux (as it's the same vae) `bbd0c161b5/examples/dreambooth/train_dreambooth_lora_flux.py (L1207)` * refactor to align with HiDream refactor * refactor to align with HiDream refactor * refactor to align with HiDream refactor * add support for cpu offloading of text encoders * Apply style fixes * adjust lr and rank for train example * fix copies * Apply style fixes * update README * update README * update README * fix license * keep prompt2,3,4 as None in validation * remove reverse ode comment * Update examples/dreambooth/train_dreambooth_lora_hidream.py Co-authored-by: Sayak Paul <spsayakpaul@gmail.com> * Update examples/dreambooth/train_dreambooth_lora_hidream.py Co-authored-by: Sayak Paul <spsayakpaul@gmail.com> * vae offload change * fix text encoder offloading * Apply style fixes * cleaner to_kwargs * fix module name in copied from * add requirements * fix offloading * fix offloading * fix offloading * update transformers version in reqs * try AutoTokenizer * try AutoTokenizer * Apply style fixes * empty commit * Delete tests/lora/test_lora_layers_hidream.py * change tokenizer_4 to load with AutoTokenizer as well * make text_encoder_four and tokenizer_four configurable * save model card * save model card * revert T5 * fix test * remove non diffusers lumina2 conversion --------- Co-authored-by: Bagheera <59658056+bghira@users.noreply.github.com> Co-authored-by: hlky <hlky@hlky.ac> Co-authored-by: Sayak Paul <spsayakpaul@gmail.com> Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>	2025-04-22 11:44:02 +03:00
PromeAI	7a4a126db8	fix issue that training flux controlnet was unstable and validation r… (#11373 ) * fix issue that training flux controlnet was unstable and validation results were unstable * del unused code pieces, fix grammar --------- Co-authored-by: Your Name <you@example.com> Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>	2025-04-21 08:16:05 -10:00
Kenneth Gerald Hamilton	0dec414d5b	[train_dreambooth_lora_sdxl.py] Fix the LR Schedulers when num_train_epochs is passed in a distributed training env (#11240 ) Co-authored-by: Linoy Tsaban <57615435+linoytsaban@users.noreply.github.com>	2025-04-21 12:51:03 +05:30
Linoy Tsaban	44eeba07b2	[Flux LoRAs] fix lr scheduler bug in distributed scenarios (#11242 ) * add fix * add fix * Apply style fixes --------- Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>	2025-04-21 10:08:45 +03:00
Kazuki Yoda	ef47726e2d	Fix: `StableDiffusionXLControlNetAdapterInpaintPipeline` incorrectly inherited `StableDiffusionLoraLoaderMixin` (#11357 ) Fix: Inherit `StableDiffusionXLLoraLoaderMixin` `StableDiffusionXLControlNetAdapterInpaintPipeline` used to incorrectly inherit `StableDiffusionLoraLoaderMixin` instead of `StableDiffusionXLLoraLoaderMixin`	2025-04-18 12:46:06 -10:00
Sayak Paul	4b868f14c1	post release 0.33.0 (#11255 ) * post release * update * fix deprecations * remaining * update --------- Co-authored-by: YiYi Xu <yixu310@gmail.com>	2025-04-15 06:50:08 -10:00
Dhruv Nair	edc154da09	Update Ruff to latest Version (#10919 ) * update * update * update * update	2025-04-09 16:51:34 +05:30
Sayak Paul	fd02aad402	fix: SD3 ControlNet validation so that it runs on a A100. (#11238 ) * fix: SD3 ControlNet validation so that it runs on a A100. * use backend-agnostic cache and pass devide.	2025-04-09 12:12:53 +05:30
Linoy Tsaban	71f34fc5a4	[Flux LoRA] fix issues in flux lora scripts (#11111 ) * remove custom scheduler * update requirements.txt * log_validation with mixed precision * add intermediate embeddings saving when checkpointing is enabled * remove comment * fix validation * add unwrap_model for accelerator, torch.no_grad context for validation, fix accelerator.accumulate call in advanced script * revert unwrap_model change temp * add .module to address distributed training bug + replace accelerator.unwrap_model with unwrap model * changes to align advanced script with canonical script * make changes for distributed training + unify unwrap_model calls in advanced script * add module.dtype fix to dreambooth script * unify unwrap_model calls in dreambooth script * fix condition in validation run * mixed precision * Update examples/advanced_diffusion_training/train_dreambooth_lora_flux_advanced.py Co-authored-by: Sayak Paul <spsayakpaul@gmail.com> * smol style change * change autocast * Apply style fixes --------- Co-authored-by: Sayak Paul <spsayakpaul@gmail.com> Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>	2025-04-08 17:40:30 +03:00
Álvaro Somoza	723dbdd363	[Training] Better image interpolation in training scripts (#11206 ) * initial * Update examples/dreambooth/train_dreambooth_lora_sdxl.py Co-authored-by: hlky <hlky@hlky.ac> * update --------- Co-authored-by: Sayak Paul <spsayakpaul@gmail.com> Co-authored-by: hlky <hlky@hlky.ac>	2025-04-08 12:26:07 +05:30
Bhavay Malhotra	fbf61f465b	[train_controlnet.py] Fix the LR schedulers when num_train_epochs is passed in a distributed training env (#8461 ) * Create diffusers.yml * fix num_train_epochs * Delete diffusers.yml * Fixed Changes --------- Co-authored-by: Sayak Paul <spsayakpaul@gmail.com> Co-authored-by: YiYi Xu <yixu310@gmail.com>	2025-04-08 12:10:09 +05:30
Edna	41afb6690c	Add Wan with STG as a community pipeline (#11184 ) * Add stg wan to community pipelines * remove debug prints * remove unused comment * Update doc * Add credit + fix typo * Apply style fixes --------- Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>	2025-04-05 04:00:40 +02:00
Kenneth Gerald Hamilton	f10775b1b5	Fixed requests.get function call by adding timeout parameter. (#11156 ) * Fixed requests.get function call by adding timeout parameter. * declare DIFFUSERS_REQUEST_TIMEOUT in constants and import when needed * remove unneeded os import * Apply style fixes --------- Co-authored-by: Sai-Suraj-27 <sai.suraj.27.729@gmail.com> Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>	2025-04-04 07:23:14 +01:00
Abhipsha Das	d9023a671a	[Model Card] standardize advanced diffusion training sdxl lora (#7615 ) * model card gen code * push modelcard creation * remove optional from params * add import * add use_dora check * correct lora var use in tags * make style && make quality --------- Co-authored-by: Aryan <aryan@huggingface.co> Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>	2025-04-03 07:43:01 +05:30
Eliseu Silva	c4646a3931	feat: [Community Pipeline] - FaithDiff Stable Diffusion XL Pipeline (#11188 ) * feat: [Community Pipeline] - FaithDiff Stable Diffusion XL Pipeline for Image SR. * added pipeline	2025-04-02 11:33:19 -10:00
hlky	d6f4774c1c	Add `latents_mean` and `latents_std` to `SDXLLongPromptWeightingPipeline` (#11034 )	2025-03-31 11:32:29 -10:00
Tolga Cangöz	0213179ba8	Update README and example code for AnyText usage (#11028 ) * [Documentation] Update README and example code with additional usage instructions for AnyText * [Documentation] Update README for AnyTextPipeline and improve logging in code * Remove wget command for font file from example docstring in anytext.py	2025-03-23 21:15:57 +05:30
Parag Ekbote	f424b1b062	Notebooks for Community Scripts-8 (#11128 ) Add 4 Notebooks and update the missing links for the example README.	2025-03-20 12:24:46 -07:00
Yuqian Hong	fc28791fc8	[BUG] Fix Autoencoderkl train script (#11113 ) * add disc_optimizer step (not fix) * support syncbatchnorm in discriminator	2025-03-19 16:49:02 +05:30
Juan Acevedo	27916822b2	update readme instructions. (#11096 ) Co-authored-by: Juan Acevedo <jfacevedo@google.com>	2025-03-17 20:07:48 -10:00
Yuxuan Zhang	82188cef04	CogView4 Control Block (#10809 ) * cogview4 control training --------- Co-authored-by: OleehyO <leehy0357@gmail.com> Co-authored-by: yiyixuxu <yixu310@gmail.com>	2025-03-15 07:15:56 -10:00
Juan Acevedo	6b9a3334db	reverts accidental change that removes attn_mask in attn. Improves fl… (#11065 ) reverts accidental change that removes attn_mask in attn. Improves flux ptxla by using flash block sizes. Moves encoding outside the for loop. Co-authored-by: Juan Acevedo <jfacevedo@google.com>	2025-03-14 12:47:01 -10:00
Andreas Jörg	8ead643bb7	[examples/controlnet/train_controlnet_sd3.py] Fixes #11050 - Cast prompt_embeds and pooled_prompt_embeds to weight_dtype to prevent dtype mismatch (#11051 ) Fix: dtype mismatch of prompt embeddings in sd3 controlnet training Co-authored-by: Andreas Jörg <andreasjoerg@MacBook-Pro-von-Andreas-2.fritz.box> Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>	2025-03-14 17:33:15 +05:30
Yaniv Galron	5e48cd27d4	making ```formatted_images``` initialization compact (#10801 ) compact writing Co-authored-by: Sayak Paul <spsayakpaul@gmail.com> Co-authored-by: YiYi Xu <yixu310@gmail.com>	2025-03-13 09:27:14 -10:00
wonderfan	36d0553af2	chore: fix help messages in advanced diffusion examples (#10923 )	2025-03-11 07:33:55 -10:00
Eliseu Silva	4e3ddd5afa	fix: mixture tiling sdxl pipeline - adjust gerating time_ids & embeddings (#11012 ) small fix on generating time_ids & embeddings	2025-03-11 04:20:18 -03:00
Tolga Cangöz	b88fef4785	[`Research Project`] Add AnyText: Multilingual Visual Text Generation And Editing (#8998 ) * Add initial template * Second template * feat: Add TextEmbeddingModule to AnyTextPipeline * feat: Add AuxiliaryLatentModule template to AnyTextPipeline * Add bert tokenizer from the anytext repo for now * feat: Update AnyTextPipeline's modify_prompt method This commit adds improvements to the modify_prompt method in the AnyTextPipeline class. The method now handles special characters and replaces selected string prompts with a placeholder. Additionally, it includes a check for Chinese text and translation using the trans_pipe. * Fill in the `forward` pass of `AuxiliaryLatentModule` * `make style && make quality` * `chore: Update bert_tokenizer.py with a TODO comment suggesting the use of the transformers library` * Update error handling to raise and logging * Add `create_glyph_lines` function into `TextEmbeddingModule` * make style * Up * Up * Up * Up * Remove several comments * refactor: Remove ControlNetConditioningEmbedding and update code accordingly * Up * Up * up * refactor: Update AnyTextPipeline to include new optional parameters * up * feat: Add OCR model and its components * chore: Update `TextEmbeddingModule` to include OCR model components and dependencies * chore: Update `AuxiliaryLatentModule` to include VAE model and its dependencies for masked image in the editing task * `make style` * refactor: Update `AnyTextPipeline`'s docstring * Update `AuxiliaryLatentModule` to include info dictionary so that text processing is done once * simplify * `make style` * Converting `TextEmbeddingModule` to ordinary `encode_prompt()` function * Simplify for now * `make style` * Up * feat: Add scripts to convert AnyText controlnet to diffusers * `make style` * Fix: Move glyph rendering to `TextEmbeddingModule` from `AuxiliaryLatentModule` * make style * Up * Simplify * Up * feat: Add safetensors module for loading model file * Fix device issues * Up * Up * refactor: Simplify * refactor: Simplify code for loading models and handling data types * `make style` * refactor: Update to() method in FrozenCLIPEmbedderT3 and TextEmbeddingModule * refactor: Update dtype in embedding_manager.py to match proj.weight * Up * Add attribution and adaptation information to pipeline_anytext.py * Update usage example * Will refactor `controlnet_cond_embedding` initialization * Add `AnyTextControlNetConditioningEmbedding` template * Refactor organization * style * style * Move custom blocks from `AuxiliaryLatentModule` to `AnyTextControlNetConditioningEmbedding` * Follow one-file policy * style * [Docs] Update README and pipeline_anytext.py to use AnyTextControlNetModel * [Docs] Update import statement for AnyTextControlNetModel in pipeline_anytext.py * [Fix] Update import path for ControlNetModel, ControlNetOutput in anytext_controlnet.py * Refactor AnyTextControlNet to use configurable conditioning embedding channels * Complete control net conditioning embedding in AnyTextControlNetModel * up * [FIX] Ensure embeddings use correct device in AnyTextControlNetModel * up * up * style * [UPDATE] Revise README and example code for AnyTextPipeline integration with DiffusionPipeline * [UPDATE] Update example code in anytext.py to use correct font file and improve clarity * down * [UPDATE] Refactor BasicTokenizer usage to a new Checker class for text processing * update pillow * [UPDATE] Remove commented-out code and unnecessary docstring in anytext.py and anytext_controlnet.py for improved clarity * [REMOVE] Delete frozen_clip_embedder_t3.py as it is in the anytext.py file * [UPDATE] Replace edict with dict for configuration in anytext.py and RecModel.py for consistency * 🆙 * style * [UPDATE] Revise README.md for clarity, remove unused imports in anytext.py, and add author credits in anytext_controlnet.py * style * Update examples/research_projects/anytext/README.md Co-authored-by: Aryan <contact.aryanvs@gmail.com> * Remove commented-out image preparation code in AnyTextPipeline * Remove unnecessary blank line in README.md	2025-03-11 01:49:37 +05:30

1 2 3 4 5 ...

1188 Commits