mirror of https://github.com/huggingface/diffusers.git synced 2026-01-27 17:22:53 +03:00

Files

Tolga Cangöz b88fef4785 [Research Project] Add AnyText: Multilingual Visual Text Generation And Editing (#8998 )

* Add initial template

* Second template

* feat: Add TextEmbeddingModule to AnyTextPipeline

* feat: Add AuxiliaryLatentModule template to AnyTextPipeline

* Add bert tokenizer from the anytext repo for now

* feat: Update AnyTextPipeline's modify_prompt method

This commit adds improvements to the modify_prompt method in the AnyTextPipeline class. The method now handles special characters and replaces selected string prompts with a placeholder. Additionally, it includes a check for Chinese text and translation using the trans_pipe.

* Fill in the `forward` pass of `AuxiliaryLatentModule`

* `make style && make quality`

* `chore: Update bert_tokenizer.py with a TODO comment suggesting the use of the transformers library`

* Update error handling to raise and logging

* Add `create_glyph_lines` function into `TextEmbeddingModule`

* make style

* Up

* Up

* Up

* Up

* Remove several comments

* refactor: Remove ControlNetConditioningEmbedding and update code accordingly

* Up

* Up

* up

* refactor: Update AnyTextPipeline to include new optional parameters

* up

* feat: Add OCR model and its components

* chore: Update `TextEmbeddingModule` to include OCR model components and dependencies

* chore: Update `AuxiliaryLatentModule` to include VAE model and its dependencies for masked image in the editing task

* `make style`

* refactor: Update `AnyTextPipeline`'s docstring

* Update `AuxiliaryLatentModule` to include info dictionary so that text processing is done once

* simplify

* `make style`

* Converting `TextEmbeddingModule` to ordinary `encode_prompt()` function

* Simplify for now

* `make style`

* Up

* feat: Add scripts to convert AnyText controlnet to diffusers

* `make style`

* Fix: Move glyph rendering to `TextEmbeddingModule` from `AuxiliaryLatentModule`

* make style

* Up

* Simplify

* Up

* feat: Add safetensors module for loading model file

* Fix device issues

* Up

* Up

* refactor: Simplify

* refactor: Simplify code for loading models and handling data types

* `make style`

* refactor: Update to() method in FrozenCLIPEmbedderT3 and TextEmbeddingModule

* refactor: Update dtype in embedding_manager.py to match proj.weight

* Up

* Add attribution and adaptation information to pipeline_anytext.py

* Update usage example

* Will refactor `controlnet_cond_embedding` initialization

* Add `AnyTextControlNetConditioningEmbedding` template

* Refactor organization

* style

* style

* Move custom blocks from `AuxiliaryLatentModule` to `AnyTextControlNetConditioningEmbedding`

* Follow one-file policy

* style

* [Docs] Update README and pipeline_anytext.py to use AnyTextControlNetModel

* [Docs] Update import statement for AnyTextControlNetModel in pipeline_anytext.py

* [Fix] Update import path for ControlNetModel, ControlNetOutput in anytext_controlnet.py

* Refactor AnyTextControlNet to use configurable conditioning embedding channels

* Complete control net conditioning embedding in AnyTextControlNetModel

* up

* [FIX] Ensure embeddings use correct device in AnyTextControlNetModel

* up

* up

* style

* [UPDATE] Revise README and example code for AnyTextPipeline integration with DiffusionPipeline

* [UPDATE] Update example code in anytext.py to use correct font file and improve clarity

* down

* [UPDATE] Refactor BasicTokenizer usage to a new Checker class for text processing

* update pillow

* [UPDATE] Remove commented-out code and unnecessary docstring in anytext.py and anytext_controlnet.py for improved clarity

* [REMOVE] Delete frozen_clip_embedder_t3.py as it is in the anytext.py file

* [UPDATE] Replace edict with dict for configuration in anytext.py and RecModel.py for consistency

* 🆙

* style

* [UPDATE] Revise README.md for clarity, remove unused imports in anytext.py, and add author credits in anytext_controlnet.py

* style

* Update examples/research_projects/anytext/README.md

Co-authored-by: Aryan <contact.aryanvs@gmail.com>

* Remove commented-out image preparation code in AnyTextPipeline

* Remove unnecessary blank line in README.md

2025-03-11 01:49:37 +05:30

advanced_diffusion_training

[flux lora training] fix t5 training bug (#10845 )

2025-03-05 13:47:01 +02:00

amused

[chore] change licensing to 2025 from 2024. (#10615 )

2025-01-20 16:57:27 -10:00

cogvideo

Fix incorrect seed initialization when args.seed is 0 (#10964 )

2025-03-04 10:09:52 -10:00

community

Add STG to community pipelines (#10960 )

2025-03-08 00:28:24 +05:30

consistency_distillation

[chore] change licensing to 2025 from 2024. (#10615 )

2025-01-20 16:57:27 -10:00

controlnet

typo fix (#10802 )

2025-02-16 20:56:54 +05:30

custom_diffusion

Fix incorrect seed initialization when args.seed is 0 (#10964 )

2025-03-04 10:09:52 -10:00

dreambooth

[train_dreambooth_lora.py] Fix the LR Schedulers when num_train_epochs is passed in a distributed training env (#10973 )

2025-03-06 10:06:24 +05:30

flux-control

[chore] change licensing to 2025 from 2024. (#10615 )

2025-01-20 16:57:27 -10:00

inference

Refactor Pipelines / Community pipelines and add better explanations. (#257 )

2022-08-30 18:43:42 +02:00

instruct_pix2pix

Fix inconsistent random transform in instruct pix2pix (#10698 )

2025-01-31 08:29:29 -10:00

kandinsky2_2/text_to_image

[chore] change licensing to 2025 from 2024. (#10615 )

2025-01-20 16:57:27 -10:00

model_search

[Community] Enhanced Model Search (#10417 )

2025-02-05 14:43:53 -10:00

reinforcement_learning

Add Diffusion Policy for Reinforcement Learning (#9824 )

2024-11-02 09:18:44 +05:30

research_projects

[Research Project] Add AnyText: Multilingual Visual Text Generation And Editing (#8998 )

2025-03-11 01:49:37 +05:30

server

Add server example (#9918 )

2024-11-18 09:26:13 -08:00

t2i_adapter

[chore] change licensing to 2025 from 2024. (#10615 )

2025-01-20 16:57:27 -10:00

text_to_image

Fix incorrect seed initialization when args.seed is 0 (#10964 )

2025-03-04 10:09:52 -10:00

textual_inversion

[chore] change licensing to 2025 from 2024. (#10615 )

2025-01-20 16:57:27 -10:00

unconditional_image_generation

[chore] post release 0.32.0 (#10361 )

2024-12-23 10:03:34 -10:00

vqgan

[chore] change licensing to 2025 from 2024. (#10615 )

2025-01-20 16:57:27 -10:00

conftest.py

change to 2024 in the license (#6902 )

2024-02-08 08:19:31 -10:00

README.md

Notebooks for Community Scripts-7 (#10846 )

2025-02-20 09:02:09 -08:00

test_examples_utils.py

change to 2024 in the license (#6902 )

2024-02-08 08:19:31 -10:00

README.md

🧨 Diffusers Examples

Diffusers examples are a collection of scripts to demonstrate how to effectively use the diffusers library for a variety of use cases involving training or fine-tuning.

Note: If you are looking for official examples on how to use diffusers for inference, please have a look at src/diffusers/pipelines.

Our examples aspire to be self-contained, easy-to-tweak, beginner-friendly and for one-purpose-only. More specifically, this means:

Self-contained: An example script shall only depend on "pip-install-able" Python packages that can be found in a requirements.txt file. Example scripts shall not depend on any local files. This means that one can simply download an example script, e.g. train_unconditional.py, install the required dependencies, e.g. requirements.txt and execute the example script.
Easy-to-tweak: While we strive to present as many use cases as possible, the example scripts are just that - examples. It is expected that they won't work out-of-the box on your specific problem and that you will be required to change a few lines of code to adapt them to your needs. To help you with that, most of the examples fully expose the preprocessing of the data and the training loop to allow you to tweak and edit them as required.
Beginner-friendly: We do not aim for providing state-of-the-art training scripts for the newest models, but rather examples that can be used as a way to better understand diffusion models and how to use them with the diffusers library. We often purposefully leave out certain state-of-the-art methods if we consider them too complex for beginners.
One-purpose-only: Examples should show one task and one task only. Even if a task is from a modeling point of view very similar, e.g. image super-resolution and image modification tend to use the same model and training method, we want examples to showcase only one task to keep them as readable and easy-to-understand as possible.

We provide official examples that cover the most popular tasks of diffusion models. Official examples are actively maintained by the diffusers maintainers and we try to rigorously follow our example philosophy as defined above. If you feel like another important example should exist, we are more than happy to welcome a Feature Request or directly a Pull Request from you!

Training examples show how to pretrain or fine-tune diffusion models for a variety of tasks. Currently we support:

Task	🤗 Accelerate	🤗 Datasets	Colab
Unconditional Image Generation	✅	✅
Text-to-Image fine-tuning	✅	✅
Textual Inversion	✅	-
Dreambooth	✅	-
ControlNet	✅	✅	Notebook
InstructPix2Pix	✅	✅	Notebook
Reinforcement Learning for Control	-	-	Notebook1, Notebook2

Community

In addition, we provide community examples, which are examples added and maintained by our community. Community examples can consist of both training examples or inference pipelines. For such examples, we are more lenient regarding the philosophy defined above and also cannot guarantee to provide maintenance for every issue. Examples that are useful for the community, but are either not yet deemed popular or not yet following our above philosophy should go into the community examples folder. The community folder therefore includes training examples and inference pipelines. Note: Community examples can be a great first contribution to show to the community how you like to use diffusers 🪄.

Research Projects

We also provide research_projects examples that are maintained by the community as defined in the respective research project folders. These examples are useful and offer the extended capabilities which are complementary to the official examples. You may refer to research_projects for details.

Important note

To make sure you can successfully run the latest versions of the example scripts, you have to install the library from source and install some example-specific requirements. To do this, execute the following steps in a new virtual environment:

git clone https://github.com/huggingface/diffusers
cd diffusers
pip install .

Then cd in the example folder of your choice and run

pip install -r requirements.txt