mirror of https://github.com/huggingface/diffusers.git synced 2026-01-29 07:22:12 +03:00

Files

Steven Liu cc5b31ffc9 [docs] Migrate syntax (#12390 )

* change syntax

* make style

2025-09-30 10:11:19 -07:00

2.6 KiB

Raw Permalink Blame History

Kandinsky 3

Kandinsky 3 is created by Vladimir Arkhipkin,Anastasia Maltseva,Igor Pavlov,Andrei Filatov,Arseniy Shakhmatov,Andrey Kuznetsov,Denis Dimitrov, Zein Shaheen

The description from it's GitHub page:

Kandinsky 3.0 is an open-source text-to-image diffusion model built upon the Kandinsky2-x model family. In comparison to its predecessors, enhancements have been made to the text understanding and visual quality of the model, achieved by increasing the size of the text encoder and Diffusion U-Net models, respectively.

Its architecture includes 3 main components:

FLAN-UL2, which is an encoder decoder model based on the T5 architecture.
New U-Net architecture featuring BigGAN-deep blocks doubles depth while maintaining the same number of parameters.
Sber-MoVQGAN is a decoder proven to have superior results in image restoration.

The original codebase can be found at ai-forever/Kandinsky-3.

Tip

Check out the Kandinsky Community organization on the Hub for the official model checkpoints for tasks like text-to-image, image-to-image, and inpainting.

Tip

Make sure to check out the schedulers guide to learn how to explore the tradeoff between scheduler speed and quality, and see the reuse components across pipelines section to learn how to efficiently load the same components into multiple pipelines.

Kandinsky3Pipeline

autodoc Kandinsky3Pipeline - all - call

Kandinsky3Img2ImgPipeline

autodoc Kandinsky3Img2ImgPipeline - all - call

2.6 KiB Raw Permalink Blame History

Kandinsky 3

Kandinsky3Pipeline

Kandinsky3Img2ImgPipeline

2.6 KiB

Raw Permalink Blame History