mirror of
https://github.com/huggingface/diffusers.git
synced 2026-01-27 17:22:53 +03:00
[docs] clarify the mapping between Transformer2DModel and finegrained variants. (#11947)
* clarify the mapping between Transformer2DModel and finegrained variants. * Update src/diffusers/pipelines/dit/pipeline_dit.py Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * fix --------- Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
This commit is contained in:
@@ -46,7 +46,9 @@ class DiTPipeline(DiffusionPipeline):
|
||||
|
||||
Parameters:
|
||||
transformer ([`DiTTransformer2DModel`]):
|
||||
A class conditioned `DiTTransformer2DModel` to denoise the encoded image latents.
|
||||
A class conditioned `DiTTransformer2DModel` to denoise the encoded image latents. Initially published as
|
||||
[`Transformer2DModel`](https://huggingface.co/facebook/DiT-XL-2-256/blob/main/transformer/config.json#L2)
|
||||
in the config, but the mismatch can be ignored.
|
||||
vae ([`AutoencoderKL`]):
|
||||
Variational Auto-Encoder (VAE) model to encode and decode images to and from latent representations.
|
||||
scheduler ([`DDIMScheduler`]):
|
||||
|
||||
@@ -256,7 +256,9 @@ class PixArtAlphaPipeline(DiffusionPipeline):
|
||||
Tokenizer of class
|
||||
[T5Tokenizer](https://huggingface.co/docs/transformers/model_doc/t5#transformers.T5Tokenizer).
|
||||
transformer ([`PixArtTransformer2DModel`]):
|
||||
A text conditioned `PixArtTransformer2DModel` to denoise the encoded image latents.
|
||||
A text conditioned `PixArtTransformer2DModel` to denoise the encoded image latents. Initially published as
|
||||
[`Transformer2DModel`](https://huggingface.co/PixArt-alpha/PixArt-XL-2-1024-MS/blob/main/transformer/config.json#L2)
|
||||
in the config, but the mismatch can be ignored.
|
||||
scheduler ([`SchedulerMixin`]):
|
||||
A scheduler to be used in combination with `transformer` to denoise the encoded image latents.
|
||||
"""
|
||||
|
||||
@@ -185,6 +185,26 @@ def retrieve_timesteps(
|
||||
class PixArtSigmaPipeline(DiffusionPipeline):
|
||||
r"""
|
||||
Pipeline for text-to-image generation using PixArt-Sigma.
|
||||
|
||||
This model inherits from [`DiffusionPipeline`]. Check the superclass documentation for the generic methods the
|
||||
library implements for all the pipelines (such as downloading or saving, running on a particular device, etc.)
|
||||
|
||||
Args:
|
||||
vae ([`AutoencoderKL`]):
|
||||
Variational Auto-Encoder (VAE) Model to encode and decode images to and from latent representations.
|
||||
text_encoder ([`T5EncoderModel`]):
|
||||
Frozen text-encoder. PixArt-Alpha uses
|
||||
[T5](https://huggingface.co/docs/transformers/model_doc/t5#transformers.T5EncoderModel), specifically the
|
||||
[t5-v1_1-xxl](https://huggingface.co/PixArt-alpha/PixArt-alpha/tree/main/t5-v1_1-xxl) variant.
|
||||
tokenizer (`T5Tokenizer`):
|
||||
Tokenizer of class
|
||||
[T5Tokenizer](https://huggingface.co/docs/transformers/model_doc/t5#transformers.T5Tokenizer).
|
||||
transformer ([`PixArtTransformer2DModel`]):
|
||||
A text conditioned `PixArtTransformer2DModel` to denoise the encoded image latents. Initially published as
|
||||
[`Transformer2DModel`](https://huggingface.co/PixArt-alpha/PixArt-Sigma-XL-2-1024-MS/blob/main/transformer/config.json#L2)
|
||||
in the config, but the mismatch can be ignored.
|
||||
scheduler ([`SchedulerMixin`]):
|
||||
A scheduler to be used in combination with `transformer` to denoise the encoded image latents.
|
||||
"""
|
||||
|
||||
bad_punct_regex = re.compile(
|
||||
|
||||
Reference in New Issue
Block a user