mirror of
https://github.com/huggingface/diffusers.git
synced 2026-01-27 17:22:53 +03:00
* Initial LTX 2.0 transformer implementation * Add tests for LTX 2 transformer model * Get LTX 2 transformer tests working * Rename LTX 2 compile test class to have LTX2 * Remove RoPE debug print statements * Get LTX 2 transformer compile tests passing * Fix LTX 2 transformer shape errors * Initial script to convert LTX 2 transformer to diffusers * Add more LTX 2 transformer audio arguments * Allow LTX 2 transformer to be loaded from local path for conversion * Improve dummy inputs and add test for LTX 2 transformer consistency * Fix LTX 2 transformer bugs so consistency test passes * Initial implementation of LTX 2.0 video VAE * Explicitly specify temporal and spatial VAE scale factors when converting * Add initial LTX 2.0 video VAE tests * Add initial LTX 2.0 video VAE tests (part 2) * Get diffusers implementation on par with official LTX 2.0 video VAE implementation * Initial LTX 2.0 vocoder implementation * Use RMSNorm implementation closer to original for LTX 2.0 video VAE * start audio decoder. * init registration. * up * simplify and clean up * up * Initial LTX 2.0 text encoder implementation * Rough initial LTX 2.0 pipeline implementation * up * up * up * up * Add imports for LTX 2.0 Audio VAE * Conversion script for LTX 2.0 Audio VAE Decoder * Add Audio VAE logic to T2V pipeline * Duplicate scheduler for audio latents * Support num_videos_per_prompt for prompt embeddings * LTX 2.0 scheduler and full pipeline conversion * Add script to test full LTX2Pipeline T2V inference * Fix pipeline return bugs * Add LTX 2 text encoder and vocoder to ltx2 subdirectory __init__ * Fix more bugs in LTX2Pipeline.__call__ * Improve CPU offload support * Fix pipeline audio VAE decoding dtype bug * Fix video shape error in full pipeline test script * Get LTX 2 T2V pipeline to produce reasonable outputs * Make LTX 2.0 scheduler more consistent with original code * Fix typo when applying scheduler fix in T2V inference script * Refactor Audio VAE to be simpler and remove helpers (#7) * remove resolve causality axes stuff. * remove a bunch of helpers. * remove adjust output shape helper. * remove the use of audiolatentshape. * move normalization and patchify out of pipeline. * fix * up * up * Remove unpatchify and patchify ops before audio latents denormalization (#9) --------- Co-authored-by: dg845 <58458699+dg845@users.noreply.github.com> * Add support for I2V (#8) * start i2v. * up * up * up * up * up * remove uniform strategy code. * remove unneeded code. * Denormalize audio latents in I2V pipeline (analogous to T2V change) (#11) * test i2v. * Move Video and Audio Text Encoder Connectors to Transformer (#12) * Denormalize audio latents in I2V pipeline (analogous to T2V change) * Initial refactor to put video and audio text encoder connectors in transformer * Get LTX 2 transformer tests working after connector refactor * precompute run_connectors,. * fixes * Address review comments * Calculate RoPE double precisions freqs using torch instead of np * Further simplify LTX 2 RoPE freq calc * Make connectors a separate module (#18) * remove text_encoder.py * address yiyi's comments. * up * up * up * up --------- Co-authored-by: sayakpaul <spsayakpaul@gmail.com> * up (#19) * address initial feedback from lightricks team (#16) * cross_attn_timestep_scale_multiplier to 1000 * implement split rope type. * up * propagate rope_type to rope embed classes as well. * up * When using split RoPE, make sure that the output dtype is same as input dtype * Fix apply split RoPE shape error when reshaping x to 4D * Add export_utils file for exporting LTX 2.0 videos with audio * Tests for T2V and I2V (#6) * add ltx2 pipeline tests. * up * up * up * up * remove content * style * Denormalize audio latents in I2V pipeline (analogous to T2V change) * Initial refactor to put video and audio text encoder connectors in transformer * Get LTX 2 transformer tests working after connector refactor * up * up * i2v tests. * up * Address review comments * Calculate RoPE double precisions freqs using torch instead of np * Further simplify LTX 2 RoPE freq calc * revert unneded changes. * up * up * update to split style rope. * up --------- Co-authored-by: Daniel Gu <dgu8957@gmail.com> * up * use export util funcs. * Point original checkpoint to LTX 2.0 official checkpoint * Allow the I2V pipeline to accept image URLs * make style and make quality * remove function map. * remove args. * update docs. * update doc entries. * disable ltx2_consistency test * Simplify LTX 2 RoPE forward by removing coords is None logic * make style and make quality * Support LTX 2.0 audio VAE encoder * Apply suggestions from code review Co-authored-by: Sayak Paul <spsayakpaul@gmail.com> * Remove print statement in audio VAE * up * Fix bug when calculating audio RoPE coords * Ltx 2 latent upsample pipeline (#12922) * Initial implementation of LTX 2.0 latent upsampling pipeline * Add new LTX 2.0 spatial latent upsampler logic * Add test script for LTX 2.0 latent upsampling * Add option to enable VAE tiling in upsampling test script * Get latent upsampler working with video latents * Fix typo in BlurDownsample * Add latent upsample pipeline docstring and example * Remove deprecated pipeline VAE slicing/tiling methods * make style and make quality * When returning latents, return unpacked and denormalized latents for T2V and I2V * Add model_cpu_offload_seq for latent upsampling pipeline --------- Co-authored-by: Daniel Gu <dgu8957@gmail.com> * Fix latent upsampler filename in LTX 2 conversion script * Add latent upsample pipeline to LTX 2 docs * Add dummy objects for LTX 2 latent upsample pipeline * Set default FPS to official LTX 2 ckpt default of 24.0 * Set default CFG scale to official LTX 2 ckpt default of 4.0 * Update LTX 2 pipeline example docstrings * make style and make quality * Remove LTX 2 test scripts * Fix LTX 2 upsample pipeline example docstring * Add logic to convert and save a LTX 2 upsampling pipeline * Document LTX2VideoTransformer3DModel forward pass --------- Co-authored-by: sayakpaul <spsayakpaul@gmail.com>
796 lines
27 KiB
YAML
796 lines
27 KiB
YAML
- sections:
|
||
- local: index
|
||
title: Diffusers
|
||
- local: installation
|
||
title: Installation
|
||
- local: quicktour
|
||
title: Quickstart
|
||
- local: stable_diffusion
|
||
title: Basic performance
|
||
title: Get started
|
||
- isExpanded: false
|
||
sections:
|
||
- local: using-diffusers/loading
|
||
title: DiffusionPipeline
|
||
- local: tutorials/autopipeline
|
||
title: AutoPipeline
|
||
- local: using-diffusers/custom_pipeline_overview
|
||
title: Community pipelines and components
|
||
- local: using-diffusers/callback
|
||
title: Pipeline callbacks
|
||
- local: using-diffusers/reusing_seeds
|
||
title: Reproducibility
|
||
- local: using-diffusers/schedulers
|
||
title: Schedulers
|
||
- local: using-diffusers/automodel
|
||
title: AutoModel
|
||
- local: using-diffusers/other-formats
|
||
title: Model formats
|
||
- local: using-diffusers/push_to_hub
|
||
title: Sharing pipelines and models
|
||
title: Pipelines
|
||
- isExpanded: false
|
||
sections:
|
||
- local: tutorials/using_peft_for_inference
|
||
title: LoRA
|
||
- local: using-diffusers/ip_adapter
|
||
title: IP-Adapter
|
||
- local: using-diffusers/controlnet
|
||
title: ControlNet
|
||
- local: using-diffusers/t2i_adapter
|
||
title: T2I-Adapter
|
||
- local: using-diffusers/dreambooth
|
||
title: DreamBooth
|
||
- local: using-diffusers/textual_inversion_inference
|
||
title: Textual inversion
|
||
title: Adapters
|
||
- isExpanded: false
|
||
sections:
|
||
- local: using-diffusers/weighted_prompts
|
||
title: Prompting
|
||
- local: using-diffusers/create_a_server
|
||
title: Create a server
|
||
- local: using-diffusers/batched_inference
|
||
title: Batch inference
|
||
- local: training/distributed_inference
|
||
title: Distributed inference
|
||
title: Inference
|
||
- isExpanded: false
|
||
sections:
|
||
- local: optimization/fp16
|
||
title: Accelerate inference
|
||
- local: optimization/cache
|
||
title: Caching
|
||
- local: optimization/attention_backends
|
||
title: Attention backends
|
||
- local: optimization/memory
|
||
title: Reduce memory usage
|
||
- local: optimization/speed-memory-optims
|
||
title: Compiling and offloading quantized models
|
||
- sections:
|
||
- local: optimization/pruna
|
||
title: Pruna
|
||
- local: optimization/xformers
|
||
title: xFormers
|
||
- local: optimization/tome
|
||
title: Token merging
|
||
- local: optimization/deepcache
|
||
title: DeepCache
|
||
- local: optimization/cache_dit
|
||
title: CacheDiT
|
||
- local: optimization/tgate
|
||
title: TGATE
|
||
- local: optimization/xdit
|
||
title: xDiT
|
||
- local: optimization/para_attn
|
||
title: ParaAttention
|
||
- local: using-diffusers/image_quality
|
||
title: FreeU
|
||
title: Community optimizations
|
||
title: Inference optimization
|
||
- isExpanded: false
|
||
sections:
|
||
- local: hybrid_inference/overview
|
||
title: Overview
|
||
- local: hybrid_inference/vae_decode
|
||
title: VAE Decode
|
||
- local: hybrid_inference/vae_encode
|
||
title: VAE Encode
|
||
- local: hybrid_inference/api_reference
|
||
title: API Reference
|
||
title: Hybrid Inference
|
||
- isExpanded: false
|
||
sections:
|
||
- local: modular_diffusers/overview
|
||
title: Overview
|
||
- local: modular_diffusers/quickstart
|
||
title: Quickstart
|
||
- local: modular_diffusers/modular_diffusers_states
|
||
title: States
|
||
- local: modular_diffusers/pipeline_block
|
||
title: ModularPipelineBlocks
|
||
- local: modular_diffusers/sequential_pipeline_blocks
|
||
title: SequentialPipelineBlocks
|
||
- local: modular_diffusers/loop_sequential_pipeline_blocks
|
||
title: LoopSequentialPipelineBlocks
|
||
- local: modular_diffusers/auto_pipeline_blocks
|
||
title: AutoPipelineBlocks
|
||
- local: modular_diffusers/modular_pipeline
|
||
title: ModularPipeline
|
||
- local: modular_diffusers/components_manager
|
||
title: ComponentsManager
|
||
- local: modular_diffusers/guiders
|
||
title: Guiders
|
||
- local: modular_diffusers/custom_blocks
|
||
title: Building Custom Blocks
|
||
title: Modular Diffusers
|
||
- isExpanded: false
|
||
sections:
|
||
- local: training/overview
|
||
title: Overview
|
||
- local: training/create_dataset
|
||
title: Create a dataset for training
|
||
- local: training/adapt_a_model
|
||
title: Adapt a model to a new task
|
||
- local: tutorials/basic_training
|
||
title: Train a diffusion model
|
||
- sections:
|
||
- local: training/unconditional_training
|
||
title: Unconditional image generation
|
||
- local: training/text2image
|
||
title: Text-to-image
|
||
- local: training/sdxl
|
||
title: Stable Diffusion XL
|
||
- local: training/kandinsky
|
||
title: Kandinsky 2.2
|
||
- local: training/wuerstchen
|
||
title: Wuerstchen
|
||
- local: training/controlnet
|
||
title: ControlNet
|
||
- local: training/t2i_adapters
|
||
title: T2I-Adapters
|
||
- local: training/instructpix2pix
|
||
title: InstructPix2Pix
|
||
- local: training/cogvideox
|
||
title: CogVideoX
|
||
title: Models
|
||
- sections:
|
||
- local: training/text_inversion
|
||
title: Textual Inversion
|
||
- local: training/dreambooth
|
||
title: DreamBooth
|
||
- local: training/lora
|
||
title: LoRA
|
||
- local: training/custom_diffusion
|
||
title: Custom Diffusion
|
||
- local: training/lcm_distill
|
||
title: Latent Consistency Distillation
|
||
- local: training/ddpo
|
||
title: Reinforcement learning training with DDPO
|
||
title: Methods
|
||
title: Training
|
||
- isExpanded: false
|
||
sections:
|
||
- local: quantization/overview
|
||
title: Getting started
|
||
- local: quantization/bitsandbytes
|
||
title: bitsandbytes
|
||
- local: quantization/gguf
|
||
title: gguf
|
||
- local: quantization/torchao
|
||
title: torchao
|
||
- local: quantization/quanto
|
||
title: quanto
|
||
- local: quantization/modelopt
|
||
title: NVIDIA ModelOpt
|
||
title: Quantization
|
||
- isExpanded: false
|
||
sections:
|
||
- local: optimization/onnx
|
||
title: ONNX
|
||
- local: optimization/open_vino
|
||
title: OpenVINO
|
||
- local: optimization/coreml
|
||
title: Core ML
|
||
- local: optimization/mps
|
||
title: Metal Performance Shaders (MPS)
|
||
- local: optimization/habana
|
||
title: Intel Gaudi
|
||
- local: optimization/neuron
|
||
title: AWS Neuron
|
||
title: Model accelerators and hardware
|
||
- isExpanded: false
|
||
sections:
|
||
- local: using-diffusers/consisid
|
||
title: ConsisID
|
||
- local: using-diffusers/sdxl
|
||
title: Stable Diffusion XL
|
||
- local: using-diffusers/sdxl_turbo
|
||
title: SDXL Turbo
|
||
- local: using-diffusers/kandinsky
|
||
title: Kandinsky
|
||
- local: using-diffusers/omnigen
|
||
title: OmniGen
|
||
- local: using-diffusers/pag
|
||
title: PAG
|
||
- local: using-diffusers/inference_with_lcm
|
||
title: Latent Consistency Model
|
||
- local: using-diffusers/shap-e
|
||
title: Shap-E
|
||
- local: using-diffusers/diffedit
|
||
title: DiffEdit
|
||
- local: using-diffusers/inference_with_tcd_lora
|
||
title: Trajectory Consistency Distillation-LoRA
|
||
- local: using-diffusers/svd
|
||
title: Stable Video Diffusion
|
||
- local: using-diffusers/marigold_usage
|
||
title: Marigold Computer Vision
|
||
title: Specific pipeline examples
|
||
- isExpanded: false
|
||
sections:
|
||
- sections:
|
||
- local: using-diffusers/unconditional_image_generation
|
||
title: Unconditional image generation
|
||
- local: using-diffusers/conditional_image_generation
|
||
title: Text-to-image
|
||
- local: using-diffusers/img2img
|
||
title: Image-to-image
|
||
- local: using-diffusers/inpaint
|
||
title: Inpainting
|
||
- local: advanced_inference/outpaint
|
||
title: Outpainting
|
||
- local: using-diffusers/text-img2vid
|
||
title: Video generation
|
||
- local: using-diffusers/depth2img
|
||
title: Depth-to-image
|
||
title: Task recipes
|
||
- local: using-diffusers/write_own_pipeline
|
||
title: Understanding pipelines, models and schedulers
|
||
- local: community_projects
|
||
title: Projects built with Diffusers
|
||
- local: conceptual/philosophy
|
||
title: Philosophy
|
||
- local: using-diffusers/controlling_generation
|
||
title: Controlled generation
|
||
- local: conceptual/contribution
|
||
title: How to contribute?
|
||
- local: conceptual/ethical_guidelines
|
||
title: Diffusers' Ethical Guidelines
|
||
- local: conceptual/evaluation
|
||
title: Evaluating Diffusion Models
|
||
title: Resources
|
||
- isExpanded: false
|
||
sections:
|
||
- sections:
|
||
- local: api/configuration
|
||
title: Configuration
|
||
- local: api/logging
|
||
title: Logging
|
||
- local: api/outputs
|
||
title: Outputs
|
||
- local: api/quantization
|
||
title: Quantization
|
||
- local: api/parallel
|
||
title: Parallel inference
|
||
title: Main Classes
|
||
- sections:
|
||
- local: api/modular_diffusers/pipeline
|
||
title: Pipeline
|
||
- local: api/modular_diffusers/pipeline_blocks
|
||
title: Blocks
|
||
- local: api/modular_diffusers/pipeline_states
|
||
title: States
|
||
- local: api/modular_diffusers/pipeline_components
|
||
title: Components and configs
|
||
- local: api/modular_diffusers/guiders
|
||
title: Guiders
|
||
title: Modular
|
||
- sections:
|
||
- local: api/loaders/ip_adapter
|
||
title: IP-Adapter
|
||
- local: api/loaders/lora
|
||
title: LoRA
|
||
- local: api/loaders/single_file
|
||
title: Single files
|
||
- local: api/loaders/textual_inversion
|
||
title: Textual Inversion
|
||
- local: api/loaders/unet
|
||
title: UNet
|
||
- local: api/loaders/transformer_sd3
|
||
title: SD3Transformer2D
|
||
- local: api/loaders/peft
|
||
title: PEFT
|
||
title: Loaders
|
||
- sections:
|
||
- local: api/models/overview
|
||
title: Overview
|
||
- local: api/models/auto_model
|
||
title: AutoModel
|
||
- sections:
|
||
- local: api/models/controlnet
|
||
title: ControlNetModel
|
||
- local: api/models/controlnet_union
|
||
title: ControlNetUnionModel
|
||
- local: api/models/controlnet_flux
|
||
title: FluxControlNetModel
|
||
- local: api/models/controlnet_hunyuandit
|
||
title: HunyuanDiT2DControlNetModel
|
||
- local: api/models/controlnet_sana
|
||
title: SanaControlNetModel
|
||
- local: api/models/controlnet_sd3
|
||
title: SD3ControlNetModel
|
||
- local: api/models/controlnet_sparsectrl
|
||
title: SparseControlNetModel
|
||
title: ControlNets
|
||
- sections:
|
||
- local: api/models/allegro_transformer3d
|
||
title: AllegroTransformer3DModel
|
||
- local: api/models/aura_flow_transformer2d
|
||
title: AuraFlowTransformer2DModel
|
||
- local: api/models/transformer_bria_fibo
|
||
title: BriaFiboTransformer2DModel
|
||
- local: api/models/bria_transformer
|
||
title: BriaTransformer2DModel
|
||
- local: api/models/chroma_transformer
|
||
title: ChromaTransformer2DModel
|
||
- local: api/models/chronoedit_transformer_3d
|
||
title: ChronoEditTransformer3DModel
|
||
- local: api/models/cogvideox_transformer3d
|
||
title: CogVideoXTransformer3DModel
|
||
- local: api/models/cogview3plus_transformer2d
|
||
title: CogView3PlusTransformer2DModel
|
||
- local: api/models/cogview4_transformer2d
|
||
title: CogView4Transformer2DModel
|
||
- local: api/models/consisid_transformer3d
|
||
title: ConsisIDTransformer3DModel
|
||
- local: api/models/cosmos_transformer3d
|
||
title: CosmosTransformer3DModel
|
||
- local: api/models/dit_transformer2d
|
||
title: DiTTransformer2DModel
|
||
- local: api/models/easyanimate_transformer3d
|
||
title: EasyAnimateTransformer3DModel
|
||
- local: api/models/flux2_transformer
|
||
title: Flux2Transformer2DModel
|
||
- local: api/models/flux_transformer
|
||
title: FluxTransformer2DModel
|
||
- local: api/models/hidream_image_transformer
|
||
title: HiDreamImageTransformer2DModel
|
||
- local: api/models/hunyuan_transformer2d
|
||
title: HunyuanDiT2DModel
|
||
- local: api/models/hunyuanimage_transformer_2d
|
||
title: HunyuanImageTransformer2DModel
|
||
- local: api/models/hunyuan_video15_transformer_3d
|
||
title: HunyuanVideo15Transformer3DModel
|
||
- local: api/models/hunyuan_video_transformer_3d
|
||
title: HunyuanVideoTransformer3DModel
|
||
- local: api/models/latte_transformer3d
|
||
title: LatteTransformer3DModel
|
||
- local: api/models/longcat_image_transformer2d
|
||
title: LongCatImageTransformer2DModel
|
||
- local: api/models/ltx2_video_transformer3d
|
||
title: LTX2VideoTransformer3DModel
|
||
- local: api/models/ltx_video_transformer3d
|
||
title: LTXVideoTransformer3DModel
|
||
- local: api/models/lumina2_transformer2d
|
||
title: Lumina2Transformer2DModel
|
||
- local: api/models/lumina_nextdit2d
|
||
title: LuminaNextDiT2DModel
|
||
- local: api/models/mochi_transformer3d
|
||
title: MochiTransformer3DModel
|
||
- local: api/models/omnigen_transformer
|
||
title: OmniGenTransformer2DModel
|
||
- local: api/models/ovisimage_transformer2d
|
||
title: OvisImageTransformer2DModel
|
||
- local: api/models/pixart_transformer2d
|
||
title: PixArtTransformer2DModel
|
||
- local: api/models/prior_transformer
|
||
title: PriorTransformer
|
||
- local: api/models/qwenimage_transformer2d
|
||
title: QwenImageTransformer2DModel
|
||
- local: api/models/sana_transformer2d
|
||
title: SanaTransformer2DModel
|
||
- local: api/models/sana_video_transformer3d
|
||
title: SanaVideoTransformer3DModel
|
||
- local: api/models/sd3_transformer2d
|
||
title: SD3Transformer2DModel
|
||
- local: api/models/skyreels_v2_transformer_3d
|
||
title: SkyReelsV2Transformer3DModel
|
||
- local: api/models/stable_audio_transformer
|
||
title: StableAudioDiTModel
|
||
- local: api/models/transformer2d
|
||
title: Transformer2DModel
|
||
- local: api/models/transformer_temporal
|
||
title: TransformerTemporalModel
|
||
- local: api/models/wan_animate_transformer_3d
|
||
title: WanAnimateTransformer3DModel
|
||
- local: api/models/wan_transformer_3d
|
||
title: WanTransformer3DModel
|
||
- local: api/models/z_image_transformer2d
|
||
title: ZImageTransformer2DModel
|
||
title: Transformers
|
||
- sections:
|
||
- local: api/models/stable_cascade_unet
|
||
title: StableCascadeUNet
|
||
- local: api/models/unet
|
||
title: UNet1DModel
|
||
- local: api/models/unet2d-cond
|
||
title: UNet2DConditionModel
|
||
- local: api/models/unet2d
|
||
title: UNet2DModel
|
||
- local: api/models/unet3d-cond
|
||
title: UNet3DConditionModel
|
||
- local: api/models/unet-motion
|
||
title: UNetMotionModel
|
||
- local: api/models/uvit2d
|
||
title: UViT2DModel
|
||
title: UNets
|
||
- sections:
|
||
- local: api/models/asymmetricautoencoderkl
|
||
title: AsymmetricAutoencoderKL
|
||
- local: api/models/autoencoder_dc
|
||
title: AutoencoderDC
|
||
- local: api/models/autoencoderkl
|
||
title: AutoencoderKL
|
||
- local: api/models/autoencoderkl_allegro
|
||
title: AutoencoderKLAllegro
|
||
- local: api/models/autoencoderkl_cogvideox
|
||
title: AutoencoderKLCogVideoX
|
||
- local: api/models/autoencoderkl_cosmos
|
||
title: AutoencoderKLCosmos
|
||
- local: api/models/autoencoder_kl_hunyuanimage
|
||
title: AutoencoderKLHunyuanImage
|
||
- local: api/models/autoencoder_kl_hunyuanimage_refiner
|
||
title: AutoencoderKLHunyuanImageRefiner
|
||
- local: api/models/autoencoder_kl_hunyuan_video
|
||
title: AutoencoderKLHunyuanVideo
|
||
- local: api/models/autoencoder_kl_hunyuan_video15
|
||
title: AutoencoderKLHunyuanVideo15
|
||
- local: api/models/autoencoderkl_audio_ltx_2
|
||
title: AutoencoderKLLTX2Audio
|
||
- local: api/models/autoencoderkl_ltx_2
|
||
title: AutoencoderKLLTX2Video
|
||
- local: api/models/autoencoderkl_ltx_video
|
||
title: AutoencoderKLLTXVideo
|
||
- local: api/models/autoencoderkl_magvit
|
||
title: AutoencoderKLMagvit
|
||
- local: api/models/autoencoderkl_mochi
|
||
title: AutoencoderKLMochi
|
||
- local: api/models/autoencoderkl_qwenimage
|
||
title: AutoencoderKLQwenImage
|
||
- local: api/models/autoencoder_kl_wan
|
||
title: AutoencoderKLWan
|
||
- local: api/models/consistency_decoder_vae
|
||
title: ConsistencyDecoderVAE
|
||
- local: api/models/autoencoder_oobleck
|
||
title: Oobleck AutoEncoder
|
||
- local: api/models/autoencoder_tiny
|
||
title: Tiny AutoEncoder
|
||
- local: api/models/vq
|
||
title: VQModel
|
||
title: VAEs
|
||
title: Models
|
||
- sections:
|
||
- local: api/pipelines/overview
|
||
title: Overview
|
||
- local: api/pipelines/auto_pipeline
|
||
title: AutoPipeline
|
||
- sections:
|
||
- local: api/pipelines/audioldm
|
||
title: AudioLDM
|
||
- local: api/pipelines/audioldm2
|
||
title: AudioLDM 2
|
||
- local: api/pipelines/dance_diffusion
|
||
title: Dance Diffusion
|
||
- local: api/pipelines/musicldm
|
||
title: MusicLDM
|
||
- local: api/pipelines/stable_audio
|
||
title: Stable Audio
|
||
title: Audio
|
||
- sections:
|
||
- local: api/pipelines/amused
|
||
title: aMUSEd
|
||
- local: api/pipelines/animatediff
|
||
title: AnimateDiff
|
||
- local: api/pipelines/attend_and_excite
|
||
title: Attend-and-Excite
|
||
- local: api/pipelines/aura_flow
|
||
title: AuraFlow
|
||
- local: api/pipelines/blip_diffusion
|
||
title: BLIP-Diffusion
|
||
- local: api/pipelines/bria_3_2
|
||
title: Bria 3.2
|
||
- local: api/pipelines/bria_fibo
|
||
title: Bria Fibo
|
||
- local: api/pipelines/chroma
|
||
title: Chroma
|
||
- local: api/pipelines/cogview3
|
||
title: CogView3
|
||
- local: api/pipelines/cogview4
|
||
title: CogView4
|
||
- local: api/pipelines/consistency_models
|
||
title: Consistency Models
|
||
- local: api/pipelines/controlnet
|
||
title: ControlNet
|
||
- local: api/pipelines/controlnet_flux
|
||
title: ControlNet with Flux.1
|
||
- local: api/pipelines/controlnet_hunyuandit
|
||
title: ControlNet with Hunyuan-DiT
|
||
- local: api/pipelines/controlnet_sd3
|
||
title: ControlNet with Stable Diffusion 3
|
||
- local: api/pipelines/controlnet_sdxl
|
||
title: ControlNet with Stable Diffusion XL
|
||
- local: api/pipelines/controlnet_sana
|
||
title: ControlNet-Sana
|
||
- local: api/pipelines/controlnetxs
|
||
title: ControlNet-XS
|
||
- local: api/pipelines/controlnetxs_sdxl
|
||
title: ControlNet-XS with Stable Diffusion XL
|
||
- local: api/pipelines/controlnet_union
|
||
title: ControlNetUnion
|
||
- local: api/pipelines/cosmos
|
||
title: Cosmos
|
||
- local: api/pipelines/ddim
|
||
title: DDIM
|
||
- local: api/pipelines/ddpm
|
||
title: DDPM
|
||
- local: api/pipelines/deepfloyd_if
|
||
title: DeepFloyd IF
|
||
- local: api/pipelines/diffedit
|
||
title: DiffEdit
|
||
- local: api/pipelines/dit
|
||
title: DiT
|
||
- local: api/pipelines/easyanimate
|
||
title: EasyAnimate
|
||
- local: api/pipelines/flux
|
||
title: Flux
|
||
- local: api/pipelines/flux2
|
||
title: Flux2
|
||
- local: api/pipelines/control_flux_inpaint
|
||
title: FluxControlInpaint
|
||
- local: api/pipelines/hidream
|
||
title: HiDream-I1
|
||
- local: api/pipelines/hunyuandit
|
||
title: Hunyuan-DiT
|
||
- local: api/pipelines/hunyuanimage21
|
||
title: HunyuanImage2.1
|
||
- local: api/pipelines/pix2pix
|
||
title: InstructPix2Pix
|
||
- local: api/pipelines/kandinsky
|
||
title: Kandinsky 2.1
|
||
- local: api/pipelines/kandinsky_v22
|
||
title: Kandinsky 2.2
|
||
- local: api/pipelines/kandinsky3
|
||
title: Kandinsky 3
|
||
- local: api/pipelines/kandinsky5_image
|
||
title: Kandinsky 5.0 Image
|
||
- local: api/pipelines/kolors
|
||
title: Kolors
|
||
- local: api/pipelines/latent_consistency_models
|
||
title: Latent Consistency Models
|
||
- local: api/pipelines/latent_diffusion
|
||
title: Latent Diffusion
|
||
- local: api/pipelines/ledits_pp
|
||
title: LEDITS++
|
||
- local: api/pipelines/longcat_image
|
||
title: LongCat-Image
|
||
- local: api/pipelines/lumina2
|
||
title: Lumina 2.0
|
||
- local: api/pipelines/lumina
|
||
title: Lumina-T2X
|
||
- local: api/pipelines/marigold
|
||
title: Marigold
|
||
- local: api/pipelines/panorama
|
||
title: MultiDiffusion
|
||
- local: api/pipelines/omnigen
|
||
title: OmniGen
|
||
- local: api/pipelines/ovis_image
|
||
title: Ovis-Image
|
||
- local: api/pipelines/pag
|
||
title: PAG
|
||
- local: api/pipelines/paint_by_example
|
||
title: Paint by Example
|
||
- local: api/pipelines/pixart
|
||
title: PixArt-α
|
||
- local: api/pipelines/pixart_sigma
|
||
title: PixArt-Σ
|
||
- local: api/pipelines/prx
|
||
title: PRX
|
||
- local: api/pipelines/qwenimage
|
||
title: QwenImage
|
||
- local: api/pipelines/sana
|
||
title: Sana
|
||
- local: api/pipelines/sana_sprint
|
||
title: Sana Sprint
|
||
- local: api/pipelines/sana_video
|
||
title: Sana Video
|
||
- local: api/pipelines/self_attention_guidance
|
||
title: Self-Attention Guidance
|
||
- local: api/pipelines/semantic_stable_diffusion
|
||
title: Semantic Guidance
|
||
- local: api/pipelines/shap_e
|
||
title: Shap-E
|
||
- local: api/pipelines/stable_cascade
|
||
title: Stable Cascade
|
||
- sections:
|
||
- local: api/pipelines/stable_diffusion/overview
|
||
title: Overview
|
||
- local: api/pipelines/stable_diffusion/depth2img
|
||
title: Depth-to-image
|
||
- local: api/pipelines/stable_diffusion/gligen
|
||
title: GLIGEN (Grounded Language-to-Image Generation)
|
||
- local: api/pipelines/stable_diffusion/image_variation
|
||
title: Image variation
|
||
- local: api/pipelines/stable_diffusion/img2img
|
||
title: Image-to-image
|
||
- local: api/pipelines/stable_diffusion/inpaint
|
||
title: Inpainting
|
||
- local: api/pipelines/stable_diffusion/k_diffusion
|
||
title: K-Diffusion
|
||
- local: api/pipelines/stable_diffusion/latent_upscale
|
||
title: Latent upscaler
|
||
- local: api/pipelines/stable_diffusion/ldm3d_diffusion
|
||
title: LDM3D Text-to-(RGB, Depth), Text-to-(RGB-pano, Depth-pano), LDM3D
|
||
Upscaler
|
||
- local: api/pipelines/stable_diffusion/stable_diffusion_safe
|
||
title: Safe Stable Diffusion
|
||
- local: api/pipelines/stable_diffusion/sdxl_turbo
|
||
title: SDXL Turbo
|
||
- local: api/pipelines/stable_diffusion/stable_diffusion_2
|
||
title: Stable Diffusion 2
|
||
- local: api/pipelines/stable_diffusion/stable_diffusion_3
|
||
title: Stable Diffusion 3
|
||
- local: api/pipelines/stable_diffusion/stable_diffusion_xl
|
||
title: Stable Diffusion XL
|
||
- local: api/pipelines/stable_diffusion/upscale
|
||
title: Super-resolution
|
||
- local: api/pipelines/stable_diffusion/adapter
|
||
title: T2I-Adapter
|
||
- local: api/pipelines/stable_diffusion/text2img
|
||
title: Text-to-image
|
||
title: Stable Diffusion
|
||
- local: api/pipelines/stable_unclip
|
||
title: Stable unCLIP
|
||
- local: api/pipelines/unclip
|
||
title: unCLIP
|
||
- local: api/pipelines/unidiffuser
|
||
title: UniDiffuser
|
||
- local: api/pipelines/value_guided_sampling
|
||
title: Value-guided sampling
|
||
- local: api/pipelines/visualcloze
|
||
title: VisualCloze
|
||
- local: api/pipelines/wuerstchen
|
||
title: Wuerstchen
|
||
- local: api/pipelines/z_image
|
||
title: Z-Image
|
||
title: Image
|
||
- sections:
|
||
- local: api/pipelines/allegro
|
||
title: Allegro
|
||
- local: api/pipelines/chronoedit
|
||
title: ChronoEdit
|
||
- local: api/pipelines/cogvideox
|
||
title: CogVideoX
|
||
- local: api/pipelines/consisid
|
||
title: ConsisID
|
||
- local: api/pipelines/framepack
|
||
title: Framepack
|
||
- local: api/pipelines/hunyuan_video
|
||
title: HunyuanVideo
|
||
- local: api/pipelines/hunyuan_video15
|
||
title: HunyuanVideo1.5
|
||
- local: api/pipelines/i2vgenxl
|
||
title: I2VGen-XL
|
||
- local: api/pipelines/kandinsky5_video
|
||
title: Kandinsky 5.0 Video
|
||
- local: api/pipelines/latte
|
||
title: Latte
|
||
- local: api/pipelines/ltx2
|
||
title: LTX-2
|
||
- local: api/pipelines/ltx_video
|
||
title: LTXVideo
|
||
- local: api/pipelines/mochi
|
||
title: Mochi
|
||
- local: api/pipelines/pia
|
||
title: Personalized Image Animator (PIA)
|
||
- local: api/pipelines/skyreels_v2
|
||
title: SkyReels-V2
|
||
- local: api/pipelines/stable_diffusion/svd
|
||
title: Stable Video Diffusion
|
||
- local: api/pipelines/text_to_video
|
||
title: Text-to-video
|
||
- local: api/pipelines/text_to_video_zero
|
||
title: Text2Video-Zero
|
||
- local: api/pipelines/wan
|
||
title: Wan
|
||
title: Video
|
||
title: Pipelines
|
||
- sections:
|
||
- local: api/schedulers/overview
|
||
title: Overview
|
||
- local: api/schedulers/cm_stochastic_iterative
|
||
title: CMStochasticIterativeScheduler
|
||
- local: api/schedulers/ddim_cogvideox
|
||
title: CogVideoXDDIMScheduler
|
||
- local: api/schedulers/multistep_dpm_solver_cogvideox
|
||
title: CogVideoXDPMScheduler
|
||
- local: api/schedulers/consistency_decoder
|
||
title: ConsistencyDecoderScheduler
|
||
- local: api/schedulers/cosine_dpm
|
||
title: CosineDPMSolverMultistepScheduler
|
||
- local: api/schedulers/ddim_inverse
|
||
title: DDIMInverseScheduler
|
||
- local: api/schedulers/ddim
|
||
title: DDIMScheduler
|
||
- local: api/schedulers/ddpm
|
||
title: DDPMScheduler
|
||
- local: api/schedulers/deis
|
||
title: DEISMultistepScheduler
|
||
- local: api/schedulers/multistep_dpm_solver_inverse
|
||
title: DPMSolverMultistepInverse
|
||
- local: api/schedulers/multistep_dpm_solver
|
||
title: DPMSolverMultistepScheduler
|
||
- local: api/schedulers/dpm_sde
|
||
title: DPMSolverSDEScheduler
|
||
- local: api/schedulers/singlestep_dpm_solver
|
||
title: DPMSolverSinglestepScheduler
|
||
- local: api/schedulers/edm_multistep_dpm_solver
|
||
title: EDMDPMSolverMultistepScheduler
|
||
- local: api/schedulers/edm_euler
|
||
title: EDMEulerScheduler
|
||
- local: api/schedulers/euler_ancestral
|
||
title: EulerAncestralDiscreteScheduler
|
||
- local: api/schedulers/euler
|
||
title: EulerDiscreteScheduler
|
||
- local: api/schedulers/flow_match_euler_discrete
|
||
title: FlowMatchEulerDiscreteScheduler
|
||
- local: api/schedulers/flow_match_heun_discrete
|
||
title: FlowMatchHeunDiscreteScheduler
|
||
- local: api/schedulers/heun
|
||
title: HeunDiscreteScheduler
|
||
- local: api/schedulers/ipndm
|
||
title: IPNDMScheduler
|
||
- local: api/schedulers/stochastic_karras_ve
|
||
title: KarrasVeScheduler
|
||
- local: api/schedulers/dpm_discrete_ancestral
|
||
title: KDPM2AncestralDiscreteScheduler
|
||
- local: api/schedulers/dpm_discrete
|
||
title: KDPM2DiscreteScheduler
|
||
- local: api/schedulers/lcm
|
||
title: LCMScheduler
|
||
- local: api/schedulers/lms_discrete
|
||
title: LMSDiscreteScheduler
|
||
- local: api/schedulers/pndm
|
||
title: PNDMScheduler
|
||
- local: api/schedulers/repaint
|
||
title: RePaintScheduler
|
||
- local: api/schedulers/score_sde_ve
|
||
title: ScoreSdeVeScheduler
|
||
- local: api/schedulers/score_sde_vp
|
||
title: ScoreSdeVpScheduler
|
||
- local: api/schedulers/tcd
|
||
title: TCDScheduler
|
||
- local: api/schedulers/unipc
|
||
title: UniPCMultistepScheduler
|
||
- local: api/schedulers/vq_diffusion
|
||
title: VQDiffusionScheduler
|
||
title: Schedulers
|
||
- sections:
|
||
- local: api/internal_classes_overview
|
||
title: Overview
|
||
- local: api/attnprocessor
|
||
title: Attention Processor
|
||
- local: api/activations
|
||
title: Custom activation functions
|
||
- local: api/cache
|
||
title: Caching methods
|
||
- local: api/normalization
|
||
title: Custom normalization layers
|
||
- local: api/utilities
|
||
title: Utilities
|
||
- local: api/image_processor
|
||
title: VAE Image Processor
|
||
- local: api/video_processor
|
||
title: Video Processor
|
||
title: Internal classes
|
||
title: API
|