* Update EasyAnimate V5.1 * Add docs && add tests && Fix comments problems in transformer3d and vae * delete comments and remove useless import * delete process * Update EXAMPLE_DOC_STRING * rename transformer file * make fix-copies * make style * refactor pt. 1 * update toctree.yml * add model tests * Update layer_norm for norm_added_q and norm_added_k in Attention * Fix processor problem * refactor vae * Fix problem in comments * refactor tiling; remove einops dependency * fix docs path * make fix-copies * Update src/diffusers/pipelines/easyanimate/pipeline_easyanimate_control.py * update _toctree.yml * fix test * update * update * update * make fix-copies * fix tests --------- Co-authored-by: Aryan <aryan@huggingface.co> Co-authored-by: Aryan <contact.aryanvs@gmail.com> Co-authored-by: YiYi Xu <yixu310@gmail.com> Co-authored-by: Dhruv Nair <dhruv.nair@gmail.com>
4.4 KiB
EasyAnimate
EasyAnimate by Alibaba PAI.
The description from it's GitHub page: EasyAnimate is a pipeline based on the transformer architecture, designed for generating AI images and videos, and for training baseline models and Lora models for Diffusion Transformer. We support direct prediction from pre-trained EasyAnimate models, allowing for the generation of videos with various resolutions, approximately 6 seconds in length, at 8fps (EasyAnimateV5.1, 1 to 49 frames). Additionally, users can train their own baseline and Lora models for specific style transformations.
This pipeline was contributed by bubbliiiing. The original codebase can be found here. The original weights can be found under hf.co/alibaba-pai.
There are two official EasyAnimate checkpoints for text-to-video and video-to-video.
| checkpoints | recommended inference dtype |
|---|---|
alibaba-pai/EasyAnimateV5.1-12b-zh |
torch.float16 |
alibaba-pai/EasyAnimateV5.1-12b-zh-InP |
torch.float16 |
There is one official EasyAnimate checkpoints available for image-to-video and video-to-video.
| checkpoints | recommended inference dtype |
|---|---|
alibaba-pai/EasyAnimateV5.1-12b-zh-InP |
torch.float16 |
There are two official EasyAnimate checkpoints available for control-to-video.
| checkpoints | recommended inference dtype |
|---|---|
alibaba-pai/EasyAnimateV5.1-12b-zh-Control |
torch.float16 |
alibaba-pai/EasyAnimateV5.1-12b-zh-Control-Camera |
torch.float16 |
For the EasyAnimateV5.1 series:
- Text-to-video (T2V) and Image-to-video (I2V) works for multiple resolutions. The width and height can vary from 256 to 1024.
- Both T2V and I2V models support generation with 1~49 frames and work best at this value. Exporting videos at 8 FPS is recommended.
Quantization
Quantization helps reduce the memory requirements of very large models by storing model weights in a lower precision data type. However, quantization may have varying impact on video quality depending on the video model.
Refer to the Quantization overview to learn more about supported quantization backends and selecting a quantization backend that supports your use case. The example below demonstrates how to load a quantized [EasyAnimatePipeline] for inference with bitsandbytes.
import torch
from diffusers import BitsAndBytesConfig as DiffusersBitsAndBytesConfig, EasyAnimateTransformer3DModel, EasyAnimatePipeline
from diffusers.utils import export_to_video
quant_config = DiffusersBitsAndBytesConfig(load_in_8bit=True)
transformer_8bit = EasyAnimateTransformer3DModel.from_pretrained(
"alibaba-pai/EasyAnimateV5.1-12b-zh",
subfolder="transformer",
quantization_config=quant_config,
torch_dtype=torch.float16,
)
pipeline = EasyAnimatePipeline.from_pretrained(
"alibaba-pai/EasyAnimateV5.1-12b-zh",
transformer=transformer_8bit,
torch_dtype=torch.float16,
device_map="balanced",
)
prompt = "A cat walks on the grass, realistic style."
negative_prompt = "bad detailed"
video = pipeline(prompt=prompt, negative_prompt=negative_prompt, num_frames=49, num_inference_steps=30).frames[0]
export_to_video(video, "cat.mp4", fps=8)
EasyAnimatePipeline
autodoc EasyAnimatePipeline
- all
- call
EasyAnimatePipelineOutput
autodoc pipelines.easyanimate.pipeline_output.EasyAnimatePipelineOutput