mirror of
https://github.com/huggingface/diffusers.git
synced 2026-01-27 17:22:53 +03:00
[Docs] add an example use for StableUnCLIPPipeline in the pipeline docs (#2897)
* improve stable unclip doc. * add: entry of StableUnCLIPPipeline to the docs * Apply suggestions from code review Co-authored-by: apolinario <joaopaulo.passos@gmail.com> --------- Co-authored-by: apolinario <joaopaulo.passos@gmail.com>
This commit is contained in:
@@ -32,12 +32,50 @@ we do not add any additional noise to the image embeddings i.e. `noise_level = 0
|
||||
* [stabilityai/stable-diffusion-2-1-unclip](https://hf.co/stabilityai/stable-diffusion-2-1-unclip)
|
||||
* [stabilityai/stable-diffusion-2-1-unclip-small](https://hf.co/stabilityai/stable-diffusion-2-1-unclip-small)
|
||||
* Text-to-image
|
||||
* Coming soon!
|
||||
* [stabilityai/stable-diffusion-2-1-unclip-small](https://hf.co/stabilityai/stable-diffusion-2-1-unclip-small)
|
||||
|
||||
### Text-to-Image Generation
|
||||
Stable unCLIP can be leveraged for text-to-image generation by pipelining it with the prior model of KakaoBrain's open source DALL-E 2 replication [Karlo](https://huggingface.co/kakaobrain/karlo-v1-alpha)
|
||||
|
||||
Coming soon!
|
||||
```python
|
||||
import torch
|
||||
from diffusers import UnCLIPScheduler, DDPMScheduler, StableUnCLIPPipeline
|
||||
from diffusers.models import PriorTransformer
|
||||
from transformers import CLIPTokenizer, CLIPTextModelWithProjection
|
||||
|
||||
prior_model_id = "kakaobrain/karlo-v1-alpha"
|
||||
data_type = torch.float16
|
||||
prior = PriorTransformer.from_pretrained(prior_model_id, subfolder="prior", torch_dtype=data_type)
|
||||
|
||||
prior_text_model_id = "openai/clip-vit-large-patch14"
|
||||
prior_tokenizer = CLIPTokenizer.from_pretrained(prior_text_model_id)
|
||||
prior_text_model = CLIPTextModelWithProjection.from_pretrained(prior_text_model_id, torch_dtype=data_type)
|
||||
prior_scheduler = UnCLIPScheduler.from_pretrained(prior_model_id, subfolder="prior_scheduler")
|
||||
prior_scheduler = DDPMScheduler.from_config(prior_scheduler.config)
|
||||
|
||||
stable_unclip_model_id = "stabilityai/stable-diffusion-2-1-unclip-small"
|
||||
|
||||
pipe = StableUnCLIPPipeline.from_pretrained(
|
||||
stable_unclip_model_id,
|
||||
torch_dtype=data_type,
|
||||
variant="fp16",
|
||||
prior_tokenizer=prior_tokenizer,
|
||||
prior_text_encoder=prior_text_model,
|
||||
prior=prior,
|
||||
prior_scheduler=prior_scheduler,
|
||||
)
|
||||
|
||||
pipe = pipe.to("cuda")
|
||||
wave_prompt = "dramatic wave, the Oceans roar, Strong wave spiral across the oceans as the waves unfurl into roaring crests; perfect wave form; perfect wave shape; dramatic wave shape; wave shape unbelievable; wave; wave shape spectacular"
|
||||
|
||||
images = pipe(prompt=wave_prompt).images
|
||||
images[0].save("waves.png")
|
||||
```
|
||||
<Tip warning={true}>
|
||||
|
||||
For text-to-image we use `stabilityai/stable-diffusion-2-1-unclip-small` as it was trained on CLIP ViT-L/14 embedding, the same as the Karlo model prior. [stabilityai/stable-diffusion-2-1-unclip](https://hf.co/stabilityai/stable-diffusion-2-1-unclip) was trained on OpenCLIP ViT-H, so we don't recommend its use.
|
||||
|
||||
</Tip>
|
||||
|
||||
### Text guided Image-to-Image Variation
|
||||
|
||||
|
||||
Reference in New Issue
Block a user