diff --git a/README.md b/README.md index 931f854d03..304755d581 100644 --- a/README.md +++ b/README.md @@ -41,13 +41,64 @@ See the [model card](https://huggingface.co/CompVis/stable-diffusion) for more i You need to accept the model license before downloading or using the Stable Diffusion weights. Please, visit the [model card](https://huggingface.co/CompVis/stable-diffusion-v1-3), read the license and tick the checkbox if you agree. You have to be a registered user in 🤗 Hugging Face Hub, and you'll also need to use an access token for the code to work. For more information on access tokens, please refer to [this section](https://huggingface.co/docs/hub/security-tokens) of the documentation. + ### Text-to-Image generation with Stable Diffusion ```python # make sure you're logged in with `huggingface-cli login` from torch import autocast -import torch -from diffusers import StableDiffusionPipeline, LMSDiscreteScheduler +from diffusers import StableDiffusionPipeline + +pipe = StableDiffusionPipeline.from_pretrained("CompVis/stable-diffusion-v1-4", use_auth_token=True) +pipe = pipe.to("cuda") + +prompt = "a photo of an astronaut riding a horse on mars" +with autocast("cuda"): + image = pipe(prompt)["sample"][0] +``` + +**Note**: If you don't want to use the token, you can also simply download the model weights +(after having [accepted the license](https://huggingface.co/CompVis/stable-diffusion-v1-4)) and pass +the path to the local folder to the `StableDiffusionPipeline`. + +``` +git lfs install +git clone https://huggingface.co/CompVis/stable-diffusion-v1-4 +``` + +Assuming the folder is stored locally under `./stable-diffusion-v1-4`, you can also run stable diffusion +without requiring an authentication token: + +```python +pipe = StableDiffusionPipeline.from_pretrained("./stable-diffusion-v1-4") +pipe = pipe.to("cuda") + +prompt = "a photo of an astronaut riding a horse on mars" +with autocast("cuda"): + image = pipe(prompt)["sample"][0] +``` + +If you are limited by GPU memory, you might want to consider using the model in `fp16`. + +```python +pipe = StableDiffusionPipeline.from_pretrained( + "CompVis/stable-diffusion-v1-4", + revision="fp16", + torch_dtype=torch.float16, + use_auth_token=True +) +pipe = pipe.to("cuda") + +prompt = "a photo of an astronaut riding a horse on mars" +with autocast("cuda"): + image = pipe(prompt)["sample"][0] +``` + +Finally, if you wish to use a different scheduler, you can simply instantiate +it before the pipeline and pass it to `from_pretrained`. + +```python +from diffusers import LMSDiscreteScheduler lms = LMSDiscreteScheduler( beta_start=0.00085, @@ -86,12 +137,15 @@ from diffusers import StableDiffusionImg2ImgPipeline # load the pipeline device = "cuda" +model_id_or_path = "CompVis/stable-diffusion-v1-4" pipe = StableDiffusionImg2ImgPipeline.from_pretrained( - "CompVis/stable-diffusion-v1-4", + model_id_or_path, revision="fp16", torch_dtype=torch.float16, use_auth_token=True ) +# or download via git clone https://huggingface.co/CompVis/stable-diffusion-v1-4 +# and pass `model_id_or_path="./stable-diffusion-v1-4"` without having to use `use_auth_token=True`. pipe = pipe.to(device) # let's download an initial image @@ -135,12 +189,15 @@ init_image = download_image(img_url).resize((512, 512)) mask_image = download_image(mask_url).resize((512, 512)) device = "cuda" +model_id_or_path = "CompVis/stable-diffusion-v1-4" pipe = StableDiffusionInpaintPipeline.from_pretrained( - "CompVis/stable-diffusion-v1-4", + model_id_or_path, revision="fp16", torch_dtype=torch.float16, use_auth_token=True ) +# or download via git clone https://huggingface.co/CompVis/stable-diffusion-v1-4 +# and pass `model_id_or_path="./stable-diffusion-v1-4"` without having to use `use_auth_token=True`. pipe = pipe.to(device) prompt = "a cat sitting on a bench" diff --git a/src/diffusers/pipelines/stable_diffusion/README.md b/src/diffusers/pipelines/stable_diffusion/README.md index 5be8fd9bb8..bb6aced080 100644 --- a/src/diffusers/pipelines/stable_diffusion/README.md +++ b/src/diffusers/pipelines/stable_diffusion/README.md @@ -12,6 +12,8 @@ The summary of the model is the following: - Stable Diffusion has the same architecture as [Latent Diffusion](https://arxiv.org/abs/2112.10752) but uses a frozen CLIP Text Encoder instead of training the text encoder jointly with the diffusion model. - An in-detail explanation of the Stable Diffusion model can be found under [Stable Diffusion with 🧨 Diffusers](https://huggingface.co/blog/stable_diffusion). +- If you don't want to rely on the Hugging Face Hub and having to pass a authentification token, you can +download the weights with `git lfs install; git clone https://huggingface.co/CompVis/stable-diffusion-v1-4` and instead pass the local path to the cloned folder to `from_pretrained` as shown below. - Stable Diffusion can work with a variety of different samplers as is shown below. ## Available Pipelines: @@ -24,6 +26,35 @@ The summary of the model is the following: ## Examples: +### Using Stable Diffusion without being logged into the Hub. + +If you want to download the model weights using a single Python line, you need to pass the token +to `use_auth_token` or be logged in via `huggingface-cli login`. +For more information on access tokens, please refer to [this section](https://huggingface.co/docs/hub/security-tokens) of the documentation. + +Assuming your token is stored under YOUR_TOKEN, you can download the stable diffusion pipeline as follows: + +```python +from diffusers import DiffusionPipeline + +pipeline = DiffusionPipeline.from_pretrained("CompVis/stable-diffusion-v1-4", use_auth_token=YOUR_TOKEN) +``` + +This however can make it difficult to build applications on top of `diffusers` as you will always have to pass the token around. A potential way to solve this issue is by downloading the weights to a local path `"./stable-diffusion-v1-4"`: + +``` +git lfs install +git clone https://huggingface.co/CompVis/stable-diffusion-v1-4 +``` + +and simply passing the local path to `from_pretrained`: + +```python +from diffusers import StableDiffusionPipeline + +pipe = StableDiffusionPipeline.from_pretrained("./stable-diffusion-v1-4") +``` + ### Text-to-Image with default PLMS scheduler ```python