From 4cc029960ad0cdc45ec4af61c87fbeb51ebc2ffc Mon Sep 17 00:00:00 2001 From: Patrick von Platen Date: Thu, 2 Jun 2022 00:50:23 +0200 Subject: [PATCH 1/3] Update README.md --- README.md | 56 ++++++++++++++++++++++++++++++++++--------------------- 1 file changed, 35 insertions(+), 21 deletions(-) diff --git a/README.md b/README.md index 6c76122215..f6e4ccd8e8 100644 --- a/README.md +++ b/README.md @@ -4,26 +4,33 @@ ``` ├── models -│   ├── dalle2 -│   │   ├── modeling_dalle2.py -│   │   ├── README.md -│   │   └── run_dalle2.py -│   ├── ddpm -│   │   ├── modeling_ddpm.py -│   │   ├── README.md -│   │   └── run_ddpm.py -│   ├── glide -│   │   ├── modeling_glide.py -│   │   ├── README.md -│   │   └── run_dalle2.py -│   ├── imagen -│   │   ├── modeling_dalle2.py -│   │   ├── README.md -│   │   └── run_dalle2.py -│   └── latent_diffusion -│   ├── modeling_latent_diffusion.py -│   ├── README.md -│   └── run_latent_diffusion.py +│   ├── audio +│   │   └── fastdiff +│   │   ├── modeling_fastdiff.py +│   │   ├── README.md +│   │   └── run_fastdiff.py +│   └── vision +│   ├── dalle2 +│   │   ├── modeling_dalle2.py +│   │   ├── README.md +│   │   └── run_dalle2.py +│   ├── ddpm +│   │   ├── modeling_ddpm.py +│   │   ├── README.md +│   │   └── run_ddpm.py +│   ├── glide +│   │   ├── modeling_glide.py +│   │   ├── README.md +│   │   └── run_dalle2.py +│   ├── imagen +│   │   ├── modeling_dalle2.py +│   │   ├── README.md +│   │   └── run_dalle2.py +│   └── latent_diffusion +│   ├── modeling_latent_diffusion.py +│   ├── README.md +│   └── run_latent_diffusion.py + ├── src │   └── diffusers │   ├── configuration_utils.py @@ -38,7 +45,14 @@ │   └── test_modeling_utils.py ``` -## Dummy Example +## 1. `diffusers` as a central modular diffusion and sampler library + +`diffusers` should be more modularized than `transformers` so that parts of it can be easily used in other libraries. +It could become a central place for all kinds of models, samplers, training utils and processors required when using diffusion models in audio, vision, ... +One should be able to save both models and samplers as well as load them from the Hub. + +Example: + ```python from diffusers import UNetModel, GaussianDiffusion import torch From 4032bedeb7847fb92cd56c8a20173b4835abee4d Mon Sep 17 00:00:00 2001 From: Patrick von Platen Date: Thu, 2 Jun 2022 12:15:59 +0200 Subject: [PATCH 2/3] Update README.md --- README.md | 4 ++++ 1 file changed, 4 insertions(+) diff --git a/README.md b/README.md index f6e4ccd8e8..45eaa8866c 100644 --- a/README.md +++ b/README.md @@ -1,5 +1,9 @@ # Diffusers +## Definitions for diffusion models + +[diffusers.pdf](https://github.com/huggingface/diffusers/files/8822839/diffusers.pdf) + ## Library structure: ``` From e6c4c72ed31c222430cc70b39e70aa5c90e04def Mon Sep 17 00:00:00 2001 From: Patrick von Platen Date: Thu, 2 Jun 2022 12:27:01 +0200 Subject: [PATCH 3/3] Update README.md --- README.md | 18 ++++++++++++++++-- 1 file changed, 16 insertions(+), 2 deletions(-) diff --git a/README.md b/README.md index 45eaa8866c..60c8b34d2b 100644 --- a/README.md +++ b/README.md @@ -1,8 +1,22 @@ # Diffusers -## Definitions for diffusion models +## Definitions -[diffusers.pdf](https://github.com/huggingface/diffusers/files/8822839/diffusers.pdf) +**Models**: Single neural network that models p_θ(x_t-1|x_t) and is trained to “denoise” to image +*Examples: UNet, Conditioned UNet, 3D UNet, Transformer UNet* + +![model_diff_1_50](https://user-images.githubusercontent.com/23423619/171610307-dab0cd8b-75da-4d4e-9f5a-5922072e2bb5.png) + +**Samplers**: Algorithm to *train* and *sample* from **Model**. Defines alpha and beta schedule, timesteps, etc.. +*Example: Vanilla DDPM, DDIM, PMLS, DEIN* + +![sampling](https://user-images.githubusercontent.com/23423619/171608981-3ad05953-a684-4c82-89f8-62a459147a07.png) +![training](https://user-images.githubusercontent.com/23423619/171608964-b3260cce-e6b4-4841-959d-7d8ba4b8d1b2.png) + +**Diffusion Pipeline**: End-to-end pipeline that includes multiple diffusion models, possible text encoders, CLIP +*Example: GLIDE,CompVis/Latent-Diffusion, Imagen, DALL-E* + +![imagen](https://user-images.githubusercontent.com/23423619/171609001-c3f2c1c9-f597-4a16-9843-749bf3f9431c.png) ## Library structure: