1
0
mirror of https://github.com/huggingface/diffusers.git synced 2026-01-27 17:22:53 +03:00
Files
diffusers/docs/source/ko/quicktour.md
Seongsu Park 0c775544dd [Docs] Korean translation update (#4684)
* Docs kr update 3

controlnet, reproducibility ์—…๋กœ๋“œ

generator ๊ทธ๋Œ€๋กœ ์‚ฌ์šฉ
seamless multi-GPU ๊ทธ๋Œ€๋กœ ์‚ฌ์šฉ

create_dataset ๋ฒˆ์—ญ 1์ฐจ

stable_diffusion_jax

new translation

Add coreml, tome

kr docs minor fix

translate training/instructpix2pix

fix training/instructpix2pix.mdx

using-diffusers/weighting_prompts ๋ฒˆ์—ญ 1์ฐจ

add SDXL docs

Translate using-diffuers/loading_overview.md

translate using-diffusers/textual_inversion_inference.md

Conditional image generation (#37)

* stable_diffusion_jax

* index_update

* index_update

* condition_image_generation

---------

Co-authored-by: Seongsu Park <tjdtnsu@gmail.com>

jihwan/stable_diffusion.mdx

custom_diffusion ์ž‘์—… ์™„๋ฃŒ

quicktour ์ž‘์—… ์™„๋ฃŒ

distributed inference & control brightness (#40)

* distributed_inference.mdx

* control_brightness

---------

Co-authored-by: idra79haza <idra79haza@github.com>
Co-authored-by: Seongsu Park <tjdtnsu@gmail.com>

using_safetensors (#41)

* distributed_inference.mdx

* control_brightness

* using_safetensors.mdx

---------

Co-authored-by: idra79haza <idra79haza@github.com>
Co-authored-by: Seongsu Park <tjdtnsu@gmail.com>

delete safetensor short

* Repace mdx to md

* toctree update

* Add controlling_generation

* toctree fix

* colab link, minor fix

* docs name typo fix

* frontmatter fix

* translation fix
2023-09-01 09:23:45 -07:00

313 lines
17 KiB
Markdown

<!--Copyright 2023 The HuggingFace Team. All rights reserved.
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
the License. You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on
an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the
specific language governing permissions and limitations under the License.
-->
[[open-in-colab]]
# ํ›‘์–ด๋ณด๊ธฐ
Diffusion ๋ชจ๋ธ์€ ์ด๋ฏธ์ง€๋‚˜ ์˜ค๋””์˜ค์™€ ๊ฐ™์€ ๊ด€์‹ฌ ์ƒ˜ํ”Œ๋“ค์„ ์ƒ์„ฑํ•˜๊ธฐ ์œ„ํ•ด ๋žœ๋ค ๊ฐ€์šฐ์‹œ์•ˆ ๋…ธ์ด์ฆˆ๋ฅผ ๋‹จ๊ณ„๋ณ„๋กœ ์ œ๊ฑฐํ•˜๋„๋ก ํ•™์Šต๋ฉ๋‹ˆ๋‹ค. ์ด๋กœ ์ธํ•ด ์ƒ์„ฑ AI์— ๋Œ€ํ•œ ๊ด€์‹ฌ์ด ๋งค์šฐ ๋†’์•„์กŒ์œผ๋ฉฐ, ์ธํ„ฐ๋„ท์—์„œ diffusion ์ƒ์„ฑ ์ด๋ฏธ์ง€์˜ ์˜ˆ๋ฅผ ๋ณธ ์ ์ด ์žˆ์„ ๊ฒƒ์ž…๋‹ˆ๋‹ค. ๐Ÿงจ Diffusers๋Š” ๋ˆ„๊ตฌ๋‚˜ diffusion ๋ชจ๋ธ๋“ค์„ ๋„๋ฆฌ ์ด์šฉํ•  ์ˆ˜ ์žˆ๋„๋ก ํ•˜๊ธฐ ์œ„ํ•œ ๋ผ์ด๋ธŒ๋Ÿฌ๋ฆฌ์ž…๋‹ˆ๋‹ค.
๊ฐœ๋ฐœ์ž๋“  ์ผ๋ฐ˜ ์‚ฌ์šฉ์ž๋“  ์ด ํ›‘์–ด๋ณด๊ธฐ๋ฅผ ํ†ตํ•ด ๐Ÿงจ diffusers๋ฅผ ์†Œ๊ฐœํ•˜๊ณ  ๋น ๋ฅด๊ฒŒ ์ƒ์„ฑํ•  ์ˆ˜ ์žˆ๋„๋ก ๋„์™€๋“œ๋ฆฝ๋‹ˆ๋‹ค! ์•Œ์•„์•ผ ํ•  ๋ผ์ด๋ธŒ๋Ÿฌ๋ฆฌ์˜ ์ฃผ์š” ๊ตฌ์„ฑ ์š”์†Œ๋Š” ํฌ๊ฒŒ ์„ธ ๊ฐ€์ง€์ž…๋‹ˆ๋‹ค:
* [`DiffusionPipeline`]์€ ์ถ”๋ก ์„ ์œ„ํ•ด ์‚ฌ์ „ ํ•™์Šต๋œ diffusion ๋ชจ๋ธ์—์„œ ์ƒ˜ํ”Œ์„ ๋น ๋ฅด๊ฒŒ ์ƒ์„ฑํ•˜๋„๋ก ์„ค๊ณ„๋œ ๋†’์€ ์ˆ˜์ค€์˜ ์—”๋“œํˆฌ์—”๋“œ ํด๋ž˜์Šค์ž…๋‹ˆ๋‹ค.
* Diffusion ์‹œ์Šคํ…œ ์ƒ์„ฑ์„ ์œ„ํ•œ ๋นŒ๋”ฉ ๋ธ”๋ก์œผ๋กœ ์‚ฌ์šฉํ•  ์ˆ˜ ์žˆ๋Š” ๋„๋ฆฌ ์‚ฌ์šฉ๋˜๋Š” ์‚ฌ์ „ ํ•™์Šต๋œ [model](./api/models) ์•„ํ‚คํ…์ฒ˜ ๋ฐ ๋ชจ๋“ˆ.
* ๋‹ค์–‘ํ•œ [schedulers](./api/schedulers/overview) - ํ•™์Šต์„ ์œ„ํ•ด ๋…ธ์ด์ฆˆ๋ฅผ ์ถ”๊ฐ€ํ•˜๋Š” ๋ฐฉ๋ฒ•๊ณผ ์ถ”๋ก  ์ค‘์— ๋…ธ์ด์ฆˆ ์ œ๊ฑฐ๋œ ์ด๋ฏธ์ง€๋ฅผ ์ƒ์„ฑํ•˜๋Š” ๋ฐฉ๋ฒ•์„ ์ œ์–ดํ•˜๋Š” ์•Œ๊ณ ๋ฆฌ์ฆ˜์ž…๋‹ˆ๋‹ค.
ํ›‘์–ด๋ณด๊ธฐ์—์„œ๋Š” ์ถ”๋ก ์„ ์œ„ํ•ด [`DiffusionPipeline`]์„ ์‚ฌ์šฉํ•˜๋Š” ๋ฐฉ๋ฒ•์„ ๋ณด์—ฌ์ค€ ๋‹ค์Œ, ๋ชจ๋ธ๊ณผ ์Šค์ผ€์ค„๋Ÿฌ๋ฅผ ๊ฒฐํ•ฉํ•˜์—ฌ [`DiffusionPipeline`] ๋‚ด๋ถ€์—์„œ ์ผ์–ด๋‚˜๋Š” ์ผ์„ ๋ณต์ œํ•˜๋Š” ๋ฐฉ๋ฒ•์„ ์•ˆ๋‚ดํ•ฉ๋‹ˆ๋‹ค.
<Tip>
ํ›‘์–ด๋ณด๊ธฐ๋Š” ๊ฐ„๊ฒฐํ•œ ๋ฒ„์ „์˜ ๐Ÿงจ Diffusers ์†Œ๊ฐœ๋กœ์„œ [๋…ธํŠธ๋ถ](https://colab.research.google.com/github/huggingface/notebooks/blob/main/diffusers/diffusers_intro.ipynb) ๋น ๋ฅด๊ฒŒ ์‹œ์ž‘ํ•  ์ˆ˜ ์žˆ๋„๋ก ๋„์™€๋“œ๋ฆฝ๋‹ˆ๋‹ค. ๋””ํ“จ์ €์˜ ๋ชฉํ‘œ, ๋””์ž์ธ ์ฒ ํ•™, ํ•ต์‹ฌ API์— ๋Œ€ํ•œ ์ถ”๊ฐ€ ์„ธ๋ถ€ ์ •๋ณด๋ฅผ ์ž์„ธํžˆ ์•Œ์•„๋ณด๋ ค๋ฉด ๋…ธํŠธ๋ถ์„ ํ™•์ธํ•˜์„ธ์š”!
</Tip>
์‹œ์ž‘ํ•˜๊ธฐ ์ „์— ํ•„์š”ํ•œ ๋ผ์ด๋ธŒ๋Ÿฌ๋ฆฌ๊ฐ€ ๋ชจ๋‘ ์„ค์น˜๋˜์–ด ์žˆ๋Š”์ง€ ํ™•์ธํ•˜์„ธ์š”:
```py
# ์ฃผ์„ ํ’€์–ด์„œ Colab์— ํ•„์š”ํ•œ ๋ผ์ด๋ธŒ๋Ÿฌ๋ฆฌ ์„ค์น˜ํ•˜๊ธฐ.
#!pip install --upgrade diffusers accelerate transformers
```
- [๐Ÿค— Accelerate](https://huggingface.co/docs/accelerate/index)๋Š” ์ถ”๋ก  ๋ฐ ํ•™์Šต์„ ์œ„ํ•œ ๋ชจ๋ธ ๋กœ๋”ฉ ์†๋„๋ฅผ ๋†’์—ฌ์ค๋‹ˆ๋‹ค.
- [๐Ÿค— Transformers](https://huggingface.co/docs/transformers/index)๋Š” [Stable Diffusion](https://huggingface.co/docs/diffusers/api/pipelines/stable_diffusion/overview)๊ณผ ๊ฐ™์ด ๊ฐ€์žฅ ๋งŽ์ด ์‚ฌ์šฉ๋˜๋Š” diffusion ๋ชจ๋ธ์„ ์‹คํ–‰ํ•˜๋Š” ๋ฐ ํ•„์š”ํ•ฉ๋‹ˆ๋‹ค.
## DiffusionPipeline
[`DiffusionPipeline`] ์€ ์ถ”๋ก ์„ ์œ„ํ•ด ์‚ฌ์ „ ํ•™์Šต๋œ diffusion ์‹œ์Šคํ…œ์„ ์‚ฌ์šฉํ•˜๋Š” ๊ฐ€์žฅ ์‰ฌ์šด ๋ฐฉ๋ฒ•์ž…๋‹ˆ๋‹ค. ๋ชจ๋ธ๊ณผ ์Šค์ผ€์ค„๋Ÿฌ๋ฅผ ํฌํ•จํ•˜๋Š” ์—”๋“œ ํˆฌ ์—”๋“œ ์‹œ์Šคํ…œ์ž…๋‹ˆ๋‹ค. ๋‹ค์–‘ํ•œ ์ž‘์—…์— [`DiffusionPipeline`]์„ ๋ฐ”๋กœ ์‚ฌ์šฉํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. ์•„๋ž˜ ํ‘œ์—์„œ ์ง€์›๋˜๋Š” ๋ช‡ ๊ฐ€์ง€ ์ž‘์—…์„ ์‚ดํŽด๋ณด๊ณ , ์ง€์›๋˜๋Š” ์ž‘์—…์˜ ์ „์ฒด ๋ชฉ๋ก์€ [๐Ÿงจ Diffusers Summary](./api/pipelines/overview#diffusers-summary) ํ‘œ์—์„œ ํ™•์ธํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.
| **Task** | **Description** | **Pipeline**
|------------------------------|--------------------------------------------------------------------------------------------------------------|-----------------|
| Unconditional Image Generation | generate an image from Gaussian noise | [unconditional_image_generation](./using-diffusers/unconditional_image_generation) |
| Text-Guided Image Generation | generate an image given a text prompt | [conditional_image_generation](./using-diffusers/conditional_image_generation) |
| Text-Guided Image-to-Image Translation | adapt an image guided by a text prompt | [img2img](./using-diffusers/img2img) |
| Text-Guided Image-Inpainting | fill the masked part of an image given the image, the mask and a text prompt | [inpaint](./using-diffusers/inpaint) |
| Text-Guided Depth-to-Image Translation | adapt parts of an image guided by a text prompt while preserving structure via depth estimation | [depth2img](./using-diffusers/depth2img) |
๋จผ์ € [`DiffusionPipeline`]์˜ ์ธ์Šคํ„ด์Šค๋ฅผ ์ƒ์„ฑํ•˜๊ณ  ๋‹ค์šด๋กœ๋“œํ•  ํŒŒ์ดํ”„๋ผ์ธ ์ฒดํฌํฌ์ธํŠธ๋ฅผ ์ง€์ •ํ•ฉ๋‹ˆ๋‹ค.
ํ—ˆ๊น…ํŽ˜์ด์Šค ํ—ˆ๋ธŒ์— ์ €์žฅ๋œ ๋ชจ๋“  [checkpoint](https://huggingface.co/models?library=diffusers&sort=downloads)์— ๋Œ€ํ•ด [`DiffusionPipeline`]์„ ์‚ฌ์šฉํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.
์ด ํ›‘์–ด๋ณด๊ธฐ์—์„œ๋Š” text-to-image ์ƒ์„ฑ์„ ์œ„ํ•œ [`stable-diffusion-v1-5`](https://huggingface.co/runwayml/stable-diffusion-v1-5) ์ฒดํฌํฌ์ธํŠธ๋ฅผ ๋กœ๋“œํ•ฉ๋‹ˆ๋‹ค.
<Tip warning={true}>
[Stable Diffusion](https://huggingface.co/CompVis/stable-diffusion) ๋ชจ๋ธ์˜ ๊ฒฝ์šฐ, ๋ชจ๋ธ์„ ์‹คํ–‰ํ•˜๊ธฐ ์ „์— [๋ผ์ด์„ ์Šค](https://huggingface.co/spaces/CompVis/stable-diffusion-license)๋ฅผ ๋จผ์ € ์ฃผ์˜ ๊นŠ๊ฒŒ ์ฝ์–ด์ฃผ์„ธ์š”. ๐Ÿงจ Diffusers๋Š” ๋ถˆ์พŒํ•˜๊ฑฐ๋‚˜ ์œ ํ•ดํ•œ ์ฝ˜ํ…์ธ ๋ฅผ ๋ฐฉ์ง€ํ•˜๊ธฐ ์œ„ํ•ด [`safety_checker`](https://github.com/huggingface/diffusers/blob/main/src/diffusers/pipelines/stable_diffusion/safety_checker.py)๋ฅผ ๊ตฌํ˜„ํ•˜๊ณ  ์žˆ์ง€๋งŒ, ๋ชจ๋ธ์˜ ํ–ฅ์ƒ๋œ ์ด๋ฏธ์ง€ ์ƒ์„ฑ ๊ธฐ๋Šฅ์œผ๋กœ ์ธํ•ด ์—ฌ์ „ํžˆ ์ž ์žฌ์ ์œผ๋กœ ์œ ํ•ดํ•œ ์ฝ˜ํ…์ธ ๊ฐ€ ์ƒ์„ฑ๋  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.
</Tip>
[`~DiffusionPipeline.from_pretrained`] ๋ฐฉ๋ฒ•์œผ๋กœ ๋ชจ๋ธ ๋กœ๋“œํ•˜๊ธฐ:
```python
>>> from diffusers import DiffusionPipeline
>>> pipeline = DiffusionPipeline.from_pretrained("runwayml/stable-diffusion-v1-5")
```
The [`DiffusionPipeline`]์€ ๋ชจ๋“  ๋ชจ๋ธ๋ง, ํ† ํฐํ™”, ์Šค์ผ€์ค„๋ง ์ปดํฌ๋„ŒํŠธ๋ฅผ ๋‹ค์šด๋กœ๋“œํ•˜๊ณ  ์บ์‹œํ•ฉ๋‹ˆ๋‹ค. Stable Diffusion Pipeline์€ ๋ฌด์—‡๋ณด๋‹ค๋„ [`UNet2DConditionModel`]๊ณผ [`PNDMScheduler`]๋กœ ๊ตฌ์„ฑ๋˜์–ด ์žˆ์Œ์„ ์•Œ ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค:
```py
>>> pipeline
StableDiffusionPipeline {
"_class_name": "StableDiffusionPipeline",
"_diffusers_version": "0.13.1",
...,
"scheduler": [
"diffusers",
"PNDMScheduler"
],
...,
"unet": [
"diffusers",
"UNet2DConditionModel"
],
"vae": [
"diffusers",
"AutoencoderKL"
]
}
```
์ด ๋ชจ๋ธ์€ ์•ฝ 14์–ต ๊ฐœ์˜ ํŒŒ๋ผ๋ฏธํ„ฐ๋กœ ๊ตฌ์„ฑ๋˜์–ด ์žˆ์œผ๋ฏ€๋กœ GPU์—์„œ ํŒŒ์ดํ”„๋ผ์ธ์„ ์‹คํ–‰ํ•  ๊ฒƒ์„ ๊ฐ•๋ ฅํžˆ ๊ถŒ์žฅํ•ฉ๋‹ˆ๋‹ค.
PyTorch์—์„œ์™€ ๋งˆ์ฐฌ๊ฐ€์ง€๋กœ ์ œ๋„ˆ๋ ˆ์ดํ„ฐ ๊ฐ์ฒด๋ฅผ GPU๋กœ ์ด๋™ํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค:
```python
>>> pipeline.to("cuda")
```
์ด์ œ `ํŒŒ์ดํ”„๋ผ์ธ`์— ํ…์ŠคํŠธ ํ”„๋กฌํ”„ํŠธ๋ฅผ ์ „๋‹ฌํ•˜์—ฌ ์ด๋ฏธ์ง€๋ฅผ ์ƒ์„ฑํ•œ ๋‹ค์Œ ๋…ธ์ด์ฆˆ๊ฐ€ ์ œ๊ฑฐ๋œ ์ด๋ฏธ์ง€์— ์•ก์„ธ์Šคํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. ๊ธฐ๋ณธ์ ์œผ๋กœ ์ด๋ฏธ์ง€ ์ถœ๋ ฅ์€ [`PIL.Image`](https://pillow.readthedocs.io/en/stable/reference/Image.html?highlight=image#the-image-class) ๊ฐ์ฒด๋กœ ๊ฐ์‹ธ์ง‘๋‹ˆ๋‹ค.
```python
>>> image = pipeline("An image of a squirrel in Picasso style").images[0]
>>> image
```
<div class="flex justify-center">
<img src="https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/image_of_squirrel_painting.png"/>
</div>
`save`๋ฅผ ํ˜ธ์ถœํ•˜์—ฌ ์ด๋ฏธ์ง€๋ฅผ ์ €์žฅํ•ฉ๋‹ˆ๋‹ค:
```python
>>> image.save("image_of_squirrel_painting.png")
```
### ๋กœ์ปฌ ํŒŒ์ดํ”„๋ผ์ธ
ํŒŒ์ดํ”„๋ผ์ธ์„ ๋กœ์ปฌ์—์„œ ์‚ฌ์šฉํ•  ์ˆ˜๋„ ์žˆ์Šต๋‹ˆ๋‹ค. ์œ ์ผํ•œ ์ฐจ์ด์ ์€ ๊ฐ€์ค‘์น˜๋ฅผ ๋จผ์ € ๋‹ค์šด๋กœ๋“œํ•ด์•ผ ํ•œ๋‹ค๋Š” ์ ์ž…๋‹ˆ๋‹ค:
```bash
!git lfs install
!git clone https://huggingface.co/runwayml/stable-diffusion-v1-5
```
๊ทธ๋Ÿฐ ๋‹ค์Œ ์ €์žฅ๋œ ๊ฐ€์ค‘์น˜๋ฅผ ํŒŒ์ดํ”„๋ผ์ธ์— ๋กœ๋“œํ•ฉ๋‹ˆ๋‹ค:
```python
>>> pipeline = DiffusionPipeline.from_pretrained("./stable-diffusion-v1-5")
```
์ด์ œ ์œ„ ์„น์…˜์—์„œ์™€ ๊ฐ™์ด ํŒŒ์ดํ”„๋ผ์ธ์„ ์‹คํ–‰ํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.
### ์Šค์ผ€์ค„๋Ÿฌ ๊ต์ฒด
์Šค์ผ€์ค„๋Ÿฌ๋งˆ๋‹ค ๋…ธ์ด์ฆˆ ์ œ๊ฑฐ ์†๋„์™€ ํ’ˆ์งˆ์ด ์„œ๋กœ ๋‹ค๋ฆ…๋‹ˆ๋‹ค. ์ž์‹ ์—๊ฒŒ ๊ฐ€์žฅ ์ ํ•ฉํ•œ ์Šค์ผ€์ค„๋Ÿฌ๋ฅผ ์ฐพ๋Š” ๊ฐ€์žฅ ์ข‹์€ ๋ฐฉ๋ฒ•์€ ์ง์ ‘ ์‚ฌ์šฉํ•ด ๋ณด๋Š” ๊ฒƒ์ž…๋‹ˆ๋‹ค! ๐Ÿงจ Diffusers์˜ ์ฃผ์š” ๊ธฐ๋Šฅ ์ค‘ ํ•˜๋‚˜๋Š” ์Šค์ผ€์ค„๋Ÿฌ ๊ฐ„์— ์‰ฝ๊ฒŒ ์ „ํ™˜์ด ๊ฐ€๋Šฅํ•˜๋‹ค๋Š” ๊ฒƒ์ž…๋‹ˆ๋‹ค. ์˜ˆ๋ฅผ ๋“ค์–ด, ๊ธฐ๋ณธ ์Šค์ผ€์ค„๋Ÿฌ์ธ [`PNDMScheduler`]๋ฅผ [`EulerDiscreteScheduler`]๋กœ ๋ฐ”๊พธ๋ ค๋ฉด, [`~diffusers.ConfigMixin.from_config`] ๋ฉ”์„œ๋“œ๋ฅผ ์‚ฌ์šฉํ•˜์—ฌ ๋กœ๋“œํ•˜์„ธ์š”:
```py
>>> from diffusers import EulerDiscreteScheduler
>>> pipeline = DiffusionPipeline.from_pretrained("runwayml/stable-diffusion-v1-5")
>>> pipeline.scheduler = EulerDiscreteScheduler.from_config(pipeline.scheduler.config)
```
์ƒˆ ์Šค์ผ€์ค„๋Ÿฌ๋กœ ์ด๋ฏธ์ง€๋ฅผ ์ƒ์„ฑํ•ด๋ณด๊ณ  ์–ด๋–ค ์ฐจ์ด๊ฐ€ ์žˆ๋Š”์ง€ ํ™•์ธํ•ด ๋ณด์„ธ์š”!
๋‹ค์Œ ์„น์…˜์—์„œ๋Š” ๋ชจ๋ธ๊ณผ ์Šค์ผ€์ค„๋Ÿฌ๋ผ๋Š” [`DiffusionPipeline`]์„ ๊ตฌ์„ฑํ•˜๋Š” ์ปดํฌ๋„ŒํŠธ๋ฅผ ์ž์„ธํžˆ ์‚ดํŽด๋ณด๊ณ  ์ด๋Ÿฌํ•œ ์ปดํฌ๋„ŒํŠธ๋ฅผ ์‚ฌ์šฉํ•˜์—ฌ ๊ณ ์–‘์ด ์ด๋ฏธ์ง€๋ฅผ ์ƒ์„ฑํ•˜๋Š” ๋ฐฉ๋ฒ•์„ ๋ฐฐ์›Œ๋ณด๊ฒ ์Šต๋‹ˆ๋‹ค.
## ๋ชจ๋ธ
๋Œ€๋ถ€๋ถ„์˜ ๋ชจ๋ธ์€ ๋…ธ์ด์ฆˆ๊ฐ€ ์žˆ๋Š” ์ƒ˜ํ”Œ์„ ๊ฐ€์ ธ์™€ ๊ฐ ์‹œ๊ฐ„ ๊ฐ„๊ฒฉ๋งˆ๋‹ค ๋…ธ์ด์ฆˆ๊ฐ€ ์ ์€ ์ด๋ฏธ์ง€์™€ ์ž…๋ ฅ ์ด๋ฏธ์ง€ ์‚ฌ์ด์˜ ์ฐจ์ด์ธ *๋…ธ์ด์ฆˆ ์ž”์ฐจ*(๋‹ค๋ฅธ ๋ชจ๋ธ์€ ์ด์ „ ์ƒ˜ํ”Œ์„ ์ง์ ‘ ์˜ˆ์ธกํ•˜๊ฑฐ๋‚˜ ์†๋„ ๋˜๋Š” [`v-prediction`](https://github.com/huggingface/diffusers/blob/5e5ce13e2f89ac45a0066cb3f369462a3cf1d9ef/src/diffusers/schedulers/scheduling_ddim.py#L110)์„ ์˜ˆ์ธกํ•˜๋Š” ํ•™์Šต์„ ํ•ฉ๋‹ˆ๋‹ค)์„ ์˜ˆ์ธกํ•ฉ๋‹ˆ๋‹ค. ๋ชจ๋ธ์„ ๋ฏน์Šค ์•ค ๋งค์น˜ํ•˜์—ฌ ๋‹ค๋ฅธ diffusion ์‹œ์Šคํ…œ์„ ๋งŒ๋“ค ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.
๋ชจ๋ธ์€ [`~ModelMixin.from_pretrained`] ๋ฉ”์„œ๋“œ๋กœ ์‹œ์ž‘๋˜๋ฉฐ, ์ด ๋ฉ”์„œ๋“œ๋Š” ๋ชจ๋ธ ๊ฐ€์ค‘์น˜๋ฅผ ๋กœ์ปฌ์— ์บ์‹œํ•˜์—ฌ ๋‹ค์Œ์— ๋ชจ๋ธ์„ ๋กœ๋“œํ•  ๋•Œ ๋” ๋น ๋ฅด๊ฒŒ ๋กœ๋“œํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. ํ›‘์–ด๋ณด๊ธฐ์—์„œ๋Š” ๊ณ ์–‘์ด ์ด๋ฏธ์ง€์— ๋Œ€ํ•ด ํ•™์Šต๋œ ์ฒดํฌํฌ์ธํŠธ๊ฐ€ ์žˆ๋Š” ๊ธฐ๋ณธ์ ์ธ unconditional ์ด๋ฏธ์ง€ ์ƒ์„ฑ ๋ชจ๋ธ์ธ [`UNet2DModel`]์„ ๋กœ๋“œํ•ฉ๋‹ˆ๋‹ค:
```py
>>> from diffusers import UNet2DModel
>>> repo_id = "google/ddpm-cat-256"
>>> model = UNet2DModel.from_pretrained(repo_id)
```
๋ชจ๋ธ ๋งค๊ฐœ๋ณ€์ˆ˜์— ์•ก์„ธ์Šคํ•˜๋ ค๋ฉด `model.config`๋ฅผ ํ˜ธ์ถœํ•ฉ๋‹ˆ๋‹ค:
```py
>>> model.config
```
๋ชจ๋ธ ๊ตฌ์„ฑ์€ ๐ŸงŠ ๊ณ ์ •๋œ ๐ŸงŠ ๋”•์…”๋„ˆ๋ฆฌ๋กœ, ๋ชจ๋ธ์ด ์ƒ์„ฑ๋œ ํ›„์—๋Š” ํ•ด๋‹น ๋งค๊ฐœ ๋ณ€์ˆ˜๋“ค์„ ๋ณ€๊ฒฝํ•  ์ˆ˜ ์—†์Šต๋‹ˆ๋‹ค. ์ด๋Š” ์˜๋„์ ์ธ ๊ฒƒ์œผ๋กœ, ์ฒ˜์Œ์— ๋ชจ๋ธ ์•„ํ‚คํ…์ฒ˜๋ฅผ ์ •์˜ํ•˜๋Š” ๋ฐ ์‚ฌ์šฉ๋œ ๋งค๊ฐœ๋ณ€์ˆ˜๋Š” ๋™์ผํ•˜๊ฒŒ ์œ ์ง€ํ•˜๋ฉด์„œ ๋‹ค๋ฅธ ๋งค๊ฐœ๋ณ€์ˆ˜๋Š” ์ถ”๋ก  ์ค‘์— ์กฐ์ •ํ•  ์ˆ˜ ์žˆ๋„๋ก ํ•˜๊ธฐ ์œ„ํ•œ ๊ฒƒ์ž…๋‹ˆ๋‹ค.
๊ฐ€์žฅ ์ค‘์š”ํ•œ ๋งค๊ฐœ๋ณ€์ˆ˜๋“ค์€ ๋‹ค์Œ๊ณผ ๊ฐ™์Šต๋‹ˆ๋‹ค:
* `sample_size`: ์ž…๋ ฅ ์ƒ˜ํ”Œ์˜ ๋†’์ด ๋ฐ ๋„ˆ๋น„ ์น˜์ˆ˜์ž…๋‹ˆ๋‹ค.
* `in_channels`: ์ž…๋ ฅ ์ƒ˜ํ”Œ์˜ ์ž…๋ ฅ ์ฑ„๋„ ์ˆ˜์ž…๋‹ˆ๋‹ค.
* `down_block_types` ๋ฐ `up_block_types`: UNet ์•„ํ‚คํ…์ฒ˜๋ฅผ ์ƒ์„ฑํ•˜๋Š” ๋ฐ ์‚ฌ์šฉ๋˜๋Š” ๋‹ค์šด ๋ฐ ์—…์ƒ˜ํ”Œ๋ง ๋ธ”๋ก์˜ ์œ ํ˜•.
* `block_out_channels`: ๋‹ค์šด์ƒ˜ํ”Œ๋ง ๋ธ”๋ก์˜ ์ถœ๋ ฅ ์ฑ„๋„ ์ˆ˜. ์—…์ƒ˜ํ”Œ๋ง ๋ธ”๋ก์˜ ์ž…๋ ฅ ์ฑ„๋„ ์ˆ˜์— ์—ญ์ˆœ์œผ๋กœ ์‚ฌ์šฉ๋˜๊ธฐ๋„ ํ•ฉ๋‹ˆ๋‹ค.
* `layers_per_block`: ๊ฐ UNet ๋ธ”๋ก์— ์กด์žฌํ•˜๋Š” ResNet ๋ธ”๋ก์˜ ์ˆ˜์ž…๋‹ˆ๋‹ค.
์ถ”๋ก ์— ๋ชจ๋ธ์„ ์‚ฌ์šฉํ•˜๋ ค๋ฉด ๋žœ๋ค ๊ฐ€์šฐ์‹œ์•ˆ ๋…ธ์ด์ฆˆ๋กœ ์ด๋ฏธ์ง€ ๋ชจ์–‘์„ ๋งŒ๋“ญ๋‹ˆ๋‹ค. ๋ชจ๋ธ์ด ์—ฌ๋Ÿฌ ๊ฐœ์˜ ๋ฌด์ž‘์œ„ ๋…ธ์ด์ฆˆ๋ฅผ ์ˆ˜์‹ ํ•  ์ˆ˜ ์žˆ์œผ๋ฏ€๋กœ 'batch' ์ถ•, ์ž…๋ ฅ ์ฑ„๋„ ์ˆ˜์— ํ•ด๋‹นํ•˜๋Š” 'channel' ์ถ•, ์ด๋ฏธ์ง€์˜ ๋†’์ด์™€ ๋„ˆ๋น„๋ฅผ ๋‚˜ํƒ€๋‚ด๋Š” 'sample_size' ์ถ•์ด ์žˆ์–ด์•ผ ํ•ฉ๋‹ˆ๋‹ค:
```py
>>> import torch
>>> torch.manual_seed(0)
>>> noisy_sample = torch.randn(1, model.config.in_channels, model.config.sample_size, model.config.sample_size)
>>> noisy_sample.shape
torch.Size([1, 3, 256, 256])
```
์ถ”๋ก ์„ ์œ„ํ•ด ๋ชจ๋ธ์— ๋…ธ์ด์ฆˆ๊ฐ€ ์žˆ๋Š” ์ด๋ฏธ์ง€์™€ `timestep`์„ ์ „๋‹ฌํ•ฉ๋‹ˆ๋‹ค. 'timestep'์€ ์ž…๋ ฅ ์ด๋ฏธ์ง€์˜ ๋…ธ์ด์ฆˆ ์ •๋„๋ฅผ ๋‚˜ํƒ€๋‚ด๋ฉฐ, ์‹œ์ž‘ ๋ถ€๋ถ„์— ๋” ๋งŽ์€ ๋…ธ์ด์ฆˆ๊ฐ€ ์žˆ๊ณ  ๋ ๋ถ€๋ถ„์— ๋” ์ ์€ ๋…ธ์ด์ฆˆ๊ฐ€ ์žˆ์Šต๋‹ˆ๋‹ค. ์ด๋ฅผ ํ†ตํ•ด ๋ชจ๋ธ์ด diffusion ๊ณผ์ •์—์„œ ์‹œ์ž‘ ๋˜๋Š” ๋์— ๋” ๊ฐ€๊นŒ์šด ์œ„์น˜๋ฅผ ๊ฒฐ์ •ํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. `sample` ๋ฉ”์„œ๋“œ๋ฅผ ์‚ฌ์šฉํ•˜์—ฌ ๋ชจ๋ธ ์ถœ๋ ฅ์„ ์–ป์Šต๋‹ˆ๋‹ค:
```py
>>> with torch.no_grad():
... noisy_residual = model(sample=noisy_sample, timestep=2).sample
```
ํ•˜์ง€๋งŒ ์‹ค์ œ ์˜ˆ๋ฅผ ์ƒ์„ฑํ•˜๋ ค๋ฉด ๋…ธ์ด์ฆˆ ์ œ๊ฑฐ ํ”„๋กœ์„ธ์Šค๋ฅผ ์•ˆ๋‚ดํ•  ์Šค์ผ€์ค„๋Ÿฌ๊ฐ€ ํ•„์š”ํ•ฉ๋‹ˆ๋‹ค. ๋‹ค์Œ ์„น์…˜์—์„œ๋Š” ๋ชจ๋ธ์„ ์Šค์ผ€์ค„๋Ÿฌ์™€ ๊ฒฐํ•ฉํ•˜๋Š” ๋ฐฉ๋ฒ•์— ๋Œ€ํ•ด ์•Œ์•„๋ด…๋‹ˆ๋‹ค.
## ์Šค์ผ€์ค„๋Ÿฌ
์Šค์ผ€์ค„๋Ÿฌ๋Š” ๋ชจ๋ธ ์ถœ๋ ฅ์ด ์ฃผ์–ด์กŒ์„ ๋•Œ ๋…ธ์ด์ฆˆ๊ฐ€ ๋งŽ์€ ์ƒ˜ํ”Œ์—์„œ ๋…ธ์ด์ฆˆ๊ฐ€ ์ ์€ ์ƒ˜ํ”Œ๋กœ ์ „ํ™˜ํ•˜๋Š” ๊ฒƒ์„ ๊ด€๋ฆฌํ•ฉ๋‹ˆ๋‹ค - ์ด ๊ฒฝ์šฐ 'noisy_residual'.
<Tip>
๐Ÿงจ Diffusers๋Š” Diffusion ์‹œ์Šคํ…œ์„ ๊ตฌ์ถ•ํ•˜๊ธฐ ์œ„ํ•œ ํˆด๋ฐ•์Šค์ž…๋‹ˆ๋‹ค. [`DiffusionPipeline`]์„ ์‚ฌ์šฉํ•˜๋ฉด ๋ฏธ๋ฆฌ ๋งŒ๋“ค์–ด์ง„ Diffusion ์‹œ์Šคํ…œ์„ ํŽธ๋ฆฌํ•˜๊ฒŒ ์‹œ์ž‘ํ•  ์ˆ˜ ์žˆ์ง€๋งŒ, ๋ชจ๋ธ๊ณผ ์Šค์ผ€์ค„๋Ÿฌ ๊ตฌ์„ฑ ์š”์†Œ๋ฅผ ๊ฐœ๋ณ„์ ์œผ๋กœ ์„ ํƒํ•˜์—ฌ ์‚ฌ์šฉ์ž ์ง€์ • Diffusion ์‹œ์Šคํ…œ์„ ๊ตฌ์ถ•ํ•  ์ˆ˜๋„ ์žˆ์Šต๋‹ˆ๋‹ค.
</Tip>
ํ›‘์–ด๋ณด๊ธฐ์˜ ๊ฒฝ์šฐ, [`~diffusers.ConfigMixin.from_config`] ๋ฉ”์„œ๋“œ๋ฅผ ์‚ฌ์šฉํ•˜์—ฌ [`DDPMScheduler`]๋ฅผ ์ธ์Šคํ„ด์Šคํ™”ํ•ฉ๋‹ˆ๋‹ค:
```py
>>> from diffusers import DDPMScheduler
>>> scheduler = DDPMScheduler.from_config(repo_id)
>>> scheduler
DDPMScheduler {
"_class_name": "DDPMScheduler",
"_diffusers_version": "0.13.1",
"beta_end": 0.02,
"beta_schedule": "linear",
"beta_start": 0.0001,
"clip_sample": true,
"clip_sample_range": 1.0,
"num_train_timesteps": 1000,
"prediction_type": "epsilon",
"trained_betas": null,
"variance_type": "fixed_small"
}
```
<Tip>
๐Ÿ’ก ์Šค์ผ€์ค„๋Ÿฌ๊ฐ€ ๊ตฌ์„ฑ์—์„œ ์–ด๋–ป๊ฒŒ ์ธ์Šคํ„ด์Šคํ™”๋˜๋Š”์ง€ ์ฃผ๋ชฉํ•˜์„ธ์š”. ๋ชจ๋ธ๊ณผ ๋‹ฌ๋ฆฌ ์Šค์ผ€์ค„๋Ÿฌ์—๋Š” ํ•™์Šต ๊ฐ€๋Šฅํ•œ ๊ฐ€์ค‘์น˜๊ฐ€ ์—†์œผ๋ฉฐ ๋งค๊ฐœ๋ณ€์ˆ˜๋„ ์—†์Šต๋‹ˆ๋‹ค!
</Tip>
๊ฐ€์žฅ ์ค‘์š”ํ•œ ๋งค๊ฐœ๋ณ€์ˆ˜๋Š” ๋‹ค์Œ๊ณผ ๊ฐ™์Šต๋‹ˆ๋‹ค:
* `num_train_timesteps`: ๋…ธ์ด์ฆˆ ์ œ๊ฑฐ ํ”„๋กœ์„ธ์Šค์˜ ๊ธธ์ด, ์ฆ‰ ๋žœ๋ค ๊ฐ€์šฐ์Šค ๋…ธ์ด์ฆˆ๋ฅผ ๋ฐ์ดํ„ฐ ์ƒ˜ํ”Œ๋กœ ์ฒ˜๋ฆฌํ•˜๋Š” ๋ฐ ํ•„์š”ํ•œ ํƒ€์ž„์Šคํ… ์ˆ˜์ž…๋‹ˆ๋‹ค.
* `beta_schedule`: ์ถ”๋ก  ๋ฐ ํ•™์Šต์— ์‚ฌ์šฉํ•  ๋…ธ์ด์ฆˆ ์Šค์ผ€์ค„ ์œ ํ˜•์ž…๋‹ˆ๋‹ค.
* `beta_start` ๋ฐ `beta_end`: ๋…ธ์ด์ฆˆ ์Šค์ผ€์ค„์˜ ์‹œ์ž‘ ๋ฐ ์ข…๋ฃŒ ๋…ธ์ด์ฆˆ ๊ฐ’์ž…๋‹ˆ๋‹ค.
๋…ธ์ด์ฆˆ๊ฐ€ ์•ฝ๊ฐ„ ์ ์€ ์ด๋ฏธ์ง€๋ฅผ ์˜ˆ์ธกํ•˜๋ ค๋ฉด ์Šค์ผ€์ค„๋Ÿฌ์˜ [`~diffusers.DDPMScheduler.step`] ๋ฉ”์„œ๋“œ์— ๋ชจ๋ธ ์ถœ๋ ฅ, `timestep`, ํ˜„์žฌ `sample`์„ ์ „๋‹ฌํ•˜์„ธ์š”.
```py
>>> less_noisy_sample = scheduler.step(model_output=noisy_residual, timestep=2, sample=noisy_sample).prev_sample
>>> less_noisy_sample.shape
```
`less_noisy_sample`์„ ๋‹ค์Œ `timestep`์œผ๋กœ ๋„˜๊ธฐ๋ฉด ๋…ธ์ด์ฆˆ๊ฐ€ ๋” ์ค„์–ด๋“ญ๋‹ˆ๋‹ค! ์ด์ œ ์ด ๋ชจ๋“  ๊ฒƒ์„ ํ•œ๋ฐ ๋ชจ์•„ ์ „์ฒด ๋…ธ์ด์ฆˆ ์ œ๊ฑฐ ๊ณผ์ •์„ ์‹œ๊ฐํ™”ํ•ด ๋ณด๊ฒ ์Šต๋‹ˆ๋‹ค.
๋จผ์ € ๋…ธ์ด์ฆˆ ์ œ๊ฑฐ๋œ ์ด๋ฏธ์ง€๋ฅผ ํ›„์ฒ˜๋ฆฌํ•˜์—ฌ `PIL.Image`๋กœ ํ‘œ์‹œํ•˜๋Š” ํ•จ์ˆ˜๋ฅผ ๋งŒ๋“ญ๋‹ˆ๋‹ค:
```py
>>> import PIL.Image
>>> import numpy as np
>>> def display_sample(sample, i):
... image_processed = sample.cpu().permute(0, 2, 3, 1)
... image_processed = (image_processed + 1.0) * 127.5
... image_processed = image_processed.numpy().astype(np.uint8)
... image_pil = PIL.Image.fromarray(image_processed[0])
... display(f"Image at step {i}")
... display(image_pil)
```
๋…ธ์ด์ฆˆ ์ œ๊ฑฐ ํ”„๋กœ์„ธ์Šค์˜ ์†๋„๋ฅผ ๋†’์ด๋ ค๋ฉด ์ž…๋ ฅ๊ณผ ๋ชจ๋ธ์„ GPU๋กœ ์˜ฎ๊ธฐ์„ธ์š”:
```py
>>> model.to("cuda")
>>> noisy_sample = noisy_sample.to("cuda")
```
์ด์ œ ๋…ธ์ด์ฆˆ๊ฐ€ ์ ์€ ์ƒ˜ํ”Œ์˜ ์ž”์ฐจ๋ฅผ ์˜ˆ์ธกํ•˜๊ณ  ์Šค์ผ€์ค„๋Ÿฌ๋กœ ๋…ธ์ด์ฆˆ๊ฐ€ ์ ์€ ์ƒ˜ํ”Œ์„ ๊ณ„์‚ฐํ•˜๋Š” ๋…ธ์ด์ฆˆ ์ œ๊ฑฐ ๋ฃจํ”„๋ฅผ ์ƒ์„ฑํ•ฉ๋‹ˆ๋‹ค:
```py
>>> import tqdm
>>> sample = noisy_sample
>>> for i, t in enumerate(tqdm.tqdm(scheduler.timesteps)):
... # 1. predict noise residual
... with torch.no_grad():
... residual = model(sample, t).sample
... # 2. compute less noisy image and set x_t -> x_t-1
... sample = scheduler.step(residual, t, sample).prev_sample
... # 3. optionally look at image
... if (i + 1) % 50 == 0:
... display_sample(sample, i + 1)
```
๊ฐ€๋งŒํžˆ ์•‰์•„์„œ ๊ณ ์–‘์ด๊ฐ€ ์†Œ์Œ์œผ๋กœ๋งŒ ์ƒ์„ฑ๋˜๋Š” ๊ฒƒ์„ ์ง€์ผœ๋ณด์„ธ์š”!๐Ÿ˜ป
<div class="flex justify-center">
<img src="https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/diffusers/diffusion-quicktour.png"/>
</div>
## ๋‹ค์Œ ๋‹จ๊ณ„
์ด๋ฒˆ ํ›‘์–ด๋ณด๊ธฐ์—์„œ ๐Ÿงจ Diffusers๋กœ ๋ฉ‹์ง„ ์ด๋ฏธ์ง€๋ฅผ ๋งŒ๋“ค์–ด ๋ณด์…จ๊ธฐ๋ฅผ ๋ฐ”๋ž๋‹ˆ๋‹ค! ๋‹ค์Œ ๋‹จ๊ณ„๋กœ ๋„˜์–ด๊ฐ€์„ธ์š”:
* [training](./tutorials/basic_training) ํŠœํ† ๋ฆฌ์–ผ์—์„œ ๋ชจ๋ธ์„ ํ•™์Šตํ•˜๊ฑฐ๋‚˜ ํŒŒ์ธํŠœ๋‹ํ•˜์—ฌ ๋‚˜๋งŒ์˜ ์ด๋ฏธ์ง€๋ฅผ ์ƒ์„ฑํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.
* ๋‹ค์–‘ํ•œ ์‚ฌ์šฉ ์‚ฌ๋ก€๋Š” ๊ณต์‹ ๋ฐ ์ปค๋ฎค๋‹ˆํ‹ฐ [ํ•™์Šต ๋˜๋Š” ํŒŒ์ธํŠœ๋‹ ์Šคํฌ๋ฆฝํŠธ](https://github.com/huggingface/diffusers/tree/main/examples#-diffusers-examples) ์˜ˆ์‹œ๋ฅผ ์ฐธ์กฐํ•˜์„ธ์š”.
* ์Šค์ผ€์ค„๋Ÿฌ ๋กœ๋“œ, ์•ก์„ธ์Šค, ๋ณ€๊ฒฝ ๋ฐ ๋น„๊ต์— ๋Œ€ํ•œ ์ž์„ธํ•œ ๋‚ด์šฉ์€ [๋‹ค๋ฅธ ์Šค์ผ€์ค„๋Ÿฌ ์‚ฌ์šฉ](./using-diffusers/schedulers) ๊ฐ€์ด๋“œ์—์„œ ํ™•์ธํ•˜์„ธ์š”.
* [Stable Diffusion](./stable_diffusion) ๊ฐ€์ด๋“œ์—์„œ ํ”„๋กฌํ”„ํŠธ ์—”์ง€๋‹ˆ์–ด๋ง, ์†๋„ ๋ฐ ๋ฉ”๋ชจ๋ฆฌ ์ตœ์ ํ™”, ๊ณ ํ’ˆ์งˆ ์ด๋ฏธ์ง€ ์ƒ์„ฑ์„ ์œ„ํ•œ ํŒ๊ณผ ์š”๋ น์„ ์‚ดํŽด๋ณด์„ธ์š”.
* [GPU์—์„œ ํŒŒ์ดํ† ์น˜ ์ตœ์ ํ™”](./optimization/fp16) ๊ฐ€์ด๋“œ์™€ [์• ํ”Œ ์‹ค๋ฆฌ์ฝ˜(M1/M2)์—์„œ์˜ Stable Diffusion](./optimization/mps) ๋ฐ [ONNX ๋Ÿฐํƒ€์ž„](./optimization/onnx) ์‹คํ–‰์— ๋Œ€ํ•œ ์ถ”๋ก  ๊ฐ€์ด๋“œ๋ฅผ ํ†ตํ•ด ๐Ÿงจ Diffuser ์†๋„๋ฅผ ๋†’์ด๋Š” ๋ฐฉ๋ฒ•์„ ๋” ์ž์„ธํžˆ ์•Œ์•„๋ณด์„ธ์š”.