1
0
mirror of https://github.com/huggingface/diffusers.git synced 2026-01-29 07:22:12 +03:00
Files
diffusers/docs/source/ko/training/controlnet.md
Seongsu Park 0c775544dd [Docs] Korean translation update (#4684)
* Docs kr update 3

controlnet, reproducibility ์—…๋กœ๋“œ

generator ๊ทธ๋Œ€๋กœ ์‚ฌ์šฉ
seamless multi-GPU ๊ทธ๋Œ€๋กœ ์‚ฌ์šฉ

create_dataset ๋ฒˆ์—ญ 1์ฐจ

stable_diffusion_jax

new translation

Add coreml, tome

kr docs minor fix

translate training/instructpix2pix

fix training/instructpix2pix.mdx

using-diffusers/weighting_prompts ๋ฒˆ์—ญ 1์ฐจ

add SDXL docs

Translate using-diffuers/loading_overview.md

translate using-diffusers/textual_inversion_inference.md

Conditional image generation (#37)

* stable_diffusion_jax

* index_update

* index_update

* condition_image_generation

---------

Co-authored-by: Seongsu Park <tjdtnsu@gmail.com>

jihwan/stable_diffusion.mdx

custom_diffusion ์ž‘์—… ์™„๋ฃŒ

quicktour ์ž‘์—… ์™„๋ฃŒ

distributed inference & control brightness (#40)

* distributed_inference.mdx

* control_brightness

---------

Co-authored-by: idra79haza <idra79haza@github.com>
Co-authored-by: Seongsu Park <tjdtnsu@gmail.com>

using_safetensors (#41)

* distributed_inference.mdx

* control_brightness

* using_safetensors.mdx

---------

Co-authored-by: idra79haza <idra79haza@github.com>
Co-authored-by: Seongsu Park <tjdtnsu@gmail.com>

delete safetensor short

* Repace mdx to md

* toctree update

* Add controlling_generation

* toctree fix

* colab link, minor fix

* docs name typo fix

* frontmatter fix

* translation fix
2023-09-01 09:23:45 -07:00

332 lines
14 KiB
Markdown

<!--Copyright 2023 The HuggingFace Team. All rights reserved.
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
the License. You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on
an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the
specific language governing permissions and limitations under the License.
-->
# ControlNet
[Adding Conditional Control to Text-to-Image Diffusion Models](https://arxiv.org/abs/2302.05543) (ControlNet)์€ Lvmin Zhang๊ณผ Maneesh Agrawala์— ์˜ํ•ด ์“ฐ์—ฌ์กŒ์Šต๋‹ˆ๋‹ค.
์ด ์˜ˆ์‹œ๋Š” [์›๋ณธ ControlNet ๋ฆฌํฌ์ง€ํ† ๋ฆฌ์—์„œ ์˜ˆ์‹œ ํ•™์Šตํ•˜๊ธฐ](https://github.com/lllyasviel/ControlNet/blob/main/docs/train.md)์— ๊ธฐ๋ฐ˜ํ•ฉ๋‹ˆ๋‹ค. ControlNet์€ ์›๋“ค์„ ์ฑ„์šฐ๊ธฐ ์œ„ํ•ด [small synthetic dataset](https://huggingface.co/datasets/fusing/fill50k)์„ ์‚ฌ์šฉํ•ด์„œ ํ•™์Šต๋ฉ๋‹ˆ๋‹ค.
## ์˜์กด์„ฑ ์„ค์น˜ํ•˜๊ธฐ
์•„๋ž˜์˜ ์Šคํฌ๋ฆฝํŠธ๋ฅผ ์‹คํ–‰ํ•˜๊ธฐ ์ „์—, ๋ผ์ด๋ธŒ๋Ÿฌ๋ฆฌ์˜ ํ•™์Šต ์˜์กด์„ฑ์„ ์„ค์น˜ํ•ด์•ผ ํ•ฉ๋‹ˆ๋‹ค.
<Tip warning={true}>
๊ฐ€์žฅ ์ตœ์‹  ๋ฒ„์ „์˜ ์˜ˆ์‹œ ์Šคํฌ๋ฆฝํŠธ๋ฅผ ์„ฑ๊ณต์ ์œผ๋กœ ์‹คํ–‰ํ•˜๊ธฐ ์œ„ํ•ด์„œ๋Š”, ์†Œ์Šค์—์„œ ์„ค์น˜ํ•˜๊ณ  ์ตœ์‹  ๋ฒ„์ „์˜ ์„ค์น˜๋ฅผ ์œ ์ง€ํ•˜๋Š” ๊ฒƒ์„ ๊ฐ•๋ ฅํ•˜๊ฒŒ ์ถ”์ฒœํ•ฉ๋‹ˆ๋‹ค. ์šฐ๋ฆฌ๋Š” ์˜ˆ์‹œ ์Šคํฌ๋ฆฝํŠธ๋“ค์„ ์ž์ฃผ ์—…๋ฐ์ดํŠธํ•˜๊ณ  ์˜ˆ์‹œ์— ๋งž์ถ˜ ํŠน์ •ํ•œ ์š”๊ตฌ์‚ฌํ•ญ์„ ์„ค์น˜ํ•ฉ๋‹ˆ๋‹ค.
</Tip>
์œ„ ์‚ฌํ•ญ์„ ๋งŒ์กฑ์‹œํ‚ค๊ธฐ ์œ„ํ•ด์„œ, ์ƒˆ๋กœ์šด ๊ฐ€์ƒํ™˜๊ฒฝ์—์„œ ๋‹ค์Œ ์ผ๋ จ์˜ ์Šคํ…์„ ์‹คํ–‰ํ•˜์„ธ์š”:
```bash
git clone https://github.com/huggingface/diffusers
cd diffusers
pip install -e .
```
๊ทธ ๋‹ค์Œ์—๋Š” [์˜ˆ์‹œ ํด๋”](https://github.com/huggingface/diffusers/tree/main/examples/controlnet)์œผ๋กœ ์ด๋™ํ•ฉ๋‹ˆ๋‹ค.
```bash
cd examples/controlnet
```
์ด์ œ ์‹คํ–‰ํ•˜์„ธ์š”:
```bash
pip install -r requirements.txt
```
[๐Ÿค—Accelerate](https://github.com/huggingface/accelerate/) ํ™˜๊ฒฝ์„ ์ดˆ๊ธฐํ™” ํ•ฉ๋‹ˆ๋‹ค:
```bash
accelerate config
```
ํ˜น์€ ์—ฌ๋Ÿฌ๋ถ„์˜ ํ™˜๊ฒฝ์ด ๋ฌด์—‡์ธ์ง€ ๋ชฐ๋ผ๋„ ๊ธฐ๋ณธ์ ์ธ ๐Ÿค—Accelerate ๊ตฌ์„ฑ์œผ๋กœ ์ดˆ๊ธฐํ™”ํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค:
```bash
accelerate config default
```
ํ˜น์€ ๋‹น์‹ ์˜ ํ™˜๊ฒฝ์ด ๋…ธํŠธ๋ถ ๊ฐ™์€ ์ƒํ˜ธ์ž‘์šฉํ•˜๋Š” ์‰˜์„ ์ง€์›ํ•˜์ง€ ์•Š๋Š”๋‹ค๋ฉด, ์•„๋ž˜์˜ ์ฝ”๋“œ๋กœ ์ดˆ๊ธฐํ™” ํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค:
```python
from accelerate.utils import write_basic_config
write_basic_config()
```
## ์›์„ ์ฑ„์šฐ๋Š” ๋ฐ์ดํ„ฐ์…‹
์›๋ณธ ๋ฐ์ดํ„ฐ์…‹์€ ControlNet [repo](https://huggingface.co/lllyasviel/ControlNet/blob/main/training/fill50k.zip)์— ์˜ฌ๋ผ์™€์žˆ์ง€๋งŒ, ์šฐ๋ฆฌ๋Š” [์—ฌ๊ธฐ](https://huggingface.co/datasets/fusing/fill50k)์— ์ƒˆ๋กญ๊ฒŒ ๋‹ค์‹œ ์˜ฌ๋ ค์„œ ๐Ÿค— Datasets ๊ณผ ํ˜ธํ™˜๊ฐ€๋Šฅํ•ฉ๋‹ˆ๋‹ค. ๊ทธ๋ž˜์„œ ํ•™์Šต ์Šคํฌ๋ฆฝํŠธ ์ƒ์—์„œ ๋ฐ์ดํ„ฐ ๋ถˆ๋Ÿฌ์˜ค๊ธฐ๋ฅผ ๋‹ค๋ฃฐ ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.
์šฐ๋ฆฌ์˜ ํ•™์Šต ์˜ˆ์‹œ๋Š” ์›๋ž˜ ControlNet์˜ ํ•™์Šต์— ์“ฐ์˜€๋˜ [`runwayml/stable-diffusion-v1-5`](https://huggingface.co/runwayml/stable-diffusion-v1-5)์„ ์‚ฌ์šฉํ•ฉ๋‹ˆ๋‹ค. ๊ทธ๋ ‡์ง€๋งŒ ControlNet์€ ๋Œ€์‘๋˜๋Š” ์–ด๋А Stable Diffusion ๋ชจ๋ธ([`CompVis/stable-diffusion-v1-4`](https://huggingface.co/CompVis/stable-diffusion-v1-4)) ํ˜น์€ [`stabilityai/stable-diffusion-2-1`](https://huggingface.co/stabilityai/stable-diffusion-2-1)์˜ ์ฆ๊ฐ€๋ฅผ ์œ„ํ•ด ํ•™์Šต๋  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.
์ž์ฒด ๋ฐ์ดํ„ฐ์…‹์„ ์‚ฌ์šฉํ•˜๊ธฐ ์œ„ํ•ด์„œ๋Š” [ํ•™์Šต์„ ์œ„ํ•œ ๋ฐ์ดํ„ฐ์…‹ ์ƒ์„ฑํ•˜๊ธฐ](create_dataset) ๊ฐ€์ด๋“œ๋ฅผ ํ™•์ธํ•˜์„ธ์š”.
## ํ•™์Šต
์ด ํ•™์Šต์— ์‚ฌ์šฉ๋  ๋‹ค์Œ ์ด๋ฏธ์ง€๋“ค์„ ๋‹ค์šด๋กœ๋“œํ•˜์„ธ์š”:
```sh
wget https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/diffusers/controlnet_training/conditioning_image_1.png
wget https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/diffusers/controlnet_training/conditioning_image_2.png
```
`MODEL_NAME` ํ™˜๊ฒฝ ๋ณ€์ˆ˜ (Hub ๋ชจ๋ธ ๋ฆฌํฌ์ง€ํ† ๋ฆฌ ์•„์ด๋”” ํ˜น์€ ๋ชจ๋ธ ๊ฐ€์ค‘์น˜๊ฐ€ ์žˆ๋Š” ๋””๋ ‰ํ† ๋ฆฌ๋กœ ๊ฐ€๋Š” ์ฃผ์†Œ)๋ฅผ ๋ช…์‹œํ•˜๊ณ  [`pretrained_model_name_or_path`](https://huggingface.co/docs/diffusers/en/api/diffusion_pipeline#diffusers.DiffusionPipeline.from_pretrained.pretrained_model_name_or_path) ์ธ์ž๋กœ ํ™˜๊ฒฝ๋ณ€์ˆ˜๋ฅผ ๋ณด๋ƒ…๋‹ˆ๋‹ค.
ํ•™์Šต ์Šคํฌ๋ฆฝํŠธ๋Š” ๋‹น์‹ ์˜ ๋ฆฌํฌ์ง€ํ† ๋ฆฌ์— `diffusion_pytorch_model.bin` ํŒŒ์ผ์„ ์ƒ์„ฑํ•˜๊ณ  ์ €์žฅํ•ฉ๋‹ˆ๋‹ค.
```bash
export MODEL_DIR="runwayml/stable-diffusion-v1-5"
export OUTPUT_DIR="path to save model"
accelerate launch train_controlnet.py \
--pretrained_model_name_or_path=$MODEL_DIR \
--output_dir=$OUTPUT_DIR \
--dataset_name=fusing/fill50k \
--resolution=512 \
--learning_rate=1e-5 \
--validation_image "./conditioning_image_1.png" "./conditioning_image_2.png" \
--validation_prompt "red circle with blue background" "cyan circle with brown floral background" \
--train_batch_size=4 \
--push_to_hub
```
์ด ๊ธฐ๋ณธ์ ์ธ ์„ค์ •์œผ๋กœ๋Š” ~38GB VRAM์ด ํ•„์š”ํ•ฉ๋‹ˆ๋‹ค.
๊ธฐ๋ณธ์ ์œผ๋กœ ํ•™์Šต ์Šคํฌ๋ฆฝํŠธ๋Š” ๊ฒฐ๊ณผ๋ฅผ ํ…์„œ๋ณด๋“œ์— ๊ธฐ๋กํ•ฉ๋‹ˆ๋‹ค. ๊ฐ€์ค‘์น˜(weight)์™€ ํŽธํ–ฅ(bias)์„ ์‚ฌ์šฉํ•˜๊ธฐ ์œ„ํ•ด `--report_to wandb` ๋ฅผ ์ „๋‹ฌํ•ฉ๋‹ˆ๋‹ค.
๋” ์ž‘์€ batch(๋ฐฐ์น˜) ํฌ๊ธฐ๋กœ gradient accumulation(๊ธฐ์šธ๊ธฐ ๋ˆ„์ )์„ ํ•˜๋ฉด ํ•™์Šต ์š”๊ตฌ์‚ฌํ•ญ์„ ~20 GB VRAM์œผ๋กœ ์ค„์ผ ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.
```bash
export MODEL_DIR="runwayml/stable-diffusion-v1-5"
export OUTPUT_DIR="path to save model"
accelerate launch train_controlnet.py \
--pretrained_model_name_or_path=$MODEL_DIR \
--output_dir=$OUTPUT_DIR \
--dataset_name=fusing/fill50k \
--resolution=512 \
--learning_rate=1e-5 \
--validation_image "./conditioning_image_1.png" "./conditioning_image_2.png" \
--validation_prompt "red circle with blue background" "cyan circle with brown floral background" \
--train_batch_size=1 \
--gradient_accumulation_steps=4 \
--push_to_hub
```
## ์—ฌ๋Ÿฌ๊ฐœ GPU๋กœ ํ•™์Šตํ•˜๊ธฐ
`accelerate` ์€ seamless multi-GPU ํ•™์Šต์„ ๊ณ ๋ คํ•ฉ๋‹ˆ๋‹ค. `accelerate`๊ณผ ํ•จ๊ป˜ ๋ถ„์‚ฐ๋œ ํ•™์Šต์„ ์‹คํ–‰ํ•˜๊ธฐ ์œ„ํ•ด [์—ฌ๊ธฐ](https://huggingface.co/docs/accelerate/basic_tutorials/launch)
์˜ ์„ค๋ช…์„ ํ™•์ธํ•˜์„ธ์š”. ์•„๋ž˜๋Š” ์˜ˆ์‹œ ๋ช…๋ น์–ด์ž…๋‹ˆ๋‹ค:
```bash
export MODEL_DIR="runwayml/stable-diffusion-v1-5"
export OUTPUT_DIR="path to save model"
accelerate launch --mixed_precision="fp16" --multi_gpu train_controlnet.py \
--pretrained_model_name_or_path=$MODEL_DIR \
--output_dir=$OUTPUT_DIR \
--dataset_name=fusing/fill50k \
--resolution=512 \
--learning_rate=1e-5 \
--validation_image "./conditioning_image_1.png" "./conditioning_image_2.png" \
--validation_prompt "red circle with blue background" "cyan circle with brown floral background" \
--train_batch_size=4 \
--mixed_precision="fp16" \
--tracker_project_name="controlnet-demo" \
--report_to=wandb \
--push_to_hub
```
## ์˜ˆ์‹œ ๊ฒฐ๊ณผ
#### ๋ฐฐ์น˜ ์‚ฌ์ด์ฆˆ 8๋กœ 300 ์Šคํ… ์ดํ›„:
| | |
|-------------------|:-------------------------:|
| | ํ‘ธ๋ฅธ ๋ฐฐ๊ฒฝ๊ณผ ๋นจ๊ฐ„ ์› |
![conditioning image](https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/diffusers/controlnet_training/conditioning_image_1.png) | ![ํ‘ธ๋ฅธ ๋ฐฐ๊ฒฝ๊ณผ ๋นจ๊ฐ„ ์›](https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/diffusers/controlnet_training/red_circle_with_blue_background_300_steps.png) |
| | ๊ฐˆ์ƒ‰ ๊ฝƒ ๋ฐฐ๊ฒฝ๊ณผ ์ฒญ๋ก์ƒ‰ ์› |
![conditioning image](https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/diffusers/controlnet_training/conditioning_image_2.png) | ![๊ฐˆ์ƒ‰ ๊ฝƒ ๋ฐฐ๊ฒฝ๊ณผ ์ฒญ๋ก์ƒ‰ ์›](https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/diffusers/controlnet_training/cyan_circle_with_brown_floral_background_300_steps.png) |
#### ๋ฐฐ์น˜ ์‚ฌ์ด์ฆˆ 8๋กœ 6000 ์Šคํ… ์ดํ›„:
| | |
|-------------------|:-------------------------:|
| | ํ‘ธ๋ฅธ ๋ฐฐ๊ฒฝ๊ณผ ๋นจ๊ฐ„ ์› |
![conditioning image](https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/diffusers/controlnet_training/conditioning_image_1.png) | ![ํ‘ธ๋ฅธ ๋ฐฐ๊ฒฝ๊ณผ ๋นจ๊ฐ„ ์›](https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/diffusers/controlnet_training/red_circle_with_blue_background_6000_steps.png) |
| | ๊ฐˆ์ƒ‰ ๊ฝƒ ๋ฐฐ๊ฒฝ๊ณผ ์ฒญ๋ก์ƒ‰ ์› |
![conditioning image](https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/diffusers/controlnet_training/conditioning_image_2.png) | ![๊ฐˆ์ƒ‰ ๊ฝƒ ๋ฐฐ๊ฒฝ๊ณผ ์ฒญ๋ก์ƒ‰ ์›](https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/diffusers/controlnet_training/cyan_circle_with_brown_floral_background_6000_steps.png) |
## 16GB GPU์—์„œ ํ•™์Šตํ•˜๊ธฐ
16GB GPU์—์„œ ํ•™์Šตํ•˜๊ธฐ ์œ„ํ•ด ๋‹ค์Œ์˜ ์ตœ์ ํ™”๋ฅผ ์ง„ํ–‰ํ•˜์„ธ์š”:
- ๊ธฐ์šธ๊ธฐ ์ฒดํฌํฌ์ธํŠธ ์ €์žฅํ•˜๊ธฐ
- bitsandbyte์˜ [8-bit optimizer](https://github.com/TimDettmers/bitsandbytes#requirements--installation)๊ฐ€ ์„ค์น˜๋˜์ง€ ์•Š์•˜๋‹ค๋ฉด ๋งํฌ์— ์—ฐ๊ฒฐ๋œ ์„ค๋ช…์„œ๋ฅผ ๋ณด์„ธ์š”.
์ด์ œ ํ•™์Šต ์Šคํฌ๋ฆฝํŠธ๋ฅผ ์‹œ์ž‘ํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค:
```bash
export MODEL_DIR="runwayml/stable-diffusion-v1-5"
export OUTPUT_DIR="path to save model"
accelerate launch train_controlnet.py \
--pretrained_model_name_or_path=$MODEL_DIR \
--output_dir=$OUTPUT_DIR \
--dataset_name=fusing/fill50k \
--resolution=512 \
--learning_rate=1e-5 \
--validation_image "./conditioning_image_1.png" "./conditioning_image_2.png" \
--validation_prompt "red circle with blue background" "cyan circle with brown floral background" \
--train_batch_size=1 \
--gradient_accumulation_steps=4 \
--gradient_checkpointing \
--use_8bit_adam \
--push_to_hub
```
## 12GB GPU์—์„œ ํ•™์Šตํ•˜๊ธฐ
12GB GPU์—์„œ ์‹คํ–‰ํ•˜๊ธฐ ์œ„ํ•ด ๋‹ค์Œ์˜ ์ตœ์ ํ™”๋ฅผ ์ง„ํ–‰ํ•˜์„ธ์š”:
- ๊ธฐ์šธ๊ธฐ ์ฒดํฌํฌ์ธํŠธ ์ €์žฅํ•˜๊ธฐ
- bitsandbyte์˜ 8-bit [optimizer](https://github.com/TimDettmers/bitsandbytes#requirements--installation)(๊ฐ€ ์„ค์น˜๋˜์ง€ ์•Š์•˜๋‹ค๋ฉด ๋งํฌ์— ์—ฐ๊ฒฐ๋œ ์„ค๋ช…์„œ๋ฅผ ๋ณด์„ธ์š”)
- [xFormers](https://huggingface.co/docs/diffusers/training/optimization/xformers)(๊ฐ€ ์„ค์น˜๋˜์ง€ ์•Š์•˜๋‹ค๋ฉด ๋งํฌ์— ์—ฐ๊ฒฐ๋œ ์„ค๋ช…์„œ๋ฅผ ๋ณด์„ธ์š”)
- ๊ธฐ์šธ๊ธฐ๋ฅผ `None`์œผ๋กœ ์„ค์ •
```bash
export MODEL_DIR="runwayml/stable-diffusion-v1-5"
export OUTPUT_DIR="path to save model"
accelerate launch train_controlnet.py \
--pretrained_model_name_or_path=$MODEL_DIR \
--output_dir=$OUTPUT_DIR \
--dataset_name=fusing/fill50k \
--resolution=512 \
--learning_rate=1e-5 \
--validation_image "./conditioning_image_1.png" "./conditioning_image_2.png" \
--validation_prompt "red circle with blue background" "cyan circle with brown floral background" \
--train_batch_size=1 \
--gradient_accumulation_steps=4 \
--gradient_checkpointing \
--use_8bit_adam \
--enable_xformers_memory_efficient_attention \
--set_grads_to_none \
--push_to_hub
```
`pip install xformers`์œผ๋กœ `xformers`์„ ํ™•์‹คํžˆ ์„ค์น˜ํ•˜๊ณ  `enable_xformers_memory_efficient_attention`์„ ์‚ฌ์šฉํ•˜์„ธ์š”.
## 8GB GPU์—์„œ ํ•™์Šตํ•˜๊ธฐ
์šฐ๋ฆฌ๋Š” ControlNet์„ ์ง€์›ํ•˜๊ธฐ ์œ„ํ•œ DeepSpeed๋ฅผ ์ฒ ์ €ํ•˜๊ฒŒ ํ…Œ์ŠคํŠธํ•˜์ง€ ์•Š์•˜์Šต๋‹ˆ๋‹ค. ํ™˜๊ฒฝ์„ค์ •์ด ๋ฉ”๋ชจ๋ฆฌ๋ฅผ ์ €์žฅํ•  ๋•Œ,
๊ทธ ํ™˜๊ฒฝ์ด ์„ฑ๊ณต์ ์œผ๋กœ ํ•™์Šตํ–ˆ๋Š”์ง€๋ฅผ ํ™•์ •ํ•˜์ง€ ์•Š์•˜์Šต๋‹ˆ๋‹ค. ์„ฑ๊ณตํ•œ ํ•™์Šต ์‹คํ–‰์„ ์œ„ํ•ด ์„ค์ •์„ ๋ณ€๊ฒฝํ•ด์•ผ ํ•  ๊ฐ€๋Šฅ์„ฑ์ด ๋†’์Šต๋‹ˆ๋‹ค.
8GB GPU์—์„œ ์‹คํ–‰ํ•˜๊ธฐ ์œ„ํ•ด ๋‹ค์Œ์˜ ์ตœ์ ํ™”๋ฅผ ์ง„ํ–‰ํ•˜์„ธ์š”:
- ๊ธฐ์šธ๊ธฐ ์ฒดํฌํฌ์ธํŠธ ์ €์žฅํ•˜๊ธฐ
- bitsandbyte์˜ 8-bit [optimizer](https://github.com/TimDettmers/bitsandbytes#requirements--installation)(๊ฐ€ ์„ค์น˜๋˜์ง€ ์•Š์•˜๋‹ค๋ฉด ๋งํฌ์— ์—ฐ๊ฒฐ๋œ ์„ค๋ช…์„œ๋ฅผ ๋ณด์„ธ์š”)
- [xFormers](https://huggingface.co/docs/diffusers/training/optimization/xformers)(๊ฐ€ ์„ค์น˜๋˜์ง€ ์•Š์•˜๋‹ค๋ฉด ๋งํฌ์— ์—ฐ๊ฒฐ๋œ ์„ค๋ช…์„œ๋ฅผ ๋ณด์„ธ์š”)
- ๊ธฐ์šธ๊ธฐ๋ฅผ `None`์œผ๋กœ ์„ค์ •
- DeepSpeed stage 2 ๋ณ€์ˆ˜์™€ optimizer ์—†์—๊ธฐ
- fp16 ํ˜ผํ•ฉ ์ •๋ฐ€๋„(precision)
[DeepSpeed](https://www.deepspeed.ai/)๋Š” CPU ๋˜๋Š” NVME๋กœ ํ…์„œ๋ฅผ VRAM์—์„œ ์˜คํ”„๋กœ๋“œํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.
์ด๋ฅผ ์œ„ํ•ด์„œ ํ›จ์”ฌ ๋” ๋งŽ์€ RAM(์•ฝ 25 GB)๊ฐ€ ํ•„์š”ํ•ฉ๋‹ˆ๋‹ค.
DeepSpeed stage 2๋ฅผ ํ™œ์„ฑํ™”ํ•˜๊ธฐ ์œ„ํ•ด์„œ `accelerate config`๋กœ ํ™˜๊ฒฝ์„ ๊ตฌ์„ฑํ•ด์•ผํ•ฉ๋‹ˆ๋‹ค.
๊ตฌ์„ฑ(configuration) ํŒŒ์ผ์€ ์ด๋Ÿฐ ๋ชจ์Šต์ด์–ด์•ผ ํ•ฉ๋‹ˆ๋‹ค:
```yaml
compute_environment: LOCAL_MACHINE
deepspeed_config:
gradient_accumulation_steps: 4
offload_optimizer_device: cpu
offload_param_device: cpu
zero3_init_flag: false
zero_stage: 2
distributed_type: DEEPSPEED
```
<ํŒ>
[๋ฌธ์„œ](https://huggingface.co/docs/accelerate/usage_guides/deepspeed)๋ฅผ ๋” ๋งŽ์€ DeepSpeed ์„ค์ • ์˜ต์…˜์„ ์œ„ํ•ด ๋ณด์„ธ์š”.
<ํŒ>
๊ธฐ๋ณธ Adam optimizer๋ฅผ DeepSpeed'์˜ Adam
`deepspeed.ops.adam.DeepSpeedCPUAdam` ์œผ๋กœ ๋ฐ”๊พธ๋ฉด ์ƒ๋‹นํ•œ ์†๋„ ํ–ฅ์ƒ์„ ์ด๋ฃฐ์ˆ˜ ์žˆ์ง€๋งŒ,
Pytorch์™€ ๊ฐ™์€ ๋ฒ„์ „์˜ CUDA toolchain์ด ํ•„์š”ํ•ฉ๋‹ˆ๋‹ค. 8-๋น„ํŠธ optimizer๋Š” ํ˜„์žฌ DeepSpeed์™€
ํ˜ธํ™˜๋˜์ง€ ์•Š๋Š” ๊ฒƒ ๊ฐ™์Šต๋‹ˆ๋‹ค.
```bash
export MODEL_DIR="runwayml/stable-diffusion-v1-5"
export OUTPUT_DIR="path to save model"
accelerate launch train_controlnet.py \
--pretrained_model_name_or_path=$MODEL_DIR \
--output_dir=$OUTPUT_DIR \
--dataset_name=fusing/fill50k \
--resolution=512 \
--validation_image "./conditioning_image_1.png" "./conditioning_image_2.png" \
--validation_prompt "red circle with blue background" "cyan circle with brown floral background" \
--train_batch_size=1 \
--gradient_accumulation_steps=4 \
--gradient_checkpointing \
--enable_xformers_memory_efficient_attention \
--set_grads_to_none \
--mixed_precision fp16 \
--push_to_hub
```
## ์ถ”๋ก 
ํ•™์Šต๋œ ๋ชจ๋ธ์€ [`StableDiffusionControlNetPipeline`]๊ณผ ํ•จ๊ป˜ ์‹คํ–‰๋  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.
`base_model_path`์™€ `controlnet_path` ์— ๊ฐ’์„ ์ง€์ •ํ•˜์„ธ์š” `--pretrained_model_name_or_path` ์™€
`--output_dir` ๋Š” ํ•™์Šต ์Šคํฌ๋ฆฝํŠธ์— ๊ฐœ๋ณ„์ ์œผ๋กœ ์ง€์ •๋ฉ๋‹ˆ๋‹ค.
```py
from diffusers import StableDiffusionControlNetPipeline, ControlNetModel, UniPCMultistepScheduler
from diffusers.utils import load_image
import torch
base_model_path = "path to model"
controlnet_path = "path to controlnet"
controlnet = ControlNetModel.from_pretrained(controlnet_path, torch_dtype=torch.float16)
pipe = StableDiffusionControlNetPipeline.from_pretrained(
base_model_path, controlnet=controlnet, torch_dtype=torch.float16
)
# ๋” ๋น ๋ฅธ ์Šค์ผ€์ค„๋Ÿฌ์™€ ๋ฉ”๋ชจ๋ฆฌ ์ตœ์ ํ™”๋กœ diffusion ํ”„๋กœ์„ธ์Šค ์†๋„ ์˜ฌ๋ฆฌ๊ธฐ
pipe.scheduler = UniPCMultistepScheduler.from_config(pipe.scheduler.config)
# xformers๊ฐ€ ์„ค์น˜๋˜์ง€ ์•Š์œผ๋ฉด ์•„๋ž˜ ์ค„์„ ์‚ญ์ œํ•˜๊ธฐ
pipe.enable_xformers_memory_efficient_attention()
pipe.enable_model_cpu_offload()
control_image = load_image("./conditioning_image_1.png")
prompt = "pale golden rod circle with old lace background"
# ์ด๋ฏธ์ง€ ์ƒ์„ฑํ•˜๊ธฐ
generator = torch.manual_seed(0)
image = pipe(prompt, num_inference_steps=20, generator=generator, image=control_image).images[0]
image.save("./output.png")
```