mirror of
https://github.com/huggingface/diffusers.git
synced 2026-01-27 17:22:53 +03:00
[Docs] Korean translation (optimization, training) (#3488)
* feat) optimization kr translation * fix) typo, italic setting * feat) dreambooth, text2image kr * feat) lora kr * fix) LoRA * fix) fp16 fix * fix) doc-builder style * fix) fp16 ์ผ๋ถ ๋จ์ด ์์ * fix) fp16 style fix * fix) opt, training docs update * feat) toctree update * feat) toctree update --------- Co-authored-by: Chanran Kim <seriousran@gmail.com>
This commit is contained in:
@@ -3,191 +3,46 @@
|
||||
title: "๐งจ Diffusers"
|
||||
- local: quicktour
|
||||
title: "ํ์ด๋ณด๊ธฐ"
|
||||
- local: in_translation
|
||||
title: Stable Diffusion
|
||||
- local: installation
|
||||
title: "์ค์น"
|
||||
title: "์์ํ๊ธฐ"
|
||||
|
||||
- sections:
|
||||
- sections:
|
||||
- local: in_translation
|
||||
title: "Loading Pipelines, Models, and Schedulers"
|
||||
title: ๊ฐ์
|
||||
- local: in_translation
|
||||
title: "Using different Schedulers"
|
||||
title: Unconditional ์ด๋ฏธ์ง ์์ฑ
|
||||
- local: in_translation
|
||||
title: "Configuring Pipelines, Models, and Schedulers"
|
||||
title: Textual Inversion
|
||||
- local: training/dreambooth
|
||||
title: DreamBooth
|
||||
- local: training/text2image
|
||||
title: Text-to-image
|
||||
- local: training/lora
|
||||
title: Low-Rank Adaptation of Large Language Models (LoRA)
|
||||
- local: in_translation
|
||||
title: "Loading and Adding Custom Pipelines"
|
||||
title: "๋ถ๋ฌ์ค๊ธฐ & ํ๋ธ (๋ฒ์ญ ์์ )"
|
||||
- sections:
|
||||
title: ControlNet
|
||||
- local: in_translation
|
||||
title: "Unconditional Image Generation"
|
||||
- local: in_translation
|
||||
title: "Text-to-Image Generation"
|
||||
- local: in_translation
|
||||
title: "Text-Guided Image-to-Image"
|
||||
- local: in_translation
|
||||
title: "Text-Guided Image-Inpainting"
|
||||
- local: in_translation
|
||||
title: "Text-Guided Depth-to-Image"
|
||||
- local: in_translation
|
||||
title: "Reusing seeds for deterministic generation"
|
||||
- local: in_translation
|
||||
title: "Community Pipelines"
|
||||
- local: in_translation
|
||||
title: "How to contribute a Pipeline"
|
||||
title: "์ถ๋ก ์ ์ํ ํ์ดํ๋ผ์ธ (๋ฒ์ญ ์์ )"
|
||||
- sections:
|
||||
- local: in_translation
|
||||
title: "Reinforcement Learning"
|
||||
- local: in_translation
|
||||
title: "Audio"
|
||||
- local: in_translation
|
||||
title: "Other Modalities"
|
||||
title: "Taking Diffusers Beyond Images"
|
||||
title: "Diffusers ์ฌ์ฉ๋ฒ (๋ฒ์ญ ์์ )"
|
||||
title: InstructPix2Pix ํ์ต
|
||||
title: ํ์ต
|
||||
- sections:
|
||||
- local: in_translation
|
||||
title: "Memory and Speed"
|
||||
title: ๊ฐ์
|
||||
- local: optimization/fp16
|
||||
title: ๋ฉ๋ชจ๋ฆฌ์ ์๋
|
||||
- local: in_translation
|
||||
title: "xFormers"
|
||||
- local: in_translation
|
||||
title: "ONNX"
|
||||
- local: in_translation
|
||||
title: "OpenVINO"
|
||||
- local: in_translation
|
||||
title: "MPS"
|
||||
- local: in_translation
|
||||
title: "Habana Gaudi"
|
||||
title: "์ต์ ํ/ํน์ ํ๋์จ์ด (๋ฒ์ญ ์์ )"
|
||||
- sections:
|
||||
- local: in_translation
|
||||
title: "Overview"
|
||||
- local: in_translation
|
||||
title: "Unconditional Image Generation"
|
||||
- local: in_translation
|
||||
title: "Textual Inversion"
|
||||
- local: in_translation
|
||||
title: "Dreambooth"
|
||||
- local: in_translation
|
||||
title: "Text-to-image fine-tuning"
|
||||
title: "ํ์ต (๋ฒ์ญ ์์ )"
|
||||
- sections:
|
||||
- local: in_translation
|
||||
title: "Stable Diffusion"
|
||||
- local: in_translation
|
||||
title: "Philosophy"
|
||||
- local: in_translation
|
||||
title: "How to contribute?"
|
||||
title: "๊ฐ๋
์ค๋ช
(๋ฒ์ญ ์์ )"
|
||||
- sections:
|
||||
- sections:
|
||||
- local: in_translation
|
||||
title: "Models"
|
||||
- local: in_translation
|
||||
title: "Diffusion Pipeline"
|
||||
- local: in_translation
|
||||
title: "Logging"
|
||||
- local: in_translation
|
||||
title: "Configuration"
|
||||
- local: in_translation
|
||||
title: "Outputs"
|
||||
title: "Main Classes"
|
||||
|
||||
- sections:
|
||||
- local: in_translation
|
||||
title: "Overview"
|
||||
- local: in_translation
|
||||
title: "AltDiffusion"
|
||||
- local: in_translation
|
||||
title: "Cycle Diffusion"
|
||||
- local: in_translation
|
||||
title: "DDIM"
|
||||
- local: in_translation
|
||||
title: "DDPM"
|
||||
- local: in_translation
|
||||
title: "Latent Diffusion"
|
||||
- local: in_translation
|
||||
title: "Unconditional Latent Diffusion"
|
||||
- local: in_translation
|
||||
title: "PaintByExample"
|
||||
- local: in_translation
|
||||
title: "PNDM"
|
||||
- local: in_translation
|
||||
title: "Score SDE VE"
|
||||
- sections:
|
||||
- local: in_translation
|
||||
title: "Overview"
|
||||
- local: in_translation
|
||||
title: "Text-to-Image"
|
||||
- local: in_translation
|
||||
title: "Image-to-Image"
|
||||
- local: in_translation
|
||||
title: "Inpaint"
|
||||
- local: in_translation
|
||||
title: "Depth-to-Image"
|
||||
- local: in_translation
|
||||
title: "Image-Variation"
|
||||
- local: in_translation
|
||||
title: "Super-Resolution"
|
||||
title: "Stable Diffusion"
|
||||
- local: in_translation
|
||||
title: "Stable Diffusion 2"
|
||||
- local: in_translation
|
||||
title: "Safe Stable Diffusion"
|
||||
- local: in_translation
|
||||
title: "Stochastic Karras VE"
|
||||
- local: in_translation
|
||||
title: "Dance Diffusion"
|
||||
- local: in_translation
|
||||
title: "UnCLIP"
|
||||
- local: in_translation
|
||||
title: "Versatile Diffusion"
|
||||
- local: in_translation
|
||||
title: "VQ Diffusion"
|
||||
- local: in_translation
|
||||
title: "RePaint"
|
||||
- local: in_translation
|
||||
title: "Audio Diffusion"
|
||||
title: "ํ์ดํ๋ผ์ธ (๋ฒ์ญ ์์ )"
|
||||
- sections:
|
||||
- local: in_translation
|
||||
title: "Overview"
|
||||
- local: in_translation
|
||||
title: "DDIM"
|
||||
- local: in_translation
|
||||
title: "DDPM"
|
||||
- local: in_translation
|
||||
title: "Singlestep DPM-Solver"
|
||||
- local: in_translation
|
||||
title: "Multistep DPM-Solver"
|
||||
- local: in_translation
|
||||
title: "Heun Scheduler"
|
||||
- local: in_translation
|
||||
title: "DPM Discrete Scheduler"
|
||||
- local: in_translation
|
||||
title: "DPM Discrete Scheduler with ancestral sampling"
|
||||
- local: in_translation
|
||||
title: "Stochastic Kerras VE"
|
||||
- local: in_translation
|
||||
title: "Linear Multistep"
|
||||
- local: in_translation
|
||||
title: "PNDM"
|
||||
- local: in_translation
|
||||
title: "VE-SDE"
|
||||
- local: in_translation
|
||||
title: "IPNDM"
|
||||
- local: in_translation
|
||||
title: "VP-SDE"
|
||||
- local: in_translation
|
||||
title: "Euler scheduler"
|
||||
- local: in_translation
|
||||
title: "Euler Ancestral Scheduler"
|
||||
- local: in_translation
|
||||
title: "VQDiffusionScheduler"
|
||||
- local: in_translation
|
||||
title: "RePaint Scheduler"
|
||||
title: "์ค์ผ์ค๋ฌ (๋ฒ์ญ ์์ )"
|
||||
- sections:
|
||||
- local: in_translation
|
||||
title: "RL Planning"
|
||||
title: "Experimental Features"
|
||||
title: "API (๋ฒ์ญ ์์ )"
|
||||
title: Torch2.0 ์ง์
|
||||
- local: optimization/xformers
|
||||
title: xFormers
|
||||
- local: optimization/onnx
|
||||
title: ONNX
|
||||
- local: optimization/open_vino
|
||||
title: OpenVINO
|
||||
- local: optimization/mps
|
||||
title: MPS
|
||||
- local: optimization/habana
|
||||
title: Habana Gaudi
|
||||
title: ์ต์ ํ/ํน์ ํ๋์จ์ด
|
||||
410
docs/source/ko/optimization/fp16.mdx
Normal file
410
docs/source/ko/optimization/fp16.mdx
Normal file
@@ -0,0 +1,410 @@
|
||||
<!--Copyright 2023 The HuggingFace Team. All rights reserved.
|
||||
|
||||
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
|
||||
the License. You may obtain a copy of the License at
|
||||
|
||||
http://www.apache.org/licenses/LICENSE-2.0
|
||||
|
||||
Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on
|
||||
an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the
|
||||
specific language governing permissions and limitations under the License.
|
||||
-->
|
||||
|
||||
# ๋ฉ๋ชจ๋ฆฌ์ ์๋
|
||||
|
||||
๋ฉ๋ชจ๋ฆฌ ๋๋ ์๋์ ๋ํด ๐ค Diffusers *์ถ๋ก *์ ์ต์ ํํ๊ธฐ ์ํ ๋ช ๊ฐ์ง ๊ธฐ์ ๊ณผ ์์ด๋์ด๋ฅผ ์ ์ํฉ๋๋ค.
|
||||
์ผ๋ฐ์ ์ผ๋ก, memory-efficient attention์ ์ํด [xFormers](https://github.com/facebookresearch/xformers) ์ฌ์ฉ์ ์ถ์ฒํ๊ธฐ ๋๋ฌธ์, ์ถ์ฒํ๋ [์ค์น ๋ฐฉ๋ฒ](xformers)์ ๋ณด๊ณ ์ค์นํด ๋ณด์ธ์.
|
||||
|
||||
๋ค์ ์ค์ ์ด ์ฑ๋ฅ๊ณผ ๋ฉ๋ชจ๋ฆฌ์ ๋ฏธ์น๋ ์ํฅ์ ๋ํด ์ค๋ช
ํฉ๋๋ค.
|
||||
|
||||
| | ์ง์ฐ์๊ฐ | ์๋ ํฅ์ |
|
||||
| ---------------- | ------- | ------- |
|
||||
| ๋ณ๋ ์ค์ ์์ | 9.50s | x1 |
|
||||
| cuDNN auto-tuner | 9.37s | x1.01 |
|
||||
| fp16 | 3.61s | x2.63 |
|
||||
| Channels Last ๋ฉ๋ชจ๋ฆฌ ํ์ | 3.30s | x2.88 |
|
||||
| traced UNet | 3.21s | x2.96 |
|
||||
| memory-efficient attention | 2.63s | x3.61 |
|
||||
|
||||
<em>
|
||||
NVIDIA TITAN RTX์์ 50 DDIM ์คํ
์ "a photo of an astronaut riding a horse on mars" ํ๋กฌํํธ๋ก 512x512 ํฌ๊ธฐ์ ๋จ์ผ ์ด๋ฏธ์ง๋ฅผ ์์ฑํ์์ต๋๋ค.
|
||||
</em>
|
||||
|
||||
## cuDNN auto-tuner ํ์ฑํํ๊ธฐ
|
||||
|
||||
[NVIDIA cuDNN](https://developer.nvidia.com/cudnn)์ ์ปจ๋ณผ๋ฃจ์
์ ๊ณ์ฐํ๋ ๋ง์ ์๊ณ ๋ฆฌ์ฆ์ ์ง์ํฉ๋๋ค. Autotuner๋ ์งง์ ๋ฒค์น๋งํฌ๋ฅผ ์คํํ๊ณ ์ฃผ์ด์ง ์
๋ ฅ ํฌ๊ธฐ์ ๋ํด ์ฃผ์ด์ง ํ๋์จ์ด์์ ์ต๊ณ ์ ์ฑ๋ฅ์ ๊ฐ์ง ์ปค๋์ ์ ํํฉ๋๋ค.
|
||||
|
||||
**์ปจ๋ณผ๋ฃจ์
๋คํธ์ํฌ**๋ฅผ ํ์ฉํ๊ณ ์๊ธฐ ๋๋ฌธ์ (๋ค๋ฅธ ์ ํ๋ค์ ํ์ฌ ์ง์๋์ง ์์), ๋ค์ ์ค์ ์ ํตํด ์ถ๋ก ์ ์ cuDNN autotuner๋ฅผ ํ์ฑํํ ์ ์์ต๋๋ค:
|
||||
|
||||
```python
|
||||
import torch
|
||||
|
||||
torch.backends.cudnn.benchmark = True
|
||||
```
|
||||
|
||||
### fp32 ๋์ tf32 ์ฌ์ฉํ๊ธฐ (Ampere ๋ฐ ์ดํ CUDA ์ฅ์น๋ค์์)
|
||||
|
||||
Ampere ๋ฐ ์ดํ CUDA ์ฅ์น์์ ํ๋ ฌ๊ณฑ ๋ฐ ์ปจ๋ณผ๋ฃจ์
์ TensorFloat32(TF32) ๋ชจ๋๋ฅผ ์ฌ์ฉํ์ฌ ๋ ๋น ๋ฅด์ง๋ง ์ฝ๊ฐ ๋ ์ ํํ ์ ์์ต๋๋ค.
|
||||
๊ธฐ๋ณธ์ ์ผ๋ก PyTorch๋ ์ปจ๋ณผ๋ฃจ์
์ ๋ํด TF32 ๋ชจ๋๋ฅผ ํ์ฑํํ์ง๋ง ํ๋ ฌ ๊ณฑ์
์ ํ์ฑํํ์ง ์์ต๋๋ค.
|
||||
๋คํธ์ํฌ์ ์์ ํ float32 ์ ๋ฐ๋๊ฐ ํ์ํ ๊ฒฝ์ฐ๊ฐ ์๋๋ฉด ํ๋ ฌ ๊ณฑ์
์ ๋ํด์๋ ์ด ์ค์ ์ ํ์ฑํํ๋ ๊ฒ์ด ์ข์ต๋๋ค.
|
||||
์ด๋ ์ผ๋ฐ์ ์ผ๋ก ๋ฌด์ํ ์ ์๋ ์์น์ ์ ํ๋ ์์ค์ด ์์ง๋ง, ๊ณ์ฐ ์๋๋ฅผ ํฌ๊ฒ ๋์ผ ์ ์์ต๋๋ค.
|
||||
๊ทธ๊ฒ์ ๋ํด [์ฌ๊ธฐ](https://huggingface.co/docs/transformers/v4.18.0/en/performance#tf32)์ ๋ ์ฝ์ ์ ์์ต๋๋ค.
|
||||
์ถ๋ก ํ๊ธฐ ์ ์ ๋ค์์ ์ถ๊ฐํ๊ธฐ๋ง ํ๋ฉด ๋ฉ๋๋ค:
|
||||
|
||||
```python
|
||||
import torch
|
||||
|
||||
torch.backends.cuda.matmul.allow_tf32 = True
|
||||
```
|
||||
|
||||
## ๋ฐ์ ๋ฐ๋ ๊ฐ์ค์น
|
||||
|
||||
๋ ๋ง์ GPU ๋ฉ๋ชจ๋ฆฌ๋ฅผ ์ ์ฝํ๊ณ ๋ ๋น ๋ฅธ ์๋๋ฅผ ์ป๊ธฐ ์ํด ๋ชจ๋ธ ๊ฐ์ค์น๋ฅผ ๋ฐ์ ๋ฐ๋(half precision)๋ก ์ง์ ๋ก๋ํ๊ณ ์คํํ ์ ์์ต๋๋ค.
|
||||
์ฌ๊ธฐ์๋ `fp16`์ด๋ผ๋ ๋ธ๋์น์ ์ ์ฅ๋ float16 ๋ฒ์ ์ ๊ฐ์ค์น๋ฅผ ๋ถ๋ฌ์ค๊ณ , ๊ทธ ๋ `float16` ์ ํ์ ์ฌ์ฉํ๋๋ก PyTorch์ ์ง์ํ๋ ์์
์ด ํฌํจ๋ฉ๋๋ค.
|
||||
|
||||
```Python
|
||||
pipe = StableDiffusionPipeline.from_pretrained(
|
||||
"runwayml/stable-diffusion-v1-5",
|
||||
|
||||
torch_dtype=torch.float16,
|
||||
)
|
||||
pipe = pipe.to("cuda")
|
||||
|
||||
prompt = "a photo of an astronaut riding a horse on mars"
|
||||
image = pipe(prompt).images[0]
|
||||
```
|
||||
|
||||
<Tip warning={true}>
|
||||
์ด๋ค ํ์ดํ๋ผ์ธ์์๋ [`torch.autocast`](https://pytorch.org/docs/stable/amp.html#torch.autocast) ๋ฅผ ์ฌ์ฉํ๋ ๊ฒ์ ๊ฒ์์ ์ด๋ฏธ์ง๋ฅผ ์์ฑํ ์ ์๊ณ , ์์ํ float16 ์ ๋ฐ๋๋ฅผ ์ฌ์ฉํ๋ ๊ฒ๋ณด๋ค ํญ์ ๋๋ฆฌ๊ธฐ ๋๋ฌธ์ ์ฌ์ฉํ์ง ์๋ ๊ฒ์ด ์ข์ต๋๋ค.
|
||||
</Tip>
|
||||
|
||||
## ์ถ๊ฐ ๋ฉ๋ชจ๋ฆฌ ์ ์ฝ์ ์ํ ์ฌ๋ผ์ด์ค ์ดํ
์
|
||||
|
||||
์ถ๊ฐ ๋ฉ๋ชจ๋ฆฌ ์ ์ฝ์ ์ํด, ํ ๋ฒ์ ๋ชจ๋ ๊ณ์ฐํ๋ ๋์ ๋จ๊ณ์ ์ผ๋ก ๊ณ์ฐ์ ์ํํ๋ ์ฌ๋ผ์ด์ค ๋ฒ์ ์ ์ดํ
์
(attention)์ ์ฌ์ฉํ ์ ์์ต๋๋ค.
|
||||
|
||||
<Tip>
|
||||
Attention slicing์ ๋ชจ๋ธ์ด ํ๋ ์ด์์ ์ดํ
์
ํค๋๋ฅผ ์ฌ์ฉํ๋ ํ, ๋ฐฐ์น ํฌ๊ธฐ๊ฐ 1์ธ ๊ฒฝ์ฐ์๋ ์ ์ฉํฉ๋๋ค.
|
||||
ํ๋ ์ด์์ ์ดํ
์
ํค๋๊ฐ ์๋ ๊ฒฝ์ฐ *QK^T* ์ดํ
์
๋งคํธ๋ฆญ์ค๋ ์๋นํ ์์ ๋ฉ๋ชจ๋ฆฌ๋ฅผ ์ ์ฝํ ์ ์๋ ๊ฐ ํค๋์ ๋ํด ์์ฐจ์ ์ผ๋ก ๊ณ์ฐ๋ ์ ์์ต๋๋ค.
|
||||
</Tip>
|
||||
|
||||
๊ฐ ํค๋์ ๋ํด ์์ฐจ์ ์ผ๋ก ์ดํ
์
๊ณ์ฐ์ ์ํํ๋ ค๋ฉด, ๋ค์๊ณผ ๊ฐ์ด ์ถ๋ก ์ ์ ํ์ดํ๋ผ์ธ์์ [`~StableDiffusionPipeline.enable_attention_slicing`]๋ฅผ ํธ์ถํ๋ฉด ๋ฉ๋๋ค:
|
||||
|
||||
```Python
|
||||
import torch
|
||||
from diffusers import StableDiffusionPipeline
|
||||
|
||||
pipe = StableDiffusionPipeline.from_pretrained(
|
||||
"runwayml/stable-diffusion-v1-5",
|
||||
|
||||
torch_dtype=torch.float16,
|
||||
)
|
||||
pipe = pipe.to("cuda")
|
||||
|
||||
prompt = "a photo of an astronaut riding a horse on mars"
|
||||
pipe.enable_attention_slicing()
|
||||
image = pipe(prompt).images[0]
|
||||
```
|
||||
|
||||
์ถ๋ก ์๊ฐ์ด ์ฝ 10% ๋๋ ค์ง๋ ์ฝ๊ฐ์ ์ฑ๋ฅ ์ ํ๊ฐ ์์ง๋ง ์ด ๋ฐฉ๋ฒ์ ์ฌ์ฉํ๋ฉด 3.2GB ์ ๋์ ์์ VRAM์ผ๋ก๋ Stable Diffusion์ ์ฌ์ฉํ ์ ์์ต๋๋ค!
|
||||
|
||||
|
||||
## ๋ ํฐ ๋ฐฐ์น๋ฅผ ์ํ sliced VAE ๋์ฝ๋
|
||||
|
||||
์ ํ๋ VRAM์์ ๋๊ท๋ชจ ์ด๋ฏธ์ง ๋ฐฐ์น๋ฅผ ๋์ฝ๋ฉํ๊ฑฐ๋ 32๊ฐ ์ด์์ ์ด๋ฏธ์ง๊ฐ ํฌํจ๋ ๋ฐฐ์น๋ฅผ ํ์ฑํํ๊ธฐ ์ํด, ๋ฐฐ์น์ latent ์ด๋ฏธ์ง๋ฅผ ํ ๋ฒ์ ํ๋์ฉ ๋์ฝ๋ฉํ๋ ์ฌ๋ผ์ด์ค VAE ๋์ฝ๋๋ฅผ ์ฌ์ฉํ ์ ์์ต๋๋ค.
|
||||
|
||||
์ด๋ฅผ [`~StableDiffusionPipeline.enable_attention_slicing`] ๋๋ [`~StableDiffusionPipeline.enable_xformers_memory_efficient_attention`]๊ณผ ๊ฒฐํฉํ์ฌ ๋ฉ๋ชจ๋ฆฌ ์ฌ์ฉ์ ์ถ๊ฐ๋ก ์ต์ํํ ์ ์์ต๋๋ค.
|
||||
|
||||
VAE ๋์ฝ๋๋ฅผ ํ ๋ฒ์ ํ๋์ฉ ์ํํ๋ ค๋ฉด ์ถ๋ก ์ ์ ํ์ดํ๋ผ์ธ์์ [`~StableDiffusionPipeline.enable_vae_slicing`]์ ํธ์ถํฉ๋๋ค. ์๋ฅผ ๋ค์ด:
|
||||
|
||||
```Python
|
||||
import torch
|
||||
from diffusers import StableDiffusionPipeline
|
||||
|
||||
pipe = StableDiffusionPipeline.from_pretrained(
|
||||
"runwayml/stable-diffusion-v1-5",
|
||||
|
||||
torch_dtype=torch.float16,
|
||||
)
|
||||
pipe = pipe.to("cuda")
|
||||
|
||||
prompt = "a photo of an astronaut riding a horse on mars"
|
||||
pipe.enable_vae_slicing()
|
||||
images = pipe([prompt] * 32).images
|
||||
```
|
||||
|
||||
๋ค์ค ์ด๋ฏธ์ง ๋ฐฐ์น์์ VAE ๋์ฝ๋๊ฐ ์ฝ๊ฐ์ ์ฑ๋ฅ ํฅ์์ด ์ด๋ฃจ์ด์ง๋๋ค. ๋จ์ผ ์ด๋ฏธ์ง ๋ฐฐ์น์์๋ ์ฑ๋ฅ ์ํฅ์ ์์ต๋๋ค.
|
||||
|
||||
|
||||
<a name="sequential_offloading"></a>
|
||||
## ๋ฉ๋ชจ๋ฆฌ ์ ์ฝ์ ์ํด ๊ฐ์ ๊ธฐ๋ฅ์ ์ฌ์ฉํ์ฌ CPU๋ก ์คํ๋ก๋ฉ
|
||||
|
||||
์ถ๊ฐ ๋ฉ๋ชจ๋ฆฌ ์ ์ฝ์ ์ํด ๊ฐ์ค์น๋ฅผ CPU๋ก ์คํ๋ก๋ํ๊ณ ์๋ฐฉํฅ ์ ๋ฌ์ ์ํํ ๋๋ง GPU๋ก ๋ก๋ํ ์ ์์ต๋๋ค.
|
||||
|
||||
CPU ์คํ๋ก๋ฉ์ ์ํํ๋ ค๋ฉด [`~StableDiffusionPipeline.enable_sequential_cpu_offload`]๋ฅผ ํธ์ถํ๊ธฐ๋ง ํ๋ฉด ๋ฉ๋๋ค:
|
||||
|
||||
```Python
|
||||
import torch
|
||||
from diffusers import StableDiffusionPipeline
|
||||
|
||||
pipe = StableDiffusionPipeline.from_pretrained(
|
||||
"runwayml/stable-diffusion-v1-5",
|
||||
|
||||
torch_dtype=torch.float16,
|
||||
)
|
||||
|
||||
prompt = "a photo of an astronaut riding a horse on mars"
|
||||
pipe.enable_sequential_cpu_offload()
|
||||
image = pipe(prompt).images[0]
|
||||
```
|
||||
|
||||
๊ทธ๋ฌ๋ฉด ๋ฉ๋ชจ๋ฆฌ ์๋น๋ฅผ 3GB ๋ฏธ๋ง์ผ๋ก ์ค์ผ ์ ์์ต๋๋ค.
|
||||
|
||||
์ฐธ๊ณ ๋ก ์ด ๋ฐฉ๋ฒ์ ์ ์ฒด ๋ชจ๋ธ์ด ์๋ ์๋ธ๋ชจ๋ ์์ค์์ ์๋ํฉ๋๋ค. ์ด๋ ๋ฉ๋ชจ๋ฆฌ ์๋น๋ฅผ ์ต์ํํ๋ ๊ฐ์ฅ ์ข์ ๋ฐฉ๋ฒ์ด์ง๋ง ํ๋ก์ธ์ค์ ๋ฐ๋ณต์ ํน์ฑ์ผ๋ก ์ธํด ์ถ๋ก ์๋๊ฐ ํจ์ฌ ๋๋ฆฝ๋๋ค. ํ์ดํ๋ผ์ธ์ UNet ๊ตฌ์ฑ ์์๋ ์ฌ๋ฌ ๋ฒ ์คํ๋ฉ๋๋ค('num_inference_steps' ๋งํผ). ๋งค๋ฒ UNet์ ์๋ก ๋ค๋ฅธ ์๋ธ๋ชจ๋์ด ์์ฐจ์ ์ผ๋ก ์จ๋ก๋๋ ๋ค์ ํ์์ ๋ฐ๋ผ ์คํ๋ก๋๋๋ฏ๋ก ๋ฉ๋ชจ๋ฆฌ ์ด๋ ํ์๊ฐ ๋ง์ต๋๋ค.
|
||||
|
||||
<Tip>
|
||||
๋ ๋ค๋ฅธ ์ต์ ํ ๋ฐฉ๋ฒ์ธ <a href="#model_offloading">๋ชจ๋ธ ์คํ๋ก๋ฉ</a>์ ์ฌ์ฉํ๋ ๊ฒ์ ๊ณ ๋ คํ์ญ์์ค. ์ด๋ ํจ์ฌ ๋น ๋ฅด์ง๋ง ๋ฉ๋ชจ๋ฆฌ ์ ์ฝ์ด ํฌ์ง๋ ์์ต๋๋ค.
|
||||
</Tip>
|
||||
|
||||
๋ํ ttention slicing๊ณผ ์ฐ๊ฒฐํด์ ์ต์ ๋ฉ๋ชจ๋ฆฌ(< 2GB)๋ก๋ ๋์ํ ์ ์์ต๋๋ค.
|
||||
|
||||
|
||||
```Python
|
||||
import torch
|
||||
from diffusers import StableDiffusionPipeline
|
||||
|
||||
pipe = StableDiffusionPipeline.from_pretrained(
|
||||
"runwayml/stable-diffusion-v1-5",
|
||||
|
||||
torch_dtype=torch.float16,
|
||||
)
|
||||
|
||||
prompt = "a photo of an astronaut riding a horse on mars"
|
||||
pipe.enable_sequential_cpu_offload()
|
||||
pipe.enable_attention_slicing(1)
|
||||
|
||||
image = pipe(prompt).images[0]
|
||||
```
|
||||
|
||||
**์ฐธ๊ณ **: 'enable_sequential_cpu_offload()'๋ฅผ ์ฌ์ฉํ ๋, ๋ฏธ๋ฆฌ ํ์ดํ๋ผ์ธ์ CUDA๋ก ์ด๋ํ์ง **์๋** ๊ฒ์ด ์ค์ํฉ๋๋ค.๊ทธ๋ ์ง ์์ผ๋ฉด ๋ฉ๋ชจ๋ฆฌ ์๋น์ ์ด๋์ด ์ต์ํ๋ฉ๋๋ค. ๋ ๋ง์ ์ ๋ณด๋ฅผ ์ํด [์ด ์ด์](https://github.com/huggingface/diffusers/issues/1934)๋ฅผ ๋ณด์ธ์.
|
||||
|
||||
<a name="model_offloading"></a>
|
||||
## ๋น ๋ฅธ ์ถ๋ก ๊ณผ ๋ฉ๋ชจ๋ฆฌ ๋ฉ๋ชจ๋ฆฌ ์ ์ฝ์ ์ํ ๋ชจ๋ธ ์คํ๋ก๋ฉ
|
||||
|
||||
[์์ฐจ์ CPU ์คํ๋ก๋ฉ](#sequential_offloading)์ ์ด์ ์น์
์์ ์ค๋ช
ํ ๊ฒ์ฒ๋ผ ๋ง์ ๋ฉ๋ชจ๋ฆฌ๋ฅผ ๋ณด์กดํ์ง๋ง ํ์์ ๋ฐ๋ผ ์๋ธ๋ชจ๋์ GPU๋ก ์ด๋ํ๊ณ ์ ๋ชจ๋์ด ์คํ๋ ๋ ์ฆ์ CPU๋ก ๋ฐํ๋๊ธฐ ๋๋ฌธ์ ์ถ๋ก ์๋๊ฐ ๋๋ ค์ง๋๋ค.
|
||||
|
||||
์ ์ฒด ๋ชจ๋ธ ์คํ๋ก๋ฉ์ ๊ฐ ๋ชจ๋ธ์ ๊ตฌ์ฑ ์์์ธ _modules_์ ์ฒ๋ฆฌํ๋ ๋์ , ์ ์ฒด ๋ชจ๋ธ์ GPU๋ก ์ด๋ํ๋ ๋์์
๋๋ค. ์ด๋ก ์ธํด ์ถ๋ก ์๊ฐ์ ๋ฏธ์น๋ ์ํฅ์ ๋ฏธ๋ฏธํ์ง๋ง(ํ์ดํ๋ผ์ธ์ 'cuda'๋ก ์ด๋ํ๋ ๊ฒ๊ณผ ๋น๊ตํ์ฌ) ์ฌ์ ํ ์ฝ๊ฐ์ ๋ฉ๋ชจ๋ฆฌ๋ฅผ ์ ์ฝํ ์ ์์ต๋๋ค.
|
||||
|
||||
์ด ์๋๋ฆฌ์ค์์๋ ํ์ดํ๋ผ์ธ์ ์ฃผ์ ๊ตฌ์ฑ ์์ ์ค ํ๋๋ง(์ผ๋ฐ์ ์ผ๋ก ํ
์คํธ ์ธ์ฝ๋, unet ๋ฐ vae) GPU์ ์๊ณ , ๋๋จธ์ง๋ CPU์์ ๋๊ธฐํ ๊ฒ์
๋๋ค.
|
||||
์ฌ๋ฌ ๋ฐ๋ณต์ ์ํด ์คํ๋๋ UNet๊ณผ ๊ฐ์ ๊ตฌ์ฑ ์์๋ ๋ ์ด์ ํ์ํ์ง ์์ ๋๊น์ง GPU์ ๋จ์ ์์ต๋๋ค.
|
||||
|
||||
์ด ๊ธฐ๋ฅ์ ์๋์ ๊ฐ์ด ํ์ดํ๋ผ์ธ์์ `enable_model_cpu_offload()`๋ฅผ ํธ์ถํ์ฌ ํ์ฑํํ ์ ์์ต๋๋ค.
|
||||
|
||||
```Python
|
||||
import torch
|
||||
from diffusers import StableDiffusionPipeline
|
||||
|
||||
pipe = StableDiffusionPipeline.from_pretrained(
|
||||
"runwayml/stable-diffusion-v1-5",
|
||||
torch_dtype=torch.float16,
|
||||
)
|
||||
|
||||
prompt = "a photo of an astronaut riding a horse on mars"
|
||||
pipe.enable_model_cpu_offload()
|
||||
image = pipe(prompt).images[0]
|
||||
```
|
||||
|
||||
์ด๋ ์ถ๊ฐ์ ์ธ ๋ฉ๋ชจ๋ฆฌ ์ ์ฝ์ ์ํ attention slicing๊ณผ๋ ํธํ๋ฉ๋๋ค.
|
||||
|
||||
```Python
|
||||
import torch
|
||||
from diffusers import StableDiffusionPipeline
|
||||
|
||||
pipe = StableDiffusionPipeline.from_pretrained(
|
||||
"runwayml/stable-diffusion-v1-5",
|
||||
torch_dtype=torch.float16,
|
||||
)
|
||||
|
||||
prompt = "a photo of an astronaut riding a horse on mars"
|
||||
pipe.enable_model_cpu_offload()
|
||||
pipe.enable_attention_slicing(1)
|
||||
|
||||
image = pipe(prompt).images[0]
|
||||
```
|
||||
|
||||
<Tip>
|
||||
์ด ๊ธฐ๋ฅ์ ์ฌ์ฉํ๋ ค๋ฉด 'accelerate' ๋ฒ์ 0.17.0 ์ด์์ด ํ์ํฉ๋๋ค.
|
||||
</Tip>
|
||||
|
||||
## Channels Last ๋ฉ๋ชจ๋ฆฌ ํ์ ์ฌ์ฉํ๊ธฐ
|
||||
|
||||
Channels Last ๋ฉ๋ชจ๋ฆฌ ํ์์ ์ฐจ์ ์์๋ฅผ ๋ณด์กดํ๋ ๋ฉ๋ชจ๋ฆฌ์์ NCHW ํ
์ ๋ฐฐ์ด์ ๋์ฒดํ๋ ๋ฐฉ๋ฒ์
๋๋ค.
|
||||
Channels Last ํ
์๋ ์ฑ๋์ด ๊ฐ์ฅ ์กฐ๋ฐํ ์ฐจ์์ด ๋๋ ๋ฐฉ์์ผ๋ก ์ ๋ ฌ๋ฉ๋๋ค(์ผ๋ช
ํฝ์
๋น ์ด๋ฏธ์ง๋ฅผ ์ ์ฅ).
|
||||
ํ์ฌ ๋ชจ๋ ์ฐ์ฐ์ Channels Last ํ์์ ์ง์ํ๋ ๊ฒ์ ์๋๋ผ ์ฑ๋ฅ์ด ์ ํ๋ ์ ์์ผ๋ฏ๋ก, ์ฌ์ฉํด๋ณด๊ณ ๋ชจ๋ธ์ ์ ์๋ํ๋์ง ํ์ธํ๋ ๊ฒ์ด ์ข์ต๋๋ค.
|
||||
|
||||
|
||||
์๋ฅผ ๋ค์ด ํ์ดํ๋ผ์ธ์ UNet ๋ชจ๋ธ์ด channels Last ํ์์ ์ฌ์ฉํ๋๋ก ์ค์ ํ๋ ค๋ฉด ๋ค์์ ์ฌ์ฉํ ์ ์์ต๋๋ค:
|
||||
|
||||
```python
|
||||
print(pipe.unet.conv_out.state_dict()["weight"].stride()) # (2880, 9, 3, 1)
|
||||
pipe.unet.to(memory_format=torch.channels_last) # in-place ์ฐ์ฐ
|
||||
# 2๋ฒ์งธ ์ฐจ์์์ ์คํธ๋ผ์ด๋ 1์ ๊ฐ์ง๋ (2880, 1, 960, 320)๋ก, ์ฐ์ฐ์ด ์๋ํจ์ ์ฆ๋ช
ํฉ๋๋ค.
|
||||
print(pipe.unet.conv_out.state_dict()["weight"].stride())
|
||||
```
|
||||
|
||||
## ์ถ์ (tracing)
|
||||
|
||||
์ถ์ ์ ๋ชจ๋ธ์ ํตํด ์์ ์
๋ ฅ ํ
์๋ฅผ ํตํด ์คํ๋๋๋ฐ, ํด๋น ์
๋ ฅ์ด ๋ชจ๋ธ์ ๋ ์ด์ด๋ฅผ ํต๊ณผํ ๋ ํธ์ถ๋๋ ์์
์ ์บก์ฒํ์ฌ ์คํ ํ์ผ ๋๋ 'ScriptFunction'์ด ๋ฐํ๋๋๋ก ํ๊ณ , ์ด๋ just-in-time ์ปดํ์ผ๋ก ์ต์ ํ๋ฉ๋๋ค.
|
||||
|
||||
UNet ๋ชจ๋ธ์ ์ถ์ ํ๊ธฐ ์ํด ๋ค์์ ์ฌ์ฉํ ์ ์์ต๋๋ค:
|
||||
|
||||
```python
|
||||
import time
|
||||
import torch
|
||||
from diffusers import StableDiffusionPipeline
|
||||
import functools
|
||||
|
||||
# torch ๊ธฐ์ธ๊ธฐ ๋นํ์ฑํ
|
||||
torch.set_grad_enabled(False)
|
||||
|
||||
# ๋ณ์ ์ค์
|
||||
n_experiments = 2
|
||||
unet_runs_per_experiment = 50
|
||||
|
||||
|
||||
# ์
๋ ฅ ๋ถ๋ฌ์ค๊ธฐ
|
||||
def generate_inputs():
|
||||
sample = torch.randn(2, 4, 64, 64).half().cuda()
|
||||
timestep = torch.rand(1).half().cuda() * 999
|
||||
encoder_hidden_states = torch.randn(2, 77, 768).half().cuda()
|
||||
return sample, timestep, encoder_hidden_states
|
||||
|
||||
|
||||
pipe = StableDiffusionPipeline.from_pretrained(
|
||||
"runwayml/stable-diffusion-v1-5",
|
||||
torch_dtype=torch.float16,
|
||||
).to("cuda")
|
||||
unet = pipe.unet
|
||||
unet.eval()
|
||||
unet.to(memory_format=torch.channels_last) # Channels Last ๋ฉ๋ชจ๋ฆฌ ํ์ ์ฌ์ฉ
|
||||
unet.forward = functools.partial(unet.forward, return_dict=False) # return_dict=False์ ๊ธฐ๋ณธ๊ฐ์ผ๋ก ์ค์
|
||||
|
||||
# ์๋ฐ์
|
||||
for _ in range(3):
|
||||
with torch.inference_mode():
|
||||
inputs = generate_inputs()
|
||||
orig_output = unet(*inputs)
|
||||
|
||||
# ์ถ์
|
||||
print("tracing..")
|
||||
unet_traced = torch.jit.trace(unet, inputs)
|
||||
unet_traced.eval()
|
||||
print("done tracing")
|
||||
|
||||
|
||||
# ์๋ฐ์
๋ฐ ๊ทธ๋ํ ์ต์ ํ
|
||||
for _ in range(5):
|
||||
with torch.inference_mode():
|
||||
inputs = generate_inputs()
|
||||
orig_output = unet_traced(*inputs)
|
||||
|
||||
|
||||
# ๋ฒค์น๋งํน
|
||||
with torch.inference_mode():
|
||||
for _ in range(n_experiments):
|
||||
torch.cuda.synchronize()
|
||||
start_time = time.time()
|
||||
for _ in range(unet_runs_per_experiment):
|
||||
orig_output = unet_traced(*inputs)
|
||||
torch.cuda.synchronize()
|
||||
print(f"unet traced inference took {time.time() - start_time:.2f} seconds")
|
||||
for _ in range(n_experiments):
|
||||
torch.cuda.synchronize()
|
||||
start_time = time.time()
|
||||
for _ in range(unet_runs_per_experiment):
|
||||
orig_output = unet(*inputs)
|
||||
torch.cuda.synchronize()
|
||||
print(f"unet inference took {time.time() - start_time:.2f} seconds")
|
||||
|
||||
# ๋ชจ๋ธ ์ ์ฅ
|
||||
unet_traced.save("unet_traced.pt")
|
||||
```
|
||||
|
||||
๊ทธ ๋ค์, ํ์ดํ๋ผ์ธ์ `unet` ํน์ฑ์ ๋ค์๊ณผ ๊ฐ์ด ์ถ์ ๋ ๋ชจ๋ธ๋ก ๋ฐ๊ฟ ์ ์์ต๋๋ค.
|
||||
|
||||
```python
|
||||
from diffusers import StableDiffusionPipeline
|
||||
import torch
|
||||
from dataclasses import dataclass
|
||||
|
||||
|
||||
@dataclass
|
||||
class UNet2DConditionOutput:
|
||||
sample: torch.FloatTensor
|
||||
|
||||
|
||||
pipe = StableDiffusionPipeline.from_pretrained(
|
||||
"runwayml/stable-diffusion-v1-5",
|
||||
torch_dtype=torch.float16,
|
||||
).to("cuda")
|
||||
|
||||
# jitted unet ์ฌ์ฉ
|
||||
unet_traced = torch.jit.load("unet_traced.pt")
|
||||
|
||||
|
||||
# pipe.unet ์ญ์
|
||||
class TracedUNet(torch.nn.Module):
|
||||
def __init__(self):
|
||||
super().__init__()
|
||||
self.in_channels = pipe.unet.in_channels
|
||||
self.device = pipe.unet.device
|
||||
|
||||
def forward(self, latent_model_input, t, encoder_hidden_states):
|
||||
sample = unet_traced(latent_model_input, t, encoder_hidden_states)[0]
|
||||
return UNet2DConditionOutput(sample=sample)
|
||||
|
||||
|
||||
pipe.unet = TracedUNet()
|
||||
|
||||
with torch.inference_mode():
|
||||
image = pipe([prompt] * 1, num_inference_steps=50).images[0]
|
||||
```
|
||||
|
||||
|
||||
## Memory-efficient attention
|
||||
|
||||
์ดํ
์
๋ธ๋ก์ ๋์ญํญ์ ์ต์ ํํ๋ ์ต๊ทผ ์์
์ผ๋ก GPU ๋ฉ๋ชจ๋ฆฌ ์ฌ์ฉ๋์ด ํฌ๊ฒ ํฅ์๋๊ณ ํฅ์๋์์ต๋๋ค.
|
||||
@tridao์ ๊ฐ์ฅ ์ต๊ทผ์ ํ๋์ ์ดํ
์
: [code](https://github.com/HazyResearch/flash-attention), [paper](https://arxiv.org/pdf/2205.14135.pdf).
|
||||
|
||||
๋ฐฐ์น ํฌ๊ธฐ 1(ํ๋กฌํํธ 1๊ฐ)์ 512x512 ํฌ๊ธฐ๋ก ์ถ๋ก ์ ์คํํ ๋ ๋ช ๊ฐ์ง Nvidia GPU์์ ์ป์ ์๋ ํฅ์์ ๋ค์๊ณผ ๊ฐ์ต๋๋ค:
|
||||
|
||||
| GPU | ๊ธฐ์ค ์ดํ
์
FP16 | ๋ฉ๋ชจ๋ฆฌ ํจ์จ์ ์ธ ์ดํ
์
FP16 |
|
||||
|------------------ |--------------------- |--------------------------------- |
|
||||
| NVIDIA Tesla T4 | 3.5it/s | 5.5it/s |
|
||||
| NVIDIA 3060 RTX | 4.6it/s | 7.8it/s |
|
||||
| NVIDIA A10G | 8.88it/s | 15.6it/s |
|
||||
| NVIDIA RTX A6000 | 11.7it/s | 21.09it/s |
|
||||
| NVIDIA TITAN RTX | 12.51it/s | 18.22it/s |
|
||||
| A100-SXM4-40GB | 18.6it/s | 29.it/s |
|
||||
| A100-SXM-80GB | 18.7it/s | 29.5it/s |
|
||||
|
||||
์ด๋ฅผ ํ์ฉํ๋ ค๋ฉด ๋ค์์ ๋ง์กฑํด์ผ ํฉ๋๋ค:
|
||||
- PyTorch > 1.12
|
||||
- Cuda ์ฌ์ฉ ๊ฐ๋ฅ
|
||||
- [xformers ๋ผ์ด๋ธ๋ฌ๋ฆฌ๋ฅผ ์ค์นํจ](xformers)
|
||||
```python
|
||||
from diffusers import StableDiffusionPipeline
|
||||
import torch
|
||||
|
||||
pipe = StableDiffusionPipeline.from_pretrained(
|
||||
"runwayml/stable-diffusion-v1-5",
|
||||
torch_dtype=torch.float16,
|
||||
).to("cuda")
|
||||
|
||||
pipe.enable_xformers_memory_efficient_attention()
|
||||
|
||||
with torch.inference_mode():
|
||||
sample = pipe("a small cat")
|
||||
|
||||
# ์ ํ: ์ด๋ฅผ ๋นํ์ฑํ ํ๊ธฐ ์ํด ๋ค์์ ์ฌ์ฉํ ์ ์์ต๋๋ค.
|
||||
# pipe.disable_xformers_memory_efficient_attention()
|
||||
```
|
||||
71
docs/source/ko/optimization/habana.mdx
Normal file
71
docs/source/ko/optimization/habana.mdx
Normal file
@@ -0,0 +1,71 @@
|
||||
<!--Copyright 2023 The HuggingFace Team. All rights reserved.
|
||||
|
||||
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
|
||||
the License. You may obtain a copy of the License at
|
||||
|
||||
http://www.apache.org/licenses/LICENSE-2.0
|
||||
|
||||
Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on
|
||||
an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the
|
||||
specific language governing permissions and limitations under the License.
|
||||
-->
|
||||
|
||||
# Habana Gaudi์์ Stable Diffusion์ ์ฌ์ฉํ๋ ๋ฐฉ๋ฒ
|
||||
|
||||
๐ค Diffusers๋ ๐ค [Optimum Habana](https://huggingface.co/docs/optimum/habana/usage_guides/stable_diffusion)๋ฅผ ํตํด์ Habana Gaudi์ ํธํ๋ฉ๋๋ค.
|
||||
|
||||
## ์๊ตฌ ์ฌํญ
|
||||
|
||||
- Optimum Habana 1.4 ๋๋ ์ดํ, [์ฌ๊ธฐ](https://huggingface.co/docs/optimum/habana/installation)์ ์ค์นํ๋ ๋ฐฉ๋ฒ์ด ์์ต๋๋ค.
|
||||
- SynapseAI 1.8.
|
||||
|
||||
|
||||
## ์ถ๋ก ํ์ดํ๋ผ์ธ
|
||||
|
||||
Gaudi์์ Stable Diffusion 1 ๋ฐ 2๋ก ์ด๋ฏธ์ง๋ฅผ ์์ฑํ๋ ค๋ฉด ๋ ์ธ์คํด์ค๋ฅผ ์ธ์คํด์คํํด์ผ ํฉ๋๋ค:
|
||||
- [`GaudiStableDiffusionPipeline`](https://huggingface.co/docs/optimum/habana/package_reference/stable_diffusion_pipeline)์ด ํฌํจ๋ ํ์ดํ๋ผ์ธ. ์ด ํ์ดํ๋ผ์ธ์ *ํ
์คํธ-์ด๋ฏธ์ง ์์ฑ*์ ์ง์ํฉ๋๋ค.
|
||||
- [`GaudiDDIMScheduler`](https://huggingface.co/docs/optimum/habana/package_reference/stable_diffusion_pipeline#optimum.habana.diffusers.GaudiDDIMScheduler)์ด ํฌํจ๋ ์ค์ผ์ค๋ฌ. ์ด ์ค์ผ์ค๋ฌ๋ Habana Gaudi์ ์ต์ ํ๋์ด ์์ต๋๋ค.
|
||||
|
||||
ํ์ดํ๋ผ์ธ์ ์ด๊ธฐํํ ๋, HPU์ ๋ฐฐํฌํ๊ธฐ ์ํด `use_habana=True`๋ฅผ ์ง์ ํด์ผ ํฉ๋๋ค.
|
||||
๋ํ ๊ฐ๋ฅํ ๊ฐ์ฅ ๋น ๋ฅธ ์์ฑ์ ์ํด `use_hpu_graphs=True`๋ก **HPU ๊ทธ๋ํ**๋ฅผ ํ์ฑํํด์ผ ํฉ๋๋ค.
|
||||
๋ง์ง๋ง์ผ๋ก, [Hugging Face Hub](https://huggingface.co/Habana)์์ ๋ค์ด๋ก๋ํ ์ ์๋ [Gaudi configuration](https://huggingface.co/docs/optimum/habana/package_reference/gaudi_config)์ ์ง์ ํด์ผ ํฉ๋๋ค.
|
||||
|
||||
```python
|
||||
from optimum.habana import GaudiConfig
|
||||
from optimum.habana.diffusers import GaudiDDIMScheduler, GaudiStableDiffusionPipeline
|
||||
|
||||
model_name = "stabilityai/stable-diffusion-2-base"
|
||||
scheduler = GaudiDDIMScheduler.from_pretrained(model_name, subfolder="scheduler")
|
||||
pipeline = GaudiStableDiffusionPipeline.from_pretrained(
|
||||
model_name,
|
||||
scheduler=scheduler,
|
||||
use_habana=True,
|
||||
use_hpu_graphs=True,
|
||||
gaudi_config="Habana/stable-diffusion",
|
||||
)
|
||||
```
|
||||
|
||||
ํ์ดํ๋ผ์ธ์ ํธ์ถํ์ฌ ํ๋ ์ด์์ ํ๋กฌํํธ์์ ๋ฐฐ์น๋ณ๋ก ์ด๋ฏธ์ง๋ฅผ ์์ฑํ ์ ์์ต๋๋ค.
|
||||
|
||||
```python
|
||||
outputs = pipeline(
|
||||
prompt=[
|
||||
"High quality photo of an astronaut riding a horse in space",
|
||||
"Face of a yellow cat, high resolution, sitting on a park bench",
|
||||
],
|
||||
num_images_per_prompt=10,
|
||||
batch_size=4,
|
||||
)
|
||||
```
|
||||
|
||||
๋ ๋ง์ ์ ๋ณด๋ฅผ ์ป๊ธฐ ์ํด, Optimum Habana์ [๋ฌธ์](https://huggingface.co/docs/optimum/habana/usage_guides/stable_diffusion)์ ๊ณต์ Github ์ ์ฅ์์ ์ ๊ณต๋ [์์](https://github.com/huggingface/optimum-habana/tree/main/examples/stable-diffusion)๋ฅผ ํ์ธํ์ธ์.
|
||||
|
||||
|
||||
## ๋ฒค์น๋งํฌ
|
||||
|
||||
๋ค์์ [Habana/stable-diffusion](https://huggingface.co/Habana/stable-diffusion) Gaudi ๊ตฌ์ฑ(ํผํฉ ์ ๋ฐ๋ bf16/fp32)์ ์ฌ์ฉํ๋ Habana first-generation Gaudi ๋ฐ Gaudi2์ ์ง์ฐ ์๊ฐ์
๋๋ค:
|
||||
|
||||
| | Latency (๋ฐฐ์น ํฌ๊ธฐ = 1) | Throughput (๋ฐฐ์น ํฌ๊ธฐ = 8) |
|
||||
| ---------------------- |:------------------------:|:---------------------------:|
|
||||
| first-generation Gaudi | 4.29s | 0.283 images/s |
|
||||
| Gaudi2 | 1.54s | 0.904 images/s |
|
||||
71
docs/source/ko/optimization/mps.mdx
Normal file
71
docs/source/ko/optimization/mps.mdx
Normal file
@@ -0,0 +1,71 @@
|
||||
<!--Copyright 2023 The HuggingFace Team. All rights reserved.
|
||||
|
||||
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
|
||||
the License. You may obtain a copy of the License at
|
||||
|
||||
http://www.apache.org/licenses/LICENSE-2.0
|
||||
|
||||
Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on
|
||||
an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the
|
||||
specific language governing permissions and limitations under the License.
|
||||
-->
|
||||
|
||||
# Apple Silicon (M1/M2)์์ Stable Diffusion์ ์ฌ์ฉํ๋ ๋ฐฉ๋ฒ
|
||||
|
||||
Diffusers๋ Stable Diffusion ์ถ๋ก ์ ์ํด PyTorch `mps`๋ฅผ ์ฌ์ฉํด Apple ์ค๋ฆฌ์ฝ๊ณผ ํธํ๋ฉ๋๋ค. ๋ค์์ Stable Diffusion์ด ์๋ M1 ๋๋ M2 ์ปดํจํฐ๋ฅผ ์ฌ์ฉํ๊ธฐ ์ํด ๋ฐ๋ผ์ผ ํ๋ ๋จ๊ณ์
๋๋ค.
|
||||
|
||||
## ์๊ตฌ ์ฌํญ
|
||||
|
||||
- Apple silicon (M1/M2) ํ๋์จ์ด์ Mac ์ปดํจํฐ.
|
||||
- macOS 12.6 ๋๋ ์ดํ (13.0 ๋๋ ์ดํ ์ถ์ฒ).
|
||||
- Python arm64 ๋ฒ์
|
||||
- PyTorch 2.0(์ถ์ฒ) ๋๋ 1.13(`mps`๋ฅผ ์ง์ํ๋ ์ต์ ๋ฒ์ ). Yhttps://pytorch.org/get-started/locally/์ ์ง์นจ์ ๋ฐ๋ผ `pip` ๋๋ `conda`๋ก ์ค์นํ ์ ์์ต๋๋ค.
|
||||
|
||||
|
||||
## ์ถ๋ก ํ์ดํ๋ผ์ธ
|
||||
|
||||
์๋ ์ฝ๋๋ ์ต์ํ `to()` ์ธํฐํ์ด์ค๋ฅผ ์ฌ์ฉํ์ฌ `mps` ๋ฐฑ์๋๋ก Stable Diffusion ํ์ดํ๋ผ์ธ์ M1 ๋๋ M2 ์ฅ์น๋ก ์ด๋ํ๋ ๋ฐฉ๋ฒ์ ๋ณด์ฌ์ค๋๋ค.
|
||||
|
||||
|
||||
<Tip warning={true}>
|
||||
|
||||
**PyTorch 1.13์ ์ฌ์ฉ ์ค์ผ ๋ ** ์ถ๊ฐ ์ผํ์ฑ ์ ๋ฌ์ ์ฌ์ฉํ์ฌ ํ์ดํ๋ผ์ธ์ "ํ๋ผ์ด๋ฐ"ํ๋ ๊ฒ์ ์ถ์ฒํฉ๋๋ค. ์ด๊ฒ์ ๋ฐ๊ฒฌํ ์ด์ํ ๋ฌธ์ ์ ๋ํ ์์ ํด๊ฒฐ ๋ฐฉ๋ฒ์
๋๋ค. ์ฒซ ๋ฒ์งธ ์ถ๋ก ์ ๋ฌ์ ํ์ ์ ๋ฌ์ ์ฝ๊ฐ ๋ค๋ฅธ ๊ฒฐ๊ณผ๋ฅผ ์์ฑํฉ๋๋ค. ์ด ์ ๋ฌ์ ํ ๋ฒ๋ง ์ํํ๋ฉด ๋๋ฉฐ ์ถ๋ก ๋จ๊ณ๋ฅผ ํ ๋ฒ๋ง ์ฌ์ฉํ๊ณ ๊ฒฐ๊ณผ๋ฅผ ํ๊ธฐํด๋ ๋ฉ๋๋ค.
|
||||
|
||||
</Tip>
|
||||
|
||||
์ด์ ํ์์ ์ค๋ช
ํ ๊ฒ๋ค์ ํฌํจํ ์ฌ๋ฌ ๋ฌธ์ ๋ฅผ ํด๊ฒฐํ๋ฏ๋ก PyTorch 2 ์ด์์ ์ฌ์ฉํ๋ ๊ฒ์ด ์ข์ต๋๋ค.
|
||||
|
||||
|
||||
```python
|
||||
# `huggingface-cli login`์ ๋ก๊ทธ์ธ๋์ด ์์์ ํ์ธ
|
||||
from diffusers import DiffusionPipeline
|
||||
|
||||
pipe = DiffusionPipeline.from_pretrained("runwayml/stable-diffusion-v1-5")
|
||||
pipe = pipe.to("mps")
|
||||
|
||||
# ์ปดํจํฐ๊ฐ 64GB ์ดํ์ RAM ๋จ์ผ ๋ ์ถ์ฒ
|
||||
pipe.enable_attention_slicing()
|
||||
|
||||
prompt = "a photo of an astronaut riding a horse on mars"
|
||||
|
||||
# ์ฒ์ "์๋ฐ์
" ์ ๋ฌ (์ ์ค๋ช
์ ๋ณด์ธ์)
|
||||
_ = pipe(prompt, num_inference_steps=1)
|
||||
|
||||
# ๊ฒฐ๊ณผ๋ ์๋ฐ์
์ ๋ฌ ํ์ CPU ์ฅ์น์ ๊ฒฐ๊ณผ์ ์ผ์นํฉ๋๋ค.
|
||||
image = pipe(prompt).images[0]
|
||||
```
|
||||
|
||||
## ์ฑ๋ฅ ์ถ์ฒ
|
||||
|
||||
M1/M2 ์ฑ๋ฅ์ ๋ฉ๋ชจ๋ฆฌ ์๋ ฅ์ ๋งค์ฐ ๋ฏผ๊ฐํฉ๋๋ค. ์์คํ
์ ํ์ํ ๊ฒฝ์ฐ ์๋์ผ๋ก ์ค์๋์ง๋ง ์ค์ํ ๋ ์ฑ๋ฅ์ด ํฌ๊ฒ ์ ํ๋ฉ๋๋ค.
|
||||
|
||||
|
||||
ํนํ ์ปดํจํฐ์ ์์คํ
RAM์ด 64GB ๋ฏธ๋ง์ด๊ฑฐ๋ 512 ร 512ํฝ์
๋ณด๋ค ํฐ ๋นํ์ค ํด์๋์์ ์ด๋ฏธ์ง๋ฅผ ์์ฑํ๋ ๊ฒฝ์ฐ, ์ถ๋ก ์ค์ ๋ฉ๋ชจ๋ฆฌ ์๋ ฅ์ ์ค์ด๊ณ ์ค์ํ์ ๋ฐฉ์งํ๊ธฐ ์ํด *์ดํ
์
์ฌ๋ผ์ด์ฑ*์ ์ฌ์ฉํ๋ ๊ฒ์ด ์ข์ต๋๋ค. ์ดํ
์
์ฌ๋ผ์ด์ฑ์ ๋น์ฉ์ด ๋ง์ด ๋๋ ์ดํ
์
์์
์ ํ ๋ฒ์ ๋ชจ๋ ์ํํ๋ ๋์ ์ฌ๋ฌ ๋จ๊ณ๋ก ์ํํฉ๋๋ค. ์ผ๋ฐ์ ์ผ๋ก ๋ฒ์ฉ ๋ฉ๋ชจ๋ฆฌ๊ฐ ์๋ ์ปดํจํฐ์์ ~20%์ ์ฑ๋ฅ ์ํฅ์ ๋ฏธ์น์ง๋ง 64GB ์ด์์ด ์๋ ๊ฒฝ์ฐ ๋๋ถ๋ถ์ Apple Silicon ์ปดํจํฐ์์ *๋ ๋์ ์ฑ๋ฅ*์ด ๊ด์ฐฐ๋์์ต๋๋ค.
|
||||
|
||||
```python
|
||||
pipeline.enable_attention_slicing()
|
||||
```
|
||||
|
||||
## Known Issues
|
||||
|
||||
- ์ฌ๋ฌ ํ๋กฌํํธ๋ฅผ ๋ฐฐ์น๋ก ์์ฑํ๋ ๊ฒ์ [์ถฉ๋์ด ๋ฐ์ํ๊ฑฐ๋ ์์ ์ ์ผ๋ก ์๋ํ์ง ์์ต๋๋ค](https://github.com/huggingface/diffusers/issues/363). ์ฐ๋ฆฌ๋ ์ด๊ฒ์ด [PyTorch์ `mps` ๋ฐฑ์๋](https://github.com/pytorch/pytorch/issues/84039)์ ๊ด๋ จ์ด ์๋ค๊ณ ์๊ฐํฉ๋๋ค. ์ด ๋ฌธ์ ๋ ํด๊ฒฐ๋๊ณ ์์ง๋ง ์ง๊ธ์ ๋ฐฐ์น ๋์ ๋ฐ๋ณต ๋ฐฉ๋ฒ์ ์ฌ์ฉํ๋ ๊ฒ์ด ์ข์ต๋๋ค.
|
||||
65
docs/source/ko/optimization/onnx.mdx
Normal file
65
docs/source/ko/optimization/onnx.mdx
Normal file
@@ -0,0 +1,65 @@
|
||||
<!--Copyright 2023 The HuggingFace Team. All rights reserved.
|
||||
|
||||
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
|
||||
the License. You may obtain a copy of the License at
|
||||
|
||||
http://www.apache.org/licenses/LICENSE-2.0
|
||||
|
||||
Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on
|
||||
an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the
|
||||
specific language governing permissions and limitations under the License.
|
||||
-->
|
||||
|
||||
|
||||
# ์ถ๋ก ์ ์ํด ONNX ๋ฐํ์์ ์ฌ์ฉํ๋ ๋ฐฉ๋ฒ
|
||||
|
||||
๐ค Diffusers๋ ONNX Runtime๊ณผ ํธํ๋๋ Stable Diffusion ํ์ดํ๋ผ์ธ์ ์ ๊ณตํฉ๋๋ค. ์ด๋ฅผ ํตํด ONNX(CPU ํฌํจ)๋ฅผ ์ง์ํ๊ณ PyTorch์ ๊ฐ์ ๋ฒ์ ์ ์ฌ์ฉํ ์ ์๋ ๋ชจ๋ ํ๋์จ์ด์์ Stable Diffusion์ ์คํํ ์ ์์ต๋๋ค.
|
||||
|
||||
## ์ค์น
|
||||
|
||||
๋ค์ ๋ช
๋ น์ด๋ก ONNX Runtime๋ฅผ ์ง์ํ๋ ๐ค Optimum๋ฅผ ์ค์นํฉ๋๋ค:
|
||||
|
||||
```
|
||||
pip install optimum["onnxruntime"]
|
||||
```
|
||||
|
||||
## Stable Diffusion ์ถ๋ก
|
||||
|
||||
์๋ ์ฝ๋๋ ONNX ๋ฐํ์์ ์ฌ์ฉํ๋ ๋ฐฉ๋ฒ์ ๋ณด์ฌ์ค๋๋ค. `StableDiffusionPipeline` ๋์ `OnnxStableDiffusionPipeline`์ ์ฌ์ฉํด์ผ ํฉ๋๋ค.
|
||||
PyTorch ๋ชจ๋ธ์ ๋ถ๋ฌ์ค๊ณ ์ฆ์ ONNX ํ์์ผ๋ก ๋ณํํ๋ ค๋ ๊ฒฝ์ฐ `export=True`๋ก ์ค์ ํฉ๋๋ค.
|
||||
|
||||
```python
|
||||
from optimum.onnxruntime import ORTStableDiffusionPipeline
|
||||
|
||||
model_id = "runwayml/stable-diffusion-v1-5"
|
||||
pipe = ORTStableDiffusionPipeline.from_pretrained(model_id, export=True)
|
||||
prompt = "a photo of an astronaut riding a horse on mars"
|
||||
images = pipe(prompt).images[0]
|
||||
pipe.save_pretrained("./onnx-stable-diffusion-v1-5")
|
||||
```
|
||||
|
||||
ํ์ดํ๋ผ์ธ์ ONNX ํ์์ผ๋ก ์คํ๋ผ์ธ์ผ๋ก ๋ด๋ณด๋ด๊ณ ๋์ค์ ์ถ๋ก ์ ์ฌ์ฉํ๋ ค๋ ๊ฒฝ์ฐ,
|
||||
[`optimum-cli export`](https://huggingface.co/docs/optimum/main/en/exporters/onnx/usage_guides/export_a_model#exporting-a-model-to-onnx-using-the-cli) ๋ช
๋ น์ด๋ฅผ ์ฌ์ฉํ ์ ์์ต๋๋ค:
|
||||
|
||||
```bash
|
||||
optimum-cli export onnx --model runwayml/stable-diffusion-v1-5 sd_v15_onnx/
|
||||
```
|
||||
|
||||
๊ทธ ๋ค์ ์ถ๋ก ์ ์ํํฉ๋๋ค:
|
||||
|
||||
```python
|
||||
from optimum.onnxruntime import ORTStableDiffusionPipeline
|
||||
|
||||
model_id = "sd_v15_onnx"
|
||||
pipe = ORTStableDiffusionPipeline.from_pretrained(model_id)
|
||||
prompt = "a photo of an astronaut riding a horse on mars"
|
||||
images = pipe(prompt).images[0]
|
||||
```
|
||||
|
||||
Notice that we didn't have to specify `export=True` above.
|
||||
|
||||
[Optimum ๋ฌธ์](https://huggingface.co/docs/optimum/)์์ ๋ ๋ง์ ์์๋ฅผ ์ฐพ์ ์ ์์ต๋๋ค.
|
||||
|
||||
## ์๋ ค์ง ์ด์๋ค
|
||||
|
||||
- ์ฌ๋ฌ ํ๋กฌํํธ๋ฅผ ๋ฐฐ์น๋ก ์์ฑํ๋ฉด ๋๋ฌด ๋ง์ ๋ฉ๋ชจ๋ฆฌ๊ฐ ์ฌ์ฉ๋๋ ๊ฒ ๊ฐ์ต๋๋ค. ์ด๋ฅผ ์กฐ์ฌํ๋ ๋์, ๋ฐฐ์น ๋์ ๋ฐ๋ณต ๋ฐฉ๋ฒ์ด ํ์ํ ์๋ ์์ต๋๋ค.
|
||||
39
docs/source/ko/optimization/open_vino.mdx
Normal file
39
docs/source/ko/optimization/open_vino.mdx
Normal file
@@ -0,0 +1,39 @@
|
||||
<!--Copyright 2023 The HuggingFace Team. All rights reserved.
|
||||
|
||||
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
|
||||
the License. You may obtain a copy of the License at
|
||||
|
||||
http://www.apache.org/licenses/LICENSE-2.0
|
||||
|
||||
Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on
|
||||
an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the
|
||||
specific language governing permissions and limitations under the License.
|
||||
-->
|
||||
|
||||
# ์ถ๋ก ์ ์ํ OpenVINO ์ฌ์ฉ ๋ฐฉ๋ฒ
|
||||
|
||||
๐ค [Optimum](https://github.com/huggingface/optimum-intel)์ OpenVINO์ ํธํ๋๋ Stable Diffusion ํ์ดํ๋ผ์ธ์ ์ ๊ณตํฉ๋๋ค.
|
||||
์ด์ ๋ค์ํ Intel ํ๋ก์ธ์์์ OpenVINO Runtime์ผ๋ก ์ฝ๊ฒ ์ถ๋ก ์ ์ํํ ์ ์์ต๋๋ค. ([์ฌ๊ธฐ](https://docs.openvino.ai/latest/openvino_docs_OV_UG_supported_plugins_Supported_Devices.html)์ ์ง์๋๋ ์ ๊ธฐ๊ธฐ ๋ชฉ๋ก์ ํ์ธํ์ธ์).
|
||||
|
||||
## ์ค์น
|
||||
|
||||
๋ค์ ๋ช
๋ น์ด๋ก ๐ค Optimum์ ์ค์นํฉ๋๋ค:
|
||||
|
||||
```
|
||||
pip install optimum["openvino"]
|
||||
```
|
||||
|
||||
## Stable Diffusion ์ถ๋ก
|
||||
|
||||
OpenVINO ๋ชจ๋ธ์ ๋ถ๋ฌ์ค๊ณ OpenVINO ๋ฐํ์์ผ๋ก ์ถ๋ก ์ ์คํํ๋ ค๋ฉด `StableDiffusionPipeline`์ `OVStableDiffusionPipeline`์ผ๋ก ๊ต์ฒดํด์ผ ํฉ๋๋ค. PyTorch ๋ชจ๋ธ์ ๋ถ๋ฌ์ค๊ณ ์ฆ์ OpenVINO ํ์์ผ๋ก ๋ณํํ๋ ค๋ ๊ฒฝ์ฐ `export=True`๋ก ์ค์ ํฉ๋๋ค.
|
||||
|
||||
```python
|
||||
from optimum.intel.openvino import OVStableDiffusionPipeline
|
||||
|
||||
model_id = "runwayml/stable-diffusion-v1-5"
|
||||
pipe = OVStableDiffusionPipeline.from_pretrained(model_id, export=True)
|
||||
prompt = "a photo of an astronaut riding a horse on mars"
|
||||
images = pipe(prompt).images[0]
|
||||
```
|
||||
|
||||
[Optimum ๋ฌธ์](https://huggingface.co/docs/optimum/intel/inference#export-and-inference-of-stable-diffusion-models)์์ (์ ์ reshaping๊ณผ ๋ชจ๋ธ ์ปดํ์ผ ๋ฑ์) ๋ ๋ง์ ์์๋ค์ ์ฐพ์ ์ ์์ต๋๋ค.
|
||||
36
docs/source/ko/optimization/xformers.mdx
Normal file
36
docs/source/ko/optimization/xformers.mdx
Normal file
@@ -0,0 +1,36 @@
|
||||
<!--Copyright 2023 The HuggingFace Team. All rights reserved.
|
||||
|
||||
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
|
||||
the License. You may obtain a copy of the License at
|
||||
|
||||
http://www.apache.org/licenses/LICENSE-2.0
|
||||
|
||||
Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on
|
||||
an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the
|
||||
specific language governing permissions and limitations under the License.
|
||||
-->
|
||||
|
||||
# xFormers ์ค์นํ๊ธฐ
|
||||
|
||||
์ถ๋ก ๊ณผ ํ์ต ๋ชจ๋์ [xFormers](https://github.com/facebookresearch/xformers)๋ฅผ ์ฌ์ฉํ๋ ๊ฒ์ด ์ข์ต๋๋ค.
|
||||
์์ฒด ํ
์คํธ๋ก ์ดํ
์
๋ธ๋ก์์ ์ํ๋ ์ต์ ํ๊ฐ ๋ ๋น ๋ฅธ ์๋์ ์ ์ ๋ฉ๋ชจ๋ฆฌ ์๋น๋ฅผ ํ์ธํ์ต๋๋ค.
|
||||
|
||||
2023๋
1์์ ์ถ์๋ xFormers ๋ฒ์ '0.0.16'๋ถํฐ ์ฌ์ ๋น๋๋ pip wheel์ ์ฌ์ฉํ์ฌ ์ฝ๊ฒ ์ค์นํ ์ ์์ต๋๋ค:
|
||||
|
||||
```bash
|
||||
pip install xformers
|
||||
```
|
||||
|
||||
<Tip>
|
||||
|
||||
xFormers PIP ํจํค์ง์๋ ์ต์ ๋ฒ์ ์ PyTorch(xFormers 0.0.16์ 1.13.1)๊ฐ ํ์ํฉ๋๋ค. ์ด์ ๋ฒ์ ์ PyTorch๋ฅผ ์ฌ์ฉํด์ผ ํ๋ ๊ฒฝ์ฐ [ํ๋ก์ ํธ ์ง์นจ](https://github.com/facebookresearch/xformers#installing-xformers)์ ์์ค๋ฅผ ์ฌ์ฉํด xFormers๋ฅผ ์ค์นํ๋ ๊ฒ์ด ์ข์ต๋๋ค.
|
||||
|
||||
</Tip>
|
||||
|
||||
xFormers๋ฅผ ์ค์นํ๋ฉด, [์ฌ๊ธฐ](fp16#memory-efficient-attention)์ ์ค๋ช
ํ ๊ฒ์ฒ๋ผ 'enable_xformers_memory_efficient_attention()'์ ์ฌ์ฉํ์ฌ ์ถ๋ก ์๋๋ฅผ ๋์ด๊ณ ๋ฉ๋ชจ๋ฆฌ ์๋น๋ฅผ ์ค์ผ ์ ์์ต๋๋ค.
|
||||
|
||||
<Tip warning={true}>
|
||||
|
||||
[์ด ์ด์](https://github.com/huggingface/diffusers/issues/2234#issuecomment-1416931212)์ ๋ฐ๋ฅด๋ฉด xFormers `v0.0.16`์์ GPU๋ฅผ ์ฌ์ฉํ ํ์ต(ํ์ธ ํ๋ ๋๋ Dreambooth)์ ํ ์ ์์ต๋๋ค. ํด๋น ๋ฌธ์ ๊ฐ ๋ฐ๊ฒฌ๋๋ฉด. ํด๋น ์ฝ๋ฉํธ๋ฅผ ์ฐธ๊ณ ํด development ๋ฒ์ ์ ์ค์นํ์ธ์.
|
||||
|
||||
</Tip>
|
||||
475
docs/source/ko/training/dreambooth.mdx
Normal file
475
docs/source/ko/training/dreambooth.mdx
Normal file
@@ -0,0 +1,475 @@
|
||||
<!--Copyright 2023 The HuggingFace Team. All rights reserved.
|
||||
|
||||
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
|
||||
the License. You may obtain a copy of the License at
|
||||
|
||||
http://www.apache.org/licenses/LICENSE-2.0
|
||||
|
||||
Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on
|
||||
an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the
|
||||
specific language governing permissions and limitations under the License.
|
||||
-->
|
||||
|
||||
# DreamBooth
|
||||
|
||||
[DreamBooth](https://arxiv.org/abs/2208.12242)๋ ํ ์ฃผ์ ์ ๋ํ ์ ์ ์ด๋ฏธ์ง(3~5๊ฐ)๋ง์ผ๋ก๋ stable diffusion๊ณผ ๊ฐ์ด text-to-image ๋ชจ๋ธ์ ๊ฐ์ธํํ ์ ์๋ ๋ฐฉ๋ฒ์
๋๋ค. ์ด๋ฅผ ํตํด ๋ชจ๋ธ์ ๋ค์ํ ์ฅ๋ฉด, ํฌ์ฆ ๋ฐ ์ฅ๋ฉด(๋ทฐ)์์ ํผ์ฌ์ฒด์ ๋ํด ๋งฅ๋ฝํ(contextualized)๋ ์ด๋ฏธ์ง๋ฅผ ์์ฑํ ์ ์์ต๋๋ค.
|
||||
|
||||

|
||||
<a href="https://dreambooth.github.io">project's blog.</a></small>
|
||||
<small><a href="https://dreambooth.github.io">ํ๋ก์ ํธ ๋ธ๋ก๊ทธ</a>์์์ Dreambooth ์์</small>
|
||||
|
||||
|
||||
์ด ๊ฐ์ด๋๋ ๋ค์ํ GPU, Flax ์ฌ์์ ๋ํด [`CompVis/stable-diffusion-v1-4`](https://huggingface.co/CompVis/stable-diffusion-v1-4) ๋ชจ๋ธ๋ก DreamBooth๋ฅผ ํ์ธํ๋ํ๋ ๋ฐฉ๋ฒ์ ๋ณด์ฌ์ค๋๋ค. ๋ ๊น์ด ํ๊ณ ๋ค์ด ์๋ ๋ฐฉ์์ ํ์ธํ๋ ๋ฐ ๊ด์ฌ์ด ์๋ ๊ฒฝ์ฐ, ์ด ๊ฐ์ด๋์ ์ฌ์ฉ๋ DreamBooth์ ๋ชจ๋ ํ์ต ์คํฌ๋ฆฝํธ๋ฅผ [์ฌ๊ธฐ](https://github.com/huggingface/diffusers/tree/main/examples/dreambooth)์์ ์ฐพ์ ์ ์์ต๋๋ค.
|
||||
|
||||
์คํฌ๋ฆฝํธ๋ฅผ ์คํํ๊ธฐ ์ ์ ๋ผ์ด๋ธ๋ฌ๋ฆฌ์ ํ์ต์ ํ์ํ dependencies๋ฅผ ์ค์นํด์ผ ํฉ๋๋ค. ๋ํ `main` GitHub ๋ธ๋์น์์ ๐งจ Diffusers๋ฅผ ์ค์นํ๋ ๊ฒ์ด ์ข์ต๋๋ค.
|
||||
|
||||
```bash
|
||||
pip install git+https://github.com/huggingface/diffusers
|
||||
pip install -U -r diffusers/examples/dreambooth/requirements.txt
|
||||
```
|
||||
|
||||
xFormers๋ ํ์ต์ ํ์ํ ์๊ตฌ ์ฌํญ์ ์๋์ง๋ง, ๊ฐ๋ฅํ๋ฉด [์ค์น](../optimization/xformers)ํ๋ ๊ฒ์ด ์ข์ต๋๋ค. ํ์ต ์๋๋ฅผ ๋์ด๊ณ ๋ฉ๋ชจ๋ฆฌ ์ฌ์ฉ๋์ ์ค์ผ ์ ์๊ธฐ ๋๋ฌธ์
๋๋ค.
|
||||
|
||||
๋ชจ๋ dependencies์ ์ค์ ํ ํ ๋ค์์ ์ฌ์ฉํ์ฌ [๐ค Accelerate](https://github.com/huggingface/accelerate/) ํ๊ฒฝ์ ๋ค์๊ณผ ๊ฐ์ด ์ด๊ธฐํํฉ๋๋ค:
|
||||
|
||||
```bash
|
||||
accelerate config
|
||||
```
|
||||
|
||||
๋ณ๋ ์ค์ ์์ด ๊ธฐ๋ณธ ๐ค Accelerate ํ๊ฒฝ์ ์ค์นํ๋ ค๋ฉด ๋ค์์ ์คํํฉ๋๋ค:
|
||||
|
||||
```bash
|
||||
accelerate config default
|
||||
```
|
||||
|
||||
๋๋ ํ์ฌ ํ๊ฒฝ์ด ๋
ธํธ๋ถ๊ณผ ๊ฐ์ ๋ํํ ์
ธ์ ์ง์ํ์ง ์๋ ๊ฒฝ์ฐ ๋ค์์ ์ฌ์ฉํ ์ ์์ต๋๋ค:
|
||||
|
||||
```py
|
||||
from accelerate.utils import write_basic_config
|
||||
|
||||
write_basic_config()
|
||||
```
|
||||
|
||||
## ํ์ธํ๋
|
||||
|
||||
<Tip warning={true}>
|
||||
|
||||
DreamBooth ํ์ธํ๋์ ํ์ดํผํ๋ผ๋ฏธํฐ์ ๋งค์ฐ ๋ฏผ๊ฐํ๊ณ ๊ณผ์ ํฉ๋๊ธฐ ์ฝ์ต๋๋ค. ์ ์ ํ ํ์ดํผํ๋ผ๋ฏธํฐ๋ฅผ ์ ํํ๋ ๋ฐ ๋์์ด ๋๋๋ก ๋ค์ํ ๊ถ์ฅ ์ค์ ์ด ํฌํจ๋ [์ฌ์ธต ๋ถ์](https://huggingface.co/blog/dreambooth)์ ์ดํด๋ณด๋ ๊ฒ์ด ์ข์ต๋๋ค.
|
||||
|
||||
</Tip>
|
||||
|
||||
<frameworkcontent>
|
||||
<pt>
|
||||
[๋ช ์ฅ์ ๊ฐ์์ง ์ด๋ฏธ์ง๋ค](https://drive.google.com/drive/folders/1BO_dyz-p65qhBRRMRA4TbZ8qW4rB99JZ)๋ก DreamBooth๋ฅผ ์๋ํด๋ด
์๋ค.
|
||||
์ด๋ฅผ ๋ค์ด๋ก๋ํด ๋๋ ํฐ๋ฆฌ์ ์ ์ฅํ ๋ค์ `INSTANCE_DIR` ํ๊ฒฝ ๋ณ์๋ฅผ ํด๋น ๊ฒฝ๋ก๋ก ์ค์ ํฉ๋๋ค:
|
||||
|
||||
|
||||
```bash
|
||||
export MODEL_NAME="CompVis/stable-diffusion-v1-4"
|
||||
export INSTANCE_DIR="path_to_training_images"
|
||||
export OUTPUT_DIR="path_to_saved_model"
|
||||
```
|
||||
|
||||
๊ทธ๋ฐ ๋ค์, ๋ค์ ๋ช
๋ น์ ์ฌ์ฉํ์ฌ ํ์ต ์คํฌ๋ฆฝํธ๋ฅผ ์คํํ ์ ์์ต๋๋ค (์ ์ฒด ํ์ต ์คํฌ๋ฆฝํธ๋ [์ฌ๊ธฐ](https://github.com/huggingface/diffusers/blob/main/examples/dreambooth/train_dreambooth.py)์์ ์ฐพ์ ์ ์์ต๋๋ค):
|
||||
|
||||
```bash
|
||||
accelerate launch train_dreambooth.py \
|
||||
--pretrained_model_name_or_path=$MODEL_NAME \
|
||||
--instance_data_dir=$INSTANCE_DIR \
|
||||
--output_dir=$OUTPUT_DIR \
|
||||
--instance_prompt="a photo of sks dog" \
|
||||
--resolution=512 \
|
||||
--train_batch_size=1 \
|
||||
--gradient_accumulation_steps=1 \
|
||||
--learning_rate=5e-6 \
|
||||
--lr_scheduler="constant" \
|
||||
--lr_warmup_steps=0 \
|
||||
--max_train_steps=400
|
||||
```
|
||||
</pt>
|
||||
<jax>
|
||||
|
||||
TPU์ ์ก์ธ์คํ ์ ์๊ฑฐ๋ ๋ ๋น ๋ฅด๊ฒ ํ๋ จํ๊ณ ์ถ๋ค๋ฉด [Flax ํ์ต ์คํฌ๋ฆฝํธ](https://github.com/huggingface/diffusers/blob/main/examples/dreambooth/train_dreambooth_flax.py)๋ฅผ ์ฌ์ฉํด ๋ณผ ์ ์์ต๋๋ค. Flax ํ์ต ์คํฌ๋ฆฝํธ๋ gradient checkpointing ๋๋ gradient accumulation์ ์ง์ํ์ง ์์ผ๋ฏ๋ก, ๋ฉ๋ชจ๋ฆฌ๊ฐ 30GB ์ด์์ธ GPU๊ฐ ํ์ํฉ๋๋ค.
|
||||
|
||||
์คํฌ๋ฆฝํธ๋ฅผ ์คํํ๊ธฐ ์ ์ ์๊ตฌ ์ฌํญ์ด ์ค์น๋์ด ์๋์ง ํ์ธํ์ญ์์ค.
|
||||
|
||||
```bash
|
||||
pip install -U -r requirements.txt
|
||||
```
|
||||
|
||||
๊ทธ๋ฌ๋ฉด ๋ค์ ๋ช
๋ น์ด๋ก ํ์ต ์คํฌ๋ฆฝํธ๋ฅผ ์คํ์ํฌ ์ ์์ต๋๋ค:
|
||||
|
||||
```bash
|
||||
export MODEL_NAME="duongna/stable-diffusion-v1-4-flax"
|
||||
export INSTANCE_DIR="path-to-instance-images"
|
||||
export OUTPUT_DIR="path-to-save-model"
|
||||
|
||||
python train_dreambooth_flax.py \
|
||||
--pretrained_model_name_or_path=$MODEL_NAME \
|
||||
--instance_data_dir=$INSTANCE_DIR \
|
||||
--output_dir=$OUTPUT_DIR \
|
||||
--instance_prompt="a photo of sks dog" \
|
||||
--resolution=512 \
|
||||
--train_batch_size=1 \
|
||||
--learning_rate=5e-6 \
|
||||
--max_train_steps=400
|
||||
```
|
||||
</jax>
|
||||
</frameworkcontent>
|
||||
|
||||
### Prior-preserving(์ฌ์ ๋ณด์กด) loss๋ฅผ ์ฌ์ฉํ ํ์ธํ๋
|
||||
|
||||
๊ณผ์ ํฉ๊ณผ language drift๋ฅผ ๋ฐฉ์งํ๊ธฐ ์ํด ์ฌ์ ๋ณด์กด์ด ์ฌ์ฉ๋ฉ๋๋ค(๊ด์ฌ์ด ์๋ ๊ฒฝ์ฐ [๋
ผ๋ฌธ](https://arxiv.org/abs/2208.12242)์ ์ฐธ์กฐํ์ธ์). ์ฌ์ ๋ณด์กด์ ์ํด ๋์ผํ ํด๋์ค์ ๋ค๋ฅธ ์ด๋ฏธ์ง๋ฅผ ํ์ต ํ๋ก์ธ์ค์ ์ผ๋ถ๋ก ์ฌ์ฉํฉ๋๋ค. ์ข์ ์ ์ Stable Diffusion ๋ชจ๋ธ ์์ฒด๋ฅผ ์ฌ์ฉํ์ฌ ์ด๋ฌํ ์ด๋ฏธ์ง๋ฅผ ์์ฑํ ์ ์๋ค๋ ๊ฒ์
๋๋ค! ํ์ต ์คํฌ๋ฆฝํธ๋ ์์ฑ๋ ์ด๋ฏธ์ง๋ฅผ ์ฐ๋ฆฌ๊ฐ ์ง์ ํ ๋ก์ปฌ ๊ฒฝ๋ก์ ์ ์ฅํฉ๋๋ค.
|
||||
|
||||
์ ์๋ค์ ๋ฐ๋ฅด๋ฉด ์ฌ์ ๋ณด์กด์ ์ํด `num_epochs * num_samples`๊ฐ์ ์ด๋ฏธ์ง๋ฅผ ์์ฑํ๋ ๊ฒ์ด ์ข์ต๋๋ค. 200-300๊ฐ์์ ๋๋ถ๋ถ ์ ์๋ํฉ๋๋ค.
|
||||
|
||||
<frameworkcontent>
|
||||
<pt>
|
||||
```bash
|
||||
export MODEL_NAME="CompVis/stable-diffusion-v1-4"
|
||||
export INSTANCE_DIR="path_to_training_images"
|
||||
export CLASS_DIR="path_to_class_images"
|
||||
export OUTPUT_DIR="path_to_saved_model"
|
||||
|
||||
accelerate launch train_dreambooth.py \
|
||||
--pretrained_model_name_or_path=$MODEL_NAME \
|
||||
--instance_data_dir=$INSTANCE_DIR \
|
||||
--class_data_dir=$CLASS_DIR \
|
||||
--output_dir=$OUTPUT_DIR \
|
||||
--with_prior_preservation --prior_loss_weight=1.0 \
|
||||
--instance_prompt="a photo of sks dog" \
|
||||
--class_prompt="a photo of dog" \
|
||||
--resolution=512 \
|
||||
--train_batch_size=1 \
|
||||
--gradient_accumulation_steps=1 \
|
||||
--learning_rate=5e-6 \
|
||||
--lr_scheduler="constant" \
|
||||
--lr_warmup_steps=0 \
|
||||
--num_class_images=200 \
|
||||
--max_train_steps=800
|
||||
```
|
||||
</pt>
|
||||
<jax>
|
||||
```bash
|
||||
export MODEL_NAME="duongna/stable-diffusion-v1-4-flax"
|
||||
export INSTANCE_DIR="path-to-instance-images"
|
||||
export CLASS_DIR="path-to-class-images"
|
||||
export OUTPUT_DIR="path-to-save-model"
|
||||
|
||||
python train_dreambooth_flax.py \
|
||||
--pretrained_model_name_or_path=$MODEL_NAME \
|
||||
--instance_data_dir=$INSTANCE_DIR \
|
||||
--class_data_dir=$CLASS_DIR \
|
||||
--output_dir=$OUTPUT_DIR \
|
||||
--with_prior_preservation --prior_loss_weight=1.0 \
|
||||
--instance_prompt="a photo of sks dog" \
|
||||
--class_prompt="a photo of dog" \
|
||||
--resolution=512 \
|
||||
--train_batch_size=1 \
|
||||
--learning_rate=5e-6 \
|
||||
--num_class_images=200 \
|
||||
--max_train_steps=800
|
||||
```
|
||||
</jax>
|
||||
</frameworkcontent>
|
||||
|
||||
## ํ
์คํธ ์ธ์ฝ๋์ and UNet๋ก ํ์ธํ๋ํ๊ธฐ
|
||||
|
||||
ํด๋น ์คํฌ๋ฆฝํธ๋ฅผ ์ฌ์ฉํ๋ฉด `unet`๊ณผ ํจ๊ป `text_encoder`๋ฅผ ํ์ธํ๋ํ ์ ์์ต๋๋ค. ์คํ์์(์์ธํ ๋ด์ฉ์ [๐งจ Diffusers๋ฅผ ์ฌ์ฉํด DreamBooth๋ก Stable Diffusion ํ์ตํ๊ธฐ](https://huggingface.co/blog/dreambooth) ๊ฒ์๋ฌผ์ ํ์ธํ์ธ์), ํนํ ์ผ๊ตด ์ด๋ฏธ์ง๋ฅผ ์์ฑํ ๋ ํจ์ฌ ๋ ๋์ ๊ฒฐ๊ณผ๋ฅผ ์ป์ ์ ์์ต๋๋ค.
|
||||
|
||||
<Tip warning={true}>
|
||||
|
||||
ํ
์คํธ ์ธ์ฝ๋๋ฅผ ํ์ต์ํค๋ ค๋ฉด ์ถ๊ฐ ๋ฉ๋ชจ๋ฆฌ๊ฐ ํ์ํด 16GB GPU๋ก๋ ๋์ํ์ง ์์ต๋๋ค. ์ด ์ต์
์ ์ฌ์ฉํ๋ ค๋ฉด ์ต์ 24GB VRAM์ด ํ์ํฉ๋๋ค.
|
||||
|
||||
</Tip>
|
||||
|
||||
`--train_text_encoder` ์ธ์๋ฅผ ํ์ต ์คํฌ๋ฆฝํธ์ ์ ๋ฌํ์ฌ `text_encoder` ๋ฐ `unet`์ ํ์ธํ๋ํ ์ ์์ต๋๋ค:
|
||||
|
||||
<frameworkcontent>
|
||||
<pt>
|
||||
```bash
|
||||
export MODEL_NAME="CompVis/stable-diffusion-v1-4"
|
||||
export INSTANCE_DIR="path_to_training_images"
|
||||
export CLASS_DIR="path_to_class_images"
|
||||
export OUTPUT_DIR="path_to_saved_model"
|
||||
|
||||
accelerate launch train_dreambooth.py \
|
||||
--pretrained_model_name_or_path=$MODEL_NAME \
|
||||
--train_text_encoder \
|
||||
--instance_data_dir=$INSTANCE_DIR \
|
||||
--class_data_dir=$CLASS_DIR \
|
||||
--output_dir=$OUTPUT_DIR \
|
||||
--with_prior_preservation --prior_loss_weight=1.0 \
|
||||
--instance_prompt="a photo of sks dog" \
|
||||
--class_prompt="a photo of dog" \
|
||||
--resolution=512 \
|
||||
--train_batch_size=1 \
|
||||
--use_8bit_adam
|
||||
--gradient_checkpointing \
|
||||
--learning_rate=2e-6 \
|
||||
--lr_scheduler="constant" \
|
||||
--lr_warmup_steps=0 \
|
||||
--num_class_images=200 \
|
||||
--max_train_steps=800
|
||||
```
|
||||
</pt>
|
||||
<jax>
|
||||
```bash
|
||||
export MODEL_NAME="duongna/stable-diffusion-v1-4-flax"
|
||||
export INSTANCE_DIR="path-to-instance-images"
|
||||
export CLASS_DIR="path-to-class-images"
|
||||
export OUTPUT_DIR="path-to-save-model"
|
||||
|
||||
python train_dreambooth_flax.py \
|
||||
--pretrained_model_name_or_path=$MODEL_NAME \
|
||||
--train_text_encoder \
|
||||
--instance_data_dir=$INSTANCE_DIR \
|
||||
--class_data_dir=$CLASS_DIR \
|
||||
--output_dir=$OUTPUT_DIR \
|
||||
--with_prior_preservation --prior_loss_weight=1.0 \
|
||||
--instance_prompt="a photo of sks dog" \
|
||||
--class_prompt="a photo of dog" \
|
||||
--resolution=512 \
|
||||
--train_batch_size=1 \
|
||||
--learning_rate=2e-6 \
|
||||
--num_class_images=200 \
|
||||
--max_train_steps=800
|
||||
```
|
||||
</jax>
|
||||
</frameworkcontent>
|
||||
|
||||
## LoRA๋ก ํ์ธํ๋ํ๊ธฐ
|
||||
|
||||
DreamBooth์์ ๋๊ท๋ชจ ๋ชจ๋ธ์ ํ์ต์ ๊ฐ์ํํ๊ธฐ ์ํ ํ์ธํ๋ ๊ธฐ์ ์ธ LoRA(Low-Rank Adaptation of Large Language Models)๋ฅผ ์ฌ์ฉํ ์ ์์ต๋๋ค. ์์ธํ ๋ด์ฉ์ [LoRA ํ์ต](training/lora#dreambooth) ๊ฐ์ด๋๋ฅผ ์ฐธ์กฐํ์ธ์.
|
||||
|
||||
### ํ์ต ์ค ์ฒดํฌํฌ์ธํธ ์ ์ฅํ๊ธฐ
|
||||
|
||||
Dreambooth๋ก ํ๋ จํ๋ ๋์ ๊ณผ์ ํฉํ๊ธฐ ์ฌ์ฐ๋ฏ๋ก, ๋๋๋ก ํ์ต ์ค์ ์ ๊ธฐ์ ์ธ ์ฒดํฌํฌ์ธํธ๋ฅผ ์ ์ฅํ๋ ๊ฒ์ด ์ ์ฉํฉ๋๋ค. ์ค๊ฐ ์ฒดํฌํฌ์ธํธ ์ค ํ๋๊ฐ ์ต์ข
๋ชจ๋ธ๋ณด๋ค ๋ ์ ์๋ํ ์ ์์ต๋๋ค! ์ฒดํฌํฌ์ธํธ ์ ์ฅ ๊ธฐ๋ฅ์ ํ์ฑํํ๋ ค๋ฉด ํ์ต ์คํฌ๋ฆฝํธ์ ๋ค์ ์ธ์๋ฅผ ์ ๋ฌํด์ผ ํฉ๋๋ค:
|
||||
|
||||
```bash
|
||||
--checkpointing_steps=500
|
||||
```
|
||||
|
||||
์ด๋ ๊ฒ ํ๋ฉด `output_dir`์ ํ์ ํด๋์ ์ ์ฒด ํ์ต ์ํ๊ฐ ์ ์ฅ๋ฉ๋๋ค. ํ์ ํด๋ ์ด๋ฆ์ ์ ๋์ฌ `checkpoint-`๋ก ์์ํ๊ณ ์ง๊ธ๊น์ง ์ํ๋ step ์์
๋๋ค. ์์๋ก `checkpoint-1500`์ 1500 ํ์ต step ํ์ ์ ์ฅ๋ ์ฒดํฌํฌ์ธํธ์
๋๋ค.
|
||||
|
||||
#### ์ ์ฅ๋ ์ฒดํฌํฌ์ธํธ์์ ํ๋ จ ์ฌ๊ฐํ๊ธฐ
|
||||
|
||||
์ ์ฅ๋ ์ฒดํฌํฌ์ธํธ์์ ํ๋ จ์ ์ฌ๊ฐํ๋ ค๋ฉด, `--resume_from_checkpoint` ์ธ์๋ฅผ ์ ๋ฌํ ๋ค์ ์ฌ์ฉํ ์ฒดํฌํฌ์ธํธ์ ์ด๋ฆ์ ์ง์ ํ๋ฉด ๋ฉ๋๋ค. ํน์ ๋ฌธ์์ด `"latest"`๋ฅผ ์ฌ์ฉํ์ฌ ์ ์ฅ๋ ๋ง์ง๋ง ์ฒดํฌํฌ์ธํธ(์ฆ, step ์๊ฐ ๊ฐ์ฅ ๋ง์ ์ฒดํฌํฌ์ธํธ)์์ ์ฌ๊ฐํ ์๋ ์์ต๋๋ค. ์๋ฅผ ๋ค์ด ๋ค์์ 1500 step ํ์ ์ ์ฅ๋ ์ฒดํฌํฌ์ธํธ์์๋ถํฐ ํ์ต์ ์ฌ๊ฐํฉ๋๋ค:
|
||||
|
||||
```bash
|
||||
--resume_from_checkpoint="checkpoint-1500"
|
||||
```
|
||||
|
||||
์ํ๋ ๊ฒฝ์ฐ ์ผ๋ถ ํ์ดํผํ๋ผ๋ฏธํฐ๋ฅผ ์กฐ์ ํ ์ ์์ต๋๋ค.
|
||||
|
||||
#### ์ ์ฅ๋ ์ฒดํฌํฌ์ธํธ๋ฅผ ์ฌ์ฉํ์ฌ ์ถ๋ก ์ํํ๊ธฐ
|
||||
|
||||
์ ์ฅ๋ ์ฒดํฌํฌ์ธํธ๋ ํ๋ จ ์ฌ๊ฐ์ ์ ํฉํ ํ์์ผ๋ก ์ ์ฅ๋ฉ๋๋ค. ์ฌ๊ธฐ์๋ ๋ชจ๋ธ ๊ฐ์ค์น๋ฟ๋ง ์๋๋ผ ์ตํฐ๋ง์ด์ , ๋ฐ์ดํฐ ๋ก๋ ๋ฐ ํ์ต๋ฅ ์ ์ํ๋ ํฌํจ๋ฉ๋๋ค.
|
||||
|
||||
**`"accelerate>=0.16.0"`**์ด ์ค์น๋ ๊ฒฝ์ฐ ๋ค์ ์ฝ๋๋ฅผ ์ฌ์ฉํ์ฌ ์ค๊ฐ ์ฒดํฌํฌ์ธํธ์์ ์ถ๋ก ์ ์คํํฉ๋๋ค.
|
||||
|
||||
```python
|
||||
from diffusers import DiffusionPipeline, UNet2DConditionModel
|
||||
from transformers import CLIPTextModel
|
||||
import torch
|
||||
|
||||
# ํ์ต์ ์ฌ์ฉ๋ ๊ฒ๊ณผ ๋์ผํ ์ธ์(model, revision)๋ก ํ์ดํ๋ผ์ธ์ ๋ก๋ํฉ๋๋ค.
|
||||
model_id = "CompVis/stable-diffusion-v1-4"
|
||||
|
||||
unet = UNet2DConditionModel.from_pretrained("/sddata/dreambooth/daruma-v2-1/checkpoint-100/unet")
|
||||
|
||||
# `args.train_text_encoder`๋ก ํ์ตํ ๊ฒฝ์ฐ๋ฉด ํ
์คํธ ์ธ์ฝ๋๋ฅผ ๊ผญ ๋ถ๋ฌ์ค์ธ์
|
||||
text_encoder = CLIPTextModel.from_pretrained("/sddata/dreambooth/daruma-v2-1/checkpoint-100/text_encoder")
|
||||
|
||||
pipeline = DiffusionPipeline.from_pretrained(model_id, unet=unet, text_encoder=text_encoder, dtype=torch.float16)
|
||||
pipeline.to("cuda")
|
||||
|
||||
# ์ถ๋ก ์ ์ํํ๊ฑฐ๋ ์ ์ฅํ๊ฑฐ๋, ํ๋ธ์ ํธ์ํฉ๋๋ค.
|
||||
pipeline.save_pretrained("dreambooth-pipeline")
|
||||
```
|
||||
|
||||
If you have **`"accelerate<0.16.0"`** installed, you need to convert it to an inference pipeline first:
|
||||
|
||||
```python
|
||||
from accelerate import Accelerator
|
||||
from diffusers import DiffusionPipeline
|
||||
|
||||
# ํ์ต์ ์ฌ์ฉ๋ ๊ฒ๊ณผ ๋์ผํ ์ธ์(model, revision)๋ก ํ์ดํ๋ผ์ธ์ ๋ก๋ํฉ๋๋ค.
|
||||
model_id = "CompVis/stable-diffusion-v1-4"
|
||||
pipeline = DiffusionPipeline.from_pretrained(model_id)
|
||||
|
||||
accelerator = Accelerator()
|
||||
|
||||
# ์ด๊ธฐ ํ์ต์ `--train_text_encoder`๊ฐ ์ฌ์ฉ๋ ๊ฒฝ์ฐ text_encoder๋ฅผ ์ฌ์ฉํฉ๋๋ค.
|
||||
unet, text_encoder = accelerator.prepare(pipeline.unet, pipeline.text_encoder)
|
||||
|
||||
# ์ฒดํฌํฌ์ธํธ ๊ฒฝ๋ก๋ก๋ถํฐ ์ํ๋ฅผ ๋ณต์ํฉ๋๋ค. ์ฌ๊ธฐ์๋ ์ ๋ ๊ฒฝ๋ก๋ฅผ ์ฌ์ฉํด์ผ ํฉ๋๋ค.
|
||||
accelerator.load_state("/sddata/dreambooth/daruma-v2-1/checkpoint-100")
|
||||
|
||||
# unwrapped ๋ชจ๋ธ๋ก ํ์ดํ๋ผ์ธ์ ๋ค์ ๋น๋ํฉ๋๋ค.(.unet and .text_encoder๋ก์ ํ ๋น๋ ์๋ํด์ผ ํฉ๋๋ค)
|
||||
pipeline = DiffusionPipeline.from_pretrained(
|
||||
model_id,
|
||||
unet=accelerator.unwrap_model(unet),
|
||||
text_encoder=accelerator.unwrap_model(text_encoder),
|
||||
)
|
||||
|
||||
# ์ถ๋ก ์ ์ํํ๊ฑฐ๋ ์ ์ฅํ๊ฑฐ๋, ํ๋ธ์ ํธ์ํฉ๋๋ค.
|
||||
pipeline.save_pretrained("dreambooth-pipeline")
|
||||
```
|
||||
|
||||
## ๊ฐ GPU ์ฉ๋์์์ ์ต์ ํ
|
||||
|
||||
ํ๋์จ์ด์ ๋ฐ๋ผ 16GB์์ 8GB๊น์ง GPU์์ DreamBooth๋ฅผ ์ต์ ํํ๋ ๋ช ๊ฐ์ง ๋ฐฉ๋ฒ์ด ์์ต๋๋ค!
|
||||
|
||||
### xFormers
|
||||
|
||||
[xFormers](https://github.com/facebookresearch/xformers)๋ Transformers๋ฅผ ์ต์ ํํ๊ธฐ ์ํ toolbox์ด๋ฉฐ, ๐งจ Diffusers์์ ์ฌ์ฉ๋๋[memory-efficient attention](https://facebookresearch.github.io/xformers/components/ops.html#module-xformers.ops) ๋ฉ์ปค๋์ฆ์ ํฌํจํ๊ณ ์์ต๋๋ค. [xFormers๋ฅผ ์ค์น](./optimization/xformers)ํ ๋ค์ ํ์ต ์คํฌ๋ฆฝํธ์ ๋ค์ ์ธ์๋ฅผ ์ถ๊ฐํฉ๋๋ค:
|
||||
|
||||
```bash
|
||||
--enable_xformers_memory_efficient_attention
|
||||
```
|
||||
|
||||
xFormers๋ Flax์์ ์ฌ์ฉํ ์ ์์ต๋๋ค.
|
||||
|
||||
### ๊ทธ๋๋์ธํธ ์์์ผ๋ก ์ค์
|
||||
|
||||
๋ฉ๋ชจ๋ฆฌ ์ฌ์ฉ๋์ ์ค์ผ ์ ์๋ ๋ ๋ค๋ฅธ ๋ฐฉ๋ฒ์ [๊ธฐ์ธ๊ธฐ ์ค์ ](https://pytorch.org/docs/stable/generated/torch.optim.Optimizer.zero_grad.html)์ 0 ๋์ `None`์ผ๋ก ํ๋ ๊ฒ์
๋๋ค. ๊ทธ๋ฌ๋ ์ด๋ก ์ธํด ํน์ ๋์์ด ๋ณ๊ฒฝ๋ ์ ์์ผ๋ฏ๋ก ๋ฌธ์ ๊ฐ ๋ฐ์ํ๋ฉด ์ด ์ธ์๋ฅผ ์ ๊ฑฐํด ๋ณด์ญ์์ค. ํ์ต ์คํฌ๋ฆฝํธ์ ๋ค์ ์ธ์๋ฅผ ์ถ๊ฐํ์ฌ ๊ทธ๋๋์ธํธ๋ฅผ `None`์ผ๋ก ์ค์ ํฉ๋๋ค.
|
||||
|
||||
```bash
|
||||
--set_grads_to_none
|
||||
```
|
||||
|
||||
### 16GB GPU
|
||||
|
||||
Gradient checkpointing๊ณผ [bitsandbytes](https://github.com/TimDettmers/bitsandbytes)์ 8๋นํธ ์ตํฐ๋ง์ด์ ์ ๋์์ผ๋ก, 16GB GPU์์ dreambooth๋ฅผ ํ๋ จํ ์ ์์ต๋๋ค. bitsandbytes๊ฐ ์ค์น๋์ด ์๋์ง ํ์ธํ์ธ์:
|
||||
|
||||
```bash
|
||||
pip install bitsandbytes
|
||||
```
|
||||
|
||||
๊ทธ ๋ค์, ํ์ต ์คํฌ๋ฆฝํธ์ `--use_8bit_adam` ์ต์
์ ๋ช
์ํฉ๋๋ค:
|
||||
|
||||
```bash
|
||||
export MODEL_NAME="CompVis/stable-diffusion-v1-4"
|
||||
export INSTANCE_DIR="path_to_training_images"
|
||||
export CLASS_DIR="path_to_class_images"
|
||||
export OUTPUT_DIR="path_to_saved_model"
|
||||
|
||||
accelerate launch train_dreambooth.py \
|
||||
--pretrained_model_name_or_path=$MODEL_NAME \
|
||||
--instance_data_dir=$INSTANCE_DIR \
|
||||
--class_data_dir=$CLASS_DIR \
|
||||
--output_dir=$OUTPUT_DIR \
|
||||
--with_prior_preservation --prior_loss_weight=1.0 \
|
||||
--instance_prompt="a photo of sks dog" \
|
||||
--class_prompt="a photo of dog" \
|
||||
--resolution=512 \
|
||||
--train_batch_size=1 \
|
||||
--gradient_accumulation_steps=2 --gradient_checkpointing \
|
||||
--use_8bit_adam \
|
||||
--learning_rate=5e-6 \
|
||||
--lr_scheduler="constant" \
|
||||
--lr_warmup_steps=0 \
|
||||
--num_class_images=200 \
|
||||
--max_train_steps=800
|
||||
```
|
||||
|
||||
### 12GB GPU
|
||||
|
||||
12GB GPU์์ DreamBooth๋ฅผ ์คํํ๋ ค๋ฉด gradient checkpointing, 8๋นํธ ์ตํฐ๋ง์ด์ , xFormers๋ฅผ ํ์ฑํํ๊ณ ๊ทธ๋๋์ธํธ๋ฅผ `None`์ผ๋ก ์ค์ ํด์ผ ํฉ๋๋ค.
|
||||
|
||||
```bash
|
||||
export MODEL_NAME="CompVis/stable-diffusion-v1-4"
|
||||
export INSTANCE_DIR="path-to-instance-images"
|
||||
export CLASS_DIR="path-to-class-images"
|
||||
export OUTPUT_DIR="path-to-save-model"
|
||||
|
||||
accelerate launch train_dreambooth.py \
|
||||
--pretrained_model_name_or_path=$MODEL_NAME \
|
||||
--instance_data_dir=$INSTANCE_DIR \
|
||||
--class_data_dir=$CLASS_DIR \
|
||||
--output_dir=$OUTPUT_DIR \
|
||||
--with_prior_preservation --prior_loss_weight=1.0 \
|
||||
--instance_prompt="a photo of sks dog" \
|
||||
--class_prompt="a photo of dog" \
|
||||
--resolution=512 \
|
||||
--train_batch_size=1 \
|
||||
--gradient_accumulation_steps=1 --gradient_checkpointing \
|
||||
--use_8bit_adam \
|
||||
--enable_xformers_memory_efficient_attention \
|
||||
--set_grads_to_none \
|
||||
--learning_rate=2e-6 \
|
||||
--lr_scheduler="constant" \
|
||||
--lr_warmup_steps=0 \
|
||||
--num_class_images=200 \
|
||||
--max_train_steps=800
|
||||
```
|
||||
|
||||
### 8GB GPU์์ ํ์ตํ๊ธฐ
|
||||
|
||||
8GB GPU์ ๋ํด์๋ [DeepSpeed](https://www.deepspeed.ai/)๋ฅผ ์ฌ์ฉํด ์ผ๋ถ ํ
์๋ฅผ VRAM์์ CPU ๋๋ NVME๋ก ์คํ๋ก๋ํ์ฌ ๋ ์ ์ GPU ๋ฉ๋ชจ๋ฆฌ๋ก ํ์ตํ ์๋ ์์ต๋๋ค.
|
||||
|
||||
๐ค Accelerate ํ๊ฒฝ์ ๊ตฌ์ฑํ๋ ค๋ฉด ๋ค์ ๋ช
๋ น์ ์คํํ์ธ์:
|
||||
|
||||
```bash
|
||||
accelerate config
|
||||
```
|
||||
|
||||
ํ๊ฒฝ ๊ตฌ์ฑ ์ค์ DeepSpeed๋ฅผ ์ฌ์ฉํ ๊ฒ์ ํ์ธํ์ธ์.
|
||||
๊ทธ๋ฌ๋ฉด DeepSpeed stage 2, fp16 ํผํฉ ์ ๋ฐ๋๋ฅผ ๊ฒฐํฉํ๊ณ ๋ชจ๋ธ ๋งค๊ฐ๋ณ์์ ์ตํฐ๋ง์ด์ ์ํ๋ฅผ ๋ชจ๋ CPU๋ก ์คํ๋ก๋ํ๋ฉด 8GB VRAM ๋ฏธ๋ง์์ ํ์ตํ ์ ์์ต๋๋ค.
|
||||
๋จ์ ์ ๋ ๋ง์ ์์คํ
RAM(์ฝ 25GB)์ด ํ์ํ๋ค๋ ๊ฒ์
๋๋ค. ์ถ๊ฐ ๊ตฌ์ฑ ์ต์
์ [DeepSpeed ๋ฌธ์](https://huggingface.co/docs/accelerate/usage_guides/deepspeed)๋ฅผ ์ฐธ์กฐํ์ธ์.
|
||||
|
||||
๋ํ ๊ธฐ๋ณธ Adam ์ตํฐ๋ง์ด์ ๋ฅผ DeepSpeed์ ์ต์ ํ๋ Adam ๋ฒ์ ์ผ๋ก ๋ณ๊ฒฝํด์ผ ํฉ๋๋ค.
|
||||
์ด๋ ์๋นํ ์๋ ํฅ์์ ์ํ Adam์ธ [`deepspeed.ops.adam.DeepSpeedCPUAdam`](https://deepspeed.readthedocs.io/en/latest/optimizers.html#adam-cpu)์
๋๋ค.
|
||||
`DeepSpeedCPUAdam`์ ํ์ฑํํ๋ ค๋ฉด ์์คํ
์ CUDA toolchain ๋ฒ์ ์ด PyTorch์ ํจ๊ป ์ค์น๋ ๊ฒ๊ณผ ๋์ผํด์ผ ํฉ๋๋ค.
|
||||
|
||||
8๋นํธ ์ตํฐ๋ง์ด์ ๋ ํ์ฌ DeepSpeed์ ํธํ๋์ง ์๋ ๊ฒ ๊ฐ์ต๋๋ค.
|
||||
|
||||
๋ค์ ๋ช
๋ น์ผ๋ก ํ์ต์ ์์ํฉ๋๋ค:
|
||||
|
||||
```bash
|
||||
export MODEL_NAME="CompVis/stable-diffusion-v1-4"
|
||||
export INSTANCE_DIR="path_to_training_images"
|
||||
export CLASS_DIR="path_to_class_images"
|
||||
export OUTPUT_DIR="path_to_saved_model"
|
||||
|
||||
accelerate launch train_dreambooth.py \
|
||||
--pretrained_model_name_or_path=$MODEL_NAME \
|
||||
--instance_data_dir=$INSTANCE_DIR \
|
||||
--class_data_dir=$CLASS_DIR \
|
||||
--output_dir=$OUTPUT_DIR \
|
||||
--with_prior_preservation --prior_loss_weight=1.0 \
|
||||
--instance_prompt="a photo of sks dog" \
|
||||
--class_prompt="a photo of dog" \
|
||||
--resolution=512 \
|
||||
--train_batch_size=1 \
|
||||
--sample_batch_size=1 \
|
||||
--gradient_accumulation_steps=1 --gradient_checkpointing \
|
||||
--learning_rate=5e-6 \
|
||||
--lr_scheduler="constant" \
|
||||
--lr_warmup_steps=0 \
|
||||
--num_class_images=200 \
|
||||
--max_train_steps=800 \
|
||||
--mixed_precision=fp16
|
||||
```
|
||||
|
||||
## ์ถ๋ก
|
||||
|
||||
๋ชจ๋ธ์ ํ์ตํ ํ์๋, ๋ชจ๋ธ์ด ์ ์ฅ๋ ๊ฒฝ๋ก๋ฅผ ์ง์ ํด [`StableDiffusionPipeline`]๋ก ์ถ๋ก ์ ์ํํ ์ ์์ต๋๋ค. ํ๋กฌํํธ์ ํ์ต์ ์ฌ์ฉ๋ ํน์ `์๋ณ์`(์ด์ ์์์ `sks`)๊ฐ ํฌํจ๋์ด ์๋์ง ํ์ธํ์ธ์.
|
||||
|
||||
**`"accelerate>=0.16.0"`**์ด ์ค์น๋์ด ์๋ ๊ฒฝ์ฐ ๋ค์ ์ฝ๋๋ฅผ ์ฌ์ฉํ์ฌ ์ค๊ฐ ์ฒดํฌํฌ์ธํธ์์ ์ถ๋ก ์ ์คํํ ์ ์์ต๋๋ค:
|
||||
|
||||
```python
|
||||
from diffusers import StableDiffusionPipeline
|
||||
import torch
|
||||
|
||||
model_id = "path_to_saved_model"
|
||||
pipe = StableDiffusionPipeline.from_pretrained(model_id, torch_dtype=torch.float16).to("cuda")
|
||||
|
||||
prompt = "A photo of sks dog in a bucket"
|
||||
image = pipe(prompt, num_inference_steps=50, guidance_scale=7.5).images[0]
|
||||
|
||||
image.save("dog-bucket.png")
|
||||
```
|
||||
|
||||
[์ ์ฅ๋ ํ์ต ์ฒดํฌํฌ์ธํธ](#inference-from-a-saved-checkpoint)์์๋ ์ถ๋ก ์ ์คํํ ์๋ ์์ต๋๋ค.
|
||||
128
docs/source/ko/training/lora.mdx
Normal file
128
docs/source/ko/training/lora.mdx
Normal file
@@ -0,0 +1,128 @@
|
||||
<!--Copyright 2023 The HuggingFace Team. All rights reserved.
|
||||
|
||||
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
|
||||
the License. You may obtain a copy of the License at
|
||||
|
||||
http://www.apache.org/licenses/LICENSE-2.0
|
||||
|
||||
Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on
|
||||
an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the
|
||||
specific language governing permissions and limitations under the License.
|
||||
-->
|
||||
|
||||
# Low-Rank Adaptation of Large Language Models (LoRA)
|
||||
|
||||
[[open-in-colab]]
|
||||
|
||||
<Tip warning={true}>
|
||||
|
||||
ํ์ฌ LoRA๋ [`UNet2DConditionalModel`]์ ์ดํ
์
๋ ์ด์ด์์๋ง ์ง์๋ฉ๋๋ค.
|
||||
|
||||
</Tip>
|
||||
|
||||
[LoRA(Low-Rank Adaptation of Large Language Models)](https://arxiv.org/abs/2106.09685)๋ ๋ฉ๋ชจ๋ฆฌ๋ฅผ ์ ๊ฒ ์ฌ์ฉํ๋ฉด์ ๋๊ท๋ชจ ๋ชจ๋ธ์ ํ์ต์ ๊ฐ์ํํ๋ ํ์ต ๋ฐฉ๋ฒ์
๋๋ค. ์ด๋ rank-decomposition weight ํ๋ ฌ ์(**์
๋ฐ์ดํธ ํ๋ ฌ**์ด๋ผ๊ณ ํจ)์ ์ถ๊ฐํ๊ณ ์๋ก ์ถ๊ฐ๋ ๊ฐ์ค์น**๋ง** ํ์ตํฉ๋๋ค. ์ฌ๊ธฐ์๋ ๋ช ๊ฐ์ง ์ฅ์ ์ด ์์ต๋๋ค.
|
||||
|
||||
- ์ด์ ์ ๋ฏธ๋ฆฌ ํ์ต๋ ๊ฐ์ค์น๋ ๊ณ ์ ๋ ์ํ๋ก ์ ์ง๋๋ฏ๋ก ๋ชจ๋ธ์ด [์น๋ช
์ ์ธ ๋ง๊ฐ](https://www.pnas.org/doi/10.1073/pnas.1611835114) ๊ฒฝํฅ์ด ์์ต๋๋ค.
|
||||
- Rank-decomposition ํ๋ ฌ์ ์๋ ๋ชจ๋ธ๋ณด๋ค ํ๋ผ๋ฉํฐ ์๊ฐ ํจ์ฌ ์ ์ผ๋ฏ๋ก ํ์ต๋ LoRA ๊ฐ์ค์น๋ฅผ ์ฝ๊ฒ ๋ผ์๋ฃ์ ์ ์์ต๋๋ค.
|
||||
- LoRA ๋งคํธ๋ฆญ์ค๋ ์ผ๋ฐ์ ์ผ๋ก ์๋ณธ ๋ชจ๋ธ์ ์ดํ
์
๋ ์ด์ด์ ์ถ๊ฐ๋ฉ๋๋ค. ๐งจ Diffusers๋ [`~diffusers.loaders.UNet2DConditionLoadersMixin.load_attn_procs`] ๋ฉ์๋๋ฅผ ์ ๊ณตํ์ฌ LoRA ๊ฐ์ค์น๋ฅผ ๋ชจ๋ธ์ ์ดํ
์
๋ ์ด์ด๋ก ๋ถ๋ฌ์ต๋๋ค. `scale` ๋งค๊ฐ๋ณ์๋ฅผ ํตํด ๋ชจ๋ธ์ด ์๋ก์ด ํ์ต ์ด๋ฏธ์ง์ ๋ง๊ฒ ์กฐ์ ๋๋ ๋ฒ์๋ฅผ ์ ์ดํ ์ ์์ต๋๋ค.
|
||||
- ๋ฉ๋ชจ๋ฆฌ ํจ์จ์ฑ์ด ํฅ์๋์ด Tesla T4, RTX 3080 ๋๋ RTX 2080 Ti์ ๊ฐ์ ์๋น์์ฉ GPU์์ ํ์ธํ๋์ ์คํํ ์ ์์ต๋๋ค! T4์ ๊ฐ์ GPU๋ ๋ฌด๋ฃ์ด๋ฉฐ Kaggle ๋๋ Google Colab ๋
ธํธ๋ถ์์ ์ฝ๊ฒ ์ก์ธ์คํ ์ ์์ต๋๋ค.
|
||||
|
||||
|
||||
<Tip>
|
||||
|
||||
๐ก LoRA๋ ์ดํ
์
๋ ์ด์ด์๋ง ํ์ ๋์ง๋ ์์ต๋๋ค. ์ ์๋ ์ธ์ด ๋ชจ๋ธ์ ์ดํ
์
๋ ์ด์ด๋ฅผ ์์ ํ๋ ๊ฒ์ด ๋งค์ฐ ํจ์จ์ ์ผ๋ก ์ฃป์ ์ฑ๋ฅ์ ์ป๊ธฐ์ ์ถฉ๋ถํ๋ค๋ ๊ฒ์ ๋ฐ๊ฒฌํ์ต๋๋ค. ์ด๊ฒ์ด LoRA ๊ฐ์ค์น๋ฅผ ๋ชจ๋ธ์ ์ดํ
์
๋ ์ด์ด์ ์ถ๊ฐํ๋ ๊ฒ์ด ์ผ๋ฐ์ ์ธ ์ด์ ์
๋๋ค. LoRA ์๋ ๋ฐฉ์์ ๋ํ ์์ธํ ๋ด์ฉ์ [Using LoRA for effective Stable Diffusion fine-tuning](https://huggingface.co/blog/lora) ๋ธ๋ก๊ทธ๋ฅผ ํ์ธํ์ธ์!
|
||||
|
||||
</Tip>
|
||||
|
||||
[cloneofsimo](https://github.com/cloneofsimo)๋ ์ธ๊ธฐ ์๋ [lora](https://github.com/cloneofsimo/lora) GitHub ๋ฆฌํฌ์งํ ๋ฆฌ์์ Stable Diffusion์ ์ํ LoRA ํ์ต์ ์ต์ด๋ก ์๋ํ์ต๋๋ค. ๐งจ Diffusers๋ [text-to-image ์์ฑ](https://github.com/huggingface/diffusers/tree/main/examples/text_to_image#training-with-lora) ๋ฐ [DreamBooth](https://github.com/huggingface/diffusers/tree/main/examples/dreambooth#training-with-low-rank-adaptation-of-large-language-models-lora)์ ์ง์ํฉ๋๋ค. ์ด ๊ฐ์ด๋๋ ๋ ๊ฐ์ง๋ฅผ ๋ชจ๋ ์ํํ๋ ๋ฐฉ๋ฒ์ ๋ณด์ฌ์ค๋๋ค.
|
||||
|
||||
๋ชจ๋ธ์ ์ ์ฅํ๊ฑฐ๋ ์ปค๋ฎค๋ํฐ์ ๊ณต์ ํ๋ ค๋ฉด Hugging Face ๊ณ์ ์ ๋ก๊ทธ์ธํ์ธ์(์์ง ๊ณ์ ์ด ์๋ ๊ฒฝ์ฐ [์์ฑ](hf.co/join)ํ์ธ์):
|
||||
|
||||
```bash
|
||||
huggingface-cli login
|
||||
```
|
||||
|
||||
## Text-to-image
|
||||
|
||||
์์ญ์ต ๊ฐ์ ํ๋ผ๋ฉํฐ๋ค์ด ์๋ Stable Diffusion๊ณผ ๊ฐ์ ๋ชจ๋ธ์ ํ์ธํ๋ํ๋ ๊ฒ์ ๋๋ฆฌ๊ณ ์ด๋ ค์ธ ์ ์์ต๋๋ค. LoRA๋ฅผ ์ฌ์ฉํ๋ฉด diffusion ๋ชจ๋ธ์ ํ์ธํ๋ํ๋ ๊ฒ์ด ํจ์ฌ ์ฝ๊ณ ๋น ๋ฆ
๋๋ค. 8๋นํธ ์ตํฐ๋ง์ด์ ์ ๊ฐ์ ํธ๋ฆญ์ ์์กดํ์ง ์๊ณ ๋ 11GB์ GPU RAM์ผ๋ก ํ๋์จ์ด์์ ์คํํ ์ ์์ต๋๋ค.
|
||||
|
||||
|
||||
### ํ์ต [[text-to-image ํ์ต]]
|
||||
|
||||
[Pokรฉmon BLIP ์บก์
](https://huggingface.co/datasets/lambdalabs/pokemon-blip-captions) ๋ฐ์ดํฐ์
์ผ๋ก [`stable-diffusion-v1-5`](https://huggingface.co/runwayml/stable-diffusion-v1-5)๋ฅผ ํ์ธํ๋ํด ๋๋ง์ ํฌ์ผ๋ชฌ์ ์์ฑํด ๋ณด๊ฒ ์ต๋๋ค.
|
||||
|
||||
์์ํ๋ ค๋ฉด `MODEL_NAME` ๋ฐ `DATASET_NAME` ํ๊ฒฝ ๋ณ์๊ฐ ์ค์ ๋์ด ์๋์ง ํ์ธํ์ญ์์ค. `OUTPUT_DIR` ๋ฐ `HUB_MODEL_ID` ๋ณ์๋ ์ ํ ์ฌํญ์ด๋ฉฐ ํ๋ธ์์ ๋ชจ๋ธ์ ์ ์ฅํ ์์น๋ฅผ ์ง์ ํฉ๋๋ค.
|
||||
|
||||
```bash
|
||||
export MODEL_NAME="runwayml/stable-diffusion-v1-5"
|
||||
export OUTPUT_DIR="/sddata/finetune/lora/pokemon"
|
||||
export HUB_MODEL_ID="pokemon-lora"
|
||||
export DATASET_NAME="lambdalabs/pokemon-blip-captions"
|
||||
```
|
||||
|
||||
ํ์ต์ ์์ํ๊ธฐ ์ ์ ์์์ผ ํ ๋ช ๊ฐ์ง ํ๋๊ทธ๊ฐ ์์ต๋๋ค.
|
||||
|
||||
* `--push_to_hub`๋ฅผ ๋ช
์ํ๋ฉด ํ์ต๋ LoRA ์๋ฒ ๋ฉ์ ํ๋ธ์ ์ ์ฅํฉ๋๋ค.
|
||||
* `--report_to=wandb`๋ ํ์ต ๊ฒฐ๊ณผ๋ฅผ ๊ฐ์ค์น ๋ฐ ํธํฅ ๋์๋ณด๋์ ๋ณด๊ณ ํ๊ณ ๊ธฐ๋กํฉ๋๋ค(์๋ฅผ ๋ค์ด, ์ด [๋ณด๊ณ ์](https://wandb.ai/pcuenq/text2image-fine-tune/run/b4k1w0tn?workspace=user-pcuenq)๋ฅผ ์ฐธ์กฐํ์ธ์).
|
||||
* `--learning_rate=1e-04`, ์ผ๋ฐ์ ์ผ๋ก LoRA์์ ์ฌ์ฉํ๋ ๊ฒ๋ณด๋ค ๋ ๋์ ํ์ต๋ฅ ์ ์ฌ์ฉํ ์ ์์ต๋๋ค.
|
||||
|
||||
์ด์ ํ์ต์ ์์ํ ์ค๋น๊ฐ ๋์์ต๋๋ค (์ ์ฒด ํ์ต ์คํฌ๋ฆฝํธ๋ [์ฌ๊ธฐ](https://github.com/huggingface/diffusers/blob/main/examples/text_to_image/train_text_to_image_lora.py)์์ ์ฐพ์ ์ ์์ต๋๋ค).
|
||||
|
||||
```bash
|
||||
accelerate launch train_dreambooth_lora.py \
|
||||
--pretrained_model_name_or_path=$MODEL_NAME \
|
||||
--instance_data_dir=$INSTANCE_DIR \
|
||||
--output_dir=$OUTPUT_DIR \
|
||||
--instance_prompt="a photo of sks dog" \
|
||||
--resolution=512 \
|
||||
--train_batch_size=1 \
|
||||
--gradient_accumulation_steps=1 \
|
||||
--checkpointing_steps=100 \
|
||||
--learning_rate=1e-4 \
|
||||
--report_to="wandb" \
|
||||
--lr_scheduler="constant" \
|
||||
--lr_warmup_steps=0 \
|
||||
--max_train_steps=500 \
|
||||
--validation_prompt="A photo of sks dog in a bucket" \
|
||||
--validation_epochs=50 \
|
||||
--seed="0" \
|
||||
--push_to_hub
|
||||
```
|
||||
|
||||
### ์ถ๋ก [[dreambooth ์ถ๋ก ]]
|
||||
|
||||
์ด์ [`StableDiffusionPipeline`]์์ ๊ธฐ๋ณธ ๋ชจ๋ธ์ ๋ถ๋ฌ์ ์ถ๋ก ์ ์ํด ๋ชจ๋ธ์ ์ฌ์ฉํ ์ ์์ต๋๋ค:
|
||||
|
||||
```py
|
||||
>>> import torch
|
||||
>>> from diffusers import StableDiffusionPipeline
|
||||
|
||||
>>> model_base = "runwayml/stable-diffusion-v1-5"
|
||||
|
||||
>>> pipe = StableDiffusionPipeline.from_pretrained(model_base, torch_dtype=torch.float16)
|
||||
```
|
||||
|
||||
*๊ธฐ๋ณธ ๋ชจ๋ธ์ ๊ฐ์ค์น ์์* ํ์ธํ๋๋ DreamBooth ๋ชจ๋ธ์์ LoRA ๊ฐ์ค์น๋ฅผ ๋ก๋ํ ๋ค์, ๋ ๋น ๋ฅธ ์ถ๋ก ์ ์ํด ํ์ดํ๋ผ์ธ์ GPU๋ก ์ด๋ํฉ๋๋ค. LoRA ๊ฐ์ค์น๋ฅผ ํ๋ฆฌ์ง๋ ์ฌ์ ํ๋ จ๋ ๋ชจ๋ธ ๊ฐ์ค์น์ ๋ณํฉํ ๋, ์ ํ์ ์ผ๋ก 'scale' ๋งค๊ฐ๋ณ์๋ก ์ด๋ ์ ๋์ ๊ฐ์ค์น๋ฅผ ๋ณํฉํ ์ง ์กฐ์ ํ ์ ์์ต๋๋ค:
|
||||
|
||||
<Tip>
|
||||
|
||||
๐ก `0`์ `scale` ๊ฐ์ LoRA ๊ฐ์ค์น๋ฅผ ์ฌ์ฉํ์ง ์์ ์๋ ๋ชจ๋ธ์ ๊ฐ์ค์น๋ง ์ฌ์ฉํ ๊ฒ๊ณผ ๊ฐ๊ณ , `1`์ `scale` ๊ฐ์ ํ์ธํ๋๋ LoRA ๊ฐ์ค์น๋ง ์ฌ์ฉํจ์ ์๋ฏธํฉ๋๋ค. 0๊ณผ 1 ์ฌ์ด์ ๊ฐ๋ค์ ๋ ๊ฒฐ๊ณผ๋ค ์ฌ์ด๋ก ๋ณด๊ฐ๋ฉ๋๋ค.
|
||||
|
||||
</Tip>
|
||||
|
||||
```py
|
||||
>>> pipe.unet.load_attn_procs(model_path)
|
||||
>>> pipe.to("cuda")
|
||||
# LoRA ํ์ธํ๋๋ ๋ชจ๋ธ์ ๊ฐ์ค์น ์ ๋ฐ๊ณผ ๊ธฐ๋ณธ ๋ชจ๋ธ์ ๊ฐ์ค์น ์ ๋ฐ ์ฌ์ฉ
|
||||
|
||||
>>> image = pipe(
|
||||
... "A picture of a sks dog in a bucket.",
|
||||
... num_inference_steps=25,
|
||||
... guidance_scale=7.5,
|
||||
... cross_attention_kwargs={"scale": 0.5},
|
||||
... ).images[0]
|
||||
# ์์ ํ ํ์ธํ๋๋ LoRA ๋ชจ๋ธ์ ๊ฐ์ค์น ์ฌ์ฉ
|
||||
|
||||
>>> image = pipe("A picture of a sks dog in a bucket.", num_inference_steps=25, guidance_scale=7.5).images[0]
|
||||
>>> image.save("bucket-dog.png")
|
||||
```
|
||||
224
docs/source/ko/training/text2image.mdx
Normal file
224
docs/source/ko/training/text2image.mdx
Normal file
@@ -0,0 +1,224 @@
|
||||
<!--Copyright 2023 The HuggingFace Team. All rights reserved.
|
||||
|
||||
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
|
||||
the License. You may obtain a copy of the License at
|
||||
|
||||
http://www.apache.org/licenses/LICENSE-2.0
|
||||
|
||||
Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on
|
||||
an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the
|
||||
specific language governing permissions and limitations under the License.
|
||||
-->
|
||||
|
||||
|
||||
# Text-to-image
|
||||
|
||||
<Tip warning={true}>
|
||||
|
||||
text-to-image ํ์ธํ๋ ์คํฌ๋ฆฝํธ๋ experimental ์ํ์
๋๋ค. ๊ณผ์ ํฉํ๊ธฐ ์ฝ๊ณ ์น๋ช
์ ์ธ ๋ง๊ฐ๊ณผ ๊ฐ์ ๋ฌธ์ ์ ๋ถ๋ชํ๊ธฐ ์ฝ์ต๋๋ค. ์์ฒด ๋ฐ์ดํฐ์
์์ ์ต์์ ๊ฒฐ๊ณผ๋ฅผ ์ป์ผ๋ ค๋ฉด ๋ค์ํ ํ์ดํผํ๋ผ๋ฏธํฐ๋ฅผ ํ์ํ๋ ๊ฒ์ด ์ข์ต๋๋ค.
|
||||
|
||||
</Tip>
|
||||
|
||||
Stable Diffusion๊ณผ ๊ฐ์ text-to-image ๋ชจ๋ธ์ ํ
์คํธ ํ๋กฌํํธ์์ ์ด๋ฏธ์ง๋ฅผ ์์ฑํฉ๋๋ค. ์ด ๊ฐ์ด๋๋ PyTorch ๋ฐ Flax๋ฅผ ์ฌ์ฉํ์ฌ ์์ฒด ๋ฐ์ดํฐ์
์์ [`CompVis/stable-diffusion-v1-4`](https://huggingface.co/CompVis/stable-diffusion-v1-4) ๋ชจ๋ธ๋ก ํ์ธํ๋ํ๋ ๋ฐฉ๋ฒ์ ๋ณด์ฌ์ค๋๋ค. ์ด ๊ฐ์ด๋์ ์ฌ์ฉ๋ text-to-image ํ์ธํ๋์ ์ํ ๋ชจ๋ ํ์ต ์คํฌ๋ฆฝํธ์ ๊ด์ฌ์ด ์๋ ๊ฒฝ์ฐ ์ด [๋ฆฌํฌ์งํ ๋ฆฌ](https://github.com/huggingface/diffusers/tree/main/examples/text_to_image)์์ ์์ธํ ์ฐพ์ ์ ์์ต๋๋ค.
|
||||
|
||||
์คํฌ๋ฆฝํธ๋ฅผ ์คํํ๊ธฐ ์ ์, ๋ผ์ด๋ธ๋ฌ๋ฆฌ์ ํ์ต dependency๋ค์ ์ค์นํด์ผ ํฉ๋๋ค:
|
||||
|
||||
```bash
|
||||
pip install git+https://github.com/huggingface/diffusers.git
|
||||
pip install -U -r requirements.txt
|
||||
```
|
||||
|
||||
๊ทธ๋ฆฌ๊ณ [๐คAccelerate](https://github.com/huggingface/accelerate/) ํ๊ฒฝ์ ์ด๊ธฐํํฉ๋๋ค:
|
||||
|
||||
```bash
|
||||
accelerate config
|
||||
```
|
||||
|
||||
๋ฆฌํฌ์งํ ๋ฆฌ๋ฅผ ์ด๋ฏธ ๋ณต์ ํ ๊ฒฝ์ฐ, ์ด ๋จ๊ณ๋ฅผ ์ํํ ํ์๊ฐ ์์ต๋๋ค. ๋์ , ๋ก์ปฌ ์ฒดํฌ์์ ๊ฒฝ๋ก๋ฅผ ํ์ต ์คํฌ๋ฆฝํธ์ ๋ช
์ํ ์ ์์ผ๋ฉฐ ๊ฑฐ๊ธฐ์์ ๋ก๋๋ฉ๋๋ค.
|
||||
|
||||
### ํ๋์จ์ด ์๊ตฌ ์ฌํญ
|
||||
|
||||
`gradient_checkpointing` ๋ฐ `mixed_precision`์ ์ฌ์ฉํ๋ฉด ๋จ์ผ 24GB GPU์์ ๋ชจ๋ธ์ ํ์ธํ๋ํ ์ ์์ต๋๋ค. ๋ ๋์ `batch_size`์ ๋ ๋น ๋ฅธ ํ๋ จ์ ์ํด์๋ GPU ๋ฉ๋ชจ๋ฆฌ๊ฐ 30GB ์ด์์ธ GPU๋ฅผ ์ฌ์ฉํ๋ ๊ฒ์ด ์ข์ต๋๋ค. TPU ๋๋ GPU์์ ํ์ธํ๋์ ์ํด JAX๋ Flax๋ฅผ ์ฌ์ฉํ ์๋ ์์ต๋๋ค. ์์ธํ ๋ด์ฉ์ [์๋](#flax-jax-finetuning)๋ฅผ ์ฐธ์กฐํ์ธ์.
|
||||
|
||||
xFormers๋ก memory efficient attention์ ํ์ฑํํ์ฌ ๋ฉ๋ชจ๋ฆฌ ์ฌ์ฉ๋ ํจ์ฌ ๋ ์ค์ผ ์ ์์ต๋๋ค. [xFormers๊ฐ ์ค์น](./optimization/xformers)๋์ด ์๋์ง ํ์ธํ๊ณ `--enable_xformers_memory_efficient_attention`๋ฅผ ํ์ต ์คํฌ๋ฆฝํธ์ ๋ช
์ํฉ๋๋ค.
|
||||
|
||||
xFormers๋ Flax์ ์ฌ์ฉํ ์ ์์ต๋๋ค.
|
||||
|
||||
## Hub์ ๋ชจ๋ธ ์
๋ก๋ํ๊ธฐ
|
||||
|
||||
ํ์ต ์คํฌ๋ฆฝํธ์ ๋ค์ ์ธ์๋ฅผ ์ถ๊ฐํ์ฌ ๋ชจ๋ธ์ ํ๋ธ์ ์ ์ฅํฉ๋๋ค:
|
||||
|
||||
```bash
|
||||
--push_to_hub
|
||||
```
|
||||
|
||||
|
||||
## ์ฒดํฌํฌ์ธํธ ์ ์ฅ ๋ฐ ๋ถ๋ฌ์ค๊ธฐ
|
||||
|
||||
ํ์ต ์ค ๋ฐ์ํ ์ ์๋ ์ผ์ ๋๋นํ์ฌ ์ ๊ธฐ์ ์ผ๋ก ์ฒดํฌํฌ์ธํธ๋ฅผ ์ ์ฅํด ๋๋ ๊ฒ์ด ์ข์ต๋๋ค. ์ฒดํฌํฌ์ธํธ๋ฅผ ์ ์ฅํ๋ ค๋ฉด ํ์ต ์คํฌ๋ฆฝํธ์ ๋ค์ ์ธ์๋ฅผ ๋ช
์ํฉ๋๋ค.
|
||||
|
||||
```bash
|
||||
--checkpointing_steps=500
|
||||
```
|
||||
|
||||
500์คํ
๋ง๋ค ์ ์ฒด ํ์ต state๊ฐ 'output_dir'์ ํ์ ํด๋์ ์ ์ฅ๋ฉ๋๋ค. ์ฒดํฌํฌ์ธํธ๋ 'checkpoint-'์ ์ง๊ธ๊น์ง ํ์ต๋ step ์์
๋๋ค. ์๋ฅผ ๋ค์ด 'checkpoint-1500'์ 1500 ํ์ต step ํ์ ์ ์ฅ๋ ์ฒดํฌํฌ์ธํธ์
๋๋ค.
|
||||
|
||||
ํ์ต์ ์ฌ๊ฐํ๊ธฐ ์ํด ์ฒดํฌํฌ์ธํธ๋ฅผ ๋ถ๋ฌ์ค๋ ค๋ฉด '--resume_from_checkpoint' ์ธ์๋ฅผ ํ์ต ์คํฌ๋ฆฝํธ์ ๋ช
์ํ๊ณ ์ฌ๊ฐํ ์ฒดํฌํฌ์ธํธ๋ฅผ ์ง์ ํ์ญ์์ค. ์๋ฅผ ๋ค์ด ๋ค์ ์ธ์๋ 1500๊ฐ์ ํ์ต step ํ์ ์ ์ฅ๋ ์ฒดํฌํฌ์ธํธ์์๋ถํฐ ํ๋ จ์ ์ฌ๊ฐํฉ๋๋ค.
|
||||
|
||||
```bash
|
||||
--resume_from_checkpoint="checkpoint-1500"
|
||||
```
|
||||
|
||||
## ํ์ธํ๋
|
||||
|
||||
<frameworkcontent>
|
||||
<pt>
|
||||
๋ค์๊ณผ ๊ฐ์ด [Pokรฉmon BLIP ์บก์
](https://huggingface.co/datasets/lambdalabs/pokemon-blip-captions) ๋ฐ์ดํฐ์
์์ ํ์ธํ๋ ์คํ์ ์ํด [PyTorch ํ์ต ์คํฌ๋ฆฝํธ](https://github.com/huggingface/diffusers/blob/main/examples/text_to_image/train_text_to_image.py)๋ฅผ ์คํํฉ๋๋ค:
|
||||
|
||||
|
||||
```bash
|
||||
export MODEL_NAME="CompVis/stable-diffusion-v1-4"
|
||||
export dataset_name="lambdalabs/pokemon-blip-captions"
|
||||
|
||||
accelerate launch train_text_to_image.py \
|
||||
--pretrained_model_name_or_path=$MODEL_NAME \
|
||||
--dataset_name=$dataset_name \
|
||||
--use_ema \
|
||||
--resolution=512 --center_crop --random_flip \
|
||||
--train_batch_size=1 \
|
||||
--gradient_accumulation_steps=4 \
|
||||
--gradient_checkpointing \
|
||||
--mixed_precision="fp16" \
|
||||
--max_train_steps=15000 \
|
||||
--learning_rate=1e-05 \
|
||||
--max_grad_norm=1 \
|
||||
--lr_scheduler="constant" --lr_warmup_steps=0 \
|
||||
--output_dir="sd-pokemon-model"
|
||||
```
|
||||
|
||||
์์ฒด ๋ฐ์ดํฐ์
์ผ๋ก ํ์ธํ๋ํ๋ ค๋ฉด ๐ค [Datasets](https://huggingface.co/docs/datasets/index)์์ ์๊ตฌํ๋ ํ์์ ๋ฐ๋ผ ๋ฐ์ดํฐ์
์ ์ค๋นํ์ธ์. [๋ฐ์ดํฐ์
์ ํ๋ธ์ ์
๋ก๋](https://huggingface.co/docs/datasets/image_dataset#upload-dataset-to-the-hub)ํ๊ฑฐ๋ [ํ์ผ๋ค์ด ์๋ ๋ก์ปฌ ํด๋๋ฅผ ์ค๋น](https ://huggingface.co/docs/datasets/image_dataset#imagefolder)ํ ์ ์์ต๋๋ค.
|
||||
|
||||
์ฌ์ฉ์ ์ปค์คํ
loading logic์ ์ฌ์ฉํ๋ ค๋ฉด ์คํฌ๋ฆฝํธ๋ฅผ ์์ ํ์ญ์์ค. ๋์์ด ๋๋๋ก ์ฝ๋์ ์ ์ ํ ์์น์ ํฌ์ธํฐ๋ฅผ ๋จ๊ฒผ์ต๋๋ค. ๐ค ์๋ ์์ ์คํฌ๋ฆฝํธ๋ `TRAIN_DIR`์ ๋ก์ปฌ ๋ฐ์ดํฐ์
์ผ๋ก๋ฅผ ํ์ธํ๋ํ๋ ๋ฐฉ๋ฒ๊ณผ `OUTPUT_DIR`์์ ๋ชจ๋ธ์ ์ ์ฅํ ์์น๋ฅผ ๋ณด์ฌ์ค๋๋ค:
|
||||
|
||||
|
||||
```bash
|
||||
export MODEL_NAME="CompVis/stable-diffusion-v1-4"
|
||||
export TRAIN_DIR="path_to_your_dataset"
|
||||
export OUTPUT_DIR="path_to_save_model"
|
||||
|
||||
accelerate launch train_text_to_image.py \
|
||||
--pretrained_model_name_or_path=$MODEL_NAME \
|
||||
--train_data_dir=$TRAIN_DIR \
|
||||
--use_ema \
|
||||
--resolution=512 --center_crop --random_flip \
|
||||
--train_batch_size=1 \
|
||||
--gradient_accumulation_steps=4 \
|
||||
--gradient_checkpointing \
|
||||
--mixed_precision="fp16" \
|
||||
--max_train_steps=15000 \
|
||||
--learning_rate=1e-05 \
|
||||
--max_grad_norm=1 \
|
||||
--lr_scheduler="constant" --lr_warmup_steps=0 \
|
||||
--output_dir=${OUTPUT_DIR}
|
||||
```
|
||||
|
||||
</pt>
|
||||
<jax>
|
||||
[@duongna211](https://github.com/duongna21)์ ๊ธฐ์ฌ๋ก, Flax๋ฅผ ์ฌ์ฉํด TPU ๋ฐ GPU์์ Stable Diffusion ๋ชจ๋ธ์ ๋ ๋น ๋ฅด๊ฒ ํ์ตํ ์ ์์ต๋๋ค. ์ด๋ TPU ํ๋์จ์ด์์ ๋งค์ฐ ํจ์จ์ ์ด์ง๋ง GPU์์๋ ํ๋ฅญํ๊ฒ ์๋ํฉ๋๋ค. Flax ํ์ต ์คํฌ๋ฆฝํธ๋ gradient checkpointing๋ gradient accumulation๊ณผ ๊ฐ์ ๊ธฐ๋ฅ์ ์์ง ์ง์ํ์ง ์์ผ๋ฏ๋ก ๋ฉ๋ชจ๋ฆฌ๊ฐ 30GB ์ด์์ธ GPU ๋๋ TPU v3๊ฐ ํ์ํฉ๋๋ค.
|
||||
|
||||
์คํฌ๋ฆฝํธ๋ฅผ ์คํํ๊ธฐ ์ ์ ์๊ตฌ ์ฌํญ์ด ์ค์น๋์ด ์๋์ง ํ์ธํ์ญ์์ค:
|
||||
|
||||
```bash
|
||||
pip install -U -r requirements_flax.txt
|
||||
```
|
||||
|
||||
๊ทธ๋ฌ๋ฉด ๋ค์๊ณผ ๊ฐ์ด [Flax ํ์ต ์คํฌ๋ฆฝํธ](https://github.com/huggingface/diffusers/blob/main/examples/text_to_image/train_text_to_image_flax.py)๋ฅผ ์คํํ ์ ์์ต๋๋ค.
|
||||
|
||||
```bash
|
||||
export MODEL_NAME="runwayml/stable-diffusion-v1-5"
|
||||
export dataset_name="lambdalabs/pokemon-blip-captions"
|
||||
|
||||
python train_text_to_image_flax.py \
|
||||
--pretrained_model_name_or_path=$MODEL_NAME \
|
||||
--dataset_name=$dataset_name \
|
||||
--resolution=512 --center_crop --random_flip \
|
||||
--train_batch_size=1 \
|
||||
--max_train_steps=15000 \
|
||||
--learning_rate=1e-05 \
|
||||
--max_grad_norm=1 \
|
||||
--output_dir="sd-pokemon-model"
|
||||
```
|
||||
|
||||
์์ฒด ๋ฐ์ดํฐ์
์ผ๋ก ํ์ธํ๋ํ๋ ค๋ฉด ๐ค [Datasets](https://huggingface.co/docs/datasets/index)์์ ์๊ตฌํ๋ ํ์์ ๋ฐ๋ผ ๋ฐ์ดํฐ์
์ ์ค๋นํ์ธ์. [๋ฐ์ดํฐ์
์ ํ๋ธ์ ์
๋ก๋](https://huggingface.co/docs/datasets/image_dataset#upload-dataset-to-the-hub)ํ๊ฑฐ๋ [ํ์ผ๋ค์ด ์๋ ๋ก์ปฌ ํด๋๋ฅผ ์ค๋น](https ://huggingface.co/docs/datasets/image_dataset#imagefolder)ํ ์ ์์ต๋๋ค.
|
||||
|
||||
์ฌ์ฉ์ ์ปค์คํ
loading logic์ ์ฌ์ฉํ๋ ค๋ฉด ์คํฌ๋ฆฝํธ๋ฅผ ์์ ํ์ญ์์ค. ๋์์ด ๋๋๋ก ์ฝ๋์ ์ ์ ํ ์์น์ ํฌ์ธํฐ๋ฅผ ๋จ๊ฒผ์ต๋๋ค. ๐ค ์๋ ์์ ์คํฌ๋ฆฝํธ๋ `TRAIN_DIR`์ ๋ก์ปฌ ๋ฐ์ดํฐ์
์ผ๋ก๋ฅผ ํ์ธํ๋ํ๋ ๋ฐฉ๋ฒ์ ๋ณด์ฌ์ค๋๋ค:
|
||||
|
||||
```bash
|
||||
export MODEL_NAME="duongna/stable-diffusion-v1-4-flax"
|
||||
export TRAIN_DIR="path_to_your_dataset"
|
||||
|
||||
python train_text_to_image_flax.py \
|
||||
--pretrained_model_name_or_path=$MODEL_NAME \
|
||||
--train_data_dir=$TRAIN_DIR \
|
||||
--resolution=512 --center_crop --random_flip \
|
||||
--train_batch_size=1 \
|
||||
--mixed_precision="fp16" \
|
||||
--max_train_steps=15000 \
|
||||
--learning_rate=1e-05 \
|
||||
--max_grad_norm=1 \
|
||||
--output_dir="sd-pokemon-model"
|
||||
```
|
||||
</jax>
|
||||
</frameworkcontent>
|
||||
|
||||
## LoRA
|
||||
|
||||
Text-to-image ๋ชจ๋ธ ํ์ธํ๋์ ์ํด, ๋๊ท๋ชจ ๋ชจ๋ธ ํ์ต์ ๊ฐ์ํํ๊ธฐ ์ํ ํ์ธํ๋ ๊ธฐ์ ์ธ LoRA(Low-Rank Adaptation of Large Language Models)๋ฅผ ์ฌ์ฉํ ์ ์์ต๋๋ค. ์์ธํ ๋ด์ฉ์ [LoRA ํ์ต](lora#text-to-image) ๊ฐ์ด๋๋ฅผ ์ฐธ์กฐํ์ธ์.
|
||||
|
||||
## ์ถ๋ก
|
||||
|
||||
ํ๋ธ์ ๋ชจ๋ธ ๊ฒฝ๋ก ๋๋ ๋ชจ๋ธ ์ด๋ฆ์ [`StableDiffusionPipeline`]์ ์ ๋ฌํ์ฌ ์ถ๋ก ์ ์ํด ํ์ธ ํ๋๋ ๋ชจ๋ธ์ ๋ถ๋ฌ์ฌ ์ ์์ต๋๋ค:
|
||||
|
||||
<frameworkcontent>
|
||||
<pt>
|
||||
```python
|
||||
from diffusers import StableDiffusionPipeline
|
||||
|
||||
model_path = "path_to_saved_model"
|
||||
pipe = StableDiffusionPipeline.from_pretrained(model_path, torch_dtype=torch.float16)
|
||||
pipe.to("cuda")
|
||||
|
||||
image = pipe(prompt="yoda").images[0]
|
||||
image.save("yoda-pokemon.png")
|
||||
```
|
||||
</pt>
|
||||
<jax>
|
||||
```python
|
||||
import jax
|
||||
import numpy as np
|
||||
from flax.jax_utils import replicate
|
||||
from flax.training.common_utils import shard
|
||||
from diffusers import FlaxStableDiffusionPipeline
|
||||
|
||||
model_path = "path_to_saved_model"
|
||||
pipe, params = FlaxStableDiffusionPipeline.from_pretrained(model_path, dtype=jax.numpy.bfloat16)
|
||||
|
||||
prompt = "yoda pokemon"
|
||||
prng_seed = jax.random.PRNGKey(0)
|
||||
num_inference_steps = 50
|
||||
|
||||
num_samples = jax.device_count()
|
||||
prompt = num_samples * [prompt]
|
||||
prompt_ids = pipeline.prepare_inputs(prompt)
|
||||
|
||||
# shard inputs and rng
|
||||
params = replicate(params)
|
||||
prng_seed = jax.random.split(prng_seed, jax.device_count())
|
||||
prompt_ids = shard(prompt_ids)
|
||||
|
||||
images = pipeline(prompt_ids, params, prng_seed, num_inference_steps, jit=True).images
|
||||
images = pipeline.numpy_to_pil(np.asarray(images.reshape((num_samples,) + images.shape[-3:])))
|
||||
image.save("yoda-pokemon.png")
|
||||
```
|
||||
</jax>
|
||||
</frameworkcontent>
|
||||
Reference in New Issue
Block a user