mirror of
https://github.com/huggingface/diffusers.git
synced 2026-01-27 17:22:53 +03:00
Docs/bentoml integration (#4090)
* docs: first draft of BentoML integration * Update the diffusers doc * Apply suggestions from code review Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * add BentoML integration guide under Optimization section * restyle codes --------- Co-authored-by: Sherlock113 <sherlockxu07@gmail.com> Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
This commit is contained in:
@@ -117,6 +117,8 @@
|
||||
title: Habana Gaudi
|
||||
- local: optimization/tome
|
||||
title: Token Merging
|
||||
- local: optimization/bentoml
|
||||
title: BentoML Integration
|
||||
title: Optimization/Special Hardware
|
||||
- sections:
|
||||
- local: conceptual/philosophy
|
||||
|
||||
200
docs/source/en/optimization/bentoml.mdx
Normal file
200
docs/source/en/optimization/bentoml.mdx
Normal file
@@ -0,0 +1,200 @@
|
||||
<!--Copyright 2023 The HuggingFace Team. All rights reserved.
|
||||
|
||||
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
|
||||
the License. You may obtain a copy of the License at
|
||||
|
||||
http://www.apache.org/licenses/LICENSE-2.0
|
||||
|
||||
Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on
|
||||
an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the
|
||||
specific language governing permissions and limitations under the License.
|
||||
-->
|
||||
|
||||
# BentoML Integration Guide
|
||||
|
||||
[[open-in-colab]]
|
||||
|
||||
[BentoML](https://github.com/bentoml/BentoML/) is an open-source framework designed for building,
|
||||
shipping, and scaling AI applications. It allows users to easily package and serve diffusion models
|
||||
for production, ensuring reliable and efficient deployments. It features out-of-the-box operational
|
||||
management tools like monitoring and tracing, and facilitates the deployment to various cloud platforms
|
||||
with ease. BentoML's distributed architecture and the separation of API server logic from
|
||||
model inference logic enable efficient scaling of deployments, even with budget constraints.
|
||||
As a result, integrating it with Diffusers provides a valuable tool for real-world deployments.
|
||||
|
||||
This tutorial demonstrates how to integrate BentoML with Diffusers.
|
||||
|
||||
## Prerequisites
|
||||
|
||||
- Install [Diffusers](https://huggingface.co/docs/diffusers/installation).
|
||||
- Install BentoML by running `pip install bentoml`. For more information, see the [BentoML documentation](https://docs.bentoml.com).
|
||||
|
||||
## Import a diffusion model
|
||||
|
||||
First, you need to prepare the model. BentoML has its own [Model Store](https://docs.bentoml.com/en/latest/concepts/model.html)
|
||||
for model management. Create a `download_model.py` file as below to import a diffusion model into BentoML's Model
|
||||
Store:
|
||||
|
||||
```py
|
||||
import bentoml
|
||||
|
||||
bentoml.diffusers.import_model(
|
||||
"sd2.1", # Model tag in the BentoML Model Store
|
||||
"stabilityai/stable-diffusion-2-1", # Hugging Face model identifier
|
||||
)
|
||||
```
|
||||
|
||||
This code snippet downloads the Stable Diffusion 2.1 model (using it's repo id
|
||||
`stabilityai/stable-diffusion-2-1`) from the Hugging Face Hub (or use the cached download
|
||||
files if the model is already downloaded) and imports it into the BentoML Model
|
||||
Store with the name `sd2.1`.
|
||||
|
||||
For models already fine-tuned and stored on disk, you can provide the path instead of
|
||||
the repo id.
|
||||
|
||||
```py
|
||||
import bentoml
|
||||
|
||||
bentoml.diffusers.import_model(
|
||||
"sd2.1-local",
|
||||
"./local_stable_diffusion_2.1/",
|
||||
)
|
||||
```
|
||||
|
||||
You can view the model in the Model Store:
|
||||
|
||||
```
|
||||
bentoml models list
|
||||
|
||||
Tag Module Size Creation Time
|
||||
sd2.1:ysrlmubascajwnry bentoml.diffusers 33.85 GiB 2023-07-12 16:47:44
|
||||
```
|
||||
|
||||
## Turn a diffusion model into a RESTful service with BentoML
|
||||
|
||||
Once the diffusion model is in BentoML's Model Store, you can implement a text-to-image
|
||||
service with it. The Stable Diffusion model accepts various arguments
|
||||
in addition to the required prompt to guide the image generation process.
|
||||
To validate these input arguments, use BentoML's [pydantic](https://github.com/pydantic/pydantic) integration.
|
||||
Create a `sdargs.py` file with an example pydantic model:
|
||||
|
||||
```py
|
||||
import typing as t
|
||||
|
||||
from pydantic import BaseModel
|
||||
|
||||
|
||||
class SDArgs(BaseModel):
|
||||
prompt: str
|
||||
negative_prompt: t.Optional[str] = None
|
||||
height: t.Optional[int] = 512
|
||||
width: t.Optional[int] = 512
|
||||
|
||||
class Config:
|
||||
extra = "allow"
|
||||
```
|
||||
|
||||
This pydantic model requires a string field `prompt` and three optional fields: `height`, `width`, and `negative_prompt`,
|
||||
each with corresponding types. The `extra = "allow"` line supports adding additional fields not defined in the `SDArgs` class.
|
||||
In a real-world scenario, you may define all the desired fields and not allow extra ones.
|
||||
|
||||
Next, create a BentoML Service file that defines a Stable Diffusion service:
|
||||
|
||||
```py
|
||||
import bentoml
|
||||
from bentoml.io import Image, JSON
|
||||
|
||||
from sdargs import SDArgs
|
||||
|
||||
bento_model = bentoml.diffusers.get("sd2.1:latest")
|
||||
sd21_runner = bento_model.to_runner(name="sd21-runner")
|
||||
|
||||
svc = bentoml.Service("stable-diffusion-21", runners=[sd21_runner])
|
||||
|
||||
|
||||
@svc.api(input=JSON(pydantic_model=SDArgs), output=Image())
|
||||
async def txt2img(input_data):
|
||||
kwargs = input_data.dict()
|
||||
res = await sd21_runner.async_run(**kwargs)
|
||||
images = res[0]
|
||||
return images[0]
|
||||
```
|
||||
|
||||
Save the file as `service.py`, and spin up a BentoML Service endpoint using:
|
||||
|
||||
```
|
||||
bentoml serve service:svc
|
||||
```
|
||||
|
||||
An HTTP server with `/txt2img` endpoint that accepts a JSON dictionary should be up at
|
||||
port 3000. Go to <http://127.0.0.1:3000> in your web browser to access the Swagger UI.
|
||||
|
||||
You can also test the text-to-image generation using `curl` and write the returned image to
|
||||
`output.jpg`.
|
||||
|
||||
```
|
||||
curl -X POST http://127.0.0.1:3000/txt2img \
|
||||
-H 'Content-Type: application/json' \
|
||||
-d "{\"prompt\":\"a black cat\", \"height\":768, \"width\":768}" \
|
||||
--output output.jpg
|
||||
```
|
||||
|
||||
## Package a BentoML Service for cloud deployment
|
||||
|
||||
To deploy a BentoML Service, you need to pack it into a BentoML
|
||||
[Bento](https://docs.bentoml.com/en/latest/concepts/bento.html), a file archive with all the source code,
|
||||
models, data files, and dependencies. This can be done by providing a `bentofile.yaml` file as follows:
|
||||
|
||||
```yaml
|
||||
service: "service.py:svc"
|
||||
include:
|
||||
- "service.py"
|
||||
python:
|
||||
packages:
|
||||
- torch
|
||||
- transformers
|
||||
- accelerate
|
||||
- diffusers
|
||||
- triton
|
||||
- xformers
|
||||
- pydantic
|
||||
docker:
|
||||
distro: debian
|
||||
cuda_version: "11.6"
|
||||
```
|
||||
|
||||
The `bentofile.yaml` file contains [Bento build
|
||||
options](https://docs.bentoml.com/en/latest/concepts/bento.html#bento-build-options),
|
||||
such as package dependencies and Docker options.
|
||||
|
||||
Then you build a Bento using:
|
||||
|
||||
```
|
||||
bentoml build
|
||||
```
|
||||
|
||||
The output looks like:
|
||||
|
||||
```
|
||||
Successfully built Bento(tag="stable-diffusion-21:crkuh7a7rw5bcasc").
|
||||
|
||||
Possible next steps:
|
||||
|
||||
* Containerize your Bento with `bentoml containerize`:
|
||||
$ bentoml containerize stable-diffusion-21:crkuh7a7rw5bcasc
|
||||
|
||||
* Push to BentoCloud with `bentoml push`:
|
||||
$ bentoml push stable-diffusion-21:crkuh7a7rw5bcasc
|
||||
```
|
||||
|
||||
You can create a Docker image based on the Bento by running the following command and deploy it to a cloud provider.
|
||||
|
||||
```
|
||||
bentoml containerize stable-diffusion-21:crkuh7a7rw5bcasc
|
||||
```
|
||||
|
||||
If you want an end-to-end solution for deploying and managing models, you can push the Bento to [Yatai](https://github.com/bentoml/Yatai) or
|
||||
[BentoCloud](https://bentoml.com/cloud) for a distributed deployment.
|
||||
|
||||
For more information about BentoML's integration with Diffusers, see the [BentoML Diffusers
|
||||
Guide](https://docs.bentoml.com/en/latest/frameworks/diffusers.html).
|
||||
Reference in New Issue
Block a user