mirror of
https://github.com/huggingface/diffusers.git
synced 2026-01-27 17:22:53 +03:00
chore: fix typos (#5386)
* chore: fix typos * Update src/diffusers/pipelines/shap_e/renderer.py Co-authored-by: psychedelicious <4822129+psychedelicious@users.noreply.github.com> --------- Co-authored-by: psychedelicious <4822129+psychedelicious@users.noreply.github.com>
This commit is contained in:
committed by
GitHub
parent
5495073faf
commit
0ea78f9707
@@ -40,7 +40,7 @@ In the following, we give an overview of different ways to contribute, ranked by
|
||||
As said before, **all contributions are valuable to the community**.
|
||||
In the following, we will explain each contribution a bit more in detail.
|
||||
|
||||
For all contributions 4.-9. you will need to open a PR. It is explained in detail how to do so in [Opening a pull requst](#how-to-open-a-pr)
|
||||
For all contributions 4.-9. you will need to open a PR. It is explained in detail how to do so in [Opening a pull request](#how-to-open-a-pr)
|
||||
|
||||
### 1. Asking and answering questions on the Diffusers discussion forum or on the Diffusers Discord
|
||||
|
||||
@@ -63,7 +63,7 @@ In the same spirit, you are of immense help to the community by answering such q
|
||||
|
||||
**Please** keep in mind that the more effort you put into asking or answering a question, the higher
|
||||
the quality of the publicly documented knowledge. In the same way, well-posed and well-answered questions create a high-quality knowledge database accessible to everybody, while badly posed questions or answers reduce the overall quality of the public knowledge database.
|
||||
In short, a high quality question or answer is *precise*, *concise*, *relevant*, *easy-to-understand*, *accesible*, and *well-formated/well-posed*. For more information, please have a look through the [How to write a good issue](#how-to-write-a-good-issue) section.
|
||||
In short, a high quality question or answer is *precise*, *concise*, *relevant*, *easy-to-understand*, *accessible*, and *well-formated/well-posed*. For more information, please have a look through the [How to write a good issue](#how-to-write-a-good-issue) section.
|
||||
|
||||
**NOTE about channels**:
|
||||
[*The forum*](https://discuss.huggingface.co/c/discussion-related-to-httpsgithubcomhuggingfacediffusers/63) is much better indexed by search engines, such as Google. Posts are ranked by popularity rather than chronologically. Hence, it's easier to look up questions and answers that we posted some time ago.
|
||||
@@ -168,7 +168,7 @@ more precise, provide the link to a duplicated issue or redirect them to [the fo
|
||||
If you have verified that the issued bug report is correct and requires a correction in the source code,
|
||||
please have a look at the next sections.
|
||||
|
||||
For all of the following contributions, you will need to open a PR. It is explained in detail how to do so in the [Opening a pull requst](#how-to-open-a-pr) section.
|
||||
For all of the following contributions, you will need to open a PR. It is explained in detail how to do so in the [Opening a pull request](#how-to-open-a-pr) section.
|
||||
|
||||
### 4. Fixing a "Good first issue"
|
||||
|
||||
|
||||
@@ -34,7 +34,7 @@ this in the generated mask, you simply have to set the embeddings related to the
|
||||
`source_prompt` and "dog" to `target_prompt`.
|
||||
* When generating partially inverted latents using `invert`, assign a caption or text embedding describing the
|
||||
overall image to the `prompt` argument to help guide the inverse latent sampling process. In most cases, the
|
||||
source concept is sufficently descriptive to yield good results, but feel free to explore alternatives.
|
||||
source concept is sufficiently descriptive to yield good results, but feel free to explore alternatives.
|
||||
* When calling the pipeline to generate the final edited image, assign the source concept to `negative_prompt`
|
||||
and the target concept to `prompt`. Taking the above example, you simply have to set the embeddings related to
|
||||
the phrases including "cat" to `negative_prompt` and "dog" to `prompt`.
|
||||
|
||||
@@ -396,7 +396,7 @@ t2i_pipe.unet.set_attn_processor(AttnAddedKVProcessor())
|
||||
```
|
||||
|
||||
With PyTorch >= 2.0, you can also use Kandinsky with `torch.compile` which depending
|
||||
on your hardware can signficantly speed-up your inference time once the model is compiled.
|
||||
on your hardware can significantly speed-up your inference time once the model is compiled.
|
||||
To use Kandinsksy with `torch.compile`, you can do:
|
||||
|
||||
```py
|
||||
|
||||
@@ -263,7 +263,7 @@ t2i_pipe.unet.set_attn_processor(AttnAddedKVProcessor())
|
||||
```
|
||||
|
||||
With PyTorch >= 2.0, you can also use Kandinsky with `torch.compile` which depending
|
||||
on your hardware can signficantly speed-up your inference time once the model is compiled.
|
||||
on your hardware can significantly speed-up your inference time once the model is compiled.
|
||||
To use Kandinsksy with `torch.compile`, you can do:
|
||||
|
||||
```py
|
||||
|
||||
@@ -40,7 +40,7 @@ In the following, we give an overview of different ways to contribute, ranked by
|
||||
As said before, **all contributions are valuable to the community**.
|
||||
In the following, we will explain each contribution a bit more in detail.
|
||||
|
||||
For all contributions 4.-9. you will need to open a PR. It is explained in detail how to do so in [Opening a pull requst](#how-to-open-a-pr)
|
||||
For all contributions 4.-9. you will need to open a PR. It is explained in detail how to do so in [Opening a pull request](#how-to-open-a-pr)
|
||||
|
||||
### 1. Asking and answering questions on the Diffusers discussion forum or on the Diffusers Discord
|
||||
|
||||
@@ -63,7 +63,7 @@ In the same spirit, you are of immense help to the community by answering such q
|
||||
|
||||
**Please** keep in mind that the more effort you put into asking or answering a question, the higher
|
||||
the quality of the publicly documented knowledge. In the same way, well-posed and well-answered questions create a high-quality knowledge database accessible to everybody, while badly posed questions or answers reduce the overall quality of the public knowledge database.
|
||||
In short, a high quality question or answer is *precise*, *concise*, *relevant*, *easy-to-understand*, *accesible*, and *well-formated/well-posed*. For more information, please have a look through the [How to write a good issue](#how-to-write-a-good-issue) section.
|
||||
In short, a high quality question or answer is *precise*, *concise*, *relevant*, *easy-to-understand*, *accessible*, and *well-formated/well-posed*. For more information, please have a look through the [How to write a good issue](#how-to-write-a-good-issue) section.
|
||||
|
||||
**NOTE about channels**:
|
||||
[*The forum*](https://discuss.huggingface.co/c/discussion-related-to-httpsgithubcomhuggingfacediffusers/63) is much better indexed by search engines, such as Google. Posts are ranked by popularity rather than chronologically. Hence, it's easier to look up questions and answers that we posted some time ago.
|
||||
@@ -168,7 +168,7 @@ more precise, provide the link to a duplicated issue or redirect them to [the fo
|
||||
If you have verified that the issued bug report is correct and requires a correction in the source code,
|
||||
please have a look at the next sections.
|
||||
|
||||
For all of the following contributions, you will need to open a PR. It is explained in detail how to do so in the [Opening a pull requst](#how-to-open-a-pr) section.
|
||||
For all of the following contributions, you will need to open a PR. It is explained in detail how to do so in the [Opening a pull request](#how-to-open-a-pr) section.
|
||||
|
||||
### 4. Fixing a `Good first issue`
|
||||
|
||||
|
||||
@@ -70,7 +70,7 @@ pipe.unet = torch.compile(pipe.unet, mode="reduce-overhead", fullgraph=True)
|
||||
images = pipe(prompt, num_inference_steps=steps, num_images_per_prompt=batch_size).images[0]
|
||||
```
|
||||
|
||||
Depending on GPU type, `torch.compile` can provide an *addtional speed-up* of **5-300x** on top of SDPA! If you're using more recent GPU architectures such as Ampere (A100, 3090), Ada (4090), and Hopper (H100), `torch.compile` is able to squeeze even more performance out of these GPUs.
|
||||
Depending on GPU type, `torch.compile` can provide an *additional speed-up* of **5-300x** on top of SDPA! If you're using more recent GPU architectures such as Ampere (A100, 3090), Ada (4090), and Hopper (H100), `torch.compile` is able to squeeze even more performance out of these GPUs.
|
||||
|
||||
Compilation requires some time to complete, so it is best suited for situations where you prepare your pipeline once and then perform the same type of inference operations multiple times. For example, calling the compiled pipeline on a different image size triggers compilation again which can be expensive.
|
||||
|
||||
|
||||
@@ -69,7 +69,7 @@ write_basic_config()
|
||||
|
||||
Now let's get our dataset. Download dataset from [here](https://www.cs.cmu.edu/~custom-diffusion/assets/data.zip) and unzip it. To use your own dataset, take a look at the [Create a dataset for training](create_dataset) guide.
|
||||
|
||||
We also collect 200 real images using `clip-retrieval` which are combined with the target images in the training dataset as a regularization. This prevents overfitting to the the given target image. The following flags enable the regularization `with_prior_preservation`, `real_prior` with `prior_loss_weight=1.`.
|
||||
We also collect 200 real images using `clip-retrieval` which are combined with the target images in the training dataset as a regularization. This prevents overfitting to the given target image. The following flags enable the regularization `with_prior_preservation`, `real_prior` with `prior_loss_weight=1.`.
|
||||
The `class_prompt` should be the category name same as target image. The collected real images are with text captions similar to the `class_prompt`. The retrieved image are saved in `class_data_dir`. You can disable `real_prior` to use generated images as regularization. To collect the real images use this command first before training.
|
||||
|
||||
```bash
|
||||
@@ -106,7 +106,7 @@ accelerate launch train_custom_diffusion.py \
|
||||
|
||||
**Use `--enable_xformers_memory_efficient_attention` for faster training with lower VRAM requirement (16GB per GPU). Follow [this guide](https://github.com/facebookresearch/xformers) for installation instructions.**
|
||||
|
||||
To track your experiments using Weights and Biases (`wandb`) and to save intermediate results (whcih we HIGHLY recommend), follow these steps:
|
||||
To track your experiments using Weights and Biases (`wandb`) and to save intermediate results (which we HIGHLY recommend), follow these steps:
|
||||
|
||||
* Install `wandb`: `pip install wandb`.
|
||||
* Authorize: `wandb login`.
|
||||
|
||||
@@ -192,7 +192,7 @@ been added to the text encoder embedding matrix and consequently been trained.
|
||||
<Tip>
|
||||
|
||||
💡 The community has created a large library of different textual inversion embedding vectors, called [sd-concepts-library](https://huggingface.co/sd-concepts-library).
|
||||
Instead of training textual inversion embeddings from scratch you can also see whether a fitting textual inversion embedding has already been added to the libary.
|
||||
Instead of training textual inversion embeddings from scratch you can also see whether a fitting textual inversion embedding has already been added to the library.
|
||||
|
||||
</Tip>
|
||||
|
||||
|
||||
@@ -434,7 +434,7 @@ high_threshold = 200
|
||||
|
||||
canny_image = cv2.Canny(canny_image, low_threshold, high_threshold)
|
||||
|
||||
# zero out middle columns of image where pose will be overlayed
|
||||
# zero out middle columns of image where pose will be overlaid
|
||||
zero_start = canny_image.shape[1] // 4
|
||||
zero_end = zero_start + canny_image.shape[1] // 2
|
||||
canny_image[:, zero_start:zero_end] = 0
|
||||
|
||||
@@ -62,7 +62,7 @@ export_to_gif(images[1], "cake_3d.gif")
|
||||
|
||||
## Image-to-3D
|
||||
|
||||
To generate a 3D object from another image, use the [`ShapEImg2ImgPipeline`]. You can use an existing image or generate an entirely new one. Let's use the the [Kandinsky 2.1](../api/pipelines/kandinsky) model to generate a new image.
|
||||
To generate a 3D object from another image, use the [`ShapEImg2ImgPipeline`]. You can use an existing image or generate an entirely new one. Let's use the [Kandinsky 2.1](../api/pipelines/kandinsky) model to generate a new image.
|
||||
|
||||
```py
|
||||
from diffusers import DiffusionPipeline
|
||||
|
||||
@@ -553,7 +553,7 @@ class OnnxStableDiffusionControlNetImg2ImgPipeline(DiffusionPipeline):
|
||||
instead.
|
||||
image (`torch.FloatTensor`, `PIL.Image.Image`, `np.ndarray`, `List[torch.FloatTensor]`, `List[PIL.Image.Image]`, `List[np.ndarray]`,:
|
||||
`List[List[torch.FloatTensor]]`, `List[List[np.ndarray]]` or `List[List[PIL.Image.Image]]`):
|
||||
The initial image will be used as the starting point for the image generation process. Can also accpet
|
||||
The initial image will be used as the starting point for the image generation process. Can also accept
|
||||
image latents as `image`, if passing latents directly, it will not be encoded again.
|
||||
control_image (`torch.FloatTensor`, `PIL.Image.Image`, `np.ndarray`, `List[torch.FloatTensor]`, `List[PIL.Image.Image]`, `List[np.ndarray]`,:
|
||||
`List[List[torch.FloatTensor]]`, `List[List[np.ndarray]]` or `List[List[PIL.Image.Image]]`):
|
||||
|
||||
@@ -657,7 +657,7 @@ class TensorRTStableDiffusionControlNetImg2ImgPipeline(DiffusionPipeline):
|
||||
instead.
|
||||
image (`torch.FloatTensor`, `PIL.Image.Image`, `np.ndarray`, `List[torch.FloatTensor]`, `List[PIL.Image.Image]`, `List[np.ndarray]`,:
|
||||
`List[List[torch.FloatTensor]]`, `List[List[np.ndarray]]` or `List[List[PIL.Image.Image]]`):
|
||||
The initial image will be used as the starting point for the image generation process. Can also accpet
|
||||
The initial image will be used as the starting point for the image generation process. Can also accept
|
||||
image latents as `image`, if passing latents directly, it will not be encoded again.
|
||||
control_image (`torch.FloatTensor`, `PIL.Image.Image`, `np.ndarray`, `List[torch.FloatTensor]`, `List[PIL.Image.Image]`, `List[np.ndarray]`,:
|
||||
`List[List[torch.FloatTensor]]`, `List[List[np.ndarray]]` or `List[List[PIL.Image.Image]]`):
|
||||
|
||||
@@ -48,7 +48,7 @@ write_basic_config()
|
||||
|
||||
Now let's get our dataset. Download dataset from [here](https://www.cs.cmu.edu/~custom-diffusion/assets/data.zip) and unzip it.
|
||||
|
||||
We also collect 200 real images using `clip-retrieval` which are combined with the target images in the training dataset as a regularization. This prevents overfitting to the the given target image. The following flags enable the regularization `with_prior_preservation`, `real_prior` with `prior_loss_weight=1.`.
|
||||
We also collect 200 real images using `clip-retrieval` which are combined with the target images in the training dataset as a regularization. This prevents overfitting to the given target image. The following flags enable the regularization `with_prior_preservation`, `real_prior` with `prior_loss_weight=1.`.
|
||||
The `class_prompt` should be the category name same as target image. The collected real images are with text captions similar to the `class_prompt`. The retrieved image are saved in `class_data_dir`. You can disable `real_prior` to use generated images as regularization. To collect the real images use this command first before training.
|
||||
|
||||
```bash
|
||||
@@ -82,7 +82,7 @@ accelerate launch train_custom_diffusion.py \
|
||||
|
||||
**Use `--enable_xformers_memory_efficient_attention` for faster training with lower VRAM requirement (16GB per GPU). Follow [this guide](https://github.com/facebookresearch/xformers) for installation instructions.**
|
||||
|
||||
To track your experiments using Weights and Biases (`wandb`) and to save intermediate results (whcih we HIGHLY recommend), follow these steps:
|
||||
To track your experiments using Weights and Biases (`wandb`) and to save intermediate results (which we HIGHLY recommend), follow these steps:
|
||||
|
||||
* Install `wandb`: `pip install wandb`.
|
||||
* Authorize: `wandb login`.
|
||||
|
||||
@@ -1119,7 +1119,7 @@ def main(args):
|
||||
unet, optimizer, train_dataloader, lr_scheduler
|
||||
)
|
||||
|
||||
# For mixed precision training we cast all non-trainable weigths (vae, non-lora text_encoder and non-lora unet) to half-precision
|
||||
# For mixed precision training we cast all non-trainable weights (vae, non-lora text_encoder and non-lora unet) to half-precision
|
||||
# as these weights are only used for inference, keeping weights in full precision is not required.
|
||||
weight_dtype = torch.float32
|
||||
if accelerator.mixed_precision == "fp16":
|
||||
|
||||
@@ -794,7 +794,7 @@ def main(args):
|
||||
text_encoder.requires_grad_(False)
|
||||
unet.requires_grad_(False)
|
||||
|
||||
# For mixed precision training we cast all non-trainable weigths (vae, non-lora text_encoder and non-lora unet) to half-precision
|
||||
# For mixed precision training we cast all non-trainable weights (vae, non-lora text_encoder and non-lora unet) to half-precision
|
||||
# as these weights are only used for inference, keeping weights in full precision is not required.
|
||||
weight_dtype = torch.float32
|
||||
if accelerator.mixed_precision == "fp16":
|
||||
|
||||
@@ -707,7 +707,7 @@ def main(args):
|
||||
text_encoder_two.requires_grad_(False)
|
||||
unet.requires_grad_(False)
|
||||
|
||||
# For mixed precision training we cast all non-trainable weigths (vae, non-lora text_encoder and non-lora unet) to half-precision
|
||||
# For mixed precision training we cast all non-trainable weights (vae, non-lora text_encoder and non-lora unet) to half-precision
|
||||
# as these weights are only used for inference, keeping weights in full precision is not required.
|
||||
weight_dtype = torch.float32
|
||||
if accelerator.mixed_precision == "fp16":
|
||||
|
||||
@@ -41,7 +41,7 @@ The `text` include the tag `Teyvat`, `Name`,`Element`, `Weapon`, `Region`, `Mode
|
||||
|
||||
## Training
|
||||
|
||||
The arguement `placement` can be `cpu`, `auto`, `cuda`, with `cpu` the GPU RAM required can be minimized to 4GB but will deceleration, with `cuda` you can also reduce GPU memory by half but accelerated training, with `auto` a more balanced solution for speed and memory can be obtained。
|
||||
The argument `placement` can be `cpu`, `auto`, `cuda`, with `cpu` the GPU RAM required can be minimized to 4GB but will deceleration, with `cuda` you can also reduce GPU memory by half but accelerated training, with `auto` a more balanced solution for speed and memory can be obtained。
|
||||
|
||||
**___Note: Change the `resolution` to 768 if you are using the [stable-diffusion-2](https://huggingface.co/stabilityai/stable-diffusion-2) 768x768 model.___**
|
||||
|
||||
|
||||
@@ -323,7 +323,7 @@ accelerate launch train_dreambooth.py \
|
||||
|
||||
### Using DreamBooth for other pipelines than Stable Diffusion
|
||||
|
||||
Altdiffusion also support dreambooth now, the runing comman is basically the same as abouve, all you need to do is replace the `MODEL_NAME` like this:
|
||||
Altdiffusion also support dreambooth now, the runing comman is basically the same as above, all you need to do is replace the `MODEL_NAME` like this:
|
||||
One can now simply change the `pretrained_model_name_or_path` to another architecture such as [`AltDiffusion`](https://huggingface.co/docs/diffusers/api/pipelines/alt_diffusion).
|
||||
|
||||
```
|
||||
|
||||
@@ -151,7 +151,7 @@ telling JAX which input arguments are static, that is, arguments that
|
||||
are known at compile time and won't change. In our case, it is num_inference_steps,
|
||||
height, width and return_latents.
|
||||
|
||||
Once the function is compiled, these parameters are ommited from future calls and
|
||||
Once the function is compiled, these parameters are omitted from future calls and
|
||||
cannot be changed without modifying the code and recompiling.
|
||||
|
||||
```python
|
||||
|
||||
@@ -911,7 +911,7 @@ class ShapERenderer(ModelMixin, ConfigMixin):
|
||||
n_coarse_samples=64,
|
||||
n_fine_samples=128,
|
||||
):
|
||||
# project the the paramters from the generated latents
|
||||
# project the parameters from the generated latents
|
||||
projected_params = self.params_proj(latents)
|
||||
|
||||
# update the mlp layers of the renderer
|
||||
@@ -955,7 +955,7 @@ class ShapERenderer(ModelMixin, ConfigMixin):
|
||||
query_batch_size: int = 4096,
|
||||
texture_channels: Tuple = ("R", "G", "B"),
|
||||
):
|
||||
# 1. project the the paramters from the generated latents
|
||||
# 1. project the parameters from the generated latents
|
||||
projected_params = self.params_proj(latents)
|
||||
|
||||
# 2. update the mlp layers of the renderer
|
||||
|
||||
@@ -20,7 +20,7 @@ class UniDiffuserTextDecoder(ModelMixin, ConfigMixin, ModuleUtilsMixin):
|
||||
prefix_length (`int`):
|
||||
Max number of prefix tokens that will be supplied to the model.
|
||||
prefix_inner_dim (`int`):
|
||||
The hidden size of the the incoming prefix embeddings. For UniDiffuser, this would be the hidden dim of the
|
||||
The hidden size of the incoming prefix embeddings. For UniDiffuser, this would be the hidden dim of the
|
||||
CLIP text encoder.
|
||||
prefix_hidden_dim (`int`, *optional*):
|
||||
Hidden dim of the MLP if we encode the prefix.
|
||||
|
||||
@@ -16,7 +16,7 @@
|
||||
Utility that sorts the imports in the custom inits of Diffusers. Diffusers uses init files that delay the
|
||||
import of an object to when it's actually needed. This is to avoid the main init importing all models, which would
|
||||
make the line `import transformers` very slow when the user has all optional dependencies installed. The inits with
|
||||
delayed imports have two halves: one definining a dictionary `_import_structure` which maps modules to the name of the
|
||||
delayed imports have two halves: one defining a dictionary `_import_structure` which maps modules to the name of the
|
||||
objects in each module, and one in `TYPE_CHECKING` which looks like a normal init for type-checkers. `isort` or `ruff`
|
||||
properly sort the second half which looks like traditionl imports, the goal of this script is to sort the first half.
|
||||
|
||||
|
||||
@@ -130,7 +130,7 @@ def pre_release_work(patch=False):
|
||||
|
||||
|
||||
def post_release_work():
|
||||
"""Do all the necesarry post-release steps."""
|
||||
"""Do all the necessary post-release steps."""
|
||||
# First let's get the current version
|
||||
current_version = get_version()
|
||||
dev_version = f"{current_version.major}.{current_version.minor + 1}.0.dev0"
|
||||
|
||||
Reference in New Issue
Block a user