mirror of
https://github.com/huggingface/diffusers.git
synced 2026-01-29 07:22:12 +03:00
* initial TokenEncoder and ContinuousEncoder * initial modules * added ContinuousContextTransformer * fix copy paste error * use numpy for get_sequence_length * initial terminal relative positional encodings * fix weights keys * fix assert * cross attend style: concat encodings * make style * concat once * fix formatting * Initial SpectrogramPipeline * fix input_tokens * make style * added mel output * ignore weights for config * move mel to numpy * import pipeline * fix class names and import * moved models to models folder * import ContinuousContextTransformer and SpectrogramDiffusionPipeline * initial spec diffusion converstion script * renamed config to t5config * added weight loading * use arguments instead of t5config * broadcast noise time to batch dim * fix call * added scale_to_features * fix weights * transpose laynorm weight * scale is a vector * scale the query outputs * added comment * undo scaling * undo depth_scaling * inital get_extended_attention_mask * attention_mask is none in self-attention * cleanup * manually invert attention * nn.linear need bias=False * added T5LayerFFCond * remove to fix conflict * make style and dummy * remove unsed variables * remove predict_epsilon * Move accelerate to a soft-dependency (#1134) * finish * finish * Update src/diffusers/modeling_utils.py * Update src/diffusers/pipeline_utils.py Co-authored-by: Anton Lozhkov <anton@huggingface.co> * more fixes * fix Co-authored-by: Anton Lozhkov <anton@huggingface.co> * fix order * added initial midi to note token data pipeline * added int to int tokenizer * remove duplicate * added logic for segments * add melgan to pipeline * move autoregressive gen into pipeline * added note_representation_processor_chain * fix dtypes * remove immutabledict req * initial doc * use np.where * require note_seq * fix typo * update dependency * added note-seq to test * added is_note_seq_available * fix import * added toc * added example usage * undo for now * moved docs * fix merge * fix imports * predict first segment * avoid un-needed copy to and from cpu * make style * Copyright * fix style * add test and fix inference steps * remove bogus files * reorder models * up * remove transformers dependency * make work with diffusers cross attention * clean more * remove @ * improve further * up * uP * Apply suggestions from code review * Update tests/pipelines/spectrogram_diffusion/test_spectrogram_diffusion.py * loop over all tokens * make style * Added a section on the model * fix formatting * grammer * formatting * make fix-copies * Update src/diffusers/pipelines/__init__.py Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> * Update src/diffusers/pipelines/spectrogram_diffusion/pipeline_spectrogram_diffusion.py Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> * added callback ad optional ionnx * do not squeeze batch dim * clean up more * upload * convert jax to nnumpy * make style * fix warning * make fix-copies * fix warning * add initial fast tests * add initial pipeline_params * eval mode due to dropout * skip batch tests as pipeline runs on a single file * make style * fix relative path * fix doc tests * Update src/diffusers/models/t5_film_transformer.py Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> * Update src/diffusers/models/t5_film_transformer.py Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> * Update docs/source/en/api/pipelines/spectrogram_diffusion.mdx Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> * Update tests/pipelines/spectrogram_diffusion/test_spectrogram_diffusion.py Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> * Update tests/pipelines/spectrogram_diffusion/test_spectrogram_diffusion.py Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> * Update tests/pipelines/spectrogram_diffusion/test_spectrogram_diffusion.py Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> * Update tests/pipelines/spectrogram_diffusion/test_spectrogram_diffusion.py Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> * add MidiProcessor * format * fix org * Apply suggestions from code review * Update tests/pipelines/spectrogram_diffusion/test_spectrogram_diffusion.py * make style * pin protobuf to <4 * fix formatting * white space * tensorboard needs protobuf --------- Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> Co-authored-by: Anton Lozhkov <anton@huggingface.co>
109 lines
2.8 KiB
Python
109 lines
2.8 KiB
Python
# These are canonical sets of parameters for different types of pipelines.
|
|
# They are set on subclasses of `PipelineTesterMixin` as `params` and
|
|
# `batch_params`.
|
|
#
|
|
# If a pipeline's set of arguments has minor changes from one of the common sets
|
|
# of arguments, do not make modifications to the existing common sets of arguments.
|
|
# I.e. a text to image pipeline with non-configurable height and width arguments
|
|
# should set its attribute as `params = TEXT_TO_IMAGE_PARAMS - {'height', 'width'}`.
|
|
|
|
TEXT_TO_IMAGE_PARAMS = frozenset(
|
|
[
|
|
"prompt",
|
|
"height",
|
|
"width",
|
|
"guidance_scale",
|
|
"negative_prompt",
|
|
"prompt_embeds",
|
|
"negative_prompt_embeds",
|
|
"cross_attention_kwargs",
|
|
]
|
|
)
|
|
|
|
TEXT_TO_IMAGE_BATCH_PARAMS = frozenset(["prompt", "negative_prompt"])
|
|
|
|
IMAGE_VARIATION_PARAMS = frozenset(
|
|
[
|
|
"image",
|
|
"height",
|
|
"width",
|
|
"guidance_scale",
|
|
]
|
|
)
|
|
|
|
IMAGE_VARIATION_BATCH_PARAMS = frozenset(["image"])
|
|
|
|
TEXT_GUIDED_IMAGE_VARIATION_PARAMS = frozenset(
|
|
[
|
|
"prompt",
|
|
"image",
|
|
"height",
|
|
"width",
|
|
"guidance_scale",
|
|
"negative_prompt",
|
|
"prompt_embeds",
|
|
"negative_prompt_embeds",
|
|
]
|
|
)
|
|
|
|
TEXT_GUIDED_IMAGE_VARIATION_BATCH_PARAMS = frozenset(["prompt", "image", "negative_prompt"])
|
|
|
|
TEXT_GUIDED_IMAGE_INPAINTING_PARAMS = frozenset(
|
|
[
|
|
# Text guided image variation with an image mask
|
|
"prompt",
|
|
"image",
|
|
"mask_image",
|
|
"height",
|
|
"width",
|
|
"guidance_scale",
|
|
"negative_prompt",
|
|
"prompt_embeds",
|
|
"negative_prompt_embeds",
|
|
]
|
|
)
|
|
|
|
TEXT_GUIDED_IMAGE_INPAINTING_BATCH_PARAMS = frozenset(["prompt", "image", "mask_image", "negative_prompt"])
|
|
|
|
IMAGE_INPAINTING_PARAMS = frozenset(
|
|
[
|
|
# image variation with an image mask
|
|
"image",
|
|
"mask_image",
|
|
"height",
|
|
"width",
|
|
"guidance_scale",
|
|
]
|
|
)
|
|
|
|
IMAGE_INPAINTING_BATCH_PARAMS = frozenset(["image", "mask_image"])
|
|
|
|
IMAGE_GUIDED_IMAGE_INPAINTING_PARAMS = frozenset(
|
|
[
|
|
"example_image",
|
|
"image",
|
|
"mask_image",
|
|
"height",
|
|
"width",
|
|
"guidance_scale",
|
|
]
|
|
)
|
|
|
|
IMAGE_GUIDED_IMAGE_INPAINTING_BATCH_PARAMS = frozenset(["example_image", "image", "mask_image"])
|
|
|
|
CLASS_CONDITIONED_IMAGE_GENERATION_PARAMS = frozenset(["class_labels"])
|
|
|
|
CLASS_CONDITIONED_IMAGE_GENERATION_BATCH_PARAMS = frozenset(["class_labels"])
|
|
|
|
UNCONDITIONAL_IMAGE_GENERATION_PARAMS = frozenset(["batch_size"])
|
|
|
|
UNCONDITIONAL_IMAGE_GENERATION_BATCH_PARAMS = frozenset([])
|
|
|
|
UNCONDITIONAL_AUDIO_GENERATION_PARAMS = frozenset(["batch_size"])
|
|
|
|
UNCONDITIONAL_AUDIO_GENERATION_BATCH_PARAMS = frozenset([])
|
|
|
|
TOKENS_TO_AUDIO_GENERATION_PARAMS = frozenset(["input_tokens"])
|
|
|
|
TOKENS_TO_AUDIO_GENERATION_BATCH_PARAMS = frozenset(["input_tokens"])
|