mirror of
https://github.com/huggingface/diffusers.git
synced 2026-01-29 07:22:12 +03:00
* initial script * formatting * prior trainer wip * add efficient_net_encoder * add CLIPTextModel * add prior ema support * optimizer * fix typo * add dataloader * prompt_embeds and image_embeds * intial training loop * fix output_dir * fix add_noise * accelerator check * make effnet_transforms dynamic * fix training loop * add validation logging * use loaded text_encoder * use PreTrainedTokenizerFast * load weigth from pickle * save_model_card * remove unused file * fix typos * save prior pipeilne in its own folder * fix imports * fix pipe_t2i * scale image_embeds * remove snr_gamma * format * initial lora prior training * log_validation and save * initial gradient working * remove save/load hooks * set set_attn_processor on prior_prior * add lora script * typos * use LoraLoaderMixin for prior pipeline * fix usage * make fix-copies * yse repo_id * write_lora_layers is a staitcmethod * use defualts * fix defaults * undo * Update src/diffusers/pipelines/wuerstchen/pipeline_wuerstchen_prior.py Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> * Update src/diffusers/loaders.py Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> * Update src/diffusers/loaders.py Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> * Update src/diffusers/pipelines/wuerstchen/modeling_wuerstchen_prior.py * Update src/diffusers/loaders.py Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> * Update src/diffusers/loaders.py Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> * add graident checkpoint support to prior * gradient_checkpointing * formatting * Update examples/wuerstchen/text_to_image/README.md Co-authored-by: Pedro Cuenca <pedro@huggingface.co> * Update examples/wuerstchen/text_to_image/README.md Co-authored-by: Pedro Cuenca <pedro@huggingface.co> * Update examples/wuerstchen/text_to_image/README.md Co-authored-by: Pedro Cuenca <pedro@huggingface.co> * Update examples/wuerstchen/text_to_image/README.md Co-authored-by: Pedro Cuenca <pedro@huggingface.co> * Update examples/wuerstchen/text_to_image/README.md Co-authored-by: Pedro Cuenca <pedro@huggingface.co> * Update examples/wuerstchen/text_to_image/train_text_to_image_lora_prior.py Co-authored-by: Pedro Cuenca <pedro@huggingface.co> * Update src/diffusers/loaders.py Co-authored-by: Pedro Cuenca <pedro@huggingface.co> * Update examples/wuerstchen/text_to_image/train_text_to_image_prior.py Co-authored-by: Pedro Cuenca <pedro@huggingface.co> * use default unet and text_encoder * fix test --------- Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> Co-authored-by: Pedro Cuenca <pedro@huggingface.co>
24 lines
881 B
Python
24 lines
881 B
Python
import torch.nn as nn
|
|
from torchvision.models import efficientnet_v2_l, efficientnet_v2_s
|
|
|
|
from diffusers.configuration_utils import ConfigMixin, register_to_config
|
|
from diffusers.models.modeling_utils import ModelMixin
|
|
|
|
|
|
class EfficientNetEncoder(ModelMixin, ConfigMixin):
|
|
@register_to_config
|
|
def __init__(self, c_latent=16, c_cond=1280, effnet="efficientnet_v2_s"):
|
|
super().__init__()
|
|
|
|
if effnet == "efficientnet_v2_s":
|
|
self.backbone = efficientnet_v2_s(weights="DEFAULT").features
|
|
else:
|
|
self.backbone = efficientnet_v2_l(weights="DEFAULT").features
|
|
self.mapper = nn.Sequential(
|
|
nn.Conv2d(c_cond, c_latent, kernel_size=1, bias=False),
|
|
nn.BatchNorm2d(c_latent), # then normalize them to have mean 0 and std 1
|
|
)
|
|
|
|
def forward(self, x):
|
|
return self.mapper(self.backbone(x))
|