1
0
mirror of https://github.com/kijai/ComfyUI-WanVideoWrapper.git synced 2026-01-26 23:41:35 +03:00
Files
ComfyUI-WanVideoWrapper/Ovi/vae/autoencoder.py
kijai 139bdf827f Squashed commit of the following:
commit 73dd1a06d33953912f5dd684f168028b14e42a36
Author: kijai <40791699+kijai@users.noreply.github.com>
Date:   Mon Oct 13 19:47:38 2025 +0300

    cleanup

commit 39bc2cecf493e2eb176b55e8841d933f0da1ec39
Author: kijai <40791699+kijai@users.noreply.github.com>
Date:   Mon Oct 13 19:24:20 2025 +0300

    Allow scheduling ovi cfg

commit 2c153c5f324dbd59670ad9c51a7995459504a3cd
Merge: dba7667 32eb6b4
Author: kijai <40791699+kijai@users.noreply.github.com>
Date:   Mon Oct 13 17:48:20 2025 +0300

    Merge branch 'main' into ovi

commit dba76674c71af7bf94c82834a0b0e40d94043c99
Merge: 0f11a43 5a0456e
Author: kijai <40791699+kijai@users.noreply.github.com>
Date:   Sun Oct 12 22:45:43 2025 +0300

    Merge branch 'main' into ovi

commit 0f11a439622799ad8070f8a2b8cc8e6a041b761d
Merge: 0999f50 e2d8c9b
Author: kijai <40791699+kijai@users.noreply.github.com>
Date:   Sat Oct 11 07:48:06 2025 +0300

    Merge branch 'main' into ovi

commit 0999f50cfe025290cd7ce88a8dd1acff0b38d9bd
Merge: d45df1f f1d1c83
Author: kijai <40791699+kijai@users.noreply.github.com>
Date:   Fri Oct 10 22:16:09 2025 +0300

    Merge branch 'main' into ovi

commit d45df1fb5b7c629b15eabc197357d62bdc232aaf
Author: kijai <40791699+kijai@users.noreply.github.com>
Date:   Thu Oct 9 20:21:37 2025 +0300

    Remove dependency for librosa

commit d8e7533fdf7eab1d2489c3e025a908c02d997444
Author: kijai <40791699+kijai@users.noreply.github.com>
Date:   Thu Oct 9 19:57:28 2025 +0300

    Remove omegaconf dependency

commit f4e27ff018e98cb5b09655dceda399baea36b240
Author: kijai <40791699+kijai@users.noreply.github.com>
Date:   Thu Oct 9 19:31:06 2025 +0300

    Fix VACE

commit 35d3df39294831e5e7568b6f7e16d2ecf2d790a0
Author: kijai <40791699+kijai@users.noreply.github.com>
Date:   Thu Oct 9 00:26:40 2025 +0300

    small update

commit 96f8ea1d26869ab7e49e12a07f19d5d5a2023253
Author: kijai <40791699+kijai@users.noreply.github.com>
Date:   Wed Oct 8 22:32:57 2025 +0300

    Create wanvideo_2_2_5B_ovi_testing.json

commit a2511be73b9da7019fd21aeb0b521af941c09150
Author: kijai <40791699+kijai@users.noreply.github.com>
Date:   Wed Oct 8 22:32:54 2025 +0300

    Update nodes_sampler.py

commit d3688b8db71452ea1f7c9a2bc0216441d524e56c
Author: kijai <40791699+kijai@users.noreply.github.com>
Date:   Wed Oct 8 21:43:02 2025 +0300

    Allow EasyCache to work with ovi

commit 586d9148a0306ef5d30e9a971a9c3be4cd3ecc97
Author: kijai <40791699+kijai@users.noreply.github.com>
Date:   Wed Oct 8 19:09:06 2025 +0300

    Update model.py

commit 61eedd2839decdb7d4c2ddd5f1310fdaf49d36ad
Author: kijai <40791699+kijai@users.noreply.github.com>
Date:   Wed Oct 8 19:09:02 2025 +0300

    I2V fix

commit a97fcb1b9ae9fb7bbfdf668c24816e014a1b58d1
Author: kijai <40791699+kijai@users.noreply.github.com>
Date:   Wed Oct 8 17:57:28 2025 +0300

    Add nodes to set audio latent size

commit d41e42a697f3d561dabbc22566f633b5f1bbd952
Author: kijai <40791699+kijai@users.noreply.github.com>
Date:   Wed Oct 8 16:42:04 2025 +0300

    Support loading mmaudio vae from .safetensors

commit 1b0e28ec41e3c97fe1f2f057fef9b9bbcb87bca7
Author: kijai <40791699+kijai@users.noreply.github.com>
Date:   Wed Oct 8 16:19:53 2025 +0300

    Update nodes_sampler.py

commit fbd18f45fe85ede8edcb5aebaea7ceb5b6eab5a2
Author: kijai <40791699+kijai@users.noreply.github.com>
Date:   Wed Oct 8 10:16:44 2025 +0300

    Fixes for other workflows

commit b06993b637198f7fad92208f3b3dc9a7d7f57c7f
Author: kijai <40791699+kijai@users.noreply.github.com>
Date:   Wed Oct 8 09:46:27 2025 +0300

    initial commit

    T2V works
2025-10-13 20:16:53 +03:00

55 lines
1.9 KiB
Python

from typing import Literal, Optional
import torch
import torch.nn as nn
from .vae import VAE, get_my_vae
from .distributions import DiagonalGaussianDistribution
from ..bigvgan import BigVGAN
from comfy.utils import load_torch_file
class AutoEncoderModule(nn.Module):
def __init__(self,
*,
vae_ckpt_path,
vocoder_ckpt_path: Optional[str] = None,
mode: Literal['16k', '44k'],
need_vae_encoder: bool = True):
super().__init__()
self.vae: VAE = get_my_vae(mode).eval()
#vae_state_dict = torch.load(vae_ckpt_path, weights_only=True, map_location='cpu')'
vae_state_dict = load_torch_file(vae_ckpt_path)
self.vae.load_state_dict(vae_state_dict)
self.vae.remove_weight_norm()
if mode == '16k':
assert vocoder_ckpt_path is not None
self.vocoder = BigVGAN(vocoder_ckpt_path).eval()
elif mode == '44k':
raise NotImplementedError("44k mode requires BigVGANv2 which is not currently supported in this environment.")
self.vocoder = BigVGANv2.from_pretrained('nvidia/bigvgan_v2_44khz_128band_512x',
use_cuda_kernel=False)
self.vocoder.remove_weight_norm()
else:
raise ValueError(f'Unknown mode: {mode}')
for param in self.parameters():
param.requires_grad = False
if not need_vae_encoder:
del self.vae.encoder
@torch.inference_mode()
def encode(self, x: torch.Tensor) -> DiagonalGaussianDistribution:
return self.vae.encode(x)
@torch.inference_mode()
def decode(self, z: torch.Tensor) -> torch.Tensor:
return self.vae.decode(z)
@torch.inference_mode()
def vocode(self, spec: torch.Tensor) -> torch.Tensor:
return self.vocoder(spec)