mirror of https://github.com/huggingface/diffusers.git synced 2026-01-27 17:22:53 +03:00

Files

Aryan a4df8dbc40 Update more licenses to 2025 (#11746 )

update

2025-06-19 07:46:01 +05:30

1.2 KiB

Raw Permalink Blame History

Reinforcement learning training with DDPO

You can fine-tune Stable Diffusion on a reward function via reinforcement learning with the 🤗 TRL library and 🤗 Diffusers. This is done with the Denoising Diffusion Policy Optimization (DDPO) algorithm introduced by Black et al. in Training Diffusion Models with Reinforcement Learning, which is implemented in 🤗 TRL with the [~trl.DDPOTrainer].

For more information, check out the [~trl.DDPOTrainer] API reference and the Finetune Stable Diffusion Models with DDPO via TRL blog post.

1.2 KiB Raw Permalink Blame History

Reinforcement learning training with DDPO

1.2 KiB

Raw Permalink Blame History