From cc2205832443176fb4c1a9b02f21929b67846fbe Mon Sep 17 00:00:00 2001 From: Sayak Paul Date: Tue, 4 Mar 2025 13:58:16 +0530 Subject: [PATCH] Update evaluation.md (#10938) * Update evaluation.md * Update docs/source/en/conceptual/evaluation.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> --------- Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> --- docs/source/en/conceptual/evaluation.md | 5 +++++ 1 file changed, 5 insertions(+) diff --git a/docs/source/en/conceptual/evaluation.md b/docs/source/en/conceptual/evaluation.md index 90e072bbf2..131b888e7a 100644 --- a/docs/source/en/conceptual/evaluation.md +++ b/docs/source/en/conceptual/evaluation.md @@ -16,6 +16,11 @@ specific language governing permissions and limitations under the License. Open In Colab +> [!TIP] +> This document has now grown outdated given the emergence of existing evaluation frameworks for diffusion models for image generation. Please check +> out works like [HEIM](https://crfm.stanford.edu/helm/heim/latest/), [T2I-Compbench](https://arxiv.org/abs/2307.06350), +> [GenEval](https://arxiv.org/abs/2310.11513). + Evaluation of generative models like [Stable Diffusion](https://huggingface.co/docs/diffusers/stable_diffusion) is subjective in nature. But as practitioners and researchers, we often have to make careful choices amongst many different possibilities. So, when working with different generative models (like GANs, Diffusion, etc.), how do we choose one over the other? Qualitative evaluation of such models can be error-prone and might incorrectly influence a decision.