Three Ways to Evaluate LLMs

Опубликовано: 25 Июль 2024
на канале: Snorkel AI
274
8

Most LLM evaluation falls into three buckets:

Open source evaluations and metrics.
LLM as judge.
Human annotation—whether internal or outsourced.

Snorkel AI founding engineer Vincent Sunn Chen walks through the advantages and drawbacks of each of these approaches.

This video is an excerpt from a longer webinar. See the full event here:    • How to Evaluate LLM Performance for Domain...  

#largelanguagemodels #evaluation #annotation