📜Get repo access at Trelis.com/ADVANCED-evals
Trelis Evals (hosted solution) - Waitlist: https://forms.gle/q2bHurzLYNLW5d1U7
Tip: If you subscribe here on YouTube, click the bell to be notified of new vids
💡 Need Technical or Market Assistance?
Book a Consult Here: https://forms.gle/wJXVZXwioKMktjyVA
🤝 Are You a Top Developer?
Work for Trelis: https://trelis.com/jobs/
💸 Starting a New Project/Venture?
Apply for a Trelis Grant: https://trelis.com/trelis-ai-grants/
📧 Get Trelis AI Tutorials by Email
Subscribe on Substack: https://trelis.substack.com
Video Links:
ADVANCED Evals Part 1: • LLM Evals - Part 1: Evaluating Perfor...
ADVANCED Evals Part 2: • LLM Evals - Part 2: Improving Perform...
YourBench: https://github.com/huggingface/yourbench
LightEval: https://github.com/huggingface/lighteval
TIMESTAMPS:
0:00 Creating a custom benchmarking dataset
0:31 Video Overview and Scripts (https://trelis.com/ADVANCED-evals)
1:06 Quick-start with YourBench from HuggingFace
7:47 Running YourBench locally to create a benchmark
20:59 Advanced data generation notes (pdf conversion, estimating difficulty, citations, chunking, multi-hop, filtering)
29:23 Evaluating a custom dataset using LightEval
36:29 Evaluation and Data Inspection with Trelis ADVANCED-evals
46:01 Conclusion