What LLM should I use for my application?

Опубликовано: 15 Апрель 2025
на канале: Trelis Research

962

📜Get repo access at Trelis.com/ADVANCED-evals

Trelis Evals (hosted solution) - Waitlist: https://forms.gle/q2bHurzLYNLW5d1U7

Tip: If you subscribe here on YouTube, click the bell to be notified of new vids

💡 Need Technical or Market Assistance?
Book a Consult Here: https://forms.gle/wJXVZXwioKMktjyVA

🤝 Are You a Top Developer?
Work for Trelis: https://trelis.com/jobs/

💸 Starting a New Project/Venture?
Apply for a Trelis Grant: https://trelis.com/trelis-ai-grants/

📧 Get Trelis AI Tutorials by Email
Subscribe on Substack: https://trelis.substack.com

Video Links:
ADVANCED Evals Part 1: • LLM Evals - Part 1: Evaluating Perfor...
ADVANCED Evals Part 2: • LLM Evals - Part 2: Improving Perform...
YourBench: https://github.com/huggingface/yourbench
LightEval: https://github.com/huggingface/lighteval

TIMESTAMPS:
0:00 Creating a custom benchmarking dataset
0:31 Video Overview and Scripts (https://trelis.com/ADVANCED-evals)
1:06 Quick-start with YourBench from HuggingFace
7:47 Running YourBench locally to create a benchmark
20:59 Advanced data generation notes (pdf conversion, estimating difficulty, citations, chunking, multi-hop, filtering)
29:23 Evaluating a custom dataset using LightEval
36:29 Evaluation and Data Inspection with Trelis ADVANCED-evals
46:01 Conclusion