ORPO: The Latest LLM Fine-tuning Method | A Quick Tutorial using Hugging Face

Опубликовано: 30 Сентябрь 2024
на канале: Quick Tutorials
562
23

In this video, we give you a quick overview of ORPO (Odds Ratio Preference Optimization) method for fine-tuning of Large Language Models (LLMs). In particular, we review SFT (Supervised Fine Tuning), DPO (Direct Preference Optimization) and RLHF (Reinforcement Learning from Human Feedback) along with ORPO, which is the latest fine-tuning method for LLMs. We also show you how to use ORPO to fine-tune an LLM using Auto Train in Hugging Face.

#llm #orpo #sft #rlhf #dpo #largelanguagemodels #pretraining #ai #nlp