How to Build an Inference Service

Опубликовано: 19 Ноябрь 2024
на канале: Trelis Research

2,695

110

➡️ Lifetime access to ADVANCED-inference Repo (incl. future additions): https://trelis.com/ADVANCED-inference/
➡️ Runpod Affiliate Link: https://runpod.io?ref=jmfkcdio
➡️ One-click GPU templates: https://github.com/TrelisResearch/one...
➡️ Thumbnail made with this tutorial: • Fine Tune Flux Diffusion Models with Your ...

OTHER TRELIS LINKS:
➡️ Trelis Newsletter: https://blog.Trelis.com
➡️ Trelis Resources and Support: https://Trelis.com/About

VIDEO LINKS:
Slides: https://docs.google.com/presentation/...

TIMESTAMPS:
00:00 - Introduction to AI Inference Scaling
00:38 - Video Agenda Overview
02:00 - Different Inference Approaches
05:13 - Understanding GPU Utilization
08:53 - Setting Up One-Click Templates
14:14 - Docker Image Configuration
24:19 - Building Auto-Scaling Service
29:19 - Model Configuration Settings
35:35 - Load Testing and Metrics
41:35 - Scaling Manager Implementation
56:15 - Setting Up API Endpoint
59:51 - Conclusion and Future Topics