Large Language Models Can Help Make Better Image Classifiers. Here's How.

Опубликовано: 11 Июнь 2024
на канале: Snorkel AI
475
13

During this research talk, you’ll see how follow-up differential descriptions (FuDD) can enhance zero-shot image classification by tailoring class descriptions for each dataset.

PhD Student Reza Esfandiarpoor from Brown University will discuss how FuDD identifies ambiguous classes for each image and then employs a large language model (LLM) to produce new descriptions that better distinguish them.

The talk will address:
How FuDD performs compared to few-shot adaptation methods.
What challenges FuDD performs well on.
How to use FuDD in your workflow.

Read a write-up of Reza's work here: https://snorkel.ai/improving-vision-l...

See more research talks here:    • AI Research Talks: Building and Disco...  

Timestamps:

00:00 Introduction
01:00 Using LLMs to Improve Image Classification
02:01 Adapting Language Generation Process
02:20 Contrastive Vision Language Models
03:51 Class Description and Ambiguities
05:00 Follow-Up Differential Descriptions (FuDD)
06:20 Experiments and Pipeline Analysis
09:14 Analytical Experiments on Descriptions
10:07 Ambiguous Classes Importance
11:05 Fine-Tuning Publicly Available LLMs
13:54 Second Part Introduction
14:03 Understanding VM Representations
15:15 Extract and Explore Methodology
17:09 Reinforcement Learning with Human Feedback
22:21 Analysis of Generated Descriptions
25:31 Role of Spurious and Non-Visual Descriptions
26:20 Fine-Grain Attribute Analysis
30:55 Use Case of X2 in Bias Analysis
33:49 Summary of Findings

#imageclassification #largelanguagemodels #airesearch