Vision language models "see" images in non-visual ways. PhD researcher Reza Esfandiarpoor at Brown University studied what language VLMs associate with specific images and found that some models use non-visual concepts more than they would a simple visual description.
#foundationmodels #airesearch #brownuniversity