How to code BERT Word + Sentence Vectors (Embedding) w/ Transformers? Theory + Colab, Python

Опубликовано: 19 Август 2022
на канале: Discover AI
9,220
163

Before SBERT there was BERT. A stacked Encoder of a Transformer, bidirectional. I show you in theory (2min) and in code (Colab) how to build WORD Embeddings (word vectors) form the hidden states of each of the 12 BERT encoders and how to build a SENTENCE Vector (a Sentence embedding) from the encoder stack in a high dimensional vector space.

Part 2 of this video is called:
Python Code for BERT Paragraph Vector Embedding w/ Transformers (PyTorch, Colab)
and linked here:    • Python Code for BERT Paragraph Vector...  

Then we can apply UMAP for dimensional reduction, w/ preserving all relevant information in a lower dimensional vector space.

SBERT today is faster and more performant than BERT Sentence Vectors. But BERT has some exceptional hidden states for its contextualized embeddings, which outperforms static word embeddings like Word2Vec. if you know which hidden states of BERT to select for your vector representation.

Great instructions online:
https://mccormickml.com/2019/05/14/BE...
https://peltarion.com/knowledge-cente...

Difference between CLS hidden state (Embedding) and pooling w/ BERT:
https://github.com/huggingface/transf...

#datascience
#ai
#sbert