Post

๐‹๐‹๐Œ2๐•๐ž๐œ - ๐“๐ซ๐š๐ง๐ฌ๐Ÿ๐จ๐ซ๐ฆ ๐‹๐‹๐Œ๐ฌ ๐ข๐ง๐ญ๐จ ๐„๐ฆ๐›๐ž๐๐๐ข๐ง๐  ๐Œ๐จ๐๐ž๐ฅ๐ฌ

LLM2Vec paper ๐Ÿ‘‰ https://mcgill-nlp.github.io/llm2vec/

Intruction

LLM2Vec, a simple unsupervised approach that can transform any decoder-only LLM into a strong text encoder.

LLM2Vec consists of three simple steps:

  1. enabling bidirectional attention,
  2. masked next token prediction, and
  3. unsupervised contrastive learning.

LLM2Vec not only outperforms encoder-only models on word-level tasks but also achieves new SOTA results on the MTEB benchmark.

To summarize, LLM2Vec shows that without expensive adaptation or synthetic GPT-4 data, LLMs can be transformed into embedding models (universal text encoders)

This post is licensed under CC BY 4.0 by the author.