Post

RAG or Fine Tuning? Fine-tune Embedding models for Retrieval Augmented Generation (RAG)

RAG or Fine Tuning? A simple feature comparision to decide which technique you should use!

For customizing LLMs, in addition to RAG, another optimization technique is fine-tuning.

๐—ฅ๐—”๐—š is akin to providing a textbook to the model, allowing it to retrieve information based on specific queries. This approach is suitable for scenarios where the model needs to address particular information retrieval tasks. However, RAG is not suitable for teaching the model to understand broad domains or learn new languages, formats, or styles.

๐—™๐—ถ๐—ป๐—ฒ-๐˜๐˜‚๐—ป๐—ถ๐—ป๐—ด is similar to enabling students to internalize knowledge through extensive learning. Fine-tuning can enhance the performance of non-fine-tuned models and make interactions more efficient. It is particularly suitable for emphasizing existing knowledge in the base model, modifying or customizing the modelโ€™s output, and providing complex directives to the model.

Sometimes it may not seem straightforward to choose one approach or the other, thatโ€™s why this guide will help you to differentiate which technique fits better your use case!

 Finetuning or RAG ?

RAG in Production: The importance of a Solid Data Strategy ๐Ÿ’ฅ

Retrieval-Augmented Generation (RAG) has become one of the hottest topics in Generative AI, providing powerful ways to enhance model responses with real-world data. But letโ€™s be honest, without a solid data strategy, youโ€™re setting yourself up for a meme-worthy fail. ๐Ÿ˜‚

๐Ÿ“ˆ ๐—ช๐—ต๐˜† ๐—ฅ๐—”๐—š ๐—ก๐—ฒ๐—ฒ๐—ฑ๐˜€ ๐—ฎ ๐——๐—ฎ๐˜๐—ฎ ๐—ฆ๐˜๐—ฟ๐—ฎ๐˜๐—ฒ๐—ด๐˜†:

  1. ๐——๐—ฎ๐˜๐—ฎ ๐—ค๐˜‚๐—ฎ๐—น๐—ถ๐˜๐˜†: Garbage in, garbage out. Your model is only as good as the data it retrieves.
  2. ๐—ฅ๐—ฒ๐—น๐—ฒ๐˜ƒ๐—ฎ๐—ป๐—ฐ๐—ฒ: Ensure your data is relevant to your use case.
  3. ๐—ฆ๐—ฐ๐—ฎ๐—น๐—ฎ๐—ฏ๐—ถ๐—น๐—ถ๐˜๐˜†: Manage and scale your data efficiently to keep up with growing demands.

Remember, a well-thought-out data strategy is the backbone of any successful RAG implementation.

๐Ÿš€ ๐—–๐—ผ๐—ป๐—ฐ๐—น๐˜‚๐˜€๐—ถ๐—ผ๐—ป: Donโ€™t let your RAG use case fall flat. Invest in your data strategy and watch your AI soar! ๐ŸŒŸ

Fine-tuning can significantly boost retrieval. ๐Ÿ‘€

Embedding models are crucial for Retrieval-Augmented Generation (RAG) applications, but general models often fall short of domain-specific tasks.

Excited to share a new blog on how to fine-tune embedding models for financial RAG applications using NVIDIAโ€™s 2023 SEC Filing dataset using latest research, like Matryoshka Representation Learning:

  • ๐Ÿš€ Fine-tuning boosts performance between 7.4% to 22.55% with just 6.3k samples
  • โœ… Baseline creation + evaluation during training
  • ๐Ÿงฌ Synthetic data generated used for fine-tuning
  • โฑ๏ธ Training on ~10,000 only 5 minutes on consumer-grade GPUs
  • ๐Ÿช† Matryoshka keeps 99% performance at 6x smaller size
  • ๐Ÿ“ˆ Fine-tuned 128-dim model outperforms baseline 768-dim by 6.51%
  • ๐Ÿ†• Uses the new Sentence Transformers v3

๐Ÿ‘‰ Original Article : https://www.philschmid.de/fine-tune-embedding-model-for-rag

๐Ÿ‘‰ Code : https://github.com/philschmid/deep-learning-pytorch-huggingface/blob/main/training/fine-tune-embedding-model-for-rag.ipynb

Go build! ๐Ÿค—

Translate to Korean

RAG ๋˜๋Š” ๋ฏธ์„ธ ์กฐ์ •? ์–ด๋–ค ๊ธฐ์ˆ ์„ ์‚ฌ์šฉํ•ด์•ผ ํ•˜๋Š”์ง€ ๊ฒฐ์ •ํ•˜๊ธฐ ์œ„ํ•œ ๊ฐ„๋‹จํ•œ ๊ธฐ๋Šฅ ๋น„๊ต!

LLM์„ ์ปค์Šคํ„ฐ๋งˆ์ด์ง•ํ•˜๊ธฐ ์œ„ํ•ด RAG ์™ธ์—๋„ ๋˜ ๋‹ค๋ฅธ ์ตœ์ ํ™” ๊ธฐ์ˆ ์ด ๋ฏธ์„ธ ์กฐ์ •์ž…๋‹ˆ๋‹ค.

RAG๋Š” ๋ชจ๋ธ์— ๊ต๊ณผ์„œ๋ฅผ ์ œ๊ณตํ•˜๋Š” ๊ฒƒ๊ณผ ์œ ์‚ฌํ•˜์—ฌ ํŠน์ • ์ฟผ๋ฆฌ๋ฅผ ๊ธฐ๋ฐ˜์œผ๋กœ ์ •๋ณด๋ฅผ ๊ฒ€์ƒ‰ํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. ์ด ๋ฐฉ๋ฒ•์€ ๋ชจ๋ธ์ด ํŠน์ • ์ •๋ณด ๊ฒ€์ƒ‰ ์ž‘์—…์„ ์ฒ˜๋ฆฌํ•ด์•ผ ํ•˜๋Š” ์‹œ๋‚˜๋ฆฌ์˜ค์— ์ ํ•ฉํ•ฉ๋‹ˆ๋‹ค. ๊ทธ๋Ÿฌ๋‚˜ RAG๋Š” ๋ชจ๋ธ์ด ๊ด‘๋ฒ”์œ„ํ•œ ๋„๋ฉ”์ธ์„ ์ดํ•ดํ•˜๊ฑฐ๋‚˜ ์ƒˆ๋กœ์šด ์–ธ์–ด, ํ˜•์‹ ๋˜๋Š” ์Šคํƒ€์ผ์„ ํ•™์Šตํ•˜๋„๋ก ํ•™์Šต์‹œํ‚ค๋Š” ๋ฐ๋Š” ์ ํ•ฉํ•˜์ง€ ์•Š์Šต๋‹ˆ๋‹ค.

๋ฏธ์„ธ ์กฐ์ •์€ ํ•™์ƒ๋“ค์ด ๊ด‘๋ฒ”์œ„ํ•œ ํ•™์Šต์„ ํ†ตํ•ด ์ง€์‹์„ ๋‚ด๋ฉดํ™”ํ•  ์ˆ˜ ์žˆ๋„๋ก ํ•˜๋Š” ๊ฒƒ๊ณผ ์œ ์‚ฌํ•ฉ๋‹ˆ๋‹ค. ๋ฏธ์„ธ ์กฐ์ •์€ ๋ฏธ์„ธ ์กฐ์ •๋˜์ง€ ์•Š์€ ๋ชจ๋ธ์˜ ์„ฑ๋Šฅ์„ ํ–ฅ์ƒ์‹œํ‚ค๊ณ  ์ƒํ˜ธ ์ž‘์šฉ์„ ๋ณด๋‹ค ํšจ์œจ์ ์œผ๋กœ ๋งŒ๋“ค ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. ๊ธฐ๋ณธ ๋ชจ๋ธ์˜ ๊ธฐ์กด ์ง€์‹์„ ๊ฐ•์กฐํ•˜๊ณ , ๋ชจ๋ธ์˜ ์ถœ๋ ฅ์„ ์ˆ˜์ •ํ•˜๊ฑฐ๋‚˜ ์‚ฌ์šฉ์ž ์ง€์ •ํ•˜๊ณ , ๋ชจ๋ธ์— ๋ณต์žกํ•œ ์ง€์‹œ๋ฌธ์„ ์ œ๊ณตํ•˜๋Š” ๋ฐ ํŠนํžˆ ์ ํ•ฉํ•ฉ๋‹ˆ๋‹ค.

๋•Œ๋กœ๋Š” ํ•œ ๊ฐ€์ง€ ์ ‘๊ทผ ๋ฐฉ์‹ ๋˜๋Š” ๋‹ค๋ฅธ ์ ‘๊ทผ ๋ฐฉ์‹์„ ์„ ํƒํ•˜๋Š” ๊ฒƒ์ด ๊ฐ„๋‹จํ•˜์ง€ ์•Š์€ ๊ฒƒ์ฒ˜๋Ÿผ ๋ณด์ผ ์ˆ˜ ์žˆ์œผ๋ฏ€๋กœ ์ด ๊ฐ€์ด๋“œ๋Š” ์‚ฌ์šฉ ์‚ฌ๋ก€์— ๋” ์ ํ•ฉํ•œ ๊ธฐ์ˆ ์„ ๊ตฌ๋ณ„ํ•˜๋Š” ๋ฐ ๋„์›€์ด ๋  ๊ฒƒ์ž…๋‹ˆ๋‹ค!

์ƒ์‚ฐ ํ˜„์žฅ์—์„œ์˜ RAG: ๊ฒฌ๊ณ ํ•œ ๋ฐ์ดํ„ฐ ์ „๋žต๐Ÿ’ฅ์˜ ์ค‘์š”์„ฑ

RAG(Retrieval-Augmented Generation)๋Š” ์ œ๋„ˆ๋ ˆ์ดํ‹ฐ๋ธŒ AI์—์„œ ๊ฐ€์žฅ ์ธ๊ธฐ ์žˆ๋Š” ์ฃผ์ œ ์ค‘ ํ•˜๋‚˜๊ฐ€ ๋˜์—ˆ์œผ๋ฉฐ, ์‹ค์ œ ๋ฐ์ดํ„ฐ๋กœ ๋ชจ๋ธ ์‘๋‹ต์„ ํ–ฅ์ƒ์‹œํ‚ฌ ์ˆ˜ ์žˆ๋Š” ๊ฐ•๋ ฅํ•œ ๋ฐฉ๋ฒ•์„ ์ œ๊ณตํ•ฉ๋‹ˆ๋‹ค. ๊ทธ๋Ÿฌ๋‚˜ ์†”์งํžˆ ๋งํ•ด์„œ ๊ฒฌ๊ณ ํ•œ ๋ฐ์ดํ„ฐ ์ „๋žต์ด ์—†์œผ๋ฉด ๋ฐˆ์— ์–ด์šธ๋ฆฌ๋Š” ์‹คํŒจ๋ฅผ ๋งž์ดํ•˜๊ฒŒ ๋ฉ๋‹ˆ๋‹ค. ๐Ÿ˜‚

๐Ÿ“ˆ RAG์— ๋ฐ์ดํ„ฐ ์ „๋žต์ด ํ•„์š”ํ•œ ์ด์œ :

  1. ๋ฐ์ดํ„ฐ ํ’ˆ์งˆ: ์“ฐ๋ ˆ๊ธฐ ์œ ์ž…, ์“ฐ๋ ˆ๊ธฐ ๋ฐฐ์ถœ. ๋ชจ๋ธ์€ ๊ฒ€์ƒ‰ํ•˜๋Š” ๋ฐ์ดํ„ฐ๋งŒํผ๋งŒ ์šฐ์ˆ˜ํ•ฉ๋‹ˆ๋‹ค.
  2. ๊ด€๋ จ์„ฑ: ๋ฐ์ดํ„ฐ๊ฐ€ ์‚ฌ์šฉ ์‚ฌ๋ก€์™€ ๊ด€๋ จ์ด ์žˆ๋Š”์ง€ ํ™•์ธํ•ฉ๋‹ˆ๋‹ค.
  3. ํ™•์žฅ์„ฑ: ์ฆ๊ฐ€ํ•˜๋Š” ์ˆ˜์š”๋ฅผ ๋”ฐ๋ผ์žก๊ธฐ ์œ„ํ•ด ๋ฐ์ดํ„ฐ๋ฅผ ํšจ์œจ์ ์œผ๋กœ ๊ด€๋ฆฌํ•˜๊ณ  ํ™•์žฅํ•ฉ๋‹ˆ๋‹ค.

์‹ ์ค‘ํ•œ ๋ฐ์ดํ„ฐ ์ „๋žต์€ ์„ฑ๊ณต์ ์ธ RAG ๊ตฌํ˜„์˜ ์ค‘์ถ”๋ผ๋Š” ์ ์„ ๊ธฐ์–ตํ•˜์‹ญ์‹œ์˜ค.

๐Ÿš€ ๊ฒฐ๋ก : RAG ์‚ฌ์šฉ ์‚ฌ๋ก€๊ฐ€ ์‹คํŒจํ•˜์ง€ ์•Š๋„๋ก ํ•˜์‹ญ์‹œ์˜ค. ๋ฐ์ดํ„ฐ ์ „๋žต์— ํˆฌ์žํ•˜๊ณ  AI๊ฐ€ ๊ธ‰์ฆํ•˜๋Š” ๊ฒƒ์„ ์ง€์ผœ๋ณด์‹ญ์‹œ์˜ค! ๐ŸŒŸ

๋ฏธ์„ธ ์กฐ์ •์€ ๊ฒ€์ƒ‰ ์†๋„๋ฅผ ํฌ๊ฒŒ ๋†’์ผ ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. ๐Ÿ‘€

์ž„๋ฒ ๋”ฉ ๋ชจ๋ธ์€ RAG(Retrieval-Augmented Generation) ์• ํ”Œ๋ฆฌ์ผ€์ด์…˜์— ๋งค์šฐ ์ค‘์š”ํ•˜์ง€๋งŒ ์ผ๋ฐ˜ ๋ชจ๋ธ์€ ๋„๋ฉ”์ธ๋ณ„ ์ž‘์—…์— ๋ฏธ์น˜์ง€ ๋ชปํ•˜๋Š” ๊ฒฝ์šฐ๊ฐ€ ๋งŽ์Šต๋‹ˆ๋‹ค.

Matryoshka Representation Learning๊ณผ ๊ฐ™์€ ์ตœ์‹  ์—ฐ๊ตฌ๋ฅผ ์‚ฌ์šฉํ•˜์—ฌ NVIDIA์˜ 2023 SEC Filing ๋ฐ์ดํ„ฐ ์„ธํŠธ๋ฅผ ์‚ฌ์šฉํ•˜์—ฌ ๊ธˆ์œต RAG ์• ํ”Œ๋ฆฌ์ผ€์ด์…˜์šฉ ์ž„๋ฒ ๋”ฉ ๋ชจ๋ธ์„ ๋ฏธ์„ธ ์กฐ์ •ํ•˜๋Š” ๋ฐฉ๋ฒ•์— ๋Œ€ํ•œ ์ƒˆ๋กœ์šด ๋ธ”๋กœ๊ทธ๋ฅผ ๊ณต์œ ํ•˜๊ฒŒ ๋˜์–ด ๊ธฐ์ฉ๋‹ˆ๋‹ค.

  • ๐Ÿš€ ๋ฏธ์„ธ ์กฐ์ •์œผ๋กœ ๋‹จ 6.3k ์ƒ˜ํ”Œ๋กœ 7.4%์—์„œ 22.55%๊นŒ์ง€ ์„ฑ๋Šฅ ํ–ฅ์ƒ
  • โœ… ๊ธฐ์ค€ ์ƒ์„ฑ + ํ•™์Šต ์ค‘ ํ‰๊ฐ€
  • ๐Ÿงฌ ๋ฏธ์„ธ ์กฐ์ •์— ์‚ฌ์šฉ๋˜๋Š” ์ƒ์„ฑ๋œ ํ•ฉ์„ฑ ๋ฐ์ดํ„ฐ
  • โฑ๏ธ ~10,000์— ๋Œ€ํ•œ ๊ต์œก, ์†Œ๋น„์ž์šฉ GPU์—์„œ ๋‹จ 5๋ถ„
  • ๐Ÿช† Matryoshka๋Š” 6๋ฐฐ ๋” ์ž‘์€ ํฌ๊ธฐ๋กœ 99%์˜ ์„ฑ๋Šฅ์„ ์œ ์ง€ํ•ฉ๋‹ˆ๋‹ค.
  • ๐Ÿ“ˆ ๋ฏธ์„ธ ์กฐ์ •๋œ 128-dim ๋ชจ๋ธ์€ ๊ธฐ์ค€ 768-dim๋ณด๋‹ค 6.51% ๋” ์šฐ์ˆ˜ํ•ฉ๋‹ˆ๋‹ค.
  • ๐Ÿ†• ์ƒˆ๋กœ์šด ๋ฌธ์žฅ ๋ณ€ํ™˜๊ธฐ v3 ์‚ฌ์šฉ

๋นŒ๋“œํ•˜๋Ÿฌ ๊ฐ€์„ธ์š”! ๐Ÿค—

This post is licensed under CC BY 4.0 by the author.