An introduction to LangChain and Agents is provided

Posted Jul 30, 2024 Updated Aug 23, 2024

By Fodev JEO 2 min read

LLMs Course Github 👉 https://github.com/peremartra/Large-Language-Model-Notebooks-Course/tree/main/3-LangChain

🔷 The RAG system from the second part is modified to use LangChain.

🔷 A moderation system is created where two Models are linked, with the second one responsible for moderating and providing the response to the user. There are three examples, one with OpenAI Models, and two more with open-source models: Llama and GPT-J.

Article and notebook using OpenAI models:

🔷The course continues by creating a Data Analyst using an agent from the langchain_experiments library, capable of interpreting tabular data files from a .csv file.

🔷 Finally, a chatbot is created to serve as a medical assistant using LangChain and ChromaDB with a medical data dataset.

Notebook: https://github.com/peremartra/Large-Language-Model-Notebooks-Course/blob/main/3-LangChain/3_4_Medical_Assistant_Agent.ipynb

As you can see, this part of the course uses a few examples to introduce you to LangChain and reinforce the knowledge you gained in the second lesson on vector database

I just finished reviewing the Large Language Models Evaluation section of the free course available on GitHub.

🔷 It starts with a brief introduction to n-grams and classic evaluation metrics like Bleu for translations and ROUGE for summaries.

▪️Evaluating translations with BLEU.

Notebook: https://github.com/peremartra/Large-Language-Model-Notebooks-Course/blob/main/4-Evaluating%20LLMs/4_1_bleu_evaluation.ipynb

▪️Evaluating Summarisations with ROUGE.

🔷 Once introduced to the world of metrics, the course moves on to using a tool like LangSmith, first to monitor the internal calls of an agent created with LangChain, and then to measure the quality of summaries using the distance between embeddings.

This second example is used to introduce LangSmith’s evaluators and shows how to use it to measure more than one metric at a time and detect harmful content in summaries.

▪️Evaluating the quality of summaries using Embedding distance with LangSmith.

🔷 Finally, a very powerful tool called Giskard is presented, which serves, among other things, to evaluate RAG solutions. Like LangSmith, Giskard uses Large Language Models to evaluate other Large Language Models.

This is one of the evaluation trends that seems to be gaining more notoriety.

▪️Evaluating a RAG solution with Giskard.

Notebook: https://github.com/peremartra/Large-Language-Model-Notebooks-Course/blob/main/4-Evaluating%20LLMs/4_3_evaluating_rag_giskard.ipynb

The evaluation of tools built with Language Models is one of the fastest evolving fields. The complexity of evaluating whether a result is correct or not is often leading to relying on one of the most advanced Large Language Models to evaluate the results of other specialized ones.

In these examples, you see everything from the most classic metrics to the latest tools that not only evaluate the quality of the text produced by the Large Language Model but also all the layers that are part of a RAG solution.

This is just an introduction because several books could be written about this field. But if you go through all the examples, you will have a fairly broad overview and will have learned about different tools.

LLM, Course

LLM Course

This post is licensed under CC BY 4.0 by the author.