Post

An introduction to LangChain and Agents is provided

LLMs Course Github ๐Ÿ‘‰ https://github.com/peremartra/Large-Language-Model-Notebooks-Course/tree/main/3-LangChain

Introduction to LangChain and Agents: Building Intelligent Applications

Curiosity: How can we build sophisticated LLM applications that go beyond simple prompts? What patterns and frameworks enable us to create intelligent agents that can reason, retrieve information, and take actions?

LangChain is a powerful framework for building LLM applications with chains, agents, and memory. This course section introduces LangChain through practical examples, from RAG systems to intelligent agents capable of data analysis and specialized assistance.

Course Structure Overview

graph TB
    A[LangChain Course] --> B[๐Ÿ”ท RAG with LangChain]
    A --> C[๐Ÿ”ท Moderation Systems]
    A --> D[๐Ÿ”ท Data Analyst Agent]
    A --> E[๐Ÿ”ท Medical Assistant]
    
    B --> B1[DataFrame Querying]
    C --> C1[OpenAI Models]
    C --> C2[Open Source Models]
    D --> D1[CSV Analysis]
    E --> E1[ChromaDB Integration]
    
    style A fill:#e1f5ff
    style B fill:#fff3cd
    style C fill:#d4edda
    style D fill:#f8d7da
    style E fill:#e7d4f8

๐Ÿ”ท Part 1: RAG System with LangChain

Retrieve: Learn how to build RAG systems using LangChainโ€™s powerful abstractions for document loading, vector stores, and retrieval.

Key Concepts:

  • Document loaders and text splitters
  • Vector store integration
  • Retrieval chains
  • Query processing

Resources:

Example Implementation:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
from langchain.document_loaders import DataFrameLoader
from langchain.vectorstores import Chroma
from langchain.embeddings import OpenAIEmbeddings
from langchain.chains import RetrievalQA
from langchain.llms import OpenAI

# Load DataFrame
loader = DataFrameLoader(df, page_content_column="text")
documents = loader.load()

# Create vector store
embeddings = OpenAIEmbeddings()
vectorstore = Chroma.from_documents(documents, embeddings)

# Create QA chain
qa_chain = RetrievalQA.from_chain_type(
    llm=OpenAI(),
    chain_type="stuff",
    retriever=vectorstore.as_retriever()
)

# Query
result = qa_chain.run("What is the average sales by region?")
print(result)

๐Ÿ”ท Part 2: Self-Moderated Commentary System

Innovate: Build moderation systems where one model moderates content before another model responds, ensuring safe and appropriate interactions.

Architecture:

graph LR
    A[User Input] --> B[Moderation Model]
    B -->|Safe| C[Response Model]
    B -->|Unsafe| D[Filtered Response]
    C --> E[User Output]
    D --> E
    
    style A fill:#e1f5ff
    style B fill:#fff3cd
    style C fill:#d4edda
    style E fill:#f8d7da

Three Implementation Approaches:

Model TypeUse CaseResources
OpenAI ModelsProduction-ready moderationArticle / Notebook
Llama 2Open-source alternativeArticle / Notebook
GPT-JCost-effective solutionSimilar pattern to Llama

Moderation Flow:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
from langchain.chains import LLMChain
from langchain.prompts import PromptTemplate

# Moderation prompt
moderation_prompt = PromptTemplate(
    input_variables=["input"],
    template="Is this input appropriate? {input}\nAnswer:"
)

# Response prompt
response_prompt = PromptTemplate(
    input_variables=["input"],
    template="Respond to: {input}"
)

# Create chains
moderation_chain = LLMChain(llm=moderation_llm, prompt=moderation_prompt)
response_chain = LLMChain(llm=response_llm, prompt=response_prompt)

# Moderation check
def moderated_response(user_input):
    moderation_result = moderation_chain.run(user_input)
    if "appropriate" in moderation_result.lower():
        return response_chain.run(user_input)
    else:
        return "I cannot respond to that input."

๐Ÿ”ท Part 3: Data Analyst Agent

Retrieve: Create intelligent agents that can analyze tabular data and answer questions using natural language.

Capabilities:

  • CSV file interpretation
  • Data analysis and visualization
  • Natural language queries
  • Automated insights generation

Resources:

Agent Architecture:

graph TB
    A[User Query] --> B[Agent]
    B --> C[Tool: Read CSV]
    B --> D[Tool: Analyze Data]
    B --> E[Tool: Generate Plot]
    C --> F[DataFrame]
    D --> G[Insights]
    E --> H[Visualization]
    F --> B
    G --> I[Response]
    H --> I
    
    style A fill:#e1f5ff
    style B fill:#fff3cd
    style I fill:#d4edda

Example Usage:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
from langchain.agents import create_pandas_dataframe_agent
from langchain.llms import OpenAI
import pandas as pd

# Load data
df = pd.read_csv("sales_data.csv")

# Create agent
agent = create_pandas_dataframe_agent(
    OpenAI(temperature=0),
    df,
    verbose=True
)

# Query
result = agent.run("What are the top 5 products by sales?")
print(result)

๐Ÿ”ท Part 4: Medical Assistant Chatbot

Innovate: Build specialized domain assistants using RAG with domain-specific knowledge bases.

Features:

  • Medical knowledge base integration
  • ChromaDB vector storage
  • Context-aware responses
  • Specialized domain expertise

Resources:

System Architecture:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
from langchain.vectorstores import Chroma
from langchain.embeddings import OpenAIEmbeddings
from langchain.chains import ConversationalRetrievalChain
from langchain.memory import ConversationBufferMemory

# Load medical documents
medical_docs = load_medical_documents()

# Create vector store
embeddings = OpenAIEmbeddings()
vectorstore = Chroma.from_documents(medical_docs, embeddings)

# Create memory
memory = ConversationBufferMemory(
    memory_key="chat_history",
    return_messages=True
)

# Create conversational chain
qa_chain = ConversationalRetrievalChain.from_llm(
    OpenAI(),
    vectorstore.as_retriever(),
    memory=memory
)

# Chat
response = qa_chain({"question": "What are the symptoms of diabetes?"})
print(response["answer"])

LangChain Components Summary

ComponentPurposeExample Use Case
ChainsSequential operationsRAG, moderation pipelines
AgentsTool-using LLMsData analysis, web search
MemoryConversation historyChatbots, assistants
Vector StoresDocument retrievalRAG systems, knowledge bases
ToolsExternal integrationsAPIs, databases, calculators

Key Takeaways

Retrieve: LangChain provides powerful abstractions for building complex LLM applications, from simple RAG systems to sophisticated agents.

Innovate: By combining LangChainโ€™s componentsโ€”chains, agents, memory, and vector storesโ€”you can create intelligent applications that go far beyond simple prompt engineering.

Curiosity โ†’ Retrieve โ†’ Innovation: Start with curiosity about building intelligent applications, retrieve knowledge about LangChain patterns, and innovate by creating specialized agents for your domain.

๐Ÿ“š Course Repository: https://github.com/peremartra/Large-Language-Model-Notebooks-Course/tree/main/3-LangChain

Next Steps:

  • Explore LangChain documentation
  • Build your first RAG system
  • Create a specialized agent
  • Integrate with your domain knowledge

I just finished reviewing the Large Language Models Evaluation section of the free course available on GitHub.

๐Ÿ”ท It starts with a brief introduction to n-grams and classic evaluation metrics like Bleu for translations and ROUGE for summaries.

โ–ช๏ธEvaluating translations with BLEU.

โ–ช๏ธEvaluating Summarisations with ROUGE.

๐Ÿ”ท Once introduced to the world of metrics, the course moves on to using a tool like LangSmith, first to monitor the internal calls of an agent created with LangChain, and then to measure the quality of summaries using the distance between embeddings.

This second example is used to introduce LangSmithโ€™s evaluators and shows how to use it to measure more than one metric at a time and detect harmful content in summaries.

โ–ช๏ธEvaluating the quality of summaries using Embedding distance with LangSmith.

๐Ÿ”ท Finally, a very powerful tool called Giskard is presented, which serves, among other things, to evaluate RAG solutions. Like LangSmith, Giskard uses Large Language Models to evaluate other Large Language Models.

This is one of the evaluation trends that seems to be gaining more notoriety.

โ–ช๏ธEvaluating a RAG solution with Giskard.

The evaluation of tools built with Language Models is one of the fastest evolving fields. The complexity of evaluating whether a result is correct or not is often leading to relying on one of the most advanced Large Language Models to evaluate the results of other specialized ones.

In these examples, you see everything from the most classic metrics to the latest tools that not only evaluate the quality of the text produced by the Large Language Model but also all the layers that are part of a RAG solution.

This is just an introduction because several books could be written about this field. But if you go through all the examples, you will have a fairly broad overview and will have learned about different tools.

This post is licensed under CC BY 4.0 by the author.