LangChain

LangChain is a framework for developing applications powered by large language models (LLMs).

langchain-core: Base abstractions and LangChain Expression Language.

Integration packages (e.g. langchain-openai, langchain-anthropic, etc.): Important integrations have been split into lightweight packages that are co-maintained by the LangChain team and the integration developers.

langchain: Chains, agents, and retrieval strategies that make up an application's cognitive architecture.

langchain-community: Third-party integrations that are community maintained.

LangGraph: Build robust and stateful multi-actor applications with LLMs by modeling steps as edges and nodes in a graph. Integrates smoothly with LangChain, but can be used without it. Useful Class to finish later - https://www.deeplearning.ai/the-batch/how-agents-can-improve-llm-performance/

Tutorial 1 - Chat Models and Prompts

ChatModels are instances of LangChain Runnables, which means they expose a standard interface for interacting with them. To simply call the model, we can pass in a list of messages to the .invoke method.


from langchain_anthropic import ChatAnthropic

model = ChatAnthropic(model="claude-3-5-sonnet-20240620")

from langchain_core.messages import HumanMessage, SystemMessage

messages = [
    SystemMessage("Translate the following from English into Italian"),
    HumanMessage("hi!"),
]

model.invoke(messages)

This is basically like the Anthropic API where we pass the messages and get the response after it is fully generated but streaming is also available

Streaming


for token in model.stream(messages):
    print(token.content, end="|")

Prompt Templates

So right now we are passing the whole prompts like "Translate the following from English into Italian" . Langchain’s prompt templates basically allow us an option to reuse these prompts with different parameters, kind of like a function. So here, I can set a template where I choose the language


from langchain_core.prompts import ChatPromptTemplate

system_template = "Translate the following from English into {language}"

prompt_template = ChatPromptTemplate.from_messages(
    [("system", system_template), ("user", "{text}")]
)

# We fill in the template here and it makes the messages we pass in to the model
# To see the message run prompt.to_messages()
prompt = prompt_template.invoke({"language": "Italian", "text": "hi!"})

response = model.invoke(prompt)
print(response.content)

Tutorial 2- Semantic Search

Document and Document Loaders

LangChain implements a Document abstraction, which is intended to represent a unit of text and associated metadata. It has three attributes:

page_content: a string representing the content;

metadata: a dict containing arbitrary metadata;

id: (optional) a string identifier for the document.

The metadata attribute can capture information about the source of the document, its relationship to other documents, and other information.


from langchain_core.documents import Document

documents = [
    Document(
        page_content="Dogs are great companions, known for their loyalty and friendliness.",
        metadata={"source": "mammal-pets-doc"},
    ),
    Document(
        page_content="Cats are independent pets that often enjoy their own space.",
        metadata={"source": "mammal-pets-doc"},
    ),
]

Langchain also provide document loaders that integrate with hundreds of common sources. This makes it easy to incorporate data from these sources into your AI application.


from langchain_community.document_loaders import PyPDFLoader

file_path = "./data/nke-10k-2023.pdf"
loader = PyPDFLoader(file_path)

docs = loader.load()

print(len(docs))

PyPDFLoader loads one Document object per PDF page.

Splitting

For both information retrieval and downstream question-answering purposes, a page may be too coarse a representation. Our goal in the end will be to retrieve Document objects that answer an input query, and further splitting our PDF will help ensure that the meanings of relevant portions of the document are not "washed out" by surrounding text.

We can use text splitters for this purpose. Here we will use a simple text splitter that partitions based on characters. We will split our documents into chunks of 1000 characters with 200 characters of overlap between chunks. The overlap helps mitigate the possibility of separating a statement from important context related to it. We use the RecursiveCharacterTextSplitter, which will recursively split the document using common separators like new lines until each chunk is the appropriate size. This is the recommended text splitter for generic text use cases.


from langchain_text_splitters import RecursiveCharacterTextSplitter

text_splitter = RecursiveCharacterTextSplitter(
    chunk_size=1000, chunk_overlap=200, add_start_index=True
)
all_splits = text_splitter.split_documents(docs)

len(all_splits)

Embeddings

Vector search is a common way to store and search over unstructured data (such as unstructured text). The idea is to store numeric vectors that are associated with the text. Given a query, we can embed it as a vector of the same dimension and use vector similarity metrics (such as cosine similarity) to identify related text.


from langchain_huggingface import HuggingFaceEmbeddings

embeddings = HuggingFaceEmbeddings(model_name="sentence-transformers/all-mpnet-base-v2")

vector_1 = embeddings.embed_query(all_splits[0].page_content)
vector_2 = embeddings.embed_query(all_splits[1].page_content)

assert len(vector_1) == len(vector_2)
print(f"Generated vectors of length {len(vector_1)}\n")
print(vector_1[:10])

Vector Stores

VectorStore objects contain methods for adding text and Document objects to the store, and querying them using various similarity metrics. They are often initialized with embedding models, which determine how text data is translated to numeric vectors.

It seems to be the way to interact with Vector Databases, can also store vectors locally


from langchain_core.vectorstores import InMemoryVectorStore

vector_store = InMemoryVectorStore(embeddings)

ids = vector_store.add_documents(documents=all_splits)

VectorStore includes methods for querying:

Synchronously and asynchronously;

By string query and by vector;

With and without returning similarity scores;

By similarity and maximum marginal relevance (to balance similarity with query to diversity in retrieved results).

Since embeddings store text as a "dense" vector such that texts with similar meanings are geometrically close.


results = vector_store.similarity_search(
    "How many distribution centers does Nike have in the US?"
)

print(results[0])

Runnables

The Runnable interface is the foundation for working with LangChain components, and it's implemented across many of them, such as language models, output parsers, retrievers, compiled LangGraph graphs and more.

The Runnable way defines a standard interface that allows a Runnable component to be:

Invoked: A single input is transformed into an output.

Batched: Multiple inputs are efficiently transformed into outputs.

Streamed: Outputs are streamed as they are produced.

Inspected: Schematic information about Runnable's input, output, and configuration can be accessed.

Composed: Multiple Runnables can be composed to work together using the LangChain Expression Language (LCEL) to create complex pipelines.

Please review the LCEL Cheatsheet for some common patterns that involve the Runnable interface and LCEL expressions.

Quick way to make a function into a runnable


from typing import List

from langchain_core.documents import Document
from langchain_core.runnables import chain


@chain
def retriever(query: str) -> List[Document]:
    return vector_store.similarity_search(query, k=1)


retriever.batch(
    [
        "How many distribution centers does Nike have in the US?",
        "When was Nike incorporated?",
    ],
)

Retrievers

Retriever class returns Documents given a text query.

Vectorstores implement an as_retriever method that will generate a Retriever, specifically a VectorStoreRetriever. These retrievers include specific search_type and search_kwargs attributes that identify what methods of the underlying vector store to call, and how to parameterize them. For instance, we can replicate the above with the following:


retriever = vector_store.as_retriever(
    search_type="similarity",
    search_kwargs={"k": 1},
)

retriever.batch(
    [
        "How many distribution centers does Nike have in the US?",
        "When was Nike incorporated?",
    ],
)

Reading Materials

Retrieval strategies can be rich and complex. For example:

We can infer hard rules and filters from a query (e.g., "using documents published after 2020");

We can return documents that are linked to the retrieved context in some way (e.g., via some document taxonomy);

We can generate multiple embeddings for each unit of context;

We can ensemble results from multiple retrievers;

We can assign weights to documents, e.g., to weigh recent documents higher.