LangChain is a framework for developing applications powered by large language models (LLMs).
langchain-core
: Base abstractions and LangChain Expression Language.
- Integration packages (e.g.
langchain-openai
,langchain-anthropic
, etc.): Important integrations have been split into lightweight packages that are co-maintained by the LangChain team and the integration developers.
langchain
: Chains, agents, and retrieval strategies that make up an application's cognitive architecture.
langchain-community
: Third-party integrations that are community maintained.
- LangGraph: Build robust and stateful multi-actor applications with LLMs by modeling steps as edges and nodes in a graph. Integrates smoothly with LangChain, but can be used without it. Useful Class to finish later - https://www.deeplearning.ai/the-batch/how-agents-can-improve-llm-performance/
Tutorial 1 - Chat Models and Prompts
ChatModels are instances of LangChain Runnables, which means they expose a standard interface for interacting with them. To simply call the model, we can pass in a list of messages to the
.invoke
method.from langchain_anthropic import ChatAnthropic model = ChatAnthropic(model="claude-3-5-sonnet-20240620") from langchain_core.messages import HumanMessage, SystemMessage messages = [ SystemMessage("Translate the following from English into Italian"), HumanMessage("hi!"), ] model.invoke(messages)
This is basically like the Anthropic API where we pass the messages and get the response after it is fully generated but streaming is also available
Streaming
for token in model.stream(messages): print(token.content, end="|")
Prompt Templates
So right now we are passing the whole prompts like
"Translate the following from English into Italian"
. Langchain’s prompt templates basically allow us an option to reuse these prompts with different parameters, kind of like a function. So here, I can set a template where I choose the language from langchain_core.prompts import ChatPromptTemplate system_template = "Translate the following from English into {language}" prompt_template = ChatPromptTemplate.from_messages( [("system", system_template), ("user", "{text}")] ) # We fill in the template here and it makes the messages we pass in to the model # To see the message run prompt.to_messages() prompt = prompt_template.invoke({"language": "Italian", "text": "hi!"}) response = model.invoke(prompt) print(response.content)
Tutorial 2- Semantic Search
Document and Document Loaders
LangChain implements a Document abstraction, which is intended to represent a unit of text and associated metadata. It has three attributes:
page_content
: a string representing the content;
metadata
: a dict containing arbitrary metadata;
id
: (optional) a string identifier for the document.
The
metadata
attribute can capture information about the source of the document, its relationship to other documents, and other information.from langchain_core.documents import Document documents = [ Document( page_content="Dogs are great companions, known for their loyalty and friendliness.", metadata={"source": "mammal-pets-doc"}, ), Document( page_content="Cats are independent pets that often enjoy their own space.", metadata={"source": "mammal-pets-doc"}, ), ]
Langchain also provide document loaders that integrate with hundreds of common sources. This makes it easy to incorporate data from these sources into your AI application.
from langchain_community.document_loaders import PyPDFLoader file_path = "./data/nke-10k-2023.pdf" loader = PyPDFLoader(file_path) docs = loader.load() print(len(docs))
PyPDFLoader
loads one Document
object per PDF page. Splitting
For both information retrieval and downstream question-answering purposes, a page may be too coarse a representation. Our goal in the end will be to retrieve
Document
objects that answer an input query, and further splitting our PDF will help ensure that the meanings of relevant portions of the document are not "washed out" by surrounding text.We can use text splitters for this purpose. Here we will use a simple text splitter that partitions based on characters. We will split our documents into chunks of 1000 characters with 200 characters of overlap between chunks. The overlap helps mitigate the possibility of separating a statement from important context related to it. We use the RecursiveCharacterTextSplitter, which will recursively split the document using common separators like new lines until each chunk is the appropriate size. This is the recommended text splitter for generic text use cases.
from langchain_text_splitters import RecursiveCharacterTextSplitter text_splitter = RecursiveCharacterTextSplitter( chunk_size=1000, chunk_overlap=200, add_start_index=True ) all_splits = text_splitter.split_documents(docs) len(all_splits)
Embeddings
Vector search is a common way to store and search over unstructured data (such as unstructured text). The idea is to store numeric vectors that are associated with the text. Given a query, we can embed it as a vector of the same dimension and use vector similarity metrics (such as cosine similarity) to identify related text.
from langchain_huggingface import HuggingFaceEmbeddings embeddings = HuggingFaceEmbeddings(model_name="sentence-transformers/all-mpnet-base-v2") vector_1 = embeddings.embed_query(all_splits[0].page_content) vector_2 = embeddings.embed_query(all_splits[1].page_content) assert len(vector_1) == len(vector_2) print(f"Generated vectors of length {len(vector_1)}\n") print(vector_1[:10])
Vector Stores
VectorStore objects contain methods for adding text and
Document
objects to the store, and querying them using various similarity metrics. They are often initialized with embedding models, which determine how text data is translated to numeric vectors.It seems to be the way to interact with Vector Databases, can also store vectors locally

from langchain_core.vectorstores import InMemoryVectorStore vector_store = InMemoryVectorStore(embeddings) ids = vector_store.add_documents(documents=all_splits)
VectorStore includes methods for querying:
- Synchronously and asynchronously;
- By string query and by vector;
- With and without returning similarity scores;
- By similarity and maximum marginal relevance (to balance similarity with query to diversity in retrieved results).
Since embeddings store text as a "dense" vector such that texts with similar meanings are geometrically close.
results = vector_store.similarity_search( "How many distribution centers does Nike have in the US?" ) print(results[0])
Runnables
The Runnable interface is the foundation for working with LangChain components, and it's implemented across many of them, such as language models, output parsers, retrievers, compiled LangGraph graphs and more.
The Runnable way defines a standard interface that allows a Runnable component to be:
- Invoked: A single input is transformed into an output.
- Batched: Multiple inputs are efficiently transformed into outputs.
- Streamed: Outputs are streamed as they are produced.
- Inspected: Schematic information about Runnable's input, output, and configuration can be accessed.
- Composed: Multiple Runnables can be composed to work together using the LangChain Expression Language (LCEL) to create complex pipelines.
Please review the LCEL Cheatsheet for some common patterns that involve the Runnable interface and LCEL expressions.
Quick way to make a function into a runnable
from typing import List from langchain_core.documents import Document from langchain_core.runnables import chain @chain def retriever(query: str) -> List[Document]: return vector_store.similarity_search(query, k=1) retriever.batch( [ "How many distribution centers does Nike have in the US?", "When was Nike incorporated?", ], )
Retrievers
Retriever class returns Documents given a text query.
Vectorstores implement an
as_retriever
method that will generate a Retriever, specifically a VectorStoreRetriever. These retrievers include specific search_type
and search_kwargs
attributes that identify what methods of the underlying vector store to call, and how to parameterize them. For instance, we can replicate the above with the following:retriever = vector_store.as_retriever( search_type="similarity", search_kwargs={"k": 1}, ) retriever.batch( [ "How many distribution centers does Nike have in the US?", "When was Nike incorporated?", ], )
Reading Materials
Retrieval strategies can be rich and complex. For example:
- We can infer hard rules and filters from a query (e.g., "using documents published after 2020");
- We can return documents that are linked to the retrieved context in some way (e.g., via some document taxonomy);
- We can generate multiple embeddings for each unit of context;
- We can ensemble results from multiple retrievers;
- We can assign weights to documents, e.g., to weigh recent documents higher.