LangChain Decoded: Part 5 - Memory
In this multi-part series, I explore various LangChain modules and use cases, and document my journey via Python notebooks on GitHub. The previous post covered LangChain Indexes; this post explores Memory. Feel free to follow along and fork the repository, or use individual notebooks on Google Colab. Shoutout to the official LangChain documentation though - much of the code is borrowed or influenced by it, and I'm thankful for the clarity it offers.
Over the course of this series, I'll dive into the following topics:
- Models
- Embeddings
- Prompts
- Indexes
- Memory (this post)
- Chains
- Agents
- Callbacks
Getting Started
LangChain is available on PyPi, so it can be easily installed with pip
. By default, the dependencies (e.g. model providers, data stores) are not installed, and should be installed separately based on your specific needs. LangChain also offers an implementation in JavaScript, but we'll only use the Python libraries here.
LangChain supports several model providers, but this tutorial will only focus on OpenAI (unless explicitly stated otherwise). Set the OpenAI API key via the OPENAI_API_KEY
environment variable, or directly inside the notebook (or your Python code); if you don't have the key, you can get it here. Obviously, the first option is preferred in general, but especially in production - do not commit your API key accidentally to GitHub!
Follow along in your own Jupyter Python notebook, or click the link below to open the notebook directly in Google Colab.
# Install the LangChain package
!pip install langchain
# Install the OpenAI package
!pip install openai
# Configure the API key
import os
openai_api_key = os.environ.get('OPENAI_API_KEY', 'sk-XXX')
LangChain Memory
Much like the underlying LLMs and chat models, LangChain modules like chains and agents are stateless, and do not persist information over time. In applications like chatbots, this is critical, both for a coherent chat experience and for ensuring that the chatbot retains context during the conversation (and sometimes even longer that that). LangChain offers the Memory module to help with this - it provides wrappers to help with different memory ingestion, storage, transformation, and retrieval capabilities, and also makes it easy to integrate the wrappers into chains.
ChatMessageHistory
is one of the core classes for managing memory - it exposes methods for storing and retrieving user and AI chat messages. If you aren't using LangChain chains, this would be the class to use. Here's a simple example to illustrate the usage.
# Store and retrieve chat messages with ChatMessageHistory
from langchain.memory import ChatMessageHistory
history = ChatMessageHistory()
history.add_user_message("Hello")
history.add_ai_message("Hi, how can I help you?")
history.add_user_message("I want to write Python code.")
history.add_ai_message("Sure, I can help with that. What do you want to code?")
history.messages
*** Response ***
[HumanMessage(content='Hello', additional_kwargs={}, example=False),
AIMessage(content='Hi, how can I help you?', additional_kwargs={}, example=False),
HumanMessage(content='I want to write Python code.', additional_kwargs={}, example=False),
AIMessage(content='Sure, I can help with that. What do you want to code?', additional_kwargs={}, example=False)]
If you are using chains though, ConversationBufferMemory
is a simple wrapper around ChatMessageHistory
to extract the messages in a variable, and use the message history in the chain.
# Retrieve chat messages with ConversationBufferHistory (as a variable)
from langchain.memory import ConversationBufferMemory
memory = ConversationBufferMemory()
memory.chat_memory.add_user_message("Hello")
memory.chat_memory.add_ai_message("Hi, how can I help you?")
memory.chat_memory.add_user_message("I want to write Python code.")
memory.chat_memory.add_ai_message("Sure, I can help with that. What do you want to code?")
memory.load_memory_variables({})
*** Response ***
{'history': 'Human: Hello\nAI: Hi, how can I help you?\nHuman: I want to write Python code.\nAI: Sure, I can help with that. What do you want to code?'}
Instead of a single variable, you can also retrieve the history as a list of messages with the return_messages=True
parameter.
# Retrieve chat messages with ConversationBufferHistory (as a list of messages)
from langchain.memory import ConversationBufferMemory
memory = ConversationBufferMemory(return_messages=True)
memory.chat_memory.add_user_message("Hello")
memory.chat_memory.add_ai_message("Hi, how can I help you?")
memory.chat_memory.add_user_message("I want to write Python code.")
memory.chat_memory.add_ai_message("Sure, I can help with that. What do you want to code?")
memory.load_memory_variables({})
*** Response ***
{'history': [HumanMessage(content='Hello', additional_kwargs={}, example=False),
AIMessage(content='Hi, how can I help you?', additional_kwargs={}, example=False),
HumanMessage(content='I want to write Python code.', additional_kwargs={}, example=False),
AIMessage(content='Sure, I can help with that. What do you want to code?', additional_kwargs={}, example=False)]}
Here's an example of how you would use ConversationBufferMemory
in a conversation chain to store and retrieve message history between users and your application. Use the verbose=True
parameter when instantiating the class for verbose prompt details (and to see the chat buildup).
# Use ConversationBufferMemory in a chain
from langchain.llms.openai import OpenAI
from langchain.chains import ConversationChain
llm = OpenAI(temperature=0, openai_api_key=openai_api_key)
conversation = ConversationChain(llm=llm, memory=ConversationBufferMemory())
conversation.predict(input="Hello")
*** Response ***
Hi there! It's nice to meet you. How can I help you today?
The predict()
method lets you hold and advance a conversation using the instantiated chain. Run it a few times in the notebook with different inputs and explore the chain outputs.
conversation.predict(input="I want to write Python code.")
*** Response ***
Sure thing! Python is a great language to learn. Do you have any experience with coding?
Given the current limitations with LLM context windows, it could be useful to summarize conversations as you go along - the ConversationSummaryMemory
class does exactly that. Instead of storing all past messages in memory, this class helps you condense the conversation and store a smaller volume of text instead.
# Store a conversation summary with ConversationSummaryMemory
from langchain.llms.openai import OpenAI
from langchain.memory import ChatMessageHistory, ConversationSummaryMemory
llm = OpenAI(temperature=0, openai_api_key=openai_api_key)
memory = ConversationSummaryMemory(llm=llm)
memory.save_context({"input": "Hello"}, {"output": "Hi, how can I help you?"})
memory.load_memory_variables({})
*** Response ***
{'history': '\nThe human greets the AI, to which the AI responds by asking how it can help.'}
As before, use the predict()
method to advance the conversation.
conversation.predict(input="I want to write Python code.")
*** Response ***
Sure, I can help you with that. Do you have any experience with coding in general, or is this your first time?
As with ConversationBufferMemory
, you can also use ConversationSummaryMemory
in a chain - here's a simple example. I've set the verbose=True
parameter so you can see the prompt details as we go along.
# Use ConversationSummaryMemory in a chain
from langchain.llms.openai import OpenAI
from langchain.chains import ConversationChain
llm = OpenAI(temperature=0, openai_api_key=openai_api_key)
memory = ConversationSummaryMemory(llm=llm)
conversation = ConversationChain(llm=llm, verbose=True, memory=memory)
conversation.predict(input="Hello")
*** Response ***
> Entering new chain...
Prompt after formatting:
The following is a friendly conversation between a human and an AI. The AI is talkative and provides lots of specific details from its context. If the AI does not know the answer to a question, it truthfully says it does not know.
Current conversation:
Human: Hello
AI:
> Finished chain.
Hi there! It's nice to meet you. How can I help you today?
After the initial greeting, ask the AI to perform a specific task, say write Python code. The AI responds accordingly, but you will also see the conversation summary being generated along the way ("current conversation").
conversation.predict(input="I want to write Python code.")
*** Response ***
> Entering new chain...
Prompt after formatting:
The following is a friendly conversation between a human and an AI. The AI is talkative and provides lots of specific details from its context. If the AI does not know the answer to a question, it truthfully says it does not know.
Current conversation:
The human greets the AI, to which the AI responds with a friendly greeting and an offer to help.
Human: I want to write Python code.
AI:
> Finished chain.
Great! I can help you with that. Do you have any experience with Python programming?
Continue the conversation; you'll see the summary change over time to include the context from newer chat messages. This allows us to keep track of very long conversations without losing the essence or the context along the way.
conversation.predict(input="No, I'm a beginner.")
*** Response ***
> Entering new chain...
Prompt after formatting:
The following is a friendly conversation between a human and an AI. The AI is talkative and provides lots of specific details from its context. If the AI does not know the answer to a question, it truthfully says it does not know.
Current conversation:
The human greets the AI, to which the AI responds with a friendly greeting and an offer to help. The human then expresses a desire to write Python code, to which the AI responds positively and inquires about the human's experience with Python programming.
Human: No, I'm a beginner.
AI:
> Finished chain.
That's great! I'm happy to help you get started with Python programming. Do you have any questions about it?
Of course, this information is persisted in memory only for the duration of the conversation (session). If you want to persist the memory for a much longer period, you need a dedicated, external store. This is where open-source (and commercial) memory servers like Zep and Motörhead come handy. Here is an example with Motörhead - I'm using the managed version here, so you'll need to sign up for an account with Metal and get the API key here.
# Managed Motorhead memory
from langchain.memory.motorhead_memory import MotorheadMemory
from langchain import OpenAI, LLMChain, PromptTemplate
template = """You are a chatbot having a conversation with a human.
{chat_history}
Human: {human_input}
AI:"""
prompt = PromptTemplate(input_variables=["chat_history", "human_input"], template=template)
memory = MotorheadMemory(
api_key="API_KEY",
client_id="CLIENT_ID",
session_id="langchain-1",
memory_key="chat_history",
)
await memory.init();
llm = OpenAI(temperature=0, openai_api_key=openai_api_key)
llm_chain = LLMChain(llm=llm, prompt=prompt, memory=memory)
llm_chain.run("Hello, I'm Motorhead.")
*** Response ***
Hi Motorhead, it's nice to meet you. Is there something specific you want to talk about?
Instead of the predict()
method, we'll use the chain's run()
method to advance the conversation here. Metal's free plan is sufficient enough to get started and test the memory storage and retrieval capabilities.
llm_chain.run("What's my name?")
*** Response ***
Your name is Motorhead. Is there anything else I can help you with?
If you're self-hosting the Motörhead server though, just replace the api_key
and client_id
parameters with url
when instantiating the class; excerpt below.
memory = MotorheadMemory(
url="https://motorhead.example.com",
session_id="langchain-1",
memory_key="chat_history",
)
LangChain provides a few other classes to help with specific use cases e.g. the ConversationEntityMemory
class helps extract information on specific entities in a conversation, while the ConversationKGMemory
class uses a knowledge graph to recreate memory. LangChain also offers built-in integrations with other memory stores like DynamoDB, MongoDB, Cassandra, and more; see the official docs for details. That concludes this tutorial on memory.
The next post in this series covers LangChain Chains - do follow along if you liked this post. Finally, check out this handy compendium of all LangChain posts.