Agentic Memory: Building AI Systems That Learn Over Time

The human memory is a fascinating, yet mysterious system; studies estimate our memory capacity to be the equivalent of 2.5 million gigabytes, we forget 50% of all new information within an hour of learning it, and left handed people on average have better memory (a fact that your left handed author is happy to share!).

Despite being such a vital structure, memory is often overlooked when designing agentic AI systems - even though we're often trying to emulate the human approach to tasks and problem solving.

In this article we'll explore the state of the art of agentic memory and how it can dramatically enhance your experience with agentic AI!

What's the value of agentic memory?

The power and promise of agentic AI comes from flexibility and creativity in the face of diverse tasks, a key advantage over workflow based systems. We believe that an effective memory structure is essential for AI agents to learn, improve and personalise for the individual users they interact with.

In our internal research and testing, the ability for an agent to learn from and remember previous interactions significantly reduces user frustration, leading to reduced friction on application uptake as well as higher usage.

Memory also shines in areas such as agentic AI for contact centre, if an AI agent can remember customer preferences and habits it is able to deliver dramatically improved customer experience.

Types of memory

Short term memory

Short term memory is the system you may be most familiar with, the passing of previous message turns as context to an LLM, most commonly within a single "chat session" or task.

As the conversational context deepens, possibly through hundreds of message turns, we start to encounter some inherent limitations of the LLM. Each model has a finite context window, although growing at an astonishing rate with each release, it is still very possible to overload this window. When compounded with large documents from RAG, and a wide range of tool definitions, you may quickly run out of space to pass in all of these messages.

Further issues with this approach will be clear to those with a keen eye for token economics, passing in a very large message context leading to high input token consumption. Your model provider doesn't care whether that context you passed is relevant to your task, an input token is an input token! For those interested further in advanced token optimisation, my colleague Derek Ho has an excellent article here: FinOps for GenAI: The Hidden Cost Saving You're Overlooking.

Short term memory is vital yet limited, in the next section we'll explore the various long term memory strategies that can help to supplement your agentic systems.

Long term memory

Long term memory represents the intentional extraction, processing and storage of useful information, key insights and user preferences from conversations across multiple sessions. How these memories are identified, stored and extracted vary between systems. These are just a few strategies for handling long term memory:

Conversation Summaries

A high level summary of the purpose, procedure and outcomes of each conversation, along with the conversation ID. When provided alongside a tool to retrieve an individual conversation, this can be a powerful tool to drill down into the users previous conversations that may be relevant.

Semantic Memory

A technique to extract key pieces of factual information and contextual knowledge from previous interactions. Enhancing these memories with extra metadata such as topics, tools used

Episodic Memory

This memory strategy focuses more on the "how" an outcome was achieved, remembering reasoning steps, actions, outcomes and reflections. This allows an agent to avoid repeating mistakes, focus on reasoning chains that lead to success, and complete tasks more efficiently

Memory management technologies

Our internal exploration into the world of agentic memory has been built upon AWS Bedrock Agentcore Memory, a new service providing managed extraction, storage and retrieval of memories for agents. A key advantage of Agentcore Memory is the ability to run the identification and extraction of long term memories asynchronously, meaning the response to the user is not delayed by this stage. Retrieval is fast and accurate, using a powerful vector store under the hood.

How does memory differ from knowledge?

Memory and knowledge provide a similar role, the insertion of extra context and information into an agentic LLM call to improve its output. Where they differ is the source; we define knowledge as human produced, vetted information such as company guides and documents. Memory on the other hand is produced and managed by AI agents themselves.

Advanced and mature agentic systems could use memory to contribute to knowledge, by raising common memories to the attention of a human in the loop, to be vetted and contributed to the common knowledge base and become available to all relevant agents in the system.

Segregation of memory/security considerations

While AI agents are responsible for the production of memories, it is vital that they remain segregated by user such that an agent can only access memories for the correct user.
Bedrock Agentcore memory achieves this with the concept of memory namespaces, seperating each extracted memory into a hierachy:

/strategy/{memoryStrategyId}/actor/{actorId}/session/{sessionId}/

The insertion of the user ID as part of the memory retrieval tool call should not be managed by the LLM, instead it should be inserted programatically to avoid prompt injection, to avoid an retrieving memories from an incorrect user namespace.

Conclusion

If we’re serious about building agents that feel less like clever autocomplete engines and more like capable collaborators, then memory can’t be an afterthought.

Short term context gets you through a conversation. Long term memory lets you build a relationship.

When an agent can remember what worked, what failed, what a user prefers, and how similar problems were solved before, we move from isolated interactions to cumulative progress. The system adapts. It personalises. And crucially, it reduces friction in a way workflow-based automation simply can’t.

Of course, memory introduces responsibility. It must be structured, cost-aware, secure, and carefully segregated. The mechanisms we explored, alongside services such as AWS Bedrock Agentcore Memory, show that the tooling is rapidly maturing. The challenge now shifts from can we build memory-enabled agents, to how well we design them.

If we want truly agentic AI, we need to give our agents the ability not just to think, but to remember.

Agentic Amnesia: How an Effective Memory Strategy Can Take Your AI System to a New Cognitive Level