Memory-Augmented Agent Patterns

You know how to store memories, retrieve them, and manage context windows. Now it's time to put it all together into production patterns that make agents genuinely intelligent.

A memory-augmented agent isn't just an agent with a database attached. It's an agent that knows when to remember, what to surface, and how to handle the inevitable conflicts between past and present. This lesson covers four patterns that separate toy memory systems from production ones: pre-prompt injection, dynamic retrieval, conflict resolution, and testing strategies.

Pattern 1: Pre-Prompt Memory Injection

The simplest pattern: retrieve relevant memories before the conversation starts and inject them into the system prompt.

Architecture

┌─────────────────────────────────────────────────────────────┐
│                     PRE-PROMPT INJECTION                     │
├─────────────────────────────────────────────────────────────┤
│                                                              │
│   User Message → Memory Search → Inject into Prompt → LLM   │
│                                                              │
│   ┌─────────────┐    ┌───────────────┐    ┌────────────┐   │
│   │ "What's my  │ →  │ Search:       │ →  │ System:    │   │
│   │  schedule?" │    │ "schedule"    │    │ "User      │   │
│   └─────────────┘    │ + user_id     │    │  prefers   │   │
│                      └───────────────┘    │  morning   │   │
│                             │             │  meetings" │   │
│                             ▼             └────────────┘   │
│                      ┌───────────────┐          │          │
│                      │ Memories:     │          │          │
│                      │ - Preferences │          ▼          │
│                      │ - Projects    │    ┌────────────┐   │
│                      │ - History     │    │ LLM Call   │   │
│                      └───────────────┘    └────────────┘   │
│                                                              │
└─────────────────────────────────────────────────────────────┘

Implementation

from mem0 import Memory
from openai import OpenAI

memory = Memory()
client = OpenAI()

class MemoryAugmentedAgent:
    def __init__(self, user_id: str):
        self.user_id = user_id
        self.base_system_prompt = """You are a helpful task management assistant.
You help users organize their work and stay productive."""

    def build_system_prompt(self, user_message: str) -> str:
        """Inject relevant memories into system prompt."""
        # 1. Retrieve memories relevant to this message
        memories = memory.search(
            query=user_message,
            user_id=self.user_id,
            limit=5
        )

        # 2. Format memories for injection
        memory_context = self.format_memories(memories.get('results', []))

        # 3. Combine base prompt with memory context
        if memory_context:
            return f"""{self.base_system_prompt}

## Context About This User

{memory_context}

Use this context to personalize your responses."""

        return self.base_system_prompt

    def format_memories(self, memories: list) -> str:
        """Format memories for prompt injection."""
        if not memories:
            return ""

        lines = []
        for mem in memories:
            lines.append(f"- {mem['memory']}")

        return "\n".join(lines)

    def chat(self, user_message: str) -> str:
        """Process a chat message with memory augmentation."""
        # Build memory-augmented prompt
        system_prompt = self.build_system_prompt(user_message)

        # Call LLM
        response = client.chat.completions.create(
            model="gpt-4o-mini",
            messages=[
                {"role": "system", "content": system_prompt},
                {"role": "user", "content": user_message}
            ]
        )

        assistant_message = response.choices[0].message.content

        # Store this interaction in memory
        memory.add(
            [
                {"role": "user", "content": user_message},
                {"role": "assistant", "content": assistant_message}
            ],
            user_id=self.user_id
        )

        return assistant_message

Usage

agent = MemoryAugmentedAgent(user_id="alex")

# First interaction
response = agent.chat("I prefer working on tasks in the morning")
# Agent responds and stores preference

# Later interaction (different session)
response = agent.chat("When should I schedule my important tasks?")
# Agent retrieves morning preference and recommends accordingly

When to Use

User preferences that should influence every response
Background context that's always relevant
Simple, predictable memory needs

Limitations

Retrieves before knowing full context of conversation
May inject irrelevant memories
Token budget consumed before conversation starts

Pattern 2: Dynamic Memory Retrieval

More sophisticated: give the agent a tool to retrieve memories on-demand during the conversation.

Architecture

┌─────────────────────────────────────────────────────────────┐
│                    DYNAMIC RETRIEVAL                         │
├─────────────────────────────────────────────────────────────┤
│                                                              │
│   User Message → LLM → Tool Call → Memory Search → Continue │
│                                                              │
│   ┌─────────────┐    ┌───────────────┐                      │
│   │ "What did   │ →  │ Agent thinks: │                      │
│   │  we discuss │    │ "Need to      │                      │
│   │  about the  │    │  recall past  │                      │
│   │  deadline?" │    │  discussions" │                      │
│   └─────────────┘    └───────┬───────┘                      │
│                              │                               │
│                              ▼                               │
│                      ┌───────────────┐                      │
│                      │ Tool Call:    │                      │
│                      │ recall_memory │                      │
│                      │ ("deadline    │                      │
│                      │  discussion") │                      │
│                      └───────┬───────┘                      │
│                              │                               │
│                              ▼                               │
│                      ┌───────────────┐    ┌────────────┐   │
│                      │ Memory Result │ →  │ Continue   │   │
│                      │ "Jan 15: Set  │    │ Response   │   │
│                      │  deadline to  │    │ with       │   │
│                      │  Feb 28"      │    │ Context    │   │
│                      └───────────────┘    └────────────┘   │
│                                                              │
└─────────────────────────────────────────────────────────────┘

Implementation with OpenAI Agents SDK

from agents import Agent, Runner, function_tool
from mem0 import Memory

memory = Memory()

def create_memory_agent(user_id: str) -> Agent:
    """Create an agent with memory retrieval capabilities."""

    @function_tool
    def recall_memory(query: str) -> str:
        """Search your memory for relevant past interactions and facts.

        Args:
            query: What to search for in memory (e.g., "user preferences",
                   "project deadlines", "past decisions")
        """
        results = memory.search(
            query=query,
            user_id=user_id,
            limit=5
        )

        memories = results.get('results', [])
        if not memories:
            return "No relevant memories found."

        formatted = []
        for mem in memories:
            formatted.append(f"- {mem['memory']} (recorded: {mem.get('created_at', 'unknown')})")

        return "Relevant memories:\n" + "\n".join(formatted)

    @function_tool
    def store_memory(content: str, category: str = "general") -> str:
        """Store important information for future reference.

        Args:
            content: The information to remember
            category: Category for organization (preferences, projects, decisions, facts)
        """
        memory.add(
            [{"role": "system", "content": content}],
            user_id=user_id,
            metadata={"category": category}
        )
        return f"Stored: {content}"

    return Agent(
        name="MemoryAgent",
        instructions="""You are a task management assistant with memory capabilities.

You can:
1. Recall past interactions using the recall_memory tool
2. Store important information using the store_memory tool

Use recall_memory when:
- User asks about past discussions or decisions
- You need context about user preferences
- Answering requires historical information

Use store_memory when:
- User shares important preferences
- Decisions are made that should be remembered
- Facts are learned that will be useful later

Always check memory before making assumptions about the user.""",
        tools=[recall_memory, store_memory]
    )

# Usage
async def main():
    agent = create_memory_agent("alex")

    # Agent decides when to recall
    result = await Runner.run(
        agent,
        "What did we decide about the project deadline?"
    )
    print(result.final_output)

When to Use

Complex conversations where relevant context isn't obvious upfront
When the agent needs to decide what to remember
Conversations that build on past interactions

Advantages Over Pre-Prompt

Pre-Prompt	Dynamic
Always injects memories	Only retrieves when needed
May waste tokens on irrelevant context	Token-efficient
Simple to implement	More complex
Can't retrieve based on conversation flow	Adapts to conversation

Pattern 3: Memory Conflict Resolution

When memories contradict each other, the agent needs a strategy.

Conflict Types

Type 1: TEMPORAL CONFLICT
├── Old: "User prefers morning meetings" (January)
└── New: "User prefers afternoon meetings" (March)
    Resolution: Newer wins (with acknowledgment)

Type 2: AMBIGUOUS CONFLICT
├── Memory A: "Project deadline is Jan 31"
└── Memory B: "Project deadline is Feb 15"
    Resolution: Ask user for clarification

Type 3: PARTIAL CONFLICT
├── Memory A: "User is vegetarian"
└── Memory B: "User enjoyed the steak dinner"
    Resolution: Check timestamps, may need clarification

Implementation

from mem0 import Memory
from openai import OpenAI

memory = Memory()
client = OpenAI()

class ConflictAwareAgent:
    def __init__(self, user_id: str):
        self.user_id = user_id

    def detect_conflicts(self, memories: list) -> list:
        """Detect potential conflicts between memories."""
        if len(memories) < 2:
            return []

        # Use LLM to detect conflicts
        memory_texts = [f"- {m['memory']} (created: {m.get('created_at', 'unknown')})"
                       for m in memories]

        prompt = f"""Analyze these memories for potential conflicts or contradictions:

{chr(10).join(memory_texts)}

For each conflict found, respond in this format:
CONFLICT: [memory 1 summary] vs [memory 2 summary]
TYPE: temporal | ambiguous | partial
RESOLUTION: newer_wins | ask_user | needs_context

If no conflicts, respond: NO_CONFLICTS"""

        response = client.chat.completions.create(
            model="gpt-4o-mini",
            messages=[{"role": "user", "content": prompt}]
        )

        result = response.choices[0].message.content

        if "NO_CONFLICTS" in result:
            return []

        # Parse conflicts (simplified)
        conflicts = []
        for line in result.split("\n"):
            if line.startswith("CONFLICT:"):
                conflicts.append({"description": line})

        return conflicts

    def resolve_conflict(self, conflict: dict, memories: list) -> str:
        """Generate resolution strategy for a conflict."""
        # For temporal conflicts: prefer newer
        # For ambiguous: prepare clarification question
        # For partial: gather more context

        return f"I noticed some potentially outdated information. Let me clarify: {conflict['description']}"

    def process_with_conflict_check(self, query: str) -> dict:
        """Process query with conflict detection."""
        # 1. Retrieve relevant memories
        results = memory.search(query, user_id=self.user_id, limit=10)
        memories = results.get('results', [])

        # 2. Check for conflicts
        conflicts = self.detect_conflicts(memories)

        # 3. If conflicts, prepare resolution
        if conflicts:
            return {
                "has_conflicts": True,
                "conflicts": conflicts,
                "memories": memories,
                "suggestion": self.resolve_conflict(conflicts[0], memories)
            }

        return {
            "has_conflicts": False,
            "memories": memories
        }

Conflict Resolution Patterns

def handle_temporal_conflict(old_memory, new_memory):
    """Newer information supersedes older."""
    # Delete old memory
    memory.delete(old_memory['id'])

    # Log the change
    memory.add(
        [{"role": "system", "content": f"Updated: {old_memory['memory']} → {new_memory['memory']}"}],
        user_id=user_id,
        metadata={"type": "update_log"}
    )

    return new_memory

def handle_ambiguous_conflict(memory_a, memory_b, user_id):
    """Ask user for clarification."""
    return {
        "action": "clarify",
        "question": f"I have conflicting information: '{memory_a['memory']}' and '{memory_b['memory']}'. Which is correct?",
        "pending_memories": [memory_a['id'], memory_b['id']]
    }

def handle_partial_conflict(memories, user_id):
    """Gather more context before resolving."""
    return {
        "action": "gather_context",
        "memories": memories,
        "note": "These memories may not actually conflict—need more context."
    }

Pattern 4: Memory Testing Strategies

Memory-augmented agents need specialized testing approaches.

Test Categories

1. STORAGE TESTS
├── Memory is stored correctly
├── Metadata is preserved
└── Duplicate detection works

2. RETRIEVAL TESTS
├── Relevant memories are retrieved
├── Irrelevant memories are filtered
└── Token budget is respected

3. CONFLICT TESTS
├── Conflicts are detected
├── Resolution strategies work
└── User is informed appropriately

4. INTEGRATION TESTS
├── Full conversation flow works
├── Memory persists across sessions
└── Agent behavior is personalized

Test Implementation

import pytest
from mem0 import Memory

@pytest.fixture
def memory_client():
    """Fresh memory instance for each test."""
    m = Memory()
    yield m
    # Cleanup after test (if needed)

@pytest.fixture
def test_user_id():
    return "test_user_123"

class TestMemoryStorage:
    def test_add_memory_stores_content(self, memory_client, test_user_id):
        """Verify memories are stored correctly."""
        messages = [
            {"role": "user", "content": "I prefer Python over JavaScript."}
        ]

        result = memory_client.add(messages, user_id=test_user_id)

        assert 'results' in result
        assert len(result['results']) > 0
        assert 'memory' in result['results'][0]

    def test_metadata_is_preserved(self, memory_client, test_user_id):
        """Verify metadata is stored and retrievable."""
        messages = [{"role": "user", "content": "My deadline is January 31st."}]
        metadata = {"category": "deadlines", "project": "Phoenix"}

        memory_client.add(messages, user_id=test_user_id, metadata=metadata)

        results = memory_client.search(
            "deadline",
            user_id=test_user_id,
            filters={"category": "deadlines"}
        )

        assert len(results['results']) > 0

class TestMemoryRetrieval:
    def test_relevant_memories_retrieved(self, memory_client, test_user_id):
        """Verify semantic search returns relevant results."""
        # Store some memories
        memory_client.add(
            [{"role": "user", "content": "I love basketball."}],
            user_id=test_user_id
        )
        memory_client.add(
            [{"role": "user", "content": "My favorite food is pizza."}],
            user_id=test_user_id
        )

        # Search for sports
        results = memory_client.search("sports hobbies", user_id=test_user_id)

        # Basketball should rank higher than pizza
        memories = results['results']
        assert any('basketball' in m['memory'].lower() for m in memories)

    def test_user_isolation(self, memory_client):
        """Verify memories are isolated by user."""
        # Store for user A
        memory_client.add(
            [{"role": "user", "content": "Secret information for user A."}],
            user_id="user_a"
        )

        # Search as user B
        results = memory_client.search("secret", user_id="user_b")

        # User B should not see user A's memories
        assert len(results['results']) == 0

class TestMemoryAugmentedAgent:
    def test_agent_uses_stored_preferences(self, memory_client, test_user_id):
        """Verify agent personalizes based on stored preferences."""
        # Store preference
        memory_client.add(
            [{"role": "user", "content": "I always want tasks scheduled for morning."}],
            user_id=test_user_id,
            metadata={"category": "preferences"}
        )

        # Create agent and ask about scheduling
        agent = MemoryAugmentedAgent(test_user_id)
        response = agent.chat("When should I schedule my important meeting?")

        # Response should reference morning preference
        assert 'morning' in response.lower()

    def test_memory_persists_across_sessions(self, memory_client, test_user_id):
        """Verify memory survives session restart."""
        # Session 1: Store memory
        memory_client.add(
            [{"role": "user", "content": "My name is Alex."}],
            user_id=test_user_id
        )

        # Simulate session restart (new Memory instance)
        new_memory_client = Memory()

        # Session 2: Retrieve memory
        results = new_memory_client.search("user name", user_id=test_user_id)

        assert any('alex' in m['memory'].lower() for m in results['results'])

Testing Best Practices

Practice	Rationale
Use isolated test users	Prevent test pollution
Clean up after tests	Don't accumulate test data
Test semantic relevance	Not just exact matches
Test multi-session	Memory must persist
Test edge cases	Empty results, conflicts

Putting It All Together

A production memory-augmented agent combines all four patterns:

class ProductionMemoryAgent:
    def __init__(self, user_id: str):
        self.user_id = user_id
        self.memory = Memory()

        # Pre-load core user context
        self.core_context = self.load_core_context()

    def load_core_context(self) -> str:
        """Pre-prompt: Load always-relevant context."""
        results = self.memory.search(
            "user preferences and key facts",
            user_id=self.user_id,
            filters={"category": "core"},
            limit=5
        )
        return self.format_memories(results.get('results', []))

    def dynamic_recall(self, query: str) -> str:
        """Dynamic: Agent calls this during conversation."""
        results = self.memory.search(query, user_id=self.user_id, limit=5)
        return self.format_memories(results.get('results', []))

    def check_and_resolve_conflicts(self, new_info: str) -> dict:
        """Conflict: Check for and resolve contradictions."""
        related = self.memory.search(new_info, user_id=self.user_id, limit=10)
        conflicts = self.detect_conflicts(new_info, related.get('results', []))

        if conflicts:
            return self.resolve_conflicts(conflicts)
        return {"status": "no_conflicts"}

    def process(self, user_message: str) -> str:
        """Main processing loop."""
        # 1. Pre-prompt injection (always)
        system_prompt = f"""You are a helpful assistant.

Core user context:
{self.core_context}

You can use the recall_memory function for additional context."""

        # 2. Process with potential dynamic retrieval
        response = self.llm_with_tools(system_prompt, user_message)

        # 3. Check for conflicts before storing
        conflict_check = self.check_and_resolve_conflicts(user_message)

        # 4. Store interaction
        self.store_interaction(user_message, response)

        return response

Try With AI

Use these prompts to practice memory-augmented agent patterns with Claude or your preferred AI assistant.

Prompt 1: Design Memory Injection Strategy

I'm building a customer support agent that handles order inquiries.

Design a memory injection strategy that combines:
1. Pre-prompt injection for customer profile (name, tier, preferences)
2. Dynamic retrieval for order history and past issues

For each pattern:
- What memories should be pre-loaded vs. retrieved on-demand?
- What's the token budget allocation?
- How do you handle a customer with 500+ past orders?

Show the system prompt structure and the dynamic retrieval tool definition.

What you're learning: Memory injection isn't all-or-nothing. Strategic allocation between pre-prompt and dynamic retrieval optimizes both relevance and token efficiency.

Prompt 2: Implement Conflict Resolution

A user has interacted with an agent over 6 months. Their memories include:

January: "User is allergic to peanuts"
March: "User enjoyed the Thai peanut salad at lunch"
May: "User mentioned avoiding nuts at the company event"

Design a conflict resolution system that:
1. Detects the apparent contradiction
2. Determines the most likely truth
3. Decides whether to ask user or resolve automatically
4. Updates memories appropriately

Show the detection logic, resolution strategy, and example conversation where this plays out.

What you're learning: Memory conflicts aren't always errors—sometimes context matters. The peanut salad might have been peanut-free, or the allergy might be new. Good systems gather context before assuming.

Prompt 3: Build Test Suite for Memory Agent

Create a comprehensive test suite for a memory-augmented task management agent.

The agent should:
- Remember user preferences (work hours, priority styles)
- Track project context across sessions
- Provide personalized task suggestions
- Handle conflicting scheduling preferences

Design tests for:
1. Memory storage correctness
2. Retrieval relevance
3. Personalization behavior
4. Conflict detection
5. Multi-session persistence

For each test category, provide:
- Test scenario description
- Setup data
- Expected behavior
- Assertion criteria

What you're learning: Testing memory agents requires testing not just functionality but behavior—does the agent actually use the memories it stores? Tests must verify end-to-end personalization, not just storage operations.

Pattern 1: Pre-Prompt Memory Injection​

Architecture​

Implementation​

Usage​

When to Use​

Limitations​

Pattern 2: Dynamic Memory Retrieval​

Architecture​

Implementation with OpenAI Agents SDK​

When to Use​

Advantages Over Pre-Prompt​

Pattern 3: Memory Conflict Resolution​

Conflict Types​

Implementation​

Conflict Resolution Patterns​

Pattern 4: Memory Testing Strategies​

Test Categories​

Test Implementation​

Testing Best Practices​

Putting It All Together​

Try With AI​

Prompt 1: Design Memory Injection Strategy​

Prompt 2: Implement Conflict Resolution​

Prompt 3: Build Test Suite for Memory Agent​

Pattern 1: Pre-Prompt Memory Injection

Architecture

Implementation

Usage

When to Use

Limitations

Pattern 2: Dynamic Memory Retrieval

Architecture

Implementation with OpenAI Agents SDK

When to Use

Advantages Over Pre-Prompt

Pattern 3: Memory Conflict Resolution

Conflict Types

Implementation

Conflict Resolution Patterns

Pattern 4: Memory Testing Strategies

Test Categories

Test Implementation

Testing Best Practices

Putting It All Together

Try With AI

Prompt 1: Design Memory Injection Strategy

Prompt 2: Implement Conflict Resolution

Prompt 3: Build Test Suite for Memory Agent