memoria-contextual-funciona
When someone says an agent "remembers" the customer, most people picture a mystical AI that accumulates everything. It's not that. Agent memory is architecture — two distinct mechanisms working together, each with a clear job.
The two types of memory
Every serious agent has:
- Short-term memory (context window). The history of the last N turns of the current conversation, sent straight to the model on every call.
- Long-term memory (persisted facts). Specific facts about the member saved in a database, retrieved on demand in future conversations.
Confusing the two is the main cause of agents that "forget" or "make things up".
Short-term memory (window)
Every language model has a context limit — gpt-4o-mini supports 128k tokens, Claude Sonnet 200k. Sounds like a lot, but if you stuff in the raw history of a long conversation, cost explodes and quality drops (the model loses attention over giant context).
At Member AI, the short-term window holds the last 15 turns raw + a summary of the previous 15. Every 30 turns the summary is updated. This keeps cost predictable without the agent losing the thread.
Long-term memory (facts)
Here's where the magic-that-isn't-magic lives. Every time a conversation ends, the agent runs a background process that extracts structured facts about the member:
- "Carla is a creator in financial education, focus on women 30-45";
- "Has 12,300 Instagram followers, 800 students in the paid community";
- "Hit Q1 2026 target (40 new clients)";
- "Prefers a casual tone, no formality";
- "Preferred call time: Tuesday/Thursday morning".
These facts live in a per-hub relational database, with tags. When the member comes back, the agent doesn't load all of them into the prompt — it retrieves only the ones relevant to the current context.
How we do retrieval without bloating the prompt
The naive approach would be vector search with embeddings over everything (a technique known as RAG). It works, but it's expensive and slow. We do hybrid:
- Tag-based search in the relational DB — if the conversation mentioned "OKR" or "planning", pull facts tagged "goals";
- Vector search only on open-text facts that don't have a clear tag;
- Re-ranking by temporal relevance — recent facts weigh more than old ones.
In the end, at most 8-12 relevant facts make it into the turn's prompt. Low cost, high quality.
What the agent should forget
As important as remembering is forgetting. We actively don't store:
- Passwords, card data, sensitive tax IDs (even if the member sends them);
- Contradicting facts — when a new fact contradicts an old one, the old one is marked superseded;
- Facts unused for 18+ months are archived (still recoverable, but out of default retrieval).
Member control over memory
Principle we follow: the member owns the data. In the hub panel, the creator enables a command on the agent — the member can say "what do you remember about me?" and the agent shows the list of persisted facts. They can ask to forget any item: the fact is deleted in real time.
Memory without control is surveillance. Memory with control is relationship. We make a point of letting the member know it's relationship.
See contextual memory running
14 days free. Set up an agent, run 3 different conversations — by the fourth you'll see the agent pull context.
Try it now
Read more at memberai.pro/en/blog/how-contextual-memory-works.
Learn more: plans and pricing · about Member AI · real customer cases · full blog.