Introduces 'Observational Memory' and Reduces AI Costs by Up to 10x

venturebeat.com

Introduces 'Observational Memory' and Reduces AI Costs by Up to 10x

TL;DR

Observational memory is a new memory architecture approach that promises to cut artificial intelligence (AI) costs by up to 10 times, developed by Mastra.

venturebeat.com•February 10, 2026•

3 min read

•1 views

Introduction to Observational Memory

Observational memory is a new approach in memory architecture that promises to reduce costs for artificial intelligence (AI) agents by up to 10 times. Developed by Mastra, this technology stands out in improving performance on long-duration benchmarks compared to the RAG (Retrieval-Augmented Generation) system.

How It Works

Unlike the RAG system, which seeks context dynamically, observational memory utilizes two background agents known as Observer and Reflector to compress the conversation history into a dated log of observations. This approach eliminates the need for dynamic information retrieval, simplifying the architecture.

Compression Process

In the process, observational memory divides the context window into two blocks: one contains compressed and dated observations, while the other maintains the current session message history. When the number of messages reaches 30,000 tokens, the Observer compresses this information and adds it to the first block, discarding the original messages.

Economic Impact

The savings brought by observational memory come from prompt caching, which reduces token costs by up to 10 times compared to uncached prompts. While most memory systems modify the prompt with each interaction, leading to cost inefficiencies, observational memory maintains context stability, enabling effective caching.

Advantages Over Traditional Compression

Most coding agents use a compression method that summarizes the entire history only when the context window is about to overflow. This may result in the loss of critical details. In contrast, observational memory processes small portions of data more frequently, producing a decision log that is more effective for executing consistent tasks.

Business Use Cases

Observational memory serves various business applications, from chatbots on content management platforms to alert management systems for engineering teams. The common requirement among these applications is context maintenance over months, allowing agents to reference past conversations without users needing to repeat information.

Significance for Production AI Systems

The introduction of observational memory represents an alternative architectural approach to vector databases and the dominant RAG pipelines in today’s market. Its simplified architecture and context stability create a conducive environment for easy debugging and maintenance. The possibility of viable caching to reduce costs transforms the proposal into a solution aligned with large-scale performance.

Final Considerations

For business teams evaluating memory approaches for AI, questions about the necessity of context maintenance, tolerance to data compression, and the need for dynamic retrieval are fundamental. Observational memory may be ideal for those requiring agents that preserve work continuity and user history. As AIs become an integral part of recording and production systems, memory design becomes as critical as the choice of technological model.

Content selected and edited with AI assistance. Original sources referenced above.

Introduces 'Observational Memory' and Reduces AI Costs by Up to 10x

TL;DR

Introduction to Observational Memory

How It Works

Compression Process

Economic Impact

Advantages Over Traditional Compression

Business Use Cases

Significance for Production AI Systems

Final Considerations

Share

venturebeat.com

Enjoyed this article?

Comments

Write a comment

More in Artificial Intelligence

Nvidia launches DreamDojo, AI model for training robots

Google Integrates Agentive Vision into Gemini 3 Flash

Stanford and Nvidia Optimize GPUs 2x Faster with TTT-Discover