Claude Dreaming - AI Agent记忆整理机制深度研究

← 返回技术AI

🎯 核心收获

🎯 Key Takeaways

AI Agent正在获得"离线成长"能力：不再是被动工具，而是能在无用户输入时主动整理经验、发现规律、纠正错误
AI Agents are gaining "offline growth" capability: No longer a passive tool, but able to proactively organize experiences, discover patterns, and correct errors without user input
记忆整理三阶段模型：Light（快速扫描）→ REM（跨会话聚类）→ Deep（深度重构），与人类睡眠机制惊人相似
Three-stage memory consolidation model: Light → REM → Deep, remarkably similar to human sleep mechanisms
跨Agent模式发现：Dreaming让Agent能看到单个会话看不到的全局关联，这是质的飞跃
Cross-agent pattern discovery: Dreaming enables agents to see global connections invisible in individual sessions
可审核设计：原始记忆不修改，生成新版本供审核，避免记忆污染
Auditable design: Original memories are never modified; new versions are generated for review
理论支撑：Sleep-time Compute论文证明离线预计算可将测试时计算需求降低5倍，准确率提升13-18%
Theoretical foundation: Sleep-time Compute paper proves offline pre-computation can reduce test-time compute by 5x and improve accuracy by 13-18%

一、背景：为什么Agent需要"做梦"？

I. Background: Why Do Agents Need to "Dream"?

1.1 LLM的上下文窗口限制

1.1 LLM Context Window Limitations

无论上下文窗口有多大，都不可能无限保存所有历史对话。在实际工作中，Agent的记忆库会迅速膨胀：

No matter how large the context window is, it cannot infinitely store all historical conversations. In practice, an agent's memory store quickly accumulates:

重复记录（duplicates）Duplicates：同一个信息被多次写入
矛盾信息（contradictions）Contradictions：前后不一致的记录
过期条目（stale entries）Stale entries：已过时但未被清理的信息

Anthropic发现，随着项目推进，Memory Store里充满了这些问题，导致Agent表现越来越差。

Anthropic found that as projects progressed, Memory Stores became filled with these issues, causing deteriorating agent performance.

1.2 从"工具"到"同事"的转变

1.2 From "Tool" to "Colleague"

传统Agent是纯被动的：你问它答，你让它做它才做。它的"学习"局限于单次对话的上下文窗口。

Traditional agents are purely passive: you ask, they answer; you tell them to do, they do. Their "learning" is limited to a single conversation's context window.

Dreaming打破了这个限制：Agent可以在没有用户输入的时间段里，主动整理经验、发现规律、纠正错误。

Dreaming breaks this limitation: Agents can proactively organize experiences, discover patterns, and correct errors during periods without user input.

能"做梦"的Agent，不再只是一个执行命令的工具。它开始有了"工作经验"的积累能力——就像一个新员工，白天干活，晚上复盘，第二天变得更强。

An agent that can "dream" is no longer just a tool for executing commands. It begins to accumulate "work experience" — like a new employee who works during the day, reflects at night, and gets stronger the next day.

二、Claude Dreaming是什么？

II. What Is Claude Dreaming?

2.1 功能定义

2.1 Function Definition

Dreaming是一个定时触发的记忆整理流程。它会在后台自动运行，读取Agent过往的会话记录和Memory Store，识别重复模式、发现反复出现的错误、提炼跨Agent的团队偏好，然后生成一份重新组织后的、更精炼的记忆库。

Dreaming is a periodically triggered memory consolidation process. It runs automatically in the background, reading past session records and Memory Stores, identifying duplicate patterns, discovering recurring errors, extracting cross-team preferences, and generating a reorganized, refined memory store.

"Dreaming surfaces patterns that a single agent can't see on its own, including recurring mistakes, workflows that agents converge on, and preferences shared across a team."

"做梦能让Agent看到单个Agent独自工作时看不到的东西。"

2.2 技术实现

2.2 Technical Implementation

开发者通过client.beta.dreams.create() API触发Dreaming pipeline：

Developers trigger the Dreaming pipeline via the client.beta.dreams.create() API:

from anthropic import Anthropic

client = Anthropic()

# 触发Dreaming
dream = client.beta.dreams.create(
    model="claude-sonnet-4-6",  # 指定运行Dreaming的模型
    memory_store="ms_abc123",    # 目标Memory Store
)

# 等待完成
while dream.status in ("pending", "running"):
    dream = client.beta.dreams.retrieve(dream.id)

# 获取输出的新Memory Store ID
new_store_id = dream.outputs[0].id

2.3 关键设计要点

2.3 Key Design Principles

设计要点	Design Principle	说明	Description
输入存储不可修改	Immutable Input	Dreaming不会直接改动原始记忆，而是生成一份新的记忆存储供审核	Dreaming never modifies original memories directly; generates new stores for review
支持模型	Supported Models	Opus 4.7 / Sonnet 4.6（研究预览阶段）	Opus 4.7 / Sonnet 4.6 (Research Preview)
跨Agent模式识别	Cross-agent Pattern	跨越多个Agent、多个会话，发现全局模式	Discovers global patterns across multiple agents and sessions
自动 vs 审核模式	Auto vs Review	用户可以选择全自动整理，也可以逐条审核后再应用	Users can choose fully automatic consolidation or review line-by-line
最多读取100个会话	Session Limit	每个Dreaming任务最多分析100个历史会话	Each Dreaming task analyzes up to 100 historical sessions

三、实际案例：Dreaming能做什么？

III. Real Cases: What Can Dreaming Do?

案例一：多Agent协作中的模式发现

Case 1: Cross-Agent Pattern Discovery

场景：一个电商团队部署了3个Claude Managed Agents，分别处理代码审查、文档生成和测试编写。

Scenario: An e-commerce team deploys 3 Claude Managed Agents handling code review, documentation, and test writing respectively.

问题：在没有Dreaming之前，每个Agent独立工作，记忆彼此隔离——Agent A在15次会话中发现"团队偏好使用pytest而非unittest"，但这条信息只存在于Agent A的记忆里，Agent B完全不知道。

Problem: Without Dreaming, each agent works independently with isolated memories — Agent A discovers "team prefers pytest over unittest" in 15 sessions, but Agent B has no idea.

Dreaming运行后：

After Dreaming runs:

跨Agent扫描所有3个Agent的记忆和会话记录
Cross-agent scan of all 3 agents' memories and session records
发现："三个Agent的15次会话中都提到了pytest优先"
Discovery: "All 3 agents mentioned pytest preference in 15 sessions"
将其提炼为一条全局记忆，写入所有Agent共享的记忆库
Extract and write as a global memory shared by all agents

结果：下次Agent B生成测试文档时，自动使用pytest格式——它不需要被告诉，是"做梦"时学会的。

Result: Next time Agent B generates test documents, it automatically uses pytest format — it didn't need to be told; it learned while "dreaming".

案例二：重复错误的自动纠正

Case 2: Recurring Error Auto-correction

场景：客服Agent连续两周都在同一个问题上犯错——总把订单号格式搞错。用户每次纠正它，但纠正信息散落在50次不同的会话记忆中。

Scenario: A customer service agent makes the same mistake for two weeks — consistently getting order number format wrong. Users correct it each time, but corrections are scattered across 50 different session memories.

Dreaming运行后：

After Dreaming runs:

识别出："这个错误出现了50次，纠正也出现了50次"
Identifies: "This error appeared 50 times, corrections also appeared 50 times"
将其提炼为一条强规则："订单号格式必须是10位纯数字"
Extracts as a strong rule: "Order numbers must be 10 digits"
写入记忆
Writes to memory

结果：Agent不再需要每次被纠正，错误率断崖式下降。

Result: The agent no longer needs to be corrected each time; error rate drops dramatically.

四、Agent的Dreaming vs 人类的Dreaming

IV. Agent Dreaming vs Human Dreaming

这是一个极其有趣的类比：

This is a fascinating analogy:

维度	Dimension	人类做梦	Human Dreaming	Agent做梦	Agent Dreaming
触发时机	Trigger	入睡后，NREM→REM周期	After sleep, NREM→REM cycles	定时调度	Scheduled timing
输入素材	Input	白天的经历、情绪、感官信息	Daytime experiences, emotions, sensory info	会话转录、Memory Store	Session transcripts, Memory Store
核心功能	Core Function	记忆巩固、情绪处理、创造性联想	Memory consolidation, emotional processing, creative association	记忆去重、模式提取、错误纠正	Deduplication, pattern extraction, error correction
是否自主	Autonomy	完全自主，大脑自动完成	Fully autonomous	需预设调度，执行过程自主	Requires scheduling, autonomous execution
产出	Output	长期记忆强化、灵感涌现	Memory strengthening, creative insights	精炼的记忆库、洞察报告	Refined memory store, insight reports
可否跳过	Skippable?	长期缺梦导致认知退化	Long-term sleep deprivation causes cognitive decline	可关闭，但记忆质量下降	Can be disabled, but memory quality degrades
"做梦的做梦"	"Dreaming about dreaming"	清醒梦（lucid dream）	Lucid dreaming	审核模式	Review mode

最妙的相似之处

The Most Fascinating Similarity

人类大脑在REM睡眠期间会"重播"白天的经历，但不是简单回放，而是重新组织、建立新的神经连接——这就是为什么你经常会在"睡一觉"之后突然想通白天没解决的难题。

The human brain "replays" daytime experiences during REM sleep, but not as simple playback — it reorganizes and creates new neural connections. That's why you often suddenly solve problems after sleeping.

Claude Dreaming做的是几乎同样的事：它不是简单压缩记忆，而是重新组织记忆结构，发现单个Agent在活跃工作时看不到的跨会话关联。

Claude Dreaming does almost the same: it doesn't simply compress memories, but reorganizes memory structures and discovers cross-session connections invisible during active work.

五、各家方案横向对比

V. Cross-Platform Comparison

5.1 Claude Managed Agents — Dreaming（官方级）

5.1 Claude Managed Agents — Dreaming (Official)

维度	Dimension	详情	Details
方式	Method	官方内置功能，定时触发记忆整理	Official built-in, scheduled memory consolidation
触发	Trigger	`client.beta.dreams.create()` API，支持cron调度	`client.beta.dreams.create()` API, cron scheduling
特点	Features	跨Agent模式识别、可审核输出、原始数据不修改	Cross-agent patterns, auditable output, immutable input
成熟度	Maturity	研究预览阶段，需申请访问	Research Preview, requires application

5.2 OpenClaw — Dreaming（开源级）

5.2 OpenClaw — Dreaming (Open Source)

阶段模型：

Stage Model:

阶段	目的	Purpose	操作	Operations
Light	快速扫描	Quick scan	标记高价值信号	Mark high-value signals
REM	跨会话聚类	Cross-session clustering	发现隐藏关联	Discover hidden connections
Deep	深度重构	Deep restructuring	提升/降级记忆条目	Promote/demote memory entries

5.3 横评总结

5.3 Overall Comparison

方案	触发方式	Trigger	跨Agent	Cross-agent	可审核	Auditable
Claude Dreaming	API/定时	API/Scheduled	✅	✅	研究预览	Research Preview
OpenClaw	cron(3AM)	cron(3AM)	✅	✅	已发布	Released
Gemini	后台异步	Background async	❌	❌	已部署	Deployed
OpenAI Memory	对话中提取	In-conversation	❌	✅	已上线	Live

六、理论基础：Sleep-time Compute

VI. Theoretical Foundation: Sleep-time Compute

UC Berkeley和Letta团队在2025年4月发表了"Sleep-time Compute: Beyond Inference Scaling at Test-time"论文（arXiv: 2504.13171），从理论上支撑了Agent"做梦"的设计。

UC Berkeley and Letta teams published "Sleep-time Compute: Beyond Inference Scaling at Test-time" (arXiv: 2504.13171) in April 2025, providing theoretical support for agent "dreaming" design.

6.1 核心公式

6.1 Core Formula

传统流程：用户查询 → 即时推理 → 输出答案
           ↑ 每次都重复计算相同上下文

Sleep-time流程：空闲时段预处理 → 存储learned context → 查询时快速输出
                 ↑ 一次预处理，多次复用

6.2 实验结果

6.2 Experimental Results

指标	Metric	效果	Effect
准确率提升	Accuracy improvement	13%-18%（相同测试时计算预算下）	13-18% with same test-time compute
计算成本降低	Compute reduction	~5倍	~5x
平均成本降低	Average cost reduction	2.5倍（多个相关查询共享预处理结果）	2.5x (shared pre-computation)

七、对Agent开发的启示

VII. Implications for Agent Development

7.1 三个关键信号

7.1 Three Key Signals

从"工具"到"同事"：能"做梦"的Agent，开始有了"工作经验"的积累能力
From "tool" to "colleague": Agents that can "dream" gain the ability to accumulate "work experience"
从"单次智能"到"持续智能"：Dreaming让Agent的记忆有了生命周期管理——写入、整理、提炼、升华
From "single-shot" to "continuous" intelligence: Dreaming enables memory lifecycle management — write, organize, refine, elevate
开源与闭源的赛跑：OpenClaw的三阶段Dreaming已经开源可用，速度令人惊讶
Open source vs closed source race: OpenClaw's three-stage Dreaming is already available as open source

7.2 架构设计建议

7.2 Architecture Design Recommendations

┌─────────────────────────────────────────────────────┐
│                    Agent Memory                       │
├─────────────────────────────────────────────────────┤
│                                                     │
│  ┌─────────────┐      ┌─────────────────────────┐  │
│  │   Memory    │ ───→ │        Dreaming          │  │
│  │   Write     │      │   Light → REM → Deep    │  │
│  │  (会话中)   │      │    (定时/离线执行)       │  │
│  └─────────────┘      └─────────────────────────┘  │
│         │                        │                 │
│         │                        ↓                 │
│         │              ┌─────────────────┐         │
│         └─────────────→ │  精炼后的记忆库  │         │
│                        │  (可审核/可回滚) │         │
│                        └─────────────────┘         │
└─────────────────────────────────────────────────────┘

7.3 实现要点

7.3 Implementation Key Points

要点	Key Point	说明	Description
原始数据不修改	Immutable Original	始终生成新版本，保留审核和回滚能力	Always generate new versions for audit and rollback
三阶段处理	Three-stage Process	Light快速扫描 → REM模式聚类 → Deep重构	Light scan → REM clustering → Deep restructuring
可配置调度	Configurable Scheduling	支持cron定时，默认凌晨执行	Support cron scheduling, default 3AM execution
Token成本控制	Token Cost Control	设置最大迭代次数，审核模式可选	Set max iterations, optional review mode

八、实测案例：企业应用效果

VIII. Enterprise Case Studies

公司	场景	Scenario	效果	Effect
Harvey (法律AI)	长文法律文书起草	Long-form legal document drafting	Agent完成率涨了约6倍	Completion rate increased by ~6x
Netflix (平台工程)	日志分析agent	Log analysis agent	多Agent并行分析，只浮出反复出现的问题	Multi-agent parallel analysis, surfacing recurring issues
Spiral (写作工具)	多稿件并行生成	Parallel document generation	Haiku领队 + Opus子Agent + Outcomes评分	Haiku leader + Opus sub-agents + Outcomes scoring
Wisedocs (医疗文档)	质检审核	Quality inspection	快了50%，多抓了30%错误	50% faster, 30% more errors caught

九、结语

IX. Conclusion

2026年，AI Agent开始"做梦"了。

In 2026, AI agents begin to "dream".

这不是为了制造一个更有"人性"的机器，而是为了解决一个纯粹的工程问题：如何让Agent在不增加上下文窗口的情况下，变得更聪明。

This isn't about creating a more "human-like" machine, but solving a pure engineering problem: how to make agents smarter without increasing context windows.

也许有一天，我们会认真思考一个问题：当Agent开始做梦，它梦到的会是什么？

Perhaps one day we'll seriously consider: when agents start dreaming, what do they dream about?

是更高效的代码？更准确的答案？还是一个没有bug的世界？

More efficient code? More accurate answers? Or a world without bugs?

Agent已经不需要等你醒来才开始进化了。

Agents no longer need to wait for you to wake up before they start evolving.

🔗 相关链接

🔗 Related Links

💭 思考与实践

💭 Reflections & Practice

对于个人Agent开发：可以考虑参考OpenClaw的三阶段模型，在自己的Agent中实现轻量级的记忆整理功能
For personal agent development: Consider implementing a lightweight memory consolidation based on OpenClaw's three-stage model
对于企业级应用：跨Agent模式发现是最有价值的功能，可以显著提升多Agent协作效率
For enterprise applications: Cross-agent pattern discovery is the most valuable feature
对于AI产品设计：记忆写入（Memory）和记忆整理（Dreaming）应该被设计为一对互补机制
For AI product design: Memory write and consolidation should be designed as complementary mechanisms