🎯 核心收获
🎯 Key Takeaways
- AI Agent正在获得"离线成长"能力:不再是被动工具,而是能在无用户输入时主动整理经验、发现规律、纠正错误
- AI Agents are gaining "offline growth" capability: No longer a passive tool, but able to proactively organize experiences, discover patterns, and correct errors without user input
- 记忆整理三阶段模型:Light(快速扫描)→ REM(跨会话聚类)→ Deep(深度重构),与人类睡眠机制惊人相似
- Three-stage memory consolidation model: Light → REM → Deep, remarkably similar to human sleep mechanisms
- 跨Agent模式发现:Dreaming让Agent能看到单个会话看不到的全局关联,这是质的飞跃
- Cross-agent pattern discovery: Dreaming enables agents to see global connections invisible in individual sessions
- 可审核设计:原始记忆不修改,生成新版本供审核,避免记忆污染
- Auditable design: Original memories are never modified; new versions are generated for review
- 理论支撑:Sleep-time Compute论文证明离线预计算可将测试时计算需求降低5倍,准确率提升13-18%
- Theoretical foundation: Sleep-time Compute paper proves offline pre-computation can reduce test-time compute by 5x and improve accuracy by 13-18%
一、背景:为什么Agent需要"做梦"?
I. Background: Why Do Agents Need to "Dream"?
1.1 LLM的上下文窗口限制
1.1 LLM Context Window Limitations
无论上下文窗口有多大,都不可能无限保存所有历史对话。在实际工作中,Agent的记忆库会迅速膨胀:
No matter how large the context window is, it cannot infinitely store all historical conversations. In practice, an agent's memory store quickly accumulates:
- 重复记录(duplicates)Duplicates:同一个信息被多次写入
- 矛盾信息(contradictions)Contradictions:前后不一致的记录
- 过期条目(stale entries)Stale entries:已过时但未被清理的信息
Anthropic发现,随着项目推进,Memory Store里充满了这些问题,导致Agent表现越来越差。
Anthropic found that as projects progressed, Memory Stores became filled with these issues, causing deteriorating agent performance.
1.2 从"工具"到"同事"的转变
1.2 From "Tool" to "Colleague"
传统Agent是纯被动的:你问它答,你让它做它才做。它的"学习"局限于单次对话的上下文窗口。
Traditional agents are purely passive: you ask, they answer; you tell them to do, they do. Their "learning" is limited to a single conversation's context window.
Dreaming打破了这个限制:Agent可以在没有用户输入的时间段里,主动整理经验、发现规律、纠正错误。
Dreaming breaks this limitation: Agents can proactively organize experiences, discover patterns, and correct errors during periods without user input.
能"做梦"的Agent,不再只是一个执行命令的工具。它开始有了"工作经验"的积累能力——就像一个新员工,白天干活,晚上复盘,第二天变得更强。
An agent that can "dream" is no longer just a tool for executing commands. It begins to accumulate "work experience" — like a new employee who works during the day, reflects at night, and gets stronger the next day.
二、Claude Dreaming是什么?
II. What Is Claude Dreaming?
2.1 功能定义
2.1 Function Definition
Dreaming是一个定时触发的记忆整理流程。它会在后台自动运行,读取Agent过往的会话记录和Memory Store,识别重复模式、发现反复出现的错误、提炼跨Agent的团队偏好,然后生成一份重新组织后的、更精炼的记忆库。
Dreaming is a periodically triggered memory consolidation process. It runs automatically in the background, reading past session records and Memory Stores, identifying duplicate patterns, discovering recurring errors, extracting cross-team preferences, and generating a reorganized, refined memory store.
"Dreaming surfaces patterns that a single agent can't see on its own, including recurring mistakes, workflows that agents converge on, and preferences shared across a team."
"做梦能让Agent看到单个Agent独自工作时看不到的东西。"
2.2 技术实现
2.2 Technical Implementation
开发者通过client.beta.dreams.create() API触发Dreaming pipeline:
Developers trigger the Dreaming pipeline via the client.beta.dreams.create() API:
from anthropic import Anthropic
client = Anthropic()
# 触发Dreaming
dream = client.beta.dreams.create(
model="claude-sonnet-4-6", # 指定运行Dreaming的模型
memory_store="ms_abc123", # 目标Memory Store
)
# 等待完成
while dream.status in ("pending", "running"):
dream = client.beta.dreams.retrieve(dream.id)
# 获取输出的新Memory Store ID
new_store_id = dream.outputs[0].id
2.3 关键设计要点
2.3 Key Design Principles
| 设计要点 | Design Principle | 说明 | Description |
|---|---|---|---|
| 输入存储不可修改 | Immutable Input | Dreaming不会直接改动原始记忆,而是生成一份新的记忆存储供审核 | Dreaming never modifies original memories directly; generates new stores for review |
| 支持模型 | Supported Models | Opus 4.7 / Sonnet 4.6(研究预览阶段) | Opus 4.7 / Sonnet 4.6 (Research Preview) |
| 跨Agent模式识别 | Cross-agent Pattern | 跨越多个Agent、多个会话,发现全局模式 | Discovers global patterns across multiple agents and sessions |
| 自动 vs 审核模式 | Auto vs Review | 用户可以选择全自动整理,也可以逐条审核后再应用 | Users can choose fully automatic consolidation or review line-by-line |
| 最多读取100个会话 | Session Limit | 每个Dreaming任务最多分析100个历史会话 | Each Dreaming task analyzes up to 100 historical sessions |
三、实际案例:Dreaming能做什么?
III. Real Cases: What Can Dreaming Do?
案例一:多Agent协作中的模式发现
Case 1: Cross-Agent Pattern Discovery
场景:一个电商团队部署了3个Claude Managed Agents,分别处理代码审查、文档生成和测试编写。
Scenario: An e-commerce team deploys 3 Claude Managed Agents handling code review, documentation, and test writing respectively.
问题:在没有Dreaming之前,每个Agent独立工作,记忆彼此隔离——Agent A在15次会话中发现"团队偏好使用pytest而非unittest",但这条信息只存在于Agent A的记忆里,Agent B完全不知道。
Problem: Without Dreaming, each agent works independently with isolated memories — Agent A discovers "team prefers pytest over unittest" in 15 sessions, but Agent B has no idea.
Dreaming运行后:
After Dreaming runs:
- 跨Agent扫描所有3个Agent的记忆和会话记录
- Cross-agent scan of all 3 agents' memories and session records
- 发现:"三个Agent的15次会话中都提到了pytest优先"
- Discovery: "All 3 agents mentioned pytest preference in 15 sessions"
- 将其提炼为一条全局记忆,写入所有Agent共享的记忆库
- Extract and write as a global memory shared by all agents
结果:下次Agent B生成测试文档时,自动使用pytest格式——它不需要被告诉,是"做梦"时学会的。
Result: Next time Agent B generates test documents, it automatically uses pytest format — it didn't need to be told; it learned while "dreaming".
案例二:重复错误的自动纠正
Case 2: Recurring Error Auto-correction
场景:客服Agent连续两周都在同一个问题上犯错——总把订单号格式搞错。用户每次纠正它,但纠正信息散落在50次不同的会话记忆中。
Scenario: A customer service agent makes the same mistake for two weeks — consistently getting order number format wrong. Users correct it each time, but corrections are scattered across 50 different session memories.
Dreaming运行后:
After Dreaming runs:
- 识别出:"这个错误出现了50次,纠正也出现了50次"
- Identifies: "This error appeared 50 times, corrections also appeared 50 times"
- 将其提炼为一条强规则:"订单号格式必须是10位纯数字"
- Extracts as a strong rule: "Order numbers must be 10 digits"
- 写入记忆
- Writes to memory
结果:Agent不再需要每次被纠正,错误率断崖式下降。
Result: The agent no longer needs to be corrected each time; error rate drops dramatically.
四、Agent的Dreaming vs 人类的Dreaming
IV. Agent Dreaming vs Human Dreaming
这是一个极其有趣的类比:
This is a fascinating analogy:
| 维度 | Dimension | 人类做梦 | Human Dreaming | Agent做梦 | Agent Dreaming |
|---|---|---|---|---|---|
| 触发时机 | Trigger | 入睡后,NREM→REM周期 | After sleep, NREM→REM cycles | 定时调度 | Scheduled timing |
| 输入素材 | Input | 白天的经历、情绪、感官信息 | Daytime experiences, emotions, sensory info | 会话转录、Memory Store | Session transcripts, Memory Store |
| 核心功能 | Core Function | 记忆巩固、情绪处理、创造性联想 | Memory consolidation, emotional processing, creative association | 记忆去重、模式提取、错误纠正 | Deduplication, pattern extraction, error correction |
| 是否自主 | Autonomy | 完全自主,大脑自动完成 | Fully autonomous | 需预设调度,执行过程自主 | Requires scheduling, autonomous execution |
| 产出 | Output | 长期记忆强化、灵感涌现 | Memory strengthening, creative insights | 精炼的记忆库、洞察报告 | Refined memory store, insight reports |
| 可否跳过 | Skippable? | 长期缺梦导致认知退化 | Long-term sleep deprivation causes cognitive decline | 可关闭,但记忆质量下降 | Can be disabled, but memory quality degrades |
| "做梦的做梦" | "Dreaming about dreaming" | 清醒梦(lucid dream) | Lucid dreaming | 审核模式 | Review mode |
最妙的相似之处
The Most Fascinating Similarity
人类大脑在REM睡眠期间会"重播"白天的经历,但不是简单回放,而是重新组织、建立新的神经连接——这就是为什么你经常会在"睡一觉"之后突然想通白天没解决的难题。
The human brain "replays" daytime experiences during REM sleep, but not as simple playback — it reorganizes and creates new neural connections. That's why you often suddenly solve problems after sleeping.
Claude Dreaming做的是几乎同样的事:它不是简单压缩记忆,而是重新组织记忆结构,发现单个Agent在活跃工作时看不到的跨会话关联。
Claude Dreaming does almost the same: it doesn't simply compress memories, but reorganizes memory structures and discovers cross-session connections invisible during active work.
五、各家方案横向对比
V. Cross-Platform Comparison
5.1 Claude Managed Agents — Dreaming(官方级)
5.1 Claude Managed Agents — Dreaming (Official)
| 维度 | Dimension | 详情 | Details |
|---|---|---|---|
| 方式 | Method | 官方内置功能,定时触发记忆整理 | Official built-in, scheduled memory consolidation |
| 触发 | Trigger | client.beta.dreams.create() API,支持cron调度 |
client.beta.dreams.create() API, cron scheduling |
| 特点 | Features | 跨Agent模式识别、可审核输出、原始数据不修改 | Cross-agent patterns, auditable output, immutable input |
| 成熟度 | Maturity | 研究预览阶段,需申请访问 | Research Preview, requires application |
5.2 OpenClaw — Dreaming(开源级)
5.2 OpenClaw — Dreaming (Open Source)
阶段模型:
Stage Model:
| 阶段 | Stage | 目的 | Purpose | 操作 | Operations |
|---|---|---|---|---|---|
| Light | 快速扫描 | Quick scan | 标记高价值信号 | Mark high-value signals | |
| REM | 跨会话聚类 | Cross-session clustering | 发现隐藏关联 | Discover hidden connections | |
| Deep | 深度重构 | Deep restructuring | 提升/降级记忆条目 | Promote/demote memory entries |
5.3 横评总结
5.3 Overall Comparison
| 方案 | 触发方式 | Trigger | 跨Agent | Cross-agent | 可审核 | Auditable | 成熟度 | Maturity |
|---|---|---|---|---|---|---|---|---|
| Claude Dreaming | API/定时 | API/Scheduled | ✅ | ✅ | 研究预览 | Research Preview | ||
| OpenClaw | cron(3AM) | cron(3AM) | ✅ | ✅ | 已发布 | Released | ||
| Gemini | 后台异步 | Background async | ❌ | ❌ | 已部署 | Deployed | ||
| OpenAI Memory | 对话中提取 | In-conversation | ❌ | ✅ | 已上线 | Live |
六、理论基础:Sleep-time Compute
VI. Theoretical Foundation: Sleep-time Compute
UC Berkeley和Letta团队在2025年4月发表了"Sleep-time Compute: Beyond Inference Scaling at Test-time"论文(arXiv: 2504.13171),从理论上支撑了Agent"做梦"的设计。
UC Berkeley and Letta teams published "Sleep-time Compute: Beyond Inference Scaling at Test-time" (arXiv: 2504.13171) in April 2025, providing theoretical support for agent "dreaming" design.
6.1 核心公式
6.1 Core Formula
传统流程:用户查询 → 即时推理 → 输出答案
↑ 每次都重复计算相同上下文
Sleep-time流程:空闲时段预处理 → 存储learned context → 查询时快速输出
↑ 一次预处理,多次复用
6.2 实验结果
6.2 Experimental Results
| 指标 | Metric | 效果 | Effect |
|---|---|---|---|
| 准确率提升 | Accuracy improvement | 13%-18%(相同测试时计算预算下) | 13-18% with same test-time compute |
| 计算成本降低 | Compute reduction | ~5倍 | ~5x |
| 平均成本降低 | Average cost reduction | 2.5倍(多个相关查询共享预处理结果) | 2.5x (shared pre-computation) |
七、对Agent开发的启示
VII. Implications for Agent Development
7.1 三个关键信号
7.1 Three Key Signals
- 从"工具"到"同事":能"做梦"的Agent,开始有了"工作经验"的积累能力
- From "tool" to "colleague": Agents that can "dream" gain the ability to accumulate "work experience"
- 从"单次智能"到"持续智能":Dreaming让Agent的记忆有了生命周期管理——写入、整理、提炼、升华
- From "single-shot" to "continuous" intelligence: Dreaming enables memory lifecycle management — write, organize, refine, elevate
- 开源与闭源的赛跑:OpenClaw的三阶段Dreaming已经开源可用,速度令人惊讶
- Open source vs closed source race: OpenClaw's three-stage Dreaming is already available as open source
7.2 架构设计建议
7.2 Architecture Design Recommendations
┌─────────────────────────────────────────────────────┐
│ Agent Memory │
├─────────────────────────────────────────────────────┤
│ │
│ ┌─────────────┐ ┌─────────────────────────┐ │
│ │ Memory │ ───→ │ Dreaming │ │
│ │ Write │ │ Light → REM → Deep │ │
│ │ (会话中) │ │ (定时/离线执行) │ │
│ └─────────────┘ └─────────────────────────┘ │
│ │ │ │
│ │ ↓ │
│ │ ┌─────────────────┐ │
│ └─────────────→ │ 精炼后的记忆库 │ │
│ │ (可审核/可回滚) │ │
│ └─────────────────┘ │
└─────────────────────────────────────────────────────┘
7.3 实现要点
7.3 Implementation Key Points
| 要点 | Key Point | 说明 | Description |
|---|---|---|---|
| 原始数据不修改 | Immutable Original | 始终生成新版本,保留审核和回滚能力 | Always generate new versions for audit and rollback |
| 三阶段处理 | Three-stage Process | Light快速扫描 → REM模式聚类 → Deep重构 | Light scan → REM clustering → Deep restructuring |
| 可配置调度 | Configurable Scheduling | 支持cron定时,默认凌晨执行 | Support cron scheduling, default 3AM execution |
| Token成本控制 | Token Cost Control | 设置最大迭代次数,审核模式可选 | Set max iterations, optional review mode |
八、实测案例:企业应用效果
VIII. Enterprise Case Studies
| 公司 | 场景 | Scenario | 效果 | Effect |
|---|---|---|---|---|
| Harvey (法律AI) |
长文法律文书起草 | Long-form legal document drafting | Agent完成率涨了约6倍 | Completion rate increased by ~6x |
| Netflix (平台工程) |
日志分析agent | Log analysis agent | 多Agent并行分析,只浮出反复出现的问题 | Multi-agent parallel analysis, surfacing recurring issues |
| Spiral (写作工具) |
多稿件并行生成 | Parallel document generation | Haiku领队 + Opus子Agent + Outcomes评分 | Haiku leader + Opus sub-agents + Outcomes scoring |
| Wisedocs (医疗文档) |
质检审核 | Quality inspection | 快了50%,多抓了30%错误 | 50% faster, 30% more errors caught |
九、结语
IX. Conclusion
2026年,AI Agent开始"做梦"了。
In 2026, AI agents begin to "dream".
这不是为了制造一个更有"人性"的机器,而是为了解决一个纯粹的工程问题:如何让Agent在不增加上下文窗口的情况下,变得更聪明。
This isn't about creating a more "human-like" machine, but solving a pure engineering problem: how to make agents smarter without increasing context windows.
也许有一天,我们会认真思考一个问题:当Agent开始做梦,它梦到的会是什么?
Perhaps one day we'll seriously consider: when agents start dreaming, what do they dream about?
是更高效的代码?更准确的答案?还是一个没有bug的世界?
More efficient code? More accurate answers? Or a world without bugs?
Agent已经不需要等你醒来才开始进化了。
Agents no longer need to wait for you to wake up before they start evolving.
🔗 相关链接
🔗 Related Links
💭 思考与实践
💭 Reflections & Practice
- 对于个人Agent开发:可以考虑参考OpenClaw的三阶段模型,在自己的Agent中实现轻量级的记忆整理功能
- For personal agent development: Consider implementing a lightweight memory consolidation based on OpenClaw's three-stage model
- 对于企业级应用:跨Agent模式发现是最有价值的功能,可以显著提升多Agent协作效率
- For enterprise applications: Cross-agent pattern discovery is the most valuable feature
- 对于AI产品设计:记忆写入(Memory)和记忆整理(Dreaming)应该被设计为一对互补机制
- For AI product design: Memory write and consolidation should be designed as complementary mechanisms