Claude Dreaming - AI Agent的"做梦"记忆整理机制

Claude Dreaming - AI Agent's "Dreaming" Memory Consolidation System

← 返回技术AI

🎯 核心收获

🎯 Key Takeaways

一、背景:为什么Agent需要"做梦"?

I. Background: Why Do Agents Need to "Dream"?

1.1 LLM的上下文窗口限制

1.1 LLM Context Window Limitations

无论上下文窗口有多大,都不可能无限保存所有历史对话。在实际工作中,Agent的记忆库会迅速膨胀:

No matter how large the context window is, it cannot infinitely store all historical conversations. In practice, an agent's memory store quickly accumulates:

Anthropic发现,随着项目推进,Memory Store里充满了这些问题,导致Agent表现越来越差。

Anthropic found that as projects progressed, Memory Stores became filled with these issues, causing deteriorating agent performance.

1.2 从"工具"到"同事"的转变

1.2 From "Tool" to "Colleague"

传统Agent是纯被动的:你问它答,你让它做它才做。它的"学习"局限于单次对话的上下文窗口。

Traditional agents are purely passive: you ask, they answer; you tell them to do, they do. Their "learning" is limited to a single conversation's context window.

Dreaming打破了这个限制:Agent可以在没有用户输入的时间段里,主动整理经验、发现规律、纠正错误。

Dreaming breaks this limitation: Agents can proactively organize experiences, discover patterns, and correct errors during periods without user input.

能"做梦"的Agent,不再只是一个执行命令的工具。它开始有了"工作经验"的积累能力——就像一个新员工,白天干活,晚上复盘,第二天变得更强。

An agent that can "dream" is no longer just a tool for executing commands. It begins to accumulate "work experience" — like a new employee who works during the day, reflects at night, and gets stronger the next day.

二、Claude Dreaming是什么?

II. What Is Claude Dreaming?

2.1 功能定义

2.1 Function Definition

Dreaming是一个定时触发的记忆整理流程。它会在后台自动运行,读取Agent过往的会话记录和Memory Store,识别重复模式、发现反复出现的错误、提炼跨Agent的团队偏好,然后生成一份重新组织后的、更精炼的记忆库。

Dreaming is a periodically triggered memory consolidation process. It runs automatically in the background, reading past session records and Memory Stores, identifying duplicate patterns, discovering recurring errors, extracting cross-team preferences, and generating a reorganized, refined memory store.

"Dreaming surfaces patterns that a single agent can't see on its own, including recurring mistakes, workflows that agents converge on, and preferences shared across a team."

"做梦能让Agent看到单个Agent独自工作时看不到的东西。"

2.2 技术实现

2.2 Technical Implementation

开发者通过client.beta.dreams.create() API触发Dreaming pipeline:

Developers trigger the Dreaming pipeline via the client.beta.dreams.create() API:

from anthropic import Anthropic

client = Anthropic()

# 触发Dreaming
dream = client.beta.dreams.create(
    model="claude-sonnet-4-6",  # 指定运行Dreaming的模型
    memory_store="ms_abc123",    # 目标Memory Store
)

# 等待完成
while dream.status in ("pending", "running"):
    dream = client.beta.dreams.retrieve(dream.id)

# 获取输出的新Memory Store ID
new_store_id = dream.outputs[0].id

2.3 关键设计要点

2.3 Key Design Principles

设计要点 Design Principle 说明 Description
输入存储不可修改 Immutable Input Dreaming不会直接改动原始记忆,而是生成一份新的记忆存储供审核 Dreaming never modifies original memories directly; generates new stores for review
支持模型 Supported Models Opus 4.7 / Sonnet 4.6(研究预览阶段) Opus 4.7 / Sonnet 4.6 (Research Preview)
跨Agent模式识别 Cross-agent Pattern 跨越多个Agent、多个会话,发现全局模式 Discovers global patterns across multiple agents and sessions
自动 vs 审核模式 Auto vs Review 用户可以选择全自动整理,也可以逐条审核后再应用 Users can choose fully automatic consolidation or review line-by-line
最多读取100个会话 Session Limit 每个Dreaming任务最多分析100个历史会话 Each Dreaming task analyzes up to 100 historical sessions

三、实际案例:Dreaming能做什么?

III. Real Cases: What Can Dreaming Do?

案例一:多Agent协作中的模式发现

Case 1: Cross-Agent Pattern Discovery

场景:一个电商团队部署了3个Claude Managed Agents,分别处理代码审查、文档生成和测试编写。

Scenario: An e-commerce team deploys 3 Claude Managed Agents handling code review, documentation, and test writing respectively.

问题:在没有Dreaming之前,每个Agent独立工作,记忆彼此隔离——Agent A在15次会话中发现"团队偏好使用pytest而非unittest",但这条信息只存在于Agent A的记忆里,Agent B完全不知道。

Problem: Without Dreaming, each agent works independently with isolated memories — Agent A discovers "team prefers pytest over unittest" in 15 sessions, but Agent B has no idea.

Dreaming运行后

After Dreaming runs:

结果:下次Agent B生成测试文档时,自动使用pytest格式——它不需要被告诉,是"做梦"时学会的。

Result: Next time Agent B generates test documents, it automatically uses pytest format — it didn't need to be told; it learned while "dreaming".

案例二:重复错误的自动纠正

Case 2: Recurring Error Auto-correction

场景:客服Agent连续两周都在同一个问题上犯错——总把订单号格式搞错。用户每次纠正它,但纠正信息散落在50次不同的会话记忆中。

Scenario: A customer service agent makes the same mistake for two weeks — consistently getting order number format wrong. Users correct it each time, but corrections are scattered across 50 different session memories.

Dreaming运行后

After Dreaming runs:

结果:Agent不再需要每次被纠正,错误率断崖式下降。

Result: The agent no longer needs to be corrected each time; error rate drops dramatically.

四、Agent的Dreaming vs 人类的Dreaming

IV. Agent Dreaming vs Human Dreaming

这是一个极其有趣的类比:

This is a fascinating analogy:

维度 Dimension 人类做梦 Human Dreaming Agent做梦 Agent Dreaming
触发时机 Trigger 入睡后,NREM→REM周期 After sleep, NREM→REM cycles 定时调度 Scheduled timing
输入素材 Input 白天的经历、情绪、感官信息 Daytime experiences, emotions, sensory info 会话转录、Memory Store Session transcripts, Memory Store
核心功能 Core Function 记忆巩固、情绪处理、创造性联想 Memory consolidation, emotional processing, creative association 记忆去重、模式提取、错误纠正 Deduplication, pattern extraction, error correction
是否自主 Autonomy 完全自主,大脑自动完成 Fully autonomous 需预设调度,执行过程自主 Requires scheduling, autonomous execution
产出 Output 长期记忆强化、灵感涌现 Memory strengthening, creative insights 精炼的记忆库、洞察报告 Refined memory store, insight reports
可否跳过 Skippable? 长期缺梦导致认知退化 Long-term sleep deprivation causes cognitive decline 可关闭,但记忆质量下降 Can be disabled, but memory quality degrades
"做梦的做梦" "Dreaming about dreaming" 清醒梦(lucid dream) Lucid dreaming 审核模式 Review mode

最妙的相似之处

The Most Fascinating Similarity

人类大脑在REM睡眠期间会"重播"白天的经历,但不是简单回放,而是重新组织、建立新的神经连接——这就是为什么你经常会在"睡一觉"之后突然想通白天没解决的难题。

The human brain "replays" daytime experiences during REM sleep, but not as simple playback — it reorganizes and creates new neural connections. That's why you often suddenly solve problems after sleeping.

Claude Dreaming做的是几乎同样的事:它不是简单压缩记忆,而是重新组织记忆结构,发现单个Agent在活跃工作时看不到的跨会话关联

Claude Dreaming does almost the same: it doesn't simply compress memories, but reorganizes memory structures and discovers cross-session connections invisible during active work.

五、各家方案横向对比

V. Cross-Platform Comparison

5.1 Claude Managed Agents — Dreaming(官方级)

5.1 Claude Managed Agents — Dreaming (Official)

维度 Dimension 详情 Details
方式 Method 官方内置功能,定时触发记忆整理 Official built-in, scheduled memory consolidation
触发 Trigger client.beta.dreams.create() API,支持cron调度 client.beta.dreams.create() API, cron scheduling
特点 Features 跨Agent模式识别、可审核输出、原始数据不修改 Cross-agent patterns, auditable output, immutable input
成熟度 Maturity 研究预览阶段,需申请访问 Research Preview, requires application

5.2 OpenClaw — Dreaming(开源级)

5.2 OpenClaw — Dreaming (Open Source)

阶段模型

Stage Model:

阶段 Stage 目的 Purpose 操作 Operations
Light 快速扫描 Quick scan 标记高价值信号 Mark high-value signals
REM 跨会话聚类 Cross-session clustering 发现隐藏关联 Discover hidden connections
Deep 深度重构 Deep restructuring 提升/降级记忆条目 Promote/demote memory entries

5.3 横评总结

5.3 Overall Comparison

方案 触发方式 Trigger 跨Agent Cross-agent 可审核 Auditable 成熟度 Maturity
Claude Dreaming API/定时 API/Scheduled 研究预览 Research Preview
OpenClaw cron(3AM) cron(3AM) 已发布 Released
Gemini 后台异步 Background async 已部署 Deployed
OpenAI Memory 对话中提取 In-conversation 已上线 Live

六、理论基础:Sleep-time Compute

VI. Theoretical Foundation: Sleep-time Compute

UC Berkeley和Letta团队在2025年4月发表了"Sleep-time Compute: Beyond Inference Scaling at Test-time"论文(arXiv: 2504.13171),从理论上支撑了Agent"做梦"的设计。

UC Berkeley and Letta teams published "Sleep-time Compute: Beyond Inference Scaling at Test-time" (arXiv: 2504.13171) in April 2025, providing theoretical support for agent "dreaming" design.

6.1 核心公式

6.1 Core Formula

传统流程:用户查询 → 即时推理 → 输出答案
           ↑ 每次都重复计算相同上下文

Sleep-time流程:空闲时段预处理 → 存储learned context → 查询时快速输出
                 ↑ 一次预处理,多次复用

6.2 实验结果

6.2 Experimental Results

指标 Metric 效果 Effect
准确率提升 Accuracy improvement 13%-18%(相同测试时计算预算下) 13-18% with same test-time compute
计算成本降低 Compute reduction ~5倍 ~5x
平均成本降低 Average cost reduction 2.5倍(多个相关查询共享预处理结果) 2.5x (shared pre-computation)

七、对Agent开发的启示

VII. Implications for Agent Development

7.1 三个关键信号

7.1 Three Key Signals

  1. 从"工具"到"同事":能"做梦"的Agent,开始有了"工作经验"的积累能力
  2. From "tool" to "colleague": Agents that can "dream" gain the ability to accumulate "work experience"
  3. 从"单次智能"到"持续智能":Dreaming让Agent的记忆有了生命周期管理——写入、整理、提炼、升华
  4. From "single-shot" to "continuous" intelligence: Dreaming enables memory lifecycle management — write, organize, refine, elevate
  5. 开源与闭源的赛跑:OpenClaw的三阶段Dreaming已经开源可用,速度令人惊讶
  6. Open source vs closed source race: OpenClaw's three-stage Dreaming is already available as open source

7.2 架构设计建议

7.2 Architecture Design Recommendations

┌─────────────────────────────────────────────────────┐
│                    Agent Memory                       │
├─────────────────────────────────────────────────────┤
│                                                     │
│  ┌─────────────┐      ┌─────────────────────────┐  │
│  │   Memory    │ ───→ │        Dreaming          │  │
│  │   Write     │      │   Light → REM → Deep    │  │
│  │  (会话中)   │      │    (定时/离线执行)       │  │
│  └─────────────┘      └─────────────────────────┘  │
│         │                        │                 │
│         │                        ↓                 │
│         │              ┌─────────────────┐         │
│         └─────────────→ │  精炼后的记忆库  │         │
│                        │  (可审核/可回滚) │         │
│                        └─────────────────┘         │
└─────────────────────────────────────────────────────┘

7.3 实现要点

7.3 Implementation Key Points

要点 Key Point 说明 Description
原始数据不修改 Immutable Original 始终生成新版本,保留审核和回滚能力 Always generate new versions for audit and rollback
三阶段处理 Three-stage Process Light快速扫描 → REM模式聚类 → Deep重构 Light scan → REM clustering → Deep restructuring
可配置调度 Configurable Scheduling 支持cron定时,默认凌晨执行 Support cron scheduling, default 3AM execution
Token成本控制 Token Cost Control 设置最大迭代次数,审核模式可选 Set max iterations, optional review mode

八、实测案例:企业应用效果

VIII. Enterprise Case Studies

公司 场景 Scenario 效果 Effect
Harvey
(法律AI)
长文法律文书起草 Long-form legal document drafting Agent完成率涨了约6倍 Completion rate increased by ~6x
Netflix
(平台工程)
日志分析agent Log analysis agent 多Agent并行分析,只浮出反复出现的问题 Multi-agent parallel analysis, surfacing recurring issues
Spiral
(写作工具)
多稿件并行生成 Parallel document generation Haiku领队 + Opus子Agent + Outcomes评分 Haiku leader + Opus sub-agents + Outcomes scoring
Wisedocs
(医疗文档)
质检审核 Quality inspection 快了50%,多抓了30%错误 50% faster, 30% more errors caught

九、结语

IX. Conclusion

2026年,AI Agent开始"做梦"了。

In 2026, AI agents begin to "dream".

这不是为了制造一个更有"人性"的机器,而是为了解决一个纯粹的工程问题:如何让Agent在不增加上下文窗口的情况下,变得更聪明

This isn't about creating a more "human-like" machine, but solving a pure engineering problem: how to make agents smarter without increasing context windows.

也许有一天,我们会认真思考一个问题:当Agent开始做梦,它梦到的会是什么?

Perhaps one day we'll seriously consider: when agents start dreaming, what do they dream about?

是更高效的代码?更准确的答案?还是一个没有bug的世界?

More efficient code? More accurate answers? Or a world without bugs?

Agent已经不需要等你醒来才开始进化了。

Agents no longer need to wait for you to wake up before they start evolving.

🔗 相关链接

🔗 Related Links

💭 思考与实践

💭 Reflections & Practice

  1. 对于个人Agent开发:可以考虑参考OpenClaw的三阶段模型,在自己的Agent中实现轻量级的记忆整理功能
  2. For personal agent development: Consider implementing a lightweight memory consolidation based on OpenClaw's three-stage model
  3. 对于企业级应用:跨Agent模式发现是最有价值的功能,可以显著提升多Agent协作效率
  4. For enterprise applications: Cross-agent pattern discovery is the most valuable feature
  5. 对于AI产品设计:记忆写入(Memory)和记忆整理(Dreaming)应该被设计为一对互补机制
  6. For AI product design: Memory write and consolidation should be designed as complementary mechanisms