← 返回开源项目

GenericAgent深度学习笔记:自我进化的极简Agent框架

GenericAgent Deep Dive: Self-Evolving Minimalist Agent Framework

🔗 GitHub 项目地址 📄 arXiv技术报告

一、项目概览

1. Project Overview

GenericAgent 是一个极简、可自我进化的自主Agent框架。核心仅 ~3K行代码,通过 9个原子工具 + ~100行Agent Loop,赋予任意LLM对本地计算机的系统级控制能力。

GenericAgent is a minimal, self-evolving autonomous agent framework. Its core is just ~3K lines of code. Through 9 atomic tools + ~100-line Agent Loop, it grants any LLM system-level control over a local computer.

💡 核心设计理念

"不预设技能,靠进化获得能力" — Don't preload skills, evolve them.

"Don't preload skills — evolve them."

项目信息 Project Info 内容
GitHub lsdefine/GenericAgent
Star数 446+ (单日新增)
代码量 ~3K行核心代码
上下文窗口 <30K(其他Agent的1/10)
支持模型 Claude / Gemini / Kimi / MiniMax

二、核心技术:上下文信息密度最大化

2. Core Technology: Context Information Density Maximization

技术报告的核心观点:长时程Agent性能不取决于上下文长度,而取决于有限上下文内维持了多少决策相关信息。

The core insight from the technical report: Long-horizon agent performance is determined not by context length, but by how much decision-relevant information is maintained within a finite context budget.

传统Agent:200K-1M上下文窗口 → 大量噪声 → 幻觉增加 → 成功率下降

Traditional Agents: 200K-1M context → Heavy noise → More hallucinations → Lower success rate


GenericAgent:<30K上下文 → 信息密度高 → 噪声少 → 成功率反而更高

GenericAgent: <30K context → High info density → Less noise → Higher success rate

四个核心组件

Four Core Components

组件 Component 功能 Function
1. 最小原子工具集 Minimal Atomic Toolset 保持接口简单,通过组合产生无限能力 Keep interface simple, generate infinite capabilities through combination
2. 分层按需记忆 Hierarchical On-demand Memory 默认只显示高层概览,需要时才展开 Show only high-level overview by default, expand when needed
3. 自我进化机制 Self-Evolution Mechanism 将验证过的轨迹转化为可复用SOP和代码 Transform verified trajectories into reusable SOPs and code
4. 上下文压缩层 Context Compression Layer 在长执行中保持信息密度 Maintain information density during long executions

三、自我进化机制详解

3. Self-Evolution Mechanism Deep Dive

这是GenericAgent区别于其他Agent框架的根本所在。每次解决新任务,Agent就将执行路径自动固化为Skill,供后续直接调用。

This is what fundamentally distinguishes GenericAgent from other agent frameworks. Every time it solves a new task, it automatically crystallizes the execution path into a Skill for direct reuse later.

[遇到新任务] → [自主摸索] → [将执行路径固化为Skill] → [写入记忆层] → [下次同类任务直接调用]

进化示例对比

Evolution Examples

你说的一句话 Your Input 第一次做了什么 First Time 之后每次 Every Time After
"监控股票并提醒我" Install mootdx → 构建选股流程 → 配置定时任务 → 保存Skill 一句话启动
"用Gmail发这个文件" 配置OAuth → 编写发送脚本 → 保存Skill 直接可用

🌟 用几周后,你的Agent实例将拥有一套任何人都没有的专属技能树,全部从3K行种子代码中生长而来。

🌟 After a few weeks, your agent instance will have a skill tree no one else in the world has — all grown from 3K lines of seed code.

四、5层分层记忆系统

4. 5-Layer Hierarchical Memory System

层级 名称 Name 功能 Function
L0 元规则 Meta Rules Agent的基础行为规则和系统约束 Core behavioral rules and system constraints
L1 记忆索引 Insight Index 极简索引层,用于快速路由与召回 Minimal memory index for fast routing and recall
L2 全局事实 Global Facts 长期运行中积累的稳定知识 Stable knowledge accumulated over long-term operation
L3 任务Skills/SOPs Task Skills / SOPs 完成特定任务的可复用流程 Reusable workflows for completing specific task types
L4 会话归档 Session Archive 从已完成任务提炼的归档记录 Archived task records distilled from finished sessions

五、9个原子工具

5. 9 Atomic Tools

GenericAgent仅提供9个原子工具,构成与外部世界交互的基础能力。工具数量少,但通过组合可以产生无限能力。

GenericAgent provides only 9 atomic tools, forming the foundational capabilities for interacting with the outside world. Few tools, but infinite capabilities through combination.

工具 功能 Function
code_run 执行任意代码 Execute arbitrary code
file_read 读取文件 Read files
file_write 写入文件 Write files
file_patch 修改文件 Patch/modify files
web_scan 感知网页内容 Perceive web content
web_execute_js 控制浏览器行为 Control browser behavior
ask_user 人机协作确认 Human-in-the-loop confirmation
update_working_checkpoint 记忆管理:持久化上下文 Memory management: persist context
start_long_term_update 记忆管理:跨会话积累经验 Memory management: accumulate experience across sessions

六、与同类产品对比

6. Comparison with Similar Products

特性 GenericAgent OpenClaw Claude Code
代码量 ~3K行 ~530K行 已开源(体量大)
部署方式 pip install + API Key 多服务编排 CLI + 订阅
浏览器控制 注入真实浏览器(保留登录态) 沙箱/无头浏览器 通过MCP插件
OS控制 键鼠、视觉、ADB 多Agent委派 文件+终端
自我进化 ✅ 自主生长Skill和工具 插件生态 ❌ 会话间无状态
Token消耗 <30K上下文 200K-1M 中等

七、快速开始

7. Quick Start

方法一:标准安装

Method 1: Standard Installation

# 1. 克隆仓库
git clone https://github.com/lsdefine/GenericAgent.git
cd GenericAgent

# 2. 安装最小依赖
pip install requests streamlit pywebview

# 3. 配置API Key
cp mykey_template.py mykey.py
# 编辑mykey.py,填入你的LLM API Key

# 4. 启动
python launch.pyw

方法二:uv快速安装

Method 2: uv Quick Installation

git clone https://github.com/lsdefine/GenericAgent.git
cd GenericAgent
uv pip install -e ".[ui]"
cp mykey_template.py mykey.py
python launch.pyw

多平台Bot支持

Multi-Platform Bot Support

# 微信Bot
pip install pycryptodome qrcode requests
python frontends/wechatapp.py

# QQ Bot
pip install qq-botpy
python frontends/qqapp.py

# 飞书Bot
pip install lark-oapi
python frontends/fsapp.py

八、核心金句

8. Key Quotes

"don't preload skills — evolve them."

"Long-horizon performance is determined not by context length, but by how much decision-relevant information is maintained within a finite context budget."

"After a few weeks, your agent instance will have a skill tree no one else in the world has — all grown from 3K lines of seed code."

九、相关链接

9. Related Links

资源 链接
GitHub https://github.com/lsdefine/GenericAgent
技术报告 https://arxiv.org/abs/2604.17091
Datawhale教程 https://datawhalechina.github.io/hello-generic-agent/