← 返回技能笔记

⚙ SkVM - Skills虚拟机深度研究

上海交通大学IPADS实验室 · 2026年4月 SJTU IPADS Lab · April 2026

📚 学习来源

📚 Learning Source

类型 Type 学术论文 + 开源项目 Research Paper + Open Source
ArXiv 2604.03088
GitHub SJTU-IPADS/SkVM
作者 Authors Le Chen, Erhu Feng, Yubin Xia, Haibo Chen Le Chen, Erhu Feng, Yubin Xia, Haibo Chen
机构 Institution 上海交通大学 IPADS 实验室 Shanghai Jiao Tong University IPADS Lab

🎯 核心收获

🎯 Key Takeaways

1. 问题洞察深刻Deep Problem Analysis

同一技能在不同模型与执行框架上的表现差异显著,有15%的任务在使用技能后性能下降,87%的任务至少有一个模型无法从技能获益。

The same skill behaves inconsistently across different models and frameworks. 15% of tasks degrade in performance, and 87% have at least one model that cannot benefit from the skill.

2. 核心设计理念Core Design Philosophy

将技能视为"代码",LLM视为"异构处理器",用编译器思想解决技能移植与效率问题,实现"一次编写,处处高效运行"。

Treating skills as "code" and LLMs as "heterogeneous processors", using compiler concepts to solve skill portability and efficiency problems.

3. 系统性解决方案Systematic Solution

通过能力画像(26种原子能力)、AOT编译(能力适配/环境绑定/并发提取)和JIT优化(代码固化/自适应重编译),构建完整的编译与运行时体系。

Building a complete compilation and runtime system through capability profiling (26 primitive capabilities), AOT compilation, and JIT optimization.

4. 实际效果显著Significant Results

任务完成率平均提升15.3%,Token消耗最多降低40%,并行加速至3.2倍,代码固化延迟降低19-50倍。

15.3% average task completion improvement, up to 40% token reduction, 3.2× parallel speedup, and 19-50× latency reduction through code solidification.

5. 生态兼容性强Strong Ecosystem Compatibility

支持 OpenClaw、Hermes、OpenCode、pi Agent 等主流框架,已内置预构建的能力画像。

Supports OpenClaw, Hermes, OpenCode, pi Agent and other mainstream frameworks with pre-built capability profiles.

📖 正文内容

📖 Main Content

一、背景与问题Background and Problems

1.1 技能生态的快速发展

1.1 Rapid Development of Skills Ecosystem

2025年以来,AI Agent 智能体生态进入爆发期。OpenClaw、Hermes、Claude Code 等框架通过 Skills(技能)模块化封装领域知识,形成繁荣的技能市场。clawhub.ai 和 skills.sh 两大平台已上架超过118,000个技能包,覆盖数据分析、金融分析、办公自动化、编程开发等几乎所有常见场景。

Since 2025, the AI Agent ecosystem has entered an explosive growth period. Frameworks like OpenClaw, Hermes, and Claude Code encapsulate domain knowledge through modular Skills, creating a thriving skills marketplace with over 118,000 skills across various domains.

1.2 技能的"水土不服"问题

1.2 Skills "Water and Soil Incompatibility" Problem

上海交通大学IPADS实验室对11.8万个技能进行系统分析,发现了一个令人不安的事实:同一技能在不同模型或框架下的表现天差地别。

Shanghai Jiao Tong University's IPADS lab conducted a systematic analysis of 118,000 skills and discovered a troubling fact: the same skill behaves vastly differently across models or frameworks.

15%
任务性能下降
Tasks with degraded performance
17%
得分不变
Tasks with unchanged score
87%
至少一模型无法获益
Tasks where at least one model cannot benefit

1.3 三大"不匹配"挑战

1.3 Three Major "Mismatch" Challenges

🔴 模型不匹配Model Mismatch

不同AI模型的能力差异极大。技能包的写法往往默认模型具备完美遵循指令的能力,导致在能力较弱的模型上无法正确理解和执行。

Different AI models have vastly different capabilities. Skills are often written assuming the model has perfect instruction-following ability, causing failures on weaker models.

🟡 脚手架不匹配Harness Mismatch

智能体运行在"代理脚手架(Agent Harness)"之中,不同框架在工具调用接口、上下文管理方式、错误处理机制上的差异,会直接影响技能的执行效果。

Agents run within "harnesses" with different tool call interfaces, context management, and error handling mechanisms that directly affect skill execution.

🟠 环境不匹配Environment Mismatch

技能包中可能声明"需要Python 3.9+"、"依赖numpy库",但用户机器上根本没有安装这些依赖,导致模型盲目重试浪费Token。

Skills may require certain Python packages or tools that aren't installed on the user's machine, causing blind retries and token waste.

二、SkVM的核心设计理念SkVM's Core Design Philosophy

2.1 灵感来源:编译器思想

2.1 Inspiration: Compiler Concepts

研究团队提出了一个核心观点:在Agent时代,Skills就是"代码",而不同的LLM就是"异构的处理器"。

The research team proposed a core insight: In the Agent era, Skills are "code" and different LLMs are "heterogeneous processors".

传统计算机系统 Traditional Computing Agent时代 Agent Era
C语言源码 C Source Code Skills Skills (natural language code)
处理器 Processors LLM模型 LLM Models
编译器+虚拟机 Compiler + VM SkVM SkVM

2.2 SkVM整体架构

2.2 SkVM Architecture

SkVM借鉴了传统编译技术中的三种编译方式:

SkVM draws from three compilation techniques in traditional computing:

  1. 解释执行Interpreted execution:当前主流方式,直接将技能文本传递给模型
  2. 提前编译(AOT)Ahead-of-Time (AOT) compilation:在技能安装时预先编译和优化
  3. 即时优化(JIT)Just-in-Time (JIT) optimization:在运行过程中动态优化
AOT编译阶段(安装时)
AOT Compilation (at install time)
  • 基于能力的编译
  • Capability-based compilation
  • 环境绑定
  • Environment binding
  • 并发提取
  • Concurrency extraction
JIT优化阶段(运行时)
JIT Optimization (at runtime)
  • 代码固化
  • Code solidification
  • 自适应重编译
  • Adaptive recompilation

三、AOT提前编译的三个步骤Three Steps of AOT Compilation

3.1 步骤一:基于能力的编译

3.1 Step 1: Capability-Based Compilation

研究者提炼出了跨越四个类别的26种原始能力,并为每种能力定义了多个熟练度级别:

Researchers extracted 26 primitive capabilities across four categories, with multiple proficiency levels for each:

类别 Category 原始能力示例 Example
代码生成 Code Generation gen.code.shell L1: 基础命令 / L2: 管道重定向 / L3: 复杂脚本 L1: Basic / L2: Piping / L3: Complex
工具使用 Tool Usage tool.api.call L1-L5 不同复杂度 L1-L5 varying complexity
推理能力 Reasoning reason.planning L1-L3 不同深度 L1-L3 varying depth
指令遵循 Instruction Following follow.json.format L1-L3 不同严格程度 L1-L3 varying strictness

3.2 步骤二:环境绑定

3.2 Step 2: Environment Binding

编译器从技能中提取所有依赖项,生成安装/检验脚本。运行前一键配好环境,避免模型盲目重试。

The compiler extracts all dependencies from the skill and generates install/check scripts for one-click environment setup.

# 环境绑定示例
if ! python -c "import pandas" 2>/dev/null; then
    pip install pandas
fi

3.3 步骤三:并发提取

3.3 Step 3: Concurrency Extraction

AOT编译能够发掘技能执行过程中不同粒度的并行机会:

AOT compilation discovers parallel opportunities at different granularities:

DLP 数据级并行Data-Level

一条指令,多个数据(如批量处理多个文件)

One instruction, multiple data (e.g., batch file processing)

ILP 指令级并行Instruction-Level

无依赖的指令并行发射(如独立的API调用)

Parallel emission of independent instructions (e.g., independent API calls)

TLP 线程级并行Thread-Level

多个独立sub-agent完成不同子任务

Multiple independent sub-agents for different subtasks

四、JIT运行时优化JIT Runtime Optimization

4.1 代码固化

4.1 Code Solidification

Skill中定义的脚本往往是可变参数的代码模板。每次运行LLM都需要重新生成,导致Token浪费。代码固化技术:

Skill-defined scripts are often template code with parameters. LLM regenerates them each time, wasting tokens. Code solidification:

  1. AOT阶段:生成代码指纹、模板、参数列表
  2. 运行阶段:匹配成功则直接固化执行,跳过LLM重新生成
  3. 4.2 自适应重编译

    4.2 Adaptive Recompilation

    运行中出现报错/重试时,系统收集错误日志反馈给编译器,自动重新优化技能,防止同类错误重复发生。

    When errors occur during execution, the system collects error logs and feeds them back to the compiler for automatic skill re-optimization.

五、实验结果与效果评估Experimental Results

+15.3%
任务完成率提升
Task completion rate increase
-40%
Token消耗降低
Token consumption reduction
3.2×
并行加速
Parallel speedup
19-50×
延迟降低
Latency reduction

支持的框架

Supported Frameworks

OpenClaw Hermes OpenCode pi Agent BareAgent

六、SkVM安装与使用指南SkVM Installation and Usage Guide

6.1 安装SkVM

6.1 Install SkVM

# curl一键安装(macOS / Linux)
curl -fsSL https://skillvm.ai/install.sh | sh

# 或通过npm安装
npm i -g @ipads-skvm/skvm

6.2 快速开始

6.2 Quick Start

# 1. 配置SkVM
skvm config init

# 2. 画像模型能力(约20分钟)
skvm profile \
  --adapter=bare-agent \
  --model=anthropic/claude-sonnet-4.6

# 3. 编译技能
skvm aot-compile \
  --skill=path/to/skill-dir \
  --model=anthropic/claude-sonnet-4.6 \
  --adapter=bare-agent \
  --pass=1

# 4. JIT优化
skvm jit-optimize \
  --skill=path/to/skill-dir \
  --task-source=synthetic

七、对知识库建设的启示Implications for Knowledge Base Development

📋 技能元数据标准化Skill Metadata Standardization
  • 明确标注适用的模型能力等级
  • Clearly mark applicable model capability levels
  • 声明所有依赖(Python包、工具、系统服务)
  • Declare all dependencies
  • 提供并发执行提示
  • Provide concurrency execution hints
🧪 技能测试体系Skill Testing System
  • 跨模型测试(至少2-3个主流模型)
  • Cross-model testing
  • 跨框架测试(OpenClaw、Hermes等)
  • Cross-framework testing
  • 环境依赖测试
  • Environment dependency testing
📝 技能编写规范Skill Writing Standards
  • 避免复杂的相对路径
  • Avoid complex relative paths
  • 避免过于复杂的shell管道
  • Avoid overly complex shell pipes
  • 提供降级方案
  • Provide fallback options

八、总结与展望Summary and Outlook

SkVM代表了Agent技能系统的一个重要方向:从"自然语言代码"到"可编译可优化"的技能文件系统。

SkVM represents an important direction for Agent skills systems: from "natural language code" to "compilable and optimizable" skill file systems.

核心价值

Core Value

  • 提升稳定性:通过能力适配和环境绑定,解决技能在不同环境下的不稳定性
  • Improved stability: Resolves skill instability across environments
  • 提高效率:通过并发提取和代码固化,大幅降低Token消耗和延迟
  • Higher efficiency: Significantly reduces token consumption and latency
  • 促进生态:通过统一的能力画像和编译接口,促进技能跨平台共享
  • Ecosystem promotion: Facilitates cross-platform skill sharing

未来方向

Future Directions

  • 更丰富的原子能力(扩展26种原始能力)
  • Richer primitive capabilities
  • 更智能的编译策略(引入机器学习)
  • Smarter compilation strategies
  • 更强的安全机制
  • Stronger security mechanisms
  • 更广泛的生态支持
  • Broader ecosystem support

🔗 相关链接

🔗 Related Links

💭 思考与实践

💭 Reflections and Practice

对Agent开发的启示

Implications for Agent Development

SkVM的成功表明,Agent系统的未来在于工程化而非"魔法提示词"。

SkVM's success shows that the future of Agent systems lies in engineering rather than "magic prompts".

  • 放弃"万能提示词"幻想:承认不同模型、不同框架的能力差异
  • Abandon the "universal prompt" illusion
  • 拥抱编译思维:像写代码一样写技能
  • Embrace compiler thinking
  • 关注效率:Token消耗和延迟是生产环境的关键指标
  • Focus on efficiency
  • 建立测试体系:技能不是"写完就忘"
  • Build testing systems

个人反思

Personal Reflection

SkVM最打动我的,是它将一个看似混乱的问题(技能在不同环境下表现不一致),通过系统性的分析(三种不匹配)和经典的设计模式(编译器),转化为一个可工程化解决的问题。

What strikes me most about SkVM is how it transforms a seemingly chaotic problem (skills behaving inconsistently across environments) into an engineerable solution through systematic analysis and classic design patterns.

这让我想起雷军的"七字诀":专注、极致、口碑、快

This reminds me of Lei Jun's "Seven Character Mantra": Focus, Extremity, Reputation, Speed.

  • 专注:SkVM专注解决一个问题——技能的可移植性和效率
  • Focus: Solving one problem
  • 极致:通过26种原子能力、AOT+JIT双层优化,做到极致
  • Extremity: Through comprehensive optimization
  • 口碑:在8个模型、3个框架上评测,用数据说话
  • Reputation: Measured rigorously
  • :19-50倍延迟降低,3.2倍加速,真正的快
  • Speed: Real performance gains

这就是工程师思维的力量:不抱怨问题,而是用系统化的方法解决问题。

This is the power of engineering thinking: don't complain about problems, solve them systematically.