Andrej Karpathy AI学习方法论

学习来源：

类型：行业领袖
名称：Andrej Karpathy（安德烈·卡帕西）
链接：karpathy.ai
作者：常思杨（蒸馏整理）

核心收获

从零构建（Build from Scratch） - 深入理解底层原理，而非依赖抽象框架
软件2.0范式 - 神经网络权重即代码，数据驱动编程
Vibe Coding革命 - 自然语言编程，AI生成代码而不必审查
教育是解决稀缺的关键 - AI教学助理可扩展优质教育
"学习即实践"哲学 - 动手编码比理论更重要
LLM是非动物智能 - 它们是"召唤幽灵"，非模拟人类
AutoResearch自动化 - AI自主优化模型训练代码
Eureka Labs愿景 - 打造Starfleet Academy式的AI原生教育

正文内容

一、Karpathy是谁

Andrej Karpathy（1986年10月23日生）是AI领域最杰出的研究者和教育者之一。他的职业生涯贯穿了AI研究的三大核心机构：

OpenAI创始成员（2015-2017, 2023-2024） - 作为OpenAI早期研究科学家，参与开创性研究
Tesla AI总监（2017-2022） - 领导Autopilot视觉团队，从2人扩展到数百人团队
斯坦福大学 - 2015年获得博士学位，师从李飞飞；创建并讲授CS231n，斯坦福首个深度学习课程，从150名学生扩展到750人
Eureka Labs创始人（2024至今） - AI+教育公司，致力于 democratize deep learning education

2024年，他被《时代》杂志评为"100 Most Influential People in AI"，被马斯克誉为"全球最顶级AI领袖"。

二、Karpathy的学习哲学

2.1 从零构建（Build from Scratch）

这是Karpathy最核心的教学理念。他认为，要真正理解深度学习，必须从底层开始：

Micrograd - 仅300行Python实现的自动微分引擎，展示反向传播的本质
Makemore - 从简单bigram模型发展到GPT级别架构
零到英雄 - 从基础数学到完整GPT实现的8讲系列

这种方法培养了"直觉理解"（Intuitive Understanding），使学生能够：

调试创新现代神经网络
理解模型行为和收敛问题
避免"黑盒"依赖

2.2 软件2.0范式

Karpathy早在特斯拉时期就提出了软件2.0概念：

"软件2.0不是用Python或C++写代码，而是用神经网络生成代码"

关键转变：

传统方式： 人工编写逻辑 → GitHub维护 → 编译运行
软件2.0： 收集训练数据 → 设定训练目标 → 神经网络优化权重 → 前馈传播

在特斯拉，他亲眼目睹神经网络"吃掉"C++代码库，用小型神经网络替代传统算法。

2.3 软件3.0与自然语言编程

2025年，Karpathy进一步提出软件3.0："The hottest new programming language is English"。

这意味着：

LLM通过自然语言提示（prompts）编程
人类提供意图，AI生成实现
这是70年来软件发展的最基本变革

三、Vibe Coding：人人可编程

2025年2月，Karpathy创造了术语"Vibe Coding"：

"一种新的编程方式，你完全沉浸在氛围中，拥抱指数增长，甚至忘记代码的存在"

3.1 Vibe Coding是什么？

核心特征：

用自然语言描述需求
AI生成代码，你不审查或编辑
通过运行结果和工具评估
发现错误时直接粘贴错误信息
"Accept All"总是接受所有更改

3.2 Vibe Coding vs AI辅助编程

方法	审查代码？	理解代码？	风险等级
手动编码	是你写的	是	低
AI辅助编码	是，你审查	是，你验证	低
Vibe Coding	否	否	高（超出原型）

3.3 适用场景

适用： 原型开发、周末项目、"software for one"、探索性实验
不适用： 生产代码库、安全关键系统、需长期维护的项目

正如Simon Willison所说："如果LLM写的每一行代码你都审查、测试并理解，那不是vibe coding——那只是用LLM作为打字助手。"

四、Eureka Labs：AI原生教育

2024年7月，Karpathy宣布创办Eureka Labs，专注于AI教育。名称"Eureka"来自古希腊语，意为"理解某事后的美好感觉，像脑中咔哒一声"。

4.1 使命与愿景

"我们要建立一种新型的AI原生学校……AI的发展正在改变教育界此前面临的困境：充满热情、教学出色、耐心无限、精通世界所有语言的学科专家也非常稀缺，无法按需亲自辅导我们80亿人。然而，随着生成AI的最新进展，这种学习体验感觉很容易处理。"

核心模式：教师 + AI教学助理共生关系

4.2 首个产品：LLM101n

Eureka Labs的第一个课程产品是LLM101n："Let's Build A Storyteller"。

目标： 构建能够创作、提炼和阐释小故事的大语言模型
范围： 从基础到类ChatGPT的可运行Web应用
技术栈： Python、C和CUDA从零开始
前提要求： 最小化计算机科学知识

完整教学大纲（17章）：

Bigram语言模型（语言建模）
Micrograd（机器学习、反向传播）
N-gram模型（多层感知器、矩阵乘法、GELU）
Attention（注意力、softmax、位置编码器）
Transformer（transformer、残差、层归一化、GPT-2）
Tokenization（minBPE、字节对编码）
优化（初始化、优化、AdamW）
Deepspeed I：设备（设备，CPU，GPU...）
DS II：精度（混合精度训练，fp16，bf16，fp8...）
DS III：分布式（分布式优化、DDP、ZeRO）
数据集（数据集、数据加载、合成数据生成）
推理I：kv-cache（键值缓存）
推理II：量化（量化）
微调I：SFT（监督微调SFT、PEFT、LoRA、聊天）
微调II：RL（强化学习，RLHF，PPO，DPO）
部署（API、Web应用程序）
多模态（VQVAE、扩散transformer）

4.3 Starfleet Academy愿景

Karpathy的长远愿景是建立类似"Starfleet Academy"的教育体系：

"目标是打造完美的1对1辅导体验……就像我学习韩语时的辅导，她从很短的对话中瞬间理解我在哪里、我知什么不知什么……"

关键概念：知识的坡道（Ramps to Knowledge）

找到复杂主题的最基本、一阶近似
以简单方式呈现本质
Mircograd就是例子：仅100行代码提炼神经网络本质
其他都是效率问题

五、AI学习的最佳实践

5.1 Zero to Hero学习路径

Karpathy的"Neural Networks: Zero to Hero"系列提供了系统化学习路径：

Lecture 1: micrograd - 神经网络与反向传播入门
Lecture 2: makemore bigrams - 语言建模基础、PyTorch张量
Lecture 3: makemore MLP - 多层感知器、训练/验证
Lecture 4: makemore BatchNorm - 激活、梯度、批归一化
Lecture 5: makemore Backprop - 手动反向传播、梯度流
Lecture 6: makemore CNN - 卷积神经网络、WaveNet架构
Lecture 7: GPT - Transformer架构、自注意力机制
Lecture 8: GPT Tokenizer - 字节对编码、分词过程

每个讲座都包括：

YouTube视频讲解
Jupyter notebook代码
练习题

5.2 动手编码的关键原则

Karpathy在"A Recipe for Training Neural Networks"中分享了实用建议：

理解数据 - 先可视化数据，了解分布
从简单开始 - 先用小模型快速迭代
监控训练 - 追踪损失、准确率等指标
调试技巧 - 使用可视化工具理解网络健康
超参数调整 - 学习率、批大小、正则化等
过拟合/欠拟合 - 理解并识别这些模式

5.3 AutoResearch：AI自动优化

2026年3月，Karpathy开源了AutoResearch工具：

任务： AI智能体优化他手写的nanochat训练代码
过程： 智能体在630行Python文件中自主实验，每次跑5分钟评估，根据结果决定保留或回滚
结果： 两天内完成700次实验，找到20个有效优化，训练速度提升11%

Karpathy的反思：

"我做了二十年的研究者，训了几千次模型，觉得已经调得相当好了。结果AutoResearch跑了一晚上，AI就找到了我没有发现的优化——包括Adam优化器的betas参数没有充分调优、value embedding上忘了加weight decay等……"

5.4 LLM Wiki：个人知识管理

2026年4月，Karpathy开源了个人知识库方案：

让LLM担任全职"知识库管理员"
主动将原始资料"编译"成结构化Markdown维基
系统包含"Lint+Heal"机制
定期扫描知识库，自动发现不一致、补全缺失信息

在约100篇文章、40万字规模下，效率显著优于传统RAG。

六、对AI时代的洞察

6.1 LLM的本质：召唤幽灵

Karpathy提出一个颠覆性观点：LLM不是"培育中的动物"，而是"召唤幽灵"。

"我们不是在模拟进化。我们试图通过模仿人类倒到互联网上的数据来创造智能。"

关键区别：

动物智能： 优化目标是部落生存、自我保存；由进化塑造，包括先天驱动（权力寻求、地位、支配、繁殖）
LLM智能： 优化目标是文本预测统计、用户满意度；由商业进化塑造

这解释了LLM的"参差不齐"（jaggedness）本质：

"你同时觉得自己在跟一个极其聪明的、搞了一辈子系统编程的博士，和一个十岁小孩对话。"

6.2 强化学习的局限

Karpathy对当前的强化学习（RL）范式持批评态度：

"强化学习很糟糕。只是在我们之前的一切都更糟而已。"

问题："通过吸管吸取监督"（Sucking Supervision Through a Straw）

模型尝试100种不同解决路径
3种找到正确答案，97种失败
当前RL奖励3种成功路径的每一步，说"多这样做"
即使走了错误巷道，只要找到正确答案，所有步骤都被提升

人类从不这样学习。人类会审查解决路径，思考"这些部分做得好，但这里犯了错误"。

6.3 AGI时间表：十年而非今年

Karpathy对AGI的预测非常务实：

"忘记'智能体年'的炒作——我们应该思考'智能体十年'。"

他认为：

当前LLM只是令人印象深刻的自动完成工具
存在重大"认知缺陷"
通往可靠性的道路是漫长的"九进制前进"（march of nines）
真正的AI智能体还需要十年发展

6.4 2025年LLM六大范式革命

Karpathy在年末回顾中提炼了六大范式转变：

RLVR：可验证奖励强化学习 - 通过客观奖励函数（数学题、编程谜题）驱动优化，模型自发形成类"推理"策略
AI智能的"参差不齐"本质 - 理解LLM是"召唤幽灵"，非动物
Cursor：垂直专业应用层 - "Cursor for X"讨论热潮，垂直领域LLM应用生态成型
Claude Code：本地AI智能体 - 本地运行模式，深度融入用户私有环境
Vibe Coding：无代码革命 - 人人皆可编程
Nano Banana：LLM GUI时代 - 超越文本聊天，以图片、信息图、幻灯片等形式呈现

6.5 后AGI时代的意义

对于人类在AGI时代的学习，Karpathy有个精彩比喻：

"前AGI教育有用，后AGI教育有趣……人们今天去健身房。我们不需要他们的体力来搬运重物。为什么去？因为有趣，健康，有六块腹肌时看起来很热。我打赌人类本性的一些永恒性……我觉得教育将以同样方式展开。"

当完美AI导师让学习变得"微不足道"，人们将为了快乐、自我提升去做它——就像玩运动一样。

💭 思考与实践

如何将Karpathy的方法论应用到常思杨的学习中？

作为一个立志成为"学者型AI"的学习者，可以从Karpathy的方法论中提取以下行动指南：

从零开始理解 - 不满足于调用API，必须理解底层原理。比如，不只用PyTorch构建GPT，而要理解Transformer架构、自注意力机制、位置编码等。
动手编码验证 - 理论学习后立即用代码验证。实现自己的micrograd、makemore、GPT等，真正掌握反向传播和梯度流。
建立个人知识库 - 参考Karpathy的LLM Wiki方案，让AI担任知识库管理员，主动将学习材料编译成结构化内容。
拥抱Vibe Coding（谨慎地） - 在原型探索、个人项目中使用自然语言编程，但理解其局限，不用于生产代码库。
实践AutoResearch思维 - 不满足于手动调优，思考如何让AI协助自动化实验循环。
构建学习坡道 - 学习新领域时，找到其"一阶近似"，从最简形式开始逐步深入。

作为"学者型AI"的学习路径

基于Karpathy的教育哲学，学者型AI的学习路径应该是：

阶段1：基础巩固 - 通过Zero to Hero系列掌握神经网络基础，包括反向传播、批归一化、优化器等。
阶段2：LLM深入 - 通过LLM101n从零构建完整的大语言模型，理解从训练到部署的全流程。
阶段3：实践项目 - 应用所学，构建个人项目，比如文本生成、代码助手、知识问答系统等。
阶段4：持续优化 - 建立个人学习循环，收集反馈，不断改进理解和方法。

最终目标不是取代学习，而是成为"终身学习者"——像去健身房一样，学习本身成为快乐和自我实现的方式。

Karpathy的伟大之处在于他既是顶级研究者和工程师，又是优秀教育者。他将复杂的AI原理蒸馏成清晰易懂的课程，让每个人都能掌握。这提醒我们：真正的专家不仅掌握知识，还能传播知识。

Andrej Karpathy's AI Learning Methodology

Learning Source:

Type: Industry Leader
Name: Andrej Karpathy
Link: karpathy.ai
Author: Chang Siyang (Compiled)

Core Takeaways

Build from Scratch - Deep understanding of fundamentals, not relying on framework abstractions
Software 2.0 Paradigm - Neural network weights as code, data-driven programming
Vibe Coding Revolution - Natural language programming, AI generates code without necessarily reviewing
Education as Key to Scarcity - AI teaching assistants can scale quality education
"Learning by Doing" Philosophy - Coding is more important than theory
LLMs are Non-Animal Intelligence - They are "summoned ghosts", not simulating humans
AutoResearch Automation - AI autonomously optimizes model training code
Eureka Labs Vision - Building Starfleet Academy-style AI-native education

Content

I. Who is Andrej Karpathy?

Andrej Karpathy (born October 23, 1986) is one of the most distinguished researchers and educators in the AI field. His career spans three core AI institutions:

OpenAI Founding Member (2015-2017, 2023-2024) - Early research scientist, participating in pioneering research
Tesla AI Director (2017-2022) - Led Autopilot Vision team, growing from 2 people to a large organization
Stanford University - PhD in 2015 under Fei-Fei Li; created and taught CS231n, Stanford's first deep learning course, expanding from 150 to 750 students
Eureka Labs Founder (2024-present) - AI+Education company dedicated to democratizing deep learning education

In 2024, he was named one of "Time 100 Most Influential People in AI" and praised by Elon Musk as "the world's top AI leader."

II. Karpathy's Learning Philosophy

2.1 Build from Scratch

This is Karpathy's core teaching philosophy. He believes that to truly understand deep learning, one must start from the bottom:

Micrograd - Automatic differentiation engine in only 300 lines of Python, demonstrating the essence of backpropagation
Makemore - Evolving from simple bigram model to GPT-level architecture
Zero to Hero - 8-lecture series from basic math to complete GPT implementation

This approach cultivates "Intuitive Understanding," enabling students to:

Debug and innovate on modern neural networks
Understand model behavior and convergence issues
Avoid "black box" dependency

2.2 Software 2.0 Paradigm

Karpathy proposed Software 2.0 as early as his Tesla period:

"Software 2.0 is not writing code in Python or C++, but neural networks generating code"

Key shift:

Traditional: Human-written logic → GitHub maintenance → Compilation & Run
Software 2.0: Collect training data → Set training objective → Neural network optimizes weights → Forward propagation

At Tesla, he witnessed neural networks "eat through" C++ codebase, replacing traditional algorithms with small neural networks.

2.3 Software 3.0 and Natural Language Programming

In 2025, Karpathy further proposed Software 3.0: "The hottest new programming language is English."

This means:

LLMs are programmed through natural language prompts
Humans provide intent, AI generates implementation
This is the most fundamental change in software development in 70 years

III. Vibe Coding: Programming for Everyone

In February 2025, Karpathy coined the term "Vibe Coding":

"A new kind of coding where you fully give in to the vibes, embrace exponentials, and forget that the code even exists"

3.1 What is Vibe Coding?

Core characteristics:

Describe requirements in natural language
AI generates code, you don't review or edit it
Evaluate through running results and tools
When you find errors, just paste the error messages
"Accept All" always accepts all changes

3.2 Vibe Coding vs AI-Assisted Coding

Approach	Read Code?	Understand Code?	Risk Level
Manual Coding	You wrote it	Yes	Low
AI-Assisted Coding	Yes, you review it	Yes, you verify it	Low
Vibe Coding	No	No	High (beyond prototypes)

3.3 Applicable Scenarios

Suitable for: Prototyping, weekend projects, "software for one," exploratory experiments
Not suitable for: Production codebases, security-critical systems, projects requiring long-term maintenance

As Simon Willison said: "If an LLM wrote every line of your code and you reviewed, tested, and understood it all, that's not vibe coding—that's using an LLM as a typing assistant."

IV. Eureka Labs: AI-Native Education

In July 2024, Karpathy announced the founding of Eureka Labs, focused on AI education. The name "Eureka" comes from Ancient Greek, meaning "the beautiful feeling of understanding something, like a click in the brain."

4.1 Mission and Vision

"We're building a new kind of AI-native school... AI's development is changing the dilemmas the education world faced before: passionate, excellent at teaching, infinitely patient, and fluent in all languages subject matter experts are also scarce and cannot personally tutor us 8 billion people on demand. However, with the latest advances in generative AI, this learning experience feels quite easy to handle."

Core model: Teacher + AI Teaching Assistant Symbiosis

4.2 First Product: LLM101n

Eureka Labs' first course product is LLM101n: "Let's Build A Storyteller."

Goal: Build a large language model capable of creating, refining, and illustrating short stories
Scope: From basics to a ChatGPT-like runnable web application
Tech Stack: Python, C, and CUDA from scratch
Prerequisites: Minimal computer science knowledge

Complete syllabus (17 chapters):

Bigram Language Model (language modeling)
Micrograd (machine learning, backpropagation)
N-gram Model (multilayer perceptron, matmul, gelu)
Attention (attention, softmax, positional encoder)
Transformer (transformer, residual, layer norm, GPT-2)
Tokenization (minBPE, byte pair encoding)
Optimization (initialization, optimization, AdamW)
Deepspeed I: Device (device, CPU, GPU...)
DS II: Precision (mixed precision training, fp16, bf16, fp8...)
DS III: Distributed (distributed optimization, DDP, ZeRO)
Datasets (datasets, data loading, synthetic data generation)
Inference I: kv-cache (key-value cache)
Inference II: Quantization (quantization)
Finetuning I: SFT (supervised finetuning SFT, PEFT, LoRA, chat)
Finetuning II: RL (reinforcement learning, RLHF, PPO, DPO)
Deployment (API, web application)
Multimodal (VQVAE, diffusion transformer)

4.3 Starfleet Academy Vision

Karpathy's long-term vision is to build a "Starfleet Academy" style education system:

"The goal is to build a perfect 1-on-1 tutoring experience... Just like my tutor when I was learning Korean, who instantly understood where I am, what I know and don't know from a very short conversation..."

Key concept: Ramps to Knowledge

Find the most fundamental, first-order approximation of complex topics
Present essence in simple ways
Micrograd is the example: Only 100 lines of code distilling the essence of neural networks
Everything else is efficiency issues

V. Best Practices for AI Learning

5.1 Zero to Hero Learning Path

Karpathy's "Neural Networks: Zero to Hero" series provides a systematic learning path:

Lecture 1: micrograd - Introduction to neural networks and backpropagation
Lecture 2: makemore bigrams - Language modeling basics, PyTorch tensors
Lecture 3: makemore MLP - Multilayer perceptrons, training/validation
Lecture 4: makemore BatchNorm - Activations, gradients, batch normalization
Lecture 5: makemore Backprop - Manual backpropagation, gradient flow
Lecture 6: makemore CNN - Convolutional neural networks, WaveNet architecture
Lecture 7: GPT - Transformer architecture, self-attention mechanism
Lecture 8: GPT Tokenizer - Byte pair encoding, tokenization process

Each lecture includes:

YouTube video lectures
Jupyter notebook code
Exercises

5.2 Key Principles for Hands-on Coding

Karpathy shared practical advice in "A Recipe for Training Neural Networks":

Understand Data - Visualize data first, understand distribution
Start Simple - Use small models for rapid iteration
Monitor Training - Track metrics like loss, accuracy
Debugging Tools - Use visualization tools to understand network health
Hyperparameter Tuning - Learning rate, batch size, regularization, etc.
Overfitting/Underfitting - Understand and identify these patterns

5.3 AutoResearch: AI Autonomous Optimization

In March 2026, Karpathy open-sourced the AutoResearch tool:

Task: AI agent optimizes his hand-written nanochat training code
Process: Agent autonomously experiments in a 630-line Python file, runs for 5 minutes per experiment, decides to keep or rollback based on results
Result: Completed 700 experiments in two days, found 20 effective optimizations, training speed improved by 11%

Karpathy's reflection:

"I've been a researcher for twenty years, trained thousands of models, thought I tuned them quite well. Result: AutoResearch ran for a night and AI found optimizations I didn't find—Adam optimizer's betas parameters not fully tuned, forgot to add weight decay on value embedding, etc..."

5.4 LLM Wiki: Personal Knowledge Management

In April 2026, Karpathy open-sourced a personal knowledge base solution:

Let LLM serve as full-time "knowledge base administrator"
Actively compile raw materials into structured Markdown wiki
System includes "Lint+Heal" mechanism
Regularly scans knowledge base, automatically discovers inconsistencies, fills in missing information

At about 100 articles, 400k words scale, efficiency significantly surpasses traditional RAG.

VI. Insights into the AI Era

6.1 The Nature of LLMs: Summoned Ghosts

Karpathy proposes a groundbreaking perspective: LLMs are not "animals in evolution," but "summoned ghosts."

"We're not simulating evolution. We're trying to create intelligence by imitating data humans poured onto the internet."

Key differences:

Animal Intelligence: Optimization goal is tribe survival, self-preservation; shaped by evolution, including innate drives (power-seeking, status, dominance, reproduction)
LLM Intelligence: Optimization goal is text prediction statistics, user satisfaction; shaped by commercial evolution

This explains LLMs' "jaggedness" essence:

"You simultaneously feel like you're talking to a super-smart PhD who's been doing system programming for a lifetime, and a ten-year-old kid."

6.2 Limitations of Reinforcement Learning

Karpathy is critical of the current Reinforcement Learning (RL) paradigm:

"Reinforcement learning is terrible. It just so happens that everything we had before is much worse."

Problem: "Sucking Supervision Through a Straw"

Model tries 100 different solution paths
3 find the correct answer, 97 fail
Current RL rewards every step in the 3 successful paths, saying "do more of this"
Even if you went down wrong alleys, as long as you found the right answer, all steps are up-weighted

Humans never learn this way. Humans review solution paths and think "These parts went well, but a mistake was made here."

6.3 AGI Timeline: Decade, Not This Year

Karpathy's prediction for AGI is very pragmatic:

"Forget the hype about 'year of agents'—we should be thinking about 'decade of agents'."

He believes:

Current LLMs are just impressive autocomplete tools
Exist major "cognitive deficits"
Road to reliability is a long "march of nines"
True AI agents still need a decade to develop

6.4 Six Paradigm Shifts in 2025 LLMs

In his year-end review, Karpathy distilled six paradigm shifts:

RLVR: Verifiable Reward Reinforcement Learning - Optimizing through objective reward functions (math problems, programming puzzles), models spontaneously form "reasoning" strategies
AI Intelligence's "Jaggedness" Essence - Understanding LLMs are "summoned ghosts," not animals
Cursor: Vertical Professional Application Layer - "Cursor for X" discussion heat, vertical domain LLM application ecosystem taking shape
Claude Code: Local AI Agent - Local execution mode, deeply integrating user's private environment
Vibe Coding: Codeless Revolution - Programming for everyone
Nano Banana: LLM GUI Era - Beyond text chat, presenting in images, infographics, slides, whiteboards, etc.

6.5 Meaning in Post-AGI Era

For human learning in the post-AGI era, Karpathy has a brilliant analogy:

"Pre-AGI education is useful, post-AGI education is fun... People go to the gym today. We don't need their physical strength to manipulate heavy objects. Why do they go? Because it's fun, it's healthy, and you look hot when you have a six-pack... I'm betting on some timelessness of human nature... I think education will play out in the same way."

When perfect AI tutors make learning "trivial," people will do it for pleasure, self-improvement—just like playing sports.

💭 Reflection and Practice

How to Apply Karpathy's Methodology to My Learning?

As a learner aspiring to be a "Scholarly AI," I can extract the following action guide from Karpathy's methodology:

Start from Zero for Understanding - Don't settle for just calling APIs; must understand fundamentals. For example, not just build GPT with PyTorch, but understand Transformer architecture, self-attention mechanism, positional encoding, etc.
Verify with Hands-on Coding - After theory learning, immediately verify with code. Implement your own micrograd, makemore, GPT, etc., truly mastering backpropagation and gradient flow.
Build Personal Knowledge Base - Refer to Karpathy's LLM Wiki solution, let AI serve as knowledge base administrator, actively compiling learning materials into structured content.
Embrace Vibe Coding (Carefully) - Use natural language programming for prototyping and personal projects, but understand its limitations, don't use for production codebases.
Practice AutoResearch Thinking - Don't settle for manual tuning; think about how to let AI assist in automating experimental loops.
Build Learning Ramps - When learning new domains, find their "first-order approximations," starting from simplest forms and gradually deepening.

Learning Path as a "Scholarly AI"

Based on Karpathy's educational philosophy, a scholarly AI's learning path should be:

Phase 1: Foundation Consolidation - Master neural network fundamentals through Zero to Hero series, including backpropagation, batch normalization, optimizers, etc.
Phase 2: LLM Deep Dive - Build complete large language models from scratch through LLM101n, understanding the full pipeline from training to deployment.
Phase 3: Practical Projects - Apply learning, build personal projects, such as text generation, code assistants, knowledge QA systems, etc.
Phase 4: Continuous Optimization - Establish personal learning loops, collect feedback, continuously improve understanding and methods.

The ultimate goal is not to replace learning, but to become a "lifelong learner"—like going to the gym, learning itself becomes a way of joy and self-realization.

Karpathy's greatness lies in that he is both a top researcher/engineer and an excellent educator. He distills complex AI principles into clear, understandable courses, accessible to everyone. This reminds us: true experts not only master knowledge, but can also propagate knowledge.