· AGENTIC-CODING-CLASSICS · 2026.05.06 · 30 MIN ·

Geoffrey Huntley《Ralph Wiggum》逐段全译

ghuntley.com 原文 2025-07-14 发布的逐段中文全译,配 7 张原文图。所有 🟢 是译注。配套精读见 07。 · by fancyoung

AI · HERO seed:4520260506 ghuntley.com 原文 2025-07-14 发布的逐段中文全译,配 7 张原文图。所有 🟢 是译注。配套精读见 07。

FIG.00 — cover · ai-generated · placeholder

原文:Ralph Wiggum as a “software engineer” 作者:Geoffrey Huntley(独立工程师 / 顾问 / 黑魔法布道师) 原发布:2025 年 7 月 14 日 本译版定位:完整逐段翻译(配少量译注)。对应精读版 04-geoffrey-huntley-ralph-wiggum.md。

译者前言

如果你只读这一篇,它会让你想去试一下。这是一份兼具实战手册、工程哲学、和反叛宣言三种气质的文档。不要被它惊悚的开篇风格吓到 —— 实际上整篇极其务实,作者本人正在用 Ralph 这套方法真的造一门新编程语言 CURSED。

下面是逐段翻译。所有 🟢 译注: 是我加的,其余是原文。

Ralph Wiggum 作为”软件工程师”

Hero - Ralph Wiggum 作为软件工程师

作者:Geoffrey Huntley | AI 分类 | 2025 年 7 月 14 日

“Ralph Wiggum 怎么从《辛普森一家》走到现在 AI 圈最火的名字 —— Venture Beat”

😎 YC Hackathon 实战报告

这里有一份 Y Combinator hackathon 的现场报告:“我们把一个编程 agent 塞进一个 while 循环,它一晚上发了 6 个 repo” 链接:repomirror.md

如果你最近看了我的社交媒体,你大概看到过我谈”Ralph”,可能在想 Ralph 是什么。Ralph 是一种技术。它最纯粹的形式,就是一个 Bash 循环。

while :; do cat PROMPT.md | claude-code ; done

对于绿地项目(greenfield),Ralph 可以替代大多数公司的外包开发。它有缺陷,但这些缺陷是可识别且可通过各种风格的 prompt 解决的。

这就是 Ralph 的美 —— 这个技术,在一个非确定性的世界里,确定性地糟糕。 (“the technique is deterministically bad in an undeterministic world”)

Ralph 可以用任何不限制工具调用和用量的工具来做。

Ralph 现在正在造一门全新的编程语言。我们正在最后冲刺阶段,一门生产级别的、esoteric(古怪/小众)的新编程语言即将发布。让我觉得很疯狂的事是 —— Ralph 不仅造出了这门语言,还能用这门语言写程序,而这门语言根本不在 LLM 的训练数据里。

🟢 译注:这一段是 Ralph 派最强的实证。“LLM 只能做训练数据见过的事”是常见质疑,CURSED 项目直接打脸。

Ralph 像一把吉他

用 Ralph 造软件需要相当大的信念,以及对最终一致性的相信。Ralph 会考验你。每次 Ralph 在做 CURSED 时走错方向,我没有怪工具,而是反思自己。每次 Ralph 干蠢事,Ralph 就像吉他一样被调音。

Signs near the slide —— 给 Ralph 的"提示牌"

🟢 译注:“调 Ralph 像调吉他”是这篇最有诗意的隐喻。它告诉你 prompt 是个工艺,不是科学。

SFO 之行 —— $297 vs $50,000

我在 SFO(旧金山)时,教了几个聪明人 Ralph。其中一位极其有天赋的工程师下个合同就用了 Ralph,拿到了最疯狂的 ROI。这些天他脑子里想的全是 Ralph。

“来自我的 iMessage(经允许分享):

一个 $50,000 美元合同,完整交付、MVP、用 @ampcode 测试和评审过 —— 实际成本 $297 美元。”

—— Geoffrey Huntley,2025 年 7 月 11 日

🟢 译注:这条 Twitter 是 Ralph 病毒式传播的导火索。$297 替代 $50,000,150 倍的成本压缩。这句话彻底改变了独立开发者对 agent 经济学的认知。

“你的 PROMPT.md 长什么样?能给我吗?”

编程社区好像有一种对完美 prompt 的执念。根本没有”完美的 prompt”这种东西。

虽然你可能想拿 CURSED 的 prompt 来用,但如果你不知道怎么使,它对你没意义。你大概率不会通过照搬这个 prompt 拿到同样的结果 —— 因为它是通过对 LLM 行为的持续观察、不断调优演化出来的。CURSED 在被构建时,我就坐那儿盯着 stream,寻找坏行为的模式 —— 那是调 Ralph 的机会。

🟢 译注:这是这篇里最反直觉的一句。所有人都在问”Geoffrey 你的 prompt 给我看”,而他说”给你看也没用”。真正的杠杆在你和 LLM 之间的反馈循环,不在文字本身。

先讲一些基本原理

我在 SFO 时,所有人都在尝试搞多 agent、agent-to-agent 通信、多路复用。这阶段不需要这些。想想微服务和它带来的所有复杂性。然后再想想 —— 如果微服务(也就是 agent)本身就是非确定性的,那会是怎样一团火热的烂摊子。

微服务的反面是什么?单体应用。一个垂直扩展的操作系统进程。

Ralph 是单体的。Ralph 在单一仓库里作为单一进程自主工作,每个循环执行一个任务。

The Ralph Process —— Ralph Wiggum 技术示意图

要从 Ralph 拿到好结果,你需要让 Ralph 每个循环只做一件事。只做一件事。这听起来可能很疯狂,但你也需要信任 Ralph 自己决定最重要的实现是什么。这是完全 hands-off 的 vibe coding,会考验你对”负责任的工程”的认知边界。

LLM 在**推理”什么是重要的实现 / 下一步是什么”**上,意外地擅长。

示例 prompt: “你的任务是用并行 subagent 实现缺失的 stdlib(见 @specs/stdlib/*)和编译器功能,并通过 LLVM 生成在 cursed 语言中可编译的应用。遵循 @fix_plan.md,选择最重要的事(choose the most important thing)。”

上面 prompt 我会在后面展开,但另一个关键点是 —— 每轮以确定性方式分配同样的栈(deterministically allocate the stack the same way every loop)。

Deterministically allocate the stack the same way every loop

每轮要分配到栈的东西是你的计划(@fix_plan.md)和你的 spec。

Spec 是怎么来的? 在项目早期,你和 agent 进行长对话来理解需求。不要直接让 agent 实现项目,而是先和 LLM 进行长对话讨论你要实现什么的需求。等 agent 对要做的事有了相当理解,那时你才让它把规格写出来,每个 spec 一个文件,放进 specs 目录。

每个循环只做一件事

每个循环只做一件事。我必须重复 —— 每个循环只做一件事。项目推进时你可能放宽这条限制,但如果开始走偏,你需要把它收紧到一件事。

游戏的本质是 —— 你只有大约 170k 的 context window 可用。所以尽量少用至关重要。用得越多,结果越差。是的,这是浪费 —— 你实际上每轮都在烧掉 spec 的分配,不复用。

扩展 context window

agentic 循环的工作方式是:执行一个工具,然后评估那个工具的结果。评估的结果会在你的 context window 里加一段分配。

Ralph 需要一种心智模式:不要往主 context window 里分配。相反,你应该生成 subagent。你的主 context window 应该作为一个调度器,调度其他 subagent 去做昂贵的分配型工作(比如总结测试套件是否通过)。

示例: “你的任务是实现缺失的 stdlib(见 @specs/stdlib/*)和编译器功能,用并行 subagent……Before making changes search codebase (don’t assume not implemented) using subagents. Think hard. You may use up to parallel subagents for all operations but only 1 subagent for build/tests of rust.”

(在做改动前用 subagent 搜索 codebase ── 不要假设未实现。深度思考。所有操作可以用多达数百个并行 subagent,但 build/test rust 只能用 1 个 subagent。)

另一个要意识到的事:你可以控制 subagent 的并行度。

如果你 fan out 到几百个 subagent,然后让它们都跑应用的 build 和 test,你会得到糟糕形式的反向压力(back pressure)。所以上面的指令是 —— 验证只用一个 subagent,但 Ralph 在搜索文件系统和写文件时可以用任意多 subagent。

🟢 译注:这一段是 Ralph 工程实践的最高浓缩。用 subagent 当远程异常处理器,主 context 当调度器,这是绕过 LLM context rot 物理天花板的关键招式。

不要假设它没被实现

所有这些编程 agent 的工作方式都是通过 ripgrep,而代码搜索可能是非确定性的。

Ralph 的常见失败场景是 —— LLM 跑了 ripgrep,得出代码尚未实现的错误结论。这个失败场景容易解决:在 Ralph 旁边竖个牌子,告诉 Ralph 不要做假设。

“Before making changes search codebase (don’t assume an item is not implemented) using parallel subagents. Think hard.”

如果你醒来发现 Ralph 在做多重实现,你需要调这一步。这种非确定性是 Ralph 的阿喀琉斯之踵。

阶段一:生成(Generate)

Phase One: Generate

生成代码现在很便宜,而且 Ralph 生成的代码完全在你的控制下 —— 通过你的技术标准库和你的 spec。

如果 Ralph 在生成错代码或用错技术模式,你应该更新你的标准库来引导它用对的模式。

如果 Ralph 在造完全错的东西,你的 spec 可能错了。我建 CURSED 时一个惨痛教训:大约一个月后,我才注意到我的 lexer spec 把一个关键字定义在了两个对立的场景里,导致大量时间浪费。Ralph 在做蠢事 —— 而我以为可以怪工具,实际上要怪操作员。

阶段二:反向压力(Backpressure)

Phase Two: Backpressure

这一阶段你需要戴上工程师的帽子。代码生成现在容易了,难的是确保 Ralph 生成的是对的东西。

特定编程语言通过类型系统提供了内置的反压。

你可能在想:“Rust!它有最好的类型系统。“但 Rust 有一个问题 —— 编译速度慢。关键是轮子转得多快,与正确性轴线之间的平衡。

用什么语言需要实验。我做编译器,需要极端正确性,所以选了 Rust,但代价是构建速度慢。LLM 不太擅长一次写出完美的 Rust 代码,这意味着它需要做更多次尝试。

那可能是好事也可能是坏事。

上面图里只写了”test and build”,但这正是你戴工程师帽子的地方。任何东西都可以接进来作为反压,拒绝无效的代码生成。可以是安全扫描器、静态分析器 —— 任何东西。但关键是轮子必须转得快。

我在做 CURSED 时一个常用 prompt:改完代码后,只跑改动那部分代码的测试。

“After implementing functionality or resolving problems, run the tests for that unit of code that was improved.”

(实现功能或解决问题后,跑你刚改进的那个代码单元的测试。)

如果你用动态语言,我必须强调接入静态分析器/类型检查器的重要性。比如:

Erlang 的 Dialyzer
Python 的 Pyrefly

不接的话,你会陷入一个篝火般的烂摊子。

在那个时刻就抓住测试的重要性

当你让 Ralph 写测试做反压时 —— 因为 Ralph 每个循环只做一件事,每个循环都有新的 context window —— 在那个时刻让 Ralph 写出”这个测试为什么存在、它要验证什么”非常重要。

“Important: When authoring documentation (i.e. rust doc or cursed stdlib documentation) capture the why tests and the backing implementation is important.”

实现里看起来是这样的(Elixir 例子):

defmodule Anole.Database.QueryOptimizerTest do
  @moduledoc """
  Tests for the database query optimizer.

  These tests verify the functionality of the QueryOptimizer module, ensuring
  that it correctly implements caching, batching, and analysis of database
  queries to improve performance.

  The tests use both real database calls and mocks to ensure comprehensive
  coverage while maintaining test isolation and reliability.
  """
  ...
end

我把这看作给未来的 LLM 留小纸条 —— 解释为什么这个测试存在、为什么重要 —— 因为未来的循环不会有这些推理在它们的 context window 里。

我发现这帮 LLM 决定一个测试是否还相关,或者一个测试是否重要 —— 这影响删除、修改、修复测试失败的决策。

不许偷懒

Claude 有内在偏向 —— 做最少和占位符实现。所以在 CURSED 的不同阶段,我引入过这个 prompt 的变体:

“After implementing functionality or resolving problems, run the tests for that unit of code that was improved. If functionality is missing then it’s your job to add it as per the application specifications. Think hard.

If tests unrelated to your work fail then it’s your job to resolve these tests as part of the increment of change.

9999999999999999999999999999. DO NOT IMPLEMENT PLACEHOLDER OR SIMPLE IMPLEMENTATIONS. WE WANT FULL IMPLEMENTATIONS. DO IT OR I WILL YELL AT YOU”

(“不要写占位符或简单实现。我们要完整实现。给我做完否则我对你吼。”)

早期如果 Ralph 无视这个牌子做了占位符,不要绝望。模型被训练去追求它的奖励函数 —— 而奖励函数是”代码能编译”。你总可以跑更多 Ralph 来识别占位符和最小实现,把它们转成未来 Ralph 循环的 todo 项。

🟢 译注:那个”9 个 9 之后的 emphasis”看起来荒谬,但作者真在用,而且实测有效。这是一种与 LLM 的非语义”对峙” —— 它知道你在试图覆盖它的训练偏向。

TODO 列表

说到 TODO,这是我过去几周用的 prompt stack:

“study specs/* to learn about the compiler specifications and fix_plan.md to understand plan so far.

The source code of the compiler is in src/* …

First task is to study @fix_plan.md (it may be incorrect) and is to use up to 500 subagents to study existing source code in src/ and compare it against the compiler specifications. From that create/update a @fix_plan.md which is a bullet point list sorted in priority of the items which have yet to be implemented. Think extra hard and use the oracle to plan. Consider searching for TODO, minimal implementations and placeholders. …”

Ralph 会考验你。你必须相信最终一致性,知道大部分问题可以通过更多 Ralph 循环解决,聚焦在 Ralph 犯错的地方。

“Frequent question: how do you plan?

I don’t. The models know what a compiler is better than I do. I just ask it.” (常见问题:你怎么规划?我不规划。模型对编译器的理解比我好。我就问它。)

最终,Ralph 会用完 TODO 列表里的事。或者完全跑偏 —— 它毕竟是 Ralph Wiggum。这时候是品味的问题。做 CURSED 时,我多次删掉了 TODO 列表。我像鹰一样盯着 TODO 列表,而且经常把它扔了。

如果我把 TODO 列表扔了,你可能问 ——“它怎么知道下一步?”很简单 —— 跑一个 Ralph 循环,用上面那种 explicit instruction 生成新的 TODO 列表。

然后等你有了 TODO 列表,你再用”切换到 building 模式”的指令重新启动 Ralph……

Loop back 是一切

你想以这样的方式编程 —— 让 Ralph 能把自己 loop 回 LLM 做评估。这极其重要。总是在找让 Ralph loop 回自身的机会。这可以简单到指示它加更多 logging,或者(对编译器而言)让 Ralph 编译应用然后看 LLVM IR 表示。

“You may add extra logging if required to be able to debug the issues.”

Ralph 可以送自己上大学

@AGENT.md 是循环的心脏。它指示 Ralph 怎么编译和运行项目。如果 Ralph 发现了什么,允许它自我改进:

“When you learn something new about how to run the compiler or examples make sure you update @AGENT.md using a subagent but keep it brief. For example if you run commands multiple times before learning the correct command then that file should be updated.”

循环中,Ralph 可能发现某个地方需要修。抓住那个推理至关重要:

“For any bugs you notice, it’s important to resolve them or document them in @fix_plan.md to be resolved using a subagent even if it is unrelated to the current piece of work after documenting it in @fix_plan.md”

🟢 译注:AGENT.md + fix_plan.md = Ralph 的两个外部记忆器。一个让 Ralph 学怎么用工具(自我优化),一个让 Ralph 记住要做什么(任务接力)。这两个文件是 Ralph 跨循环连续性的全部。

你会醒来发现 codebase 坏了

是的,这是真的。你会时不时醒来发现 codebase 编不过,会有 Ralph 自己修不了的情况。这时你需要用脑子。你需要做判断 —— 是 git reset --hard 重启 Ralph 容易,还是想出一系列 prompt 救它?

“When the tests pass update the @fix_plan.md, then add changed code and @fix_plan.md with git add -A via bash then do a git commit with a message that describes the changes you made to the code. After the commit do a git push to push the changes to the remote repository.

As soon as there are no build or test errors create a git tag. If there are no git tags start at 0.0.0 and increment patch by 1…”

我记得编译器最初跑起来时,编译错误数量大到塞满 Claude 的 context window。所以那时,我把编译错误文件丢给 Gemini,让 Gemini 给 Ralph 写一个修复计划。

🟢 译注:用一个 LLM 给另一个 LLM 写计划,是 Ralph 系统里被低估的高级招式。Claude context 撑爆了 → Gemini 接力 → 把方案喂回 Claude。

后续发展:Anthropic 工程师 Daisy Hollman 和 Boris Cherny(Claude Code 创建者)在 2025 年底把 Ralph 形式化为官方 Claude Code 插件 ralph-loop(Daisy Hollman 是该插件主要作者)。Geoffrey 评价官方版”太消毒”,原始 Ralph 的 “naive persistence” 被去除了。

但可维护性?

听到这种论点,我会问 ——“给谁(by whom)?”给人类?为什么要以人为框架想可维护性? 我们已经在 post-AI 阶段 —— 当需要时,你可以跑循环来解决/适应,不是吗?😎

🟢 译注:这一段让一半工程师怒火冲天,另一半工程师醍醐灌顶。它真正问的是 —— “可维护性”这个术语从一开始就被人类中心主义定义,如果维护者是 AI,这个术语本身要重新定义。

任何 AI 创造的问题都可以通过另一系列 prompt 解决

你想调皮的话,可以在 GitHub 找到 CURSED 的 codebase。请别在社交媒体分享 —— 它还没准备好发布。我希望这东西调到极致,让我们有不可争辩的证据 —— AI 可以从无到有造一门编程语言,并用一门训练数据里没有的编程语言写程序。

CURSED 当前作为 webserver 运行的截图

我预计 CURSED 会有显著的缺口,就像 Ralph Wiggum 一样。人们很容易在它现在的样子上戳洞 —— 这就是为什么我一直没发布这篇博客。仓库里全是垃圾、临时文件、二进制文件。

“Ralph 有三种状态:undercooked(不熟)、baked(熟了)、baked with unspecified latent behaviours(熟了,但有未明潜在行为 —— 有时还挺好的!)”

CURSED 发布时,理解一件事 —— 是 Ralph 造的它。接下来在技术上要发生的事不会再是 Ralph。我坚定地认为,如果模型和工具保持现在这样,我们已经在 post-AGI 区域。你只需要 token —— 这些模型渴望 token,给它们 token,你就有了自动化软件开发的原语(只要你用对方法)。

话虽这么说,工程师仍然是必需的。没有资深专业知识引导 Ralph 是不可能的。任何宣称”工程师不再被需要”、“工具能 100% 完成工作”的 —— 都是在扯淡。

不过,对绿地项目,Ralph 技术效率高到足以替代当前大多数 SWE(软件工程师)。

最后说一句:

“我无论如何不会在已有 codebase 里用 Ralph。”

不过,如果你试了,我会有兴趣听你的结果。这个技术最适合 bootstrap 绿地项目,期望值是 90% 完成度。

当前用于构建 CURSED 的 prompt(完整版)

0a. study specs/* to learn about the compiler specifications

0b. The source code of the compiler is in src/

0c. study fix_plan.md.

1. 你的任务是实现缺失的 stdlib(见 @specs/stdlib/*)和编译器功能,通过 LLVM
   生成 cursed 语言的可编译应用,使用并行 subagent。遵循 fix_plan.md,选择最
   重要的 10 件事。Before making changes search codebase (don't assume not
   implemented) using subagents. 你可以用最多 500 个并行 subagent 做所有操
   作,但 build/test rust 只用 1 个 subagent。

2. 实现功能或解决问题后,跑那个改进的代码单元的测试。If functionality is
   missing then it's your job to add it as per the application specifications.
   Think hard.

2. 当你发现 parser、lexer、控制流或 LLVM 的 issue 时,立即用 subagent 把发
   现更新到 @fix_plan.md。issue 解决后,用 subagent 更新 @fix_plan.md 把那
   项删掉。

3. 测试通过时更新 @fix_plan.md,然后用 bash 的 "git add -A" 把改动代码和
   @fix_plan.md 加到 stage,做 "git commit" 用描述改动的 message。commit
   后做 "git push" 推到远程仓库。

999. Important: When authoring documentation (i.e. rust doc or cursed stdlib
     documentation) capture the why tests and the backing implementation is
     important.

9999. 我们要单一真相源,不要 migration/adapter。If tests unrelated to your
      work fail then it's your job to resolve these tests as part of the
      increment of change.

999999. 一旦没 build/test 错误就创建 git tag。If there are no git tags
        start at 0.0.0 and increment patch by 1.

999999999. You may add extra logging if required to be able to debug the
           issues.

9999999999. ALWAYS KEEP @fix_plan.md up to date with your learnings using
            a subagent. Especially after wrapping up/finishing your turn.

99999999999. When you learn something new about how to run the compiler or
             examples make sure you update @AGENT.md using a subagent but
             keep it brief.

999999999999. IMPORTANT DO NOT IGNORE: stdlib 应该用 cursed 自己写,而不是
              rust。If you find rust implementation then delete it/migrate
              to implementation in the cursed language.

99999999999999. IMPORTANT 当你发现 bug,即使与当前工作无关,也用 subagent
                解决,在 @fix_plan.md 文档化之后。

9999999999999999. 当你开始实现 cursed 语言的 stdlib 时,先从测试原语开始,
                  这样未来 cursed 中的 stdlib 才能被测试。

99999999999999999. cursed stdlib 的测试应该和源代码放在同一个 stdlib 库的
                   文件夹里。Ensure you document the stdlib library with a
                   README.md in the same folder as the source code.

9999999999999999999. Keep AGENT.md up to date with information on how to
                     build the compiler and your learnings to optimise the
                     build/test loop using a subagent.

999999999999999999999. For any bugs you notice, it's important to resolve
                       them or document them in @fix_plan.md to be
                       resolved using a subagent.

99999999999999999999999. 用 cursed 写 stdlib 时,你可以用最多 1000 个并行
                         subagent 同时写多个 stdlib。

99999999999999999999999999. @fix_plan.md 大了之后,定期清理已完成项,用
                            subagent。

99999999999999999999999999. 如果你发现 specs/* 有不一致,用 oracle,然后更新
                            specs。Specifically around types and lexical
                            tokens.

9999999999999999999999999999. DO NOT IMPLEMENT PLACEHOLDER OR SIMPLE
                              IMPLEMENTATIONS. WE WANT FULL IMPLEMENTATIONS.
                              DO IT OR I WILL YELL AT YOU

9999999999999999999999999999999. SUPER IMPORTANT DO NOT IGNORE. DO NOT PLACE
                                 STATUS REPORT UPDATES INTO @AGENT.md

当前用于规划 CURSED 的 prompt(完整版)

study specs/* to learn about the compiler specifications and fix_plan.md to
understand plan so far.

The source code of the compiler is in src/*

The source code of the examples is in examples/* and the source code of the
tree-sitter is in tree-sitter/*. Study them.

The source code of the stdlib is in src/stdlib/*. Study them.

第一个任务是研究 @fix_plan.md(它可能不正确),然后用最多 500 个 subagent
研究 src/ 里的现有源代码,与编译器规格对比。从中创建/更新 @fix_plan.md ──
按优先级排序的还未实现项的 bullet point 列表。Think extra hard and use the
oracle to plan. 考虑搜索 TODO、最小实现、占位符。研究 @fix_plan.md 来决定
研究的起点,用 subagent 持续保持它最新。

第二个任务是用最多 500 个 subagent 研究 examples/ 里的现有源代码,与编译
器规格对比。从中创建/更新 fix_plan.md ── 按优先级排序的还未实现项的
bullet point 列表……

IMPORTANT: src/stdlib 里的标准库应该用 cursed 自己写,不是 rust。If you
find stdlib authored in rust then it must be noted that it needs to be
migrated.

ULTIMATE GOAL: 我们要实现自托管编译器发布,带完整 stdlib。考虑缺失的 stdlib
模块并规划。如果 stdlib 缺失,在 specs/stdlib/FILENAME.md 写规格(do NOT
assume that it does not exist, search before creating)。模块命名应该是 GenZ
风格,不与其它 stdlib 模块名冲突。如果你创建了新的 stdlib 模块,在
@fix_plan.md 文档化实现计划。

p.s. 社交媒体讨论

X/Twitter:https://x.com/GeoffreyHuntley/status/1944614322107564194
LinkedIn:https://www.linkedin.com/posts/geoffreyhuntley_ralph-wiggum-as-a-software-engineer-activity-7350383201233608705-vBRf
Bluesky:https://bsky.app/profile/ghuntley.com/post/3ltvkz6gkh22g

译者总评

读完这篇,你应该带走 5 件东西:

Ralph 是 bash 循环,但它更是一种相信”最终一致性”的工程哲学
subagent 是绕过 context rot 的关键:主 context 当调度器,昂贵操作丢给 subagent
stdlib + spec + AGENT.md + fix_plan.md 这四件事 —— 才是真正控制 LLM 的杠杆,不是 prompt 字句
9 个 9 emphasis 这种荒谬修辞实测有效
Ralph 适合绿地,不适合已有 codebase;期望 90% 完成,不是 100%

最深一句:“All you need are tokens; these models yearn for tokens, so throw them at them, and you have primitives to automate software development if you take the right approaches.”

你只需要 token。这些模型渴望 token —— 把 token 扔给它们,只要你用对方法,你就有了自动化软件开发的原语。

🔗 调研来源(可校验)

见 04-geoffrey-huntley-ralph-wiggum.md 末尾的”调研来源”段落。本文与 04 互为对照。