12-Factor Agents - Principles for building reliable LLM applications
12-Factor Agents - 构建可靠 LLM 应用的原则
In the spirit of 12 Factor Apps. The source for this project is public at https://github.com/humanlayer/12-factor-agents, and I welcome your feedback and contributions. Let's figure this out together!
秉承 12 因子应用的理念。本项目的源代码可在 https://github.com/humanlayer/12-factor-agents 公开获取,我欢迎您的反馈和贡献。让我们一起解决这个问题!
Tip 提示
Missed the AI Engineer World's Fair? Catch the talk here
错过了 AI 工程师世界博览会?在这里观看演讲
Looking for Context Engineering? Jump straight to factor 3
寻找上下文工程?直接跳转到因素 3
Want to contribute to npx/uvx create-12-factor-agent - check out
想为 npx/uvx create-12-factor-agent 贡献代码 - 请查看the discussion thread
Hi, I'm Dex. I've been hacking on AI agents for a while.
你好,我是 Dex。我一直在开发 AI 代理。
I've tried every agent framework out there, from the plug-and-play crew/langchains to the "minimalist" smolagents of the world to the "production grade" langraph, griptape, etc.
我尝试过所有可用的代理框架,从 plug-and-play 的 crew/langchains 到世界上所谓的“极简主义”smolagents,再到“生产级”的 langraph、griptape 等。
I've talked to a lot of really strong founders, in and out of YC, who are all building really impressive things with AI. Most of them are rolling the stack themselves. I don't see a lot of frameworks in production customer-facing agents.
我与很多非常优秀的创始人交流过,他们来自 YC 内外,都在用 AI 构建令人印象深刻的产品。大多数人都在自行搭建技术栈。我很少看到有框架被用于生产环境中的客户面向代理。
I've been surprised to find that most of the products out there billing themselves as "AI Agents" are not all that agentic. A lot of them are mostly deterministic code, with LLM steps sprinkled in at just the right points to make the experience truly magical.
我发现大多数号称自己是“AI Agents”的产品实际上并不具备真正的自主性。很多产品只是以 LLM 步骤点缀在合适的位置,使体验变得真正神奇,但它们的核心仍然是决定性的代码。
Agents, at least the good ones, don't follow the "here's your prompt, here's a bag of tools, loop until you hit the goal" pattern. Rather, they are comprised of mostly just software.
代理,至少是好的代理,不会遵循“给你一个提示,给你一袋工具,循环直到达成目标”的模式。相反,它们主要由软件构成。
So, I set out to answer:
因此,我开始着手回答:
Welcome to 12-factor agents. As every Chicago mayor since Daley has consistently plastered all over the city's major airports, we're glad you're here.
欢迎来到 12-factor agents。正如每一位芝加哥市长自 Daley 以来都一直张贴在全市主要机场的标语一样,我们很高兴你来到这里。
Special thanks to @iantbutler01, @tnm, @hellovai, @stantonk, @balanceiskey, @AdjectiveAllison, @pfbyjy, @a-churchill, and the SF MLOps community for early feedback on this guide.
特别感谢 @iantbutler01、@tnm、@hellovai、@stantonk、@balanceiskey、@AdjectiveAllison、@pfbyjy、@a-churchill 以及 SF MLOps 社区在本指南初期提供的反馈。
Even if LLMs continue to get exponentially more powerful, there will be core engineering techniques that make LLM-powered software more reliable, more scalable, and easier to maintain.
即使 LLMs 持续变得指数级更强大,仍有一些核心的工程技巧可以让基于 LLM 的软件更加可靠、更加可扩展且更容易维护。
- How We Got Here: A Brief History of Software
如何走到今天:软件简史 - Factor 1: Natural Language to Tool Calls
因子 1:自然语言到工具调用 - Factor 2: Own your prompts
因子 2:拥有你的提示 [Output ONLY translation, matching {{content_type}} format] - Factor 3: Own your context window
因子 3:掌控你的上下文窗口 - Factor 4: Tools are just structured outputs
因子 4:工具只是结构化的输出 - Factor 5: Unify execution state and business state
因子 5:统一执行状态和业务状态 - Factor 6: Launch/Pause/Resume with simple APIs
因子 6:通过简单的 API 启动/暂停/恢复 - Factor 7: Contact humans with tool calls
因子 7:通过工具调用联系人类 [输出仅翻译,符合 {{content_type}} 格式] - Factor 8: Own your control flow
因子 8:掌控你的流程 - Factor 9: Compact Errors into Context Window
因子 9:将错误压缩到上下文窗口中 - Factor 10: Small, Focused Agents
因子 10:小巧、专注的代理程序 - Factor 11: Trigger from anywhere, meet users where they are
因子 11:从任何地方触发,与用户所在的位置相遇 - Factor 12: Make your agent a stateless reducer
Factor 12:使你的代理成为无状态的归约器
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
For a deeper dive on my agent journey and what led us here, check out A Brief History of Software - a quick summary here:
对于深入了解我的代理旅程以及是什么让我们走到这里,可以查看《A Brief History of Software》- 这里有一个快速摘要:
We're gonna talk a lot about Directed Graphs (DGs) and their Acyclic friends, DAGs. I'll start by pointing out that...well...software is a directed graph. There's a reason we used to represent programs as flow charts.
我们将大量讨论有向图(DGs)及其无环朋友,DAGs。我将首先指出,软件本质上就是一个有向图。我们过去之所以用流程图来表示程序,是有其原因的。
Around 20 years ago, we started to see DAG orchestrators become popular. We're talking classics like Airflow, Prefect, some predecessors, and some newer ones like (dagster, inggest, windmill). These followed the same graph pattern, with the added benefit of observability, modularity, retries, administration, etc.
大约 20 年前,我们开始看到 DAG 调度器变得流行起来。我们指的是像 Airflow、Prefect 这样的经典工具,以及一些前身和较新的工具(如 dagster、inggest、windmill)。这些工具遵循相同的图模式,并且增加了可观测性、模块化、重试、管理等功能等优势。
I'm not the first person to say this, but my biggest takeaway when I started learning about agents, was that you get to throw the DAG away. Instead of software engineers coding each step and edge case, you can give the agent a goal and a set of transitions:
我不是第一个说这话的人,但当我开始学习代理(agents)时,最大的收获是你可以扔掉 DAG(有向无环图)。不再需要软件工程师逐个编写每个步骤和边缘情况,你可以给代理一个目标和一组转换规则:
And let the LLM make decisions in real time to figure out the path
让 LLM 实时做出决策以确定路径
The promise here is that you write less software, you just give the LLM the "edges" of the graph and let it figure out the nodes. You can recover from errors, you can write less code, and you may find that LLMs find novel solutions to problems.
这里的承诺是,你编写更少的软件,只需将图的“边”交给 LLM,让它自行推断节点。你可以从错误中恢复,可以编写更少的代码,并且可能会发现 LLMs 能为问题找到新颖的解决方案。
As we'll see later, it turns out this doesn't quite work.
正如我们后面会看到的,事实证明这并不完全奏效。
Let's dive one step deeper - with agents you've got this loop consisting of 3 steps:
让我们再深入一步——使用代理,你将拥有一个由 3 个步骤组成的循环:
- LLM determines the next step in the workflow, outputting structured json ("tool calling")
LLM 决定工作流的下一步,输出结构化的 json("工具调用") - Deterministic code executes the tool call
确定性代码执行工具调用 - The result is appended to the context window
结果被追加到上下文窗口 - Repeat until the next step is determined to be "done"
重复,直到下一步被判定为“完成”
initial_event = {"message": "..."}
context = [initial_event]
while True:
next_step = await llm.determine_next_step(context)
context.append(next_step)
if (next_step.intent === "done"):
return next_step.final_answer
result = await execute_step(next_step)
context.append(result)Our initial context is just the starting event (maybe a user message, maybe a cron fired, maybe a webhook, etc), and we ask the llm to choose the next step (tool) or to determine that we're done.
我们的初始上下文只是起始事件(可能是用户消息,可能是定时任务触发,也可能是 webhook 等),我们要求 LLM 选择下一步(工具)或确定我们已经完成。
Here's a multi-step example:
这是一个多步骤示例:
027-agent-loop-animation.mp4
At the end of the day, this approach just doesn't work as well as we want it to.
总之,这种方法的效果远不如我们所期望的。
In building HumanLayer, I've talked to at least 100 SaaS builders (mostly technical founders) looking to make their existing product more agentic. The journey usually goes something like:
在构建 HumanLayer 的过程中,我与至少 100 名 SaaS 构建者(大多是技术创始人)进行了交流,他们希望让现有的产品更具代理性。通常的旅程大致如下:
- Decide you want to build an agent
决定你要构建一个代理 - Product design, UX mapping, what problems to solve
产品设计,用户体验映射,需要解决哪些问题 - Want to move fast, so grab $FRAMEWORK and get to building
想要快速推进,那就获取 $FRAMEWORK 并开始构建 - Get to 70-80% quality bar
达到 70-80%质量酒吧 - Realize that 80% isn't good enough for most customer-facing features
认识到对于大多数面向客户的功能,80%的完成度并不足够 - Realize that getting past 80% requires reverse-engineering the framework, prompts, flow, etc.
认识到要突破 80%的水平,需要逆向工程框架、提示、流程等。 - Start over from scratch
从头开始重新开始
Random Disclaimers 随机免责声明
DISCLAIMER: I'm not sure the exact right place to say this, but here seems as good as any: this in BY NO MEANS meant to be a dig on either the many frameworks out there, or the pretty dang smart people who work on them. They enable incredible things and have accelerated the AI ecosystem.
免责声明:我不确定在哪个地方最恰当地说这句话,但这里似乎和任何地方一样合适:这绝不是对现有各种框架或致力于开发它们的聪明人们的贬低。它们使不可思议的事情成为可能,并加速了 AI 生态系统的发展。
I hope that one outcome of this post is that agent framework builders can learn from the journeys of myself and others, and make frameworks even better.
我希望这篇博文的一个结果是,代理框架的构建者能够从我和其他人的经历中学习,从而打造更出色的框架。
Especially for builders who want to move fast but need deep control.
特别是对于那些希望快速推进但需要深度控制的开发者。
DISCLAIMER 2: I'm not going to talk about MCP. I'm sure you can see where it fits in.
免责声明 2:我不会谈论 MCP。我相信你们能明白它应该放在哪里。
DISCLAIMER 3: I'm using mostly typescript, for reasons but all this stuff works in python or any other language you prefer.
免责声明 3:我主要使用 TypeScript,但所有这些内容在 Python 或你偏好的任何其他语言中都能正常工作。
Anyways back to the thing...
总之,回到正题……
After digging through hundreds of AI libriaries and working with dozens of founders, my instinct is this:
在研究了数百个 AI 库并与数十位创始人合作后,我的直觉是这样的:
- There are some core things that make agents great
有一些核心要素使代理变得出色 - Going all in on a framework and building what is essentially a greenfield rewrite may be counter-productive
全身心投入一个框架并构建本质上是一个全新的重写,可能会适得其反 - There are some core principles that make agents great, and you will get most/all of them if you pull in a framework
有一些核心原则让代理变得出色,如果你引入一个框架,你将获得其中的大部分或全部原则 - BUT, the fastest way I've seen for builders to get high-quality AI software in the hands of customers is to take small, modular concepts from agent building, and incorporate them into their existing product
但是,我见过 builders 快速将高质量的 AI 软件交付给客户的方法是:从 agent 构建中提取小而模块化的概念,并将其整合到他们的现有产品中 - These modular concepts from agents can be defined and applied by most skilled software engineers, even if they don't have an AI background
这些来自代理的模块化概念可以被大多数熟练的软件工程师定义和应用,即使他们没有人工智能背景
- How We Got Here: A Brief History of Software
如何走到今天:软件简史 - Factor 1: Natural Language to Tool Calls
因子 1:自然语言到工具调用 - Factor 2: Own your prompts
因子 2:拥有你的提示 [输出仅翻译,符合 {{content_type}} 格式] - Factor 3: Own your context window
因子 3:掌控你的上下文窗口 - Factor 4: Tools are just structured outputs
因子 4:工具只是结构化的输出 - Factor 5: Unify execution state and business state
因子 5:统一执行状态和业务状态 - Factor 6: Launch/Pause/Resume with simple APIs
因子 6:通过简单的 API 启动/暂停/恢复 - Factor 7: Contact humans with tool calls
因子 7:通过工具调用联系人类 [输出仅翻译,符合 {{content_type}} 格式] - Factor 8: Own your control flow
因子 8:掌控你的流程流向 - Factor 9: Compact Errors into Context Window
因子 9:将错误压缩到上下文窗口中 - Factor 10: Small, Focused Agents
因子 10:小型、专注的代理程序 - Factor 11: Trigger from anywhere, meet users where they are
因子 11:从任何地方触发,与用户所在位置相遇 - Factor 12: Make your agent a stateless reducer
Factor 12:让你的代理成为无状态的归约器
- Contribute to this guide here
在此贡献指南 - I talked about a lot of this on an episode of the Tool Use podcast in March 2025
我在 2025 年 3 月的一期 Tool Use 播客中谈到了很多相关内容 - I write about some of this stuff at The Outer Loop
我在《The Outer Loop》上写过一些相关内容 - I do webinars about Maximizing LLM Performance with @hellovai
我做关于使用 @hellovai 最大化 LLM 性能的网络研讨会 - We build OSS agents with this methodology under got-agents/agents
我们使用此方法论在 got-agents/agents 下构建 OSS 代理程序 - We ignored all our own advice and built a framework for running distributed agents in kubernetes
我们无视了所有自己的建议,构建了一个用于在 Kubernetes 中运行分布式代理的框架 - Other links from this guide:
其他链接来自本指南:- 12 Factor Apps 12 Factor 应用
- Building Effective Agents (Anthropic)
构建有效的代理(Anthropic) - Prompts are Functions 提示即函数
- Library patterns: Why frameworks are evil
库模式:为什么框架是有害的 - The Wrong Abstraction 《错误的抽象》
- Mailcrew Agent 邮件群发代理
- Mailcrew Demo Video 邮件演示视频
- Chainlit Demo 链 lit 演示
- TypeScript for LLMs
- Schema Aligned Parsing 架构对齐解析
- Function Calling vs Structured Outputs vs JSON Mode
函数调用 vs 结构化输出 vs JSON 模式 - BAML on GitHub BAML 在 GitHub 上
- OpenAI JSON vs Function Calling
OpenAI JSON 与 Function 调用 - Outer Loop Agents 外环代理
- Airflow Airflow [仅输出翻译,符合 {{content_type}} 格式]
- Prefect Prefect [仅输出翻译,符合{{content_type}}格式]
- Dagster
- Inngest
- Windmill 风车
- The AI Agent Index (MIT)
AI Agent 指数 (MIT) - NotebookLM on Finding Model Capability Boundaries
NotebookLM 在寻找模型能力边界
Thanks to everyone who has contributed to 12-factor agents!
感谢所有为 12-factor agents 做出贡献的人!
This is the current version of 12-factor agents, version 1.0. There is a draft of version 1.1 on the v1.1 branch. There are a few Issues to track work on v1.1.
这是当前版本的 12-factor agents,版本 1.0。在 v1.1 分支上有一个版本 1.1 的草案。有一些 Issues 用于跟踪 v1.1 的工作。
All content and images are licensed under a CC BY-SA 4.0 License
所有内容和图片均采用 CC BY-SA 4.0 许可证授权
Code is licensed under the Apache 2.0 License
代码采用 Apache 2.0 许可证授权


















