这是用户在 2025-7-22 9:12 为 https://avi.im/blag/2025/rickrolling-turso/ 保存的双语快照页面,由 沉浸式翻译 提供双语支持。了解如何保存?
About Now  关于现在

Rickrolling Turso DB (SQLite rewrite in Rust)
《Rickrolling Turso DB (Rust 重写的 SQLite) - blag》

This is a beginner’s guide to hacking into Turso DB (formerly known as Limbo), the SQLite rewrite in Rust. In this short post, I will explore how to get familiar with Turso’s codebase, tooling, and tests (hereafter mentioned as just Turso).
这是一份关于如何开始探索 Turso DB(原名 Limbo,即用 Rust 重写的 SQLite)的入门指南。在这篇简短的文章中,我将介绍如何熟悉 Turso 的代码库、工具和测试(以下统称为 Turso)。

Disclosure: I work in the same organization where Turso is being developed. However, I am not a contributor (yet) and most of my work is on the Turso Server.
声明:我在 Turso 开发的同一组织工作。然而,我目前还不是贡献者,我的工作主要集中在 Turso 服务器上。

I don’t contribute to the Turso Database, but I am somewhat familiar with SQLite. I wanted to take Turso for a spin and explore the codebase. I timeboxed my experiment for 6 hours, including writing this blog post. Do note that Turso is under heavy development, so much so that core developers spend more time resolving merge conflicts than writing code. So expect most of the content in this blog post to get obsoleted by next month.
我不为 Turso 数据库做贡献,但对 SQLite 有一定的了解。我想试试 Turso 并探索其代码库。我给这个实验设定了 6 小时的时间,包括写这篇博客。请注意 Turso 正在积极开发中,以至于核心开发人员花更多时间解决合并冲突而不是编写代码。所以可以预期这篇博客中的大部分内容在下个月都会过时。

Getting and Compiling  获取和编译

Turso is written in Rust. So getting the code and running it was fairly simple:
Turso 是用 Rust 编写的。所以获取代码并运行它相当简单:

git clone git@github.com:tursodatabase/turso.git
cd turso

cargo run
   Compiling turso_cli v0.1.3-pre.3 (/turso/cli)
    Finished `dev` profile [unoptimized + debuginfo] target(s) in 1.74s
     Running `target/debug/tursodb`

Turso v0.1.3-pre.3
Enter ".help" for usage hints.
This software is ALPHA, only use for development, testing, and experimentation.
Connected to a transient in-memory database.
Use ".open FILENAME" to reopen on a persistent database
turso>

Side note: compiling and running SQLite is also simple, btw.
附注:编译和运行 SQLite 也是简单的,顺便一提。

The big hurdle I find with large codebases is that I have no idea how to compile and get them to run. My background is not in systems programming. I really like how modern languages make it easy to compile and run something. Last time I tried make build on a large C codebase, I invoked Cthulhu and got a bunch of unrelated errors, and I had no idea how to solve any of them.
我在处理大型代码库时遇到的主要困难是我不知道如何编译并让它们运行。我的背景不是系统编程。我真的很喜欢现代语言如何让编译和运行变得简单。上次我在一个大型 C 代码库上尝试 make build 时,我召唤了克苏鲁并得到了一堆不相关的错误,而且我不知道如何解决其中任何一个。

turso> create table t(v);
turso> insert into t values("answer to everything");
turso> select * from t;
┌──────────────────────┐
│ v                    │
├──────────────────────┤
│ answer to everything │
└──────────────────────┘
turso>

My goal was to hack the column to provide some custom value instead of whatever is inserted. Perhaps, 42, the answer to everything.
我的目标是修改列以提供一些自定义值,而不是插入的值。也许, 42 ,万物的答案。

Some internals - op_column  一些内部细节 - op_column

To make this work, we need to figure out how a query is run and executed. How the db parses the bytes it fetched from disk into something meaningful. It all starts with a db connection, so we must first understand the entry point and lifecycle of a connection. The repo has a mermaid diagram here which explains the sequence of a query.
为了让这工作,我们需要弄清楚查询是如何运行和执行的。数据库如何解析从磁盘获取的字节以使其具有意义。这一切都始于数据库连接,因此我们必须首先理解连接的入口点和生命周期。仓库在这里有一个 mermaid 图表,解释了查询的顺序。

Knowing SQLite internals helps here. In short, SQLite has a VM interpreter. Every query statement is turned into a bunch of opcodes, then executed on this VM, called VDBE.
了解 SQLite 的内部机制有助于此操作。简而言之,SQLite 有一个虚拟机解释器。每个查询语句都会被转换为一系列操作码,然后在名为 VDBE 的虚拟机上执行。

SQLite also has a command equivalent to the EXPLAIN query we see in most databases. It gives a breakdown of a query plan and how it is run:
SQLite 还有一个与大多数数据库中看到的 EXPLAIN 查询等效的命令。它提供了查询计划的分析以及其执行方式:

turso> explain select * from t;
addr  opcode             p1    p2    p3    p4             p5  comment
----  -----------------  ----  ----  ----  -------------  --  -------
0     Init               0     7     0                    0   Start at 7
1     OpenRead           0     2     0                    0   table=t, root=2
2     Rewind             0     6     0                    0   Rewind table t
3       Column           0     0     1                    0   r[1]=t.v
4       ResultRow        1     1     0                    0   output=r[1]
5     Next               0     3     0                    0
6     Halt               0     0     0                    0
7     Transaction        0     0     0                    0   write=false
8     Goto               0     1     0                    0

turso>

Read this post by Ben Johnson to understand more about VDBE. Turso CLI also has an option to inspect any opcode:
阅读 Ben Johnson 的这篇帖子,以了解更多关于 VDBE 的信息。Turso CLI 也有一个选项可以检查任何操作码:

turso> .opcodes Column

Column
-------
Interpret the data that cursor P1 points to as a structure built using the MakeRecord instruction.
(See the MakeRecord opcode for additional information about the format of the data.) Extract the
P2-th column from this record. If there are less than (P2+1) values in the record, extract a NULL.
The value extracted is stored in register P3. If the record contains fewer than P2 fields, then
extract a NULL. Or, if the P4 argument is a P4_MEM use the value of the P4 argument as the result.
If the OPFLAG_LENGTHARG bit is set in P5 then the result is guaranteed to only be used by the
length() function or the equivalent. The content of large blobs is not loaded, thus saving CPU
cycles. If the OPFLAG_TYPEOFARG bit is set then the result will only be used by the typeof()
function or the IS NULL or IS NOT NULL operators or the equivalent. In this case, all content
loading can be omitted.

In short, the Column opcode is the one which is responsible for converting the values fetched from disk to an in-memory data structure. So if I wanted to hack it, this is where I need to modify. But how do I find how this opcode is implemented?
简而言之, Column 指令码负责将从磁盘获取的值转换为内存数据结构。所以如果我想修改它,就需要从这里进行修改。但我该如何找到这个指令码的实现方式?

Since this is a Rust codebase, I set RUST_LOG=trace and did cargo run. This spit out a bunch of log lines, walking through the VM execution. RUST_LOG env works for most, if not all, Rust projects; it is quite universal.
由于这是一个 Rust 项目,我设置了 RUST_LOG=trace 并执行了 cargo run 。这输出了大量日志行,展示了虚拟机执行过程。 RUST_LOG 环境适用于大多数甚至所有 Rust 项目,非常通用。

With some more exploration of the codebase, I found out the following:
通过进一步探索代码库,我发现以下几点:

  1. The Connection struct   Connection 结构体
  2. Once a connection is created, the SQL statement is parsed and executed here and here.
    一旦创建连接,SQL 语句就在这里和这里被解析和执行。
  3. translate_select generates the bytecode VM instructions
    translate_select 生成字节码虚拟机指令
  4. op_column is implemented here   op_column 在这里实现
  5. The Program builder runs the VM
    程序构建器运行虚拟机

With all this info, I modified the op_column to do just this:
有了这些信息,我修改了 op_column ,让它只做这件事:

state.registers[*dest] = Register::Value(Value::Integer(42));
state.pc += 1;
return Ok(InsnFunctionStepResult::Step);

and it worked!  而且它成功了!

turso> insert into t values("answer to everything");
turso>
turso> select * from t;
┌────┐
│ v  │
├────┤
│ 42 │
└────┘
turso>

But there is one big issue: how do I retrieve my data when I want to?
但是有一个大问题:我怎样才能在我需要的时候检索我的数据?

PRAGMA

SQLite lets you configure the database using PRAGMA statements. We are lucky today; Turso already supports them. Now I need to figure out how to hook up PRAGMA with the connection and use that when executing from the VM. This turned out to be easy too, as Program would have access to the Connection. So I added the flag in the Connection struct.
SQLite 允许你使用 PRAGMA 语句来配置数据库。我们今天很幸运,Turso 已经支持它们了。现在我需要弄清楚如何将 PRAGMA 与连接挂钩,并在从 VM 执行时使用它。这也很容易,因为 Program 将可以访问 Connection 。所以我将在 Connection 结构中添加这个标志。

We want to add a new PRAGMA, something that is not supported by SQLite yet. So we need to first modify the parser to consider our pragma. Then we need to update this flag to enable/disable. But this turned out to be much simpler than I expected. I added the new PRAGMA here, then fixed everywhere the Rust compiler complained.
我们希望添加一个新的 PRAGMA,这是 SQLite 目前还不支持的。因此,我们需要首先修改解析器以考虑我们的 PRAGMA。然后我们需要更新这个标志来启用/禁用。但结果比我想象的要简单得多。我在这里添加了新的 PRAGMA,然后修复了 Rust 编译器抱怨的所有地方。

Tests  测试

There are a bunch of bugs with this little hack. I also wanted to play with the DST (Deterministic Simulation Tests). To run the simulator, you can do:
这个小技巧有很多 bug。我还想试试 DST(确定性模拟测试)。要运行模拟器,你可以这样做:

cargo run -p limbo_sim

This runs the simulator till it finds a bug or till the end of time. But unfortunately, my time ran out, so I stopped my experiment. So next steps would be:
这个模拟器会一直运行,直到发现一个错误或者直到时间耗尽。但不幸的是,我的时间用完了,所以我停止了我的实验。接下来的步骤将是:

  1. Fix the bugs and add tests in the simulator
    修复模拟器中的错误并添加测试
  2. Selectively enable this feature(?) for some specific columns
    为某些特定列有选择地启用此功能(?)
  3. Add the same in SQLite. I notice a lot of method names in Turso to be similar to SQLite, so we could follow the same path
    为 SQLite 添加相同内容。我注意到 Turso 中有许多方法名称与 SQLite 相似,因此我们可以遵循相同的路径
  4. Explore adding the same in Postgres or MySQL (or try failing)
    探索在 Postgres 或 MySQL 中添加相同内容(或尝试失败)

Demo Time  演示时间

Finally, here is the demo. You can share this special build with your frens and whatever song they insert, they will always get their favourite song back ;)
最后,这是演示。你可以把这个特别版本分享给你的朋友们,无论他们输入什么歌曲,他们总会收到他们最喜欢的歌曲 ;)

Here are all my changes to run and play this.
这是我所有用于运行和播放这些的更改。

Though I am biased, I think Turso DB is a great project for someone new to jump into databases and start contributing. Rust makes it so accessible and easy to hack. If this post interested you, here are some good first issues.
虽然我有些偏见,但我认为 Turso DB 是一个非常适合新手进入数据库领域并开始贡献的好项目。Rust 让这一切变得非常容易和便捷。如果这篇文章引起了你的兴趣,这里有一些很好的初始问题。


1. Trivia: Db’s experimental experimental name was Ligma, written in Zig. Later it was rewritten to Rust and renamed to Limbo.
1. 趣闻:Db 的实验性名称最初是 Ligma,用 Zig 语言编写。后来改用 Rust 重写,并更名为 Limbo。

2. The blog posts and the linked papers have some more details on SQLite’s internals: How bloom filters made SQLite 10x faster and In search of a faster SQLite.
2. 博客文章和链接的论文中还有一些关于 SQLite 内部机制的详细信息:Bloom 过滤器如何使 SQLite 速度提升 10 倍,以及寻找更快的 SQLite。