這是用戶在 2025-7-10 8:03 為 https://theconversation.com/neurosymbolic-ai-is-the-answer-to-large-language-models-inability-to-sto... 保存的雙語快照頁面,由 沉浸式翻譯 提供雙語支持。了解如何保存?
A hand holding a digital brain
Down with endless data. Alexander Supertramp
受夠沒完沒了的資料了。Alexander Supertramp

Neurosymbolic AI is the answer to large language models’ inability to stop hallucinating
Neurosymbolic AI 才是解決大型語言模型老是胡說八道的辦法啦!

The main problem with big tech’s experiment with artificial intelligence (AI) is not that it could take over humanity. It’s that large language models (LLMs) like Open AI’s ChatGPT, Google’s Gemini and Meta’s Llama continue to get things wrong, and the problem is intractable.
科技巨頭們在人工智慧(AI)上的實驗,主要問題不是它會取代人類啦。而是像 OpenAI 的 ChatGPT、Google 的 Gemini 和 Meta 的 Llama 這些 LLMs,一直會亂講話,而且這問題超難搞定。

Known as hallucinations, the most prominent example was perhaps the case of US law professor Jonathan Turley, who was falsely accused of sexual harassment by ChatGPT in 2023.
這狀況叫做「幻覺」(hallucinations),最經典的例子大概就是美國法學教授 Jonathan Turley 的案子了,他在 2023 年被 ChatGPT 亂指控性騷擾。

OpenAI’s solution seems to have been to basically “disappear” Turley by programming ChatGPT to say it can’t respond to questions about him, which is clearly not a fair or satisfactory solution. Trying to solve hallucinations after the event and case by case is clearly not the way to go.
OpenAI 的解決方法好像就是直接讓 Turley 「消失」一樣,把 ChatGPT 設定成不能回答關於他的問題,這擺明就不是個公平又讓人滿意的解決方案啊。想說等事情發生了再一個一個去處理這些幻覺,這根本不是辦法嘛。

The same can be said of LLMs amplifying stereotypes or giving western-centric answers. There’s also a total lack of accountability in the face of this widespread misinformation, since it’s difficult to ascertain how the LLM reached this conclusion in the first place.
同樣的狀況也發生在 LLMs 放大刻板印象或給出偏向西方觀點的答案。面對這些滿天飛的錯誤資訊,完全沒有人需要負責,因為根本很難搞清楚 LLM 到底是怎麼得出這些結論的。

We saw a fierce debate about these problems after the 2023 release of GPT-4, the most recent major paradigm in OpenAI’s LLM development. Arguably the debate has cooled since then, though without justification.
2023 年 OpenAI 推出最新的 LLM 大作 GPT-4 後,我們就看到針對這些問題的激烈討論。雖然之後討論好像沒那麼熱烈了,但其實問題根本沒解決。

The EU passed its AI Act in record time in 2024, for instance, in a bid to be world leader in overseeing this field. But the act relies heavily on AI companies to regulate themselves without really addressing the issues in question. It hasn’t stopped tech companies from releasing LLMs worldwide to hundreds of millions of users and collecting their data without proper scrutiny.
像歐盟就在 2024 年火速通過了他們的 AI 法案,想在這個領域搶當世界領頭羊。但這法案很大程度上還是靠 AI 公司自己管自己,根本沒真正解決那些核心問題。這也沒阻止科技公司把 LLMs 推向全球幾億使用者,然後在沒啥監督的情況下收集他們的資料。

Understand how AI is changing society
了解 AI 是怎麼改變社會的

Meanwhile, the latest tests indicate that even the most sophisticated LLMs remain unreliable. Despite this, the leading AI companies still resist taking responsibility for errors.
同時,最新的測試顯示,即使是最厲害的 LLMs 還是不太可靠。儘管如此,那些領頭的 AI 公司還是不願意為錯誤負責。

Unfortunately LLMs’ tendencies to misinform and reproduce bias can’t be solved with gradual improvements over time. And with the advent of agentic AI, where users will soon be able to assign projects to an LLM such as, say, booking their holiday or optimising the payment of all their bills each month, the potential for trouble is set to multiply.
可惜的是,LLMs 這種亂給資訊和複製偏見的毛病,不是靠時間慢慢改進就能解決的。而且隨著 agentic AI 的出現,使用者很快就能把一些任務丟給 LLM 去做,像是訂機票飯店,或是每個月幫你把所有帳單都繳好繳滿,這樣一來,出包的機率只會越來越高。

The emerging field of neurosymbolic AI could solve these issues, while also reducing the enormous amounts of data required for training LLMs. So what is neurosymbolic AI and how does it work?
新興的 neurosymbolic AI 這個領域或許能解決這些問題,同時也能減少訓練 LLMs 需要的龐大資料量。所以,到底什麼是 neurosymbolic AI?它是怎麼運作的呢?

The LLM problem  LLM 的問題

LLMs work using a technique called deep learning, where they are given vast amounts of text data and use advanced statistics to infer patterns that determine what the next word or phrase in any given response should be. Each model – along with all the patterns it has learned – is stored in arrays of powerful computers in large data centres known as neural networks.
LLMs 是用一種叫做深度學習的技術來運作的,他們會被餵食大量的文字資料,然後用高階統計學來推斷模式,決定在任何回應中下一個字或詞應該是什麼。每個模型,連同它學到的所有模式,都儲存在大型資料中心裡強大的電腦陣列中,這些地方就叫做神經網路。

LLMs can appear to reason using a process called chain-of-thought, where they generate multi-step responses that mimic how humans might logically arrive at a conclusion, based on patterns seen in the training data.
LLMs 看起來好像會思考,這是透過一種叫做 chain-of-thought 的過程,他們會產生多步驟的回應,模仿人類根據訓練資料中看到的模式,如何邏輯性地得出結論。

Undoubtedly, LLMs are a great engineering achievement. They are impressive at summarising text and translating, and may improve the productivity of those diligent and knowledgeable enough to spot their mistakes. Nevertheless they have great potential to mislead because their conclusions are always based on probabilities – not understanding.
不得不說,LLMs 真的是很厲害的工程成就啦!它們在文字摘要跟翻譯方面超強的,而且如果你夠認真、夠懂行,能抓出它們的錯誤,確實可以提升不少工作效率。不過呢,它們還是很有可能誤導人,因為它們的結論都是基於機率,而不是真的「懂」。

Hands typing a question into an AI
Misinformation in, misinformation out. Collagery
假資訊進,假資訊出。拼貼畫。

A popular workaround is called “human-in-the-loop”: making sure that humans using AIs still make the final decisions. However, apportioning blame to humans does not solve the problem. They’ll still often be misled by misinformation.
一個很常見的變通方法叫做「人在迴圈」(human-in-the-loop),意思就是用 AI 的時候,最後的決定還是要由人來做。不過,把責任都推給人並不能解決問題啦。他們還是常常會被那些錯誤資訊給誤導啊。

LLMs now need so much training data to advance that we’re now having to feed them synthetic data, meaning data created by LLMs. This data can copy and amplify existing errors from its own source data, such that new models inherit the weaknesses of old ones. As a result, the cost of programming AIs to be more accurate after their training – known as “post-hoc model alignment” – is skyrocketing.
現在的 LLMs 要進步,需要超大量的訓練資料,多到我們甚至得餵它們「合成資料」,也就是 LLMs 自己產生的資料。這種資料會複製甚至放大它原始資料裡的錯誤,結果就是新的模型會繼承舊模型的缺點。這樣一來,訓練完後還要再調整 AI 讓它更準確(這叫做「後驗模型對齊」)的成本就爆高了。

It also becomes increasingly difficult for programmers to see what’s going wrong because the number of steps in the model’s thought process become ever larger, making it harder and harder to correct for errors.
而且啊,因為模型思考的步驟越來越多,程式設計師要找出哪裡出錯也變得越來越難,要修正錯誤就更不容易了。

Neurosymbolic AI combines the predictive learning of neural networks with teaching the AI a series of formal rules that humans learn to be able to deliberate more reliably. These include logic rules, like “if a then b”, such as “if it’s raining then everything outside is normally wet”; mathematical rules, like “if a = b and b = c then a = c”; and the agreed upon meanings of things like words, diagrams and symbols. Some of these will be inputted directly into the AI system, while it will deduce others itself by analysing its training data and doing “knowledge extraction”.
Neurosymbolic AI 這種技術,就是把神經網路那種預測學習的能力,跟教 AI 一堆人類學來、可以更可靠地思考的正式規則結合起來。這些規則包含邏輯規則,像是「如果 A 成立,那 B 就成立」,舉例來說就是「如果下雨,外面通常都是濕的」;還有數學規則,像是「如果 A 等於 B,B 又等於 C,那 A 就等於 C」;以及大家約定俗成的東西,像是文字、圖表和符號的意思。有些規則會直接輸入到 AI 系統裡,有些則是 AI 自己分析訓練資料,透過「知識萃取」推論出來的。

This should create an AI that will never hallucinate and will learn faster and smarter by organising its knowledge into clear, reusable parts. For example if the AI has a rule about things being wet outside when it rains, there’s no need for it to retain every example of the things that might be wet outside – the rule can be applied to any new object, even one it has never seen before.
這樣就能做出一個絕對不會胡說八道(hallucinate)的 AI,而且它會把知識整理得清清楚楚、可以重複利用,學得更快更聰明。舉例來說,如果這個 AI 有一條規則是「下雨的時候外面東西會濕」,那它就不用記住所有外面可能會濕的東西的例子了——這條規則可以套用到任何新的東西上,就算它以前從來沒看過也一樣。

During model development, neurosymbolic AI also integrates learning and formal reasoning using a process known as the “neurosymbolic cycle”. This involves a partially trained AI extracting rules from its training data then instilling this consolidated knowledge back into the network before further training with data.
在模型開發的時候,neurosymbolic AI 還會用一種叫做「neurosymbolic cycle」的流程,把學習跟形式化推理整合在一起。這個流程就是讓一個還沒完全訓練好的 AI,先從它的訓練資料裡把規則抽出來,然後再把這些整理好的知識灌回網路裡,之後再用資料繼續訓練。

This is more energy efficient because the AI needn’t store as much data, while the AI is more accountable because it’s easier for a user to control how it reaches particular conclusions and improves over time. It’s also fairer because it can be made to follow pre-existing rules, such as: “For any decision made by the AI, the outcome must not depend on a person’s race or gender”.
這樣更省電,因為 AI 不用存那麼多資料,而且 AI 也更負責任,使用者更容易控制它是怎麼得出結論的,而且會隨著時間越來越進步。這也更公平,因為可以讓它遵守一些既定的規則,像是:「AI 做任何決定時,結果都不能跟一個人的種族或性別有關。」

The third wave  第三波

The first wave of AI in the 1980s, known as symbolic AI, was actually based on teaching computers formal rules that they could then apply to new information. Deep learning followed as the second wave in the 2010s, and many see neurosymbolic AI as the third.
1980 年代的第一波 AI,叫做 symbolic AI,其實就是教電腦一些正式的規則,然後它們就可以把這些規則套用到新的資訊上。接著在 2010 年代出現了第二波的 Deep learning,很多人都覺得 neurosymbolic AI 就是第三波。

It’s easiest to apply neurosymbolic principles to AI in niche areas, because the rules can be clearly defined. So it’s no surprise that we’ve seen it first emerge in Google’s AlphaFold, which predicts protein structures to help with drug discovery; and AlphaGeometry, which solves complex geometry problems.
把 neurosymbolic 的原理應用在 AI 上,在特定領域會比較容易,因為規則可以定義得很清楚。所以我們看到它最先出現在 Google 的 AlphaFold 和 AlphaGeometry,一點都不意外。AlphaFold 可以預測蛋白質結構,對藥物開發很有幫助;AlphaGeometry 則是專門解決複雜的幾何問題。

For more broad-based AIs, China’s DeepSeek uses a learning technique called “distillation” which is a step in the same direction. But to make neurosymbolic AI fully feasible for general models, there still needs to be more research to refine their ability to discern general rules and perform knowledge extraction.
至於更廣泛的 AI,中國的 DeepSeek 用了一種叫做「蒸餾」(distillation)的學習技術,這也算是往同一個方向邁進了一步。不過,要讓 neurosymbolic AI 對於通用模型來說完全可行,還是需要更多研究來精進它們辨識通用規則和提取知識的能力。

It’s unclear to what extent LLM makers are working on this already. They certainly sound like they’re heading in the direction of trying to teach their models to think more cleverly, but they also seem wedded to the need to scale up with ever larger amounts of data.
現在還不確定那些做 LLM 的廠商到底在這方面下了多少功夫。聽起來他們確實是想讓模型學得更聰明,但他們好像也離不開靠海量數據來擴大規模這條路。

The reality is that if AI is going to keep advancing, we will need systems that adapt to novelty from only a few examples, that check their understanding, that can multitask and reuse knowledge to improve data efficiency and that can reason reliably in sophisticated ways.
說真的啦,如果 AI 要繼續進步,我們就需要那種只要看幾個例子就能學會新東西、會自己檢查理解對不對、能同時處理好幾件事、還能把學到的知識拿來重複利用,這樣資料才不會浪費,而且最重要的是,它得能用很厲害的方式可靠地推理才行。

This way, well designed digital technology could potentially even offer an alternative to regulation, because the checks and balances would be built into the architecture and perhaps standardised across the industry. There’s a long way to go, but at least there’s a path ahead.
這樣一來,設計得好的數位科技搞不好還能取代法規,因為那些制衡機制會直接內建在架構裡,甚至可能成為業界標準。雖然還有很長一段路要走,但至少有個方向了。