Research
Advanced version of Gemini with Deep Think officially achieves gold-medal standard at the International Mathematical Olympiad
搭載 Deep Think 的 Gemini 進階版,在國際數學奧林匹亞競賽正式達到金牌水準
The International Mathematical Olympiad (“IMO”) is the world’s most prestigious competition for young mathematicians, and has been held annually since 1959. Each country taking part is represented by six elite, pre-university mathematicians who compete to solve six exceptionally difficult problems in algebra, combinatorics, geometry, and number theory. Medals are awarded to the top half of contestants, with approximately 8% receiving a prestigious gold medal.
國際數學奧林匹亞競賽 (International Mathematical Olympiad,簡稱 IMO) 是全球最負盛名的青年數學家競賽,自 1959 年以來每年舉辦一次。每個參賽國家都派出六名菁英大學預科數學家,競相解決代數、組合學、幾何學和數論中六個極其困難的問題。獎牌頒發給前一半的參賽者,其中約 8% 的人獲得享有盛譽的金牌。
Recently, the IMO has also become an aspirational challenge for AI systems as a test of their advanced mathematical problem-solving and reasoning capabilities. Last year, Google DeepMind’s combined AlphaProof and AlphaGeometry 2 systems achieved the silver-medal standard, solving four out of the six problems and scoring 28 points. Making use of specialist formal languages, this breakthrough demonstrated that AI was beginning to approach elite human mathematical reasoning.
最近,IMO 也成為 AI 系統的理想挑戰,用來測試其進階數學問題解決和推理能力。去年,Google DeepMind 的 AlphaProof 和 AlphaGeometry 2 組合系統達到了銀牌水準,解決了六個問題中的四個,並獲得 28 分。這項突破利用了專業的形式語言,證明了 AI 已開始接近人類菁英的數學推理能力。
This year, we were amongst an inaugural cohort to have our model results officially graded and certified by IMO coordinators using the same criteria as for student solutions. Recognizing the significant accomplishments of this year’s student-participants, we’re now excited to share the news of Gemini’s breakthrough performance.
今年,我們是首批由 IMO 協調員使用與學生解決方案相同的標準,正式評分並認證模型結果的團隊之一。我們認可今年參賽學生的重大成就,現在很高興能分享 Gemini 突破性表現的消息。
Breakthrough Performance at IMO 2025 with Gemini Deep Think
Gemini Deep Think 在 2025 年 IMO 競賽中取得突破性表現
An advanced version of Gemini Deep Think solved five out of the six IMO problems perfectly, earning 35 total points, and achieving gold-medal level performance. The solutions can be found online here.
搭載 Deep Think 的 Gemini 進階版完美解決了六個 IMO 問題中的五個,總共獲得 35 分,達到金牌水準的表現。解決方案可在此處線上查看。
"We can confirm that Google DeepMind has reached the much-desired milestone, earning 35 out of a possible 42 points — a gold medal score. Their solutions were astonishing in many respects. IMO graders found them to be clear, precise and most of them easy to follow."
我們可以確認 Google DeepMind 已達到備受期待的里程碑,在滿分 42 分中獲得 35 分,這是金牌分數。他們的解決方案在許多方面都令人驚訝。IMO 評審認為它們清晰、精確,而且大多數都易於理解。
IMO President Prof. Dr. Gregor Dolinar
IMO 主席 Gregor Dolinar 教授
This achievement is a significant advance over last year’s breakthrough result. At IMO 2024, AlphaGeometry and AlphaProof required experts to first translate problems from natural language into domain-specific languages, such as Lean, and vice-versa for the proofs. It also took two to three days of computation. This year, our advanced Gemini model operated end-to-end in natural language, producing rigorous mathematical proofs directly from the official problem descriptions – all within the 4.5-hour competition time limit.
這項成就比去年的突破性成果有了顯著的進步。在 2024 年 IMO 競賽中,AlphaGeometry 和 AlphaProof 需要專家先將問題從自然語言翻譯成特定領域語言 (例如 Lean),然後再將證明反向翻譯。這也需要兩到三天的計算時間。今年,我們的 Gemini 進階模型以自然語言端到端運作,直接從官方問題描述中產生嚴謹的數學證明,所有這些都在 4.5 小時的比賽時間限制內完成。
Making the most of Deep Think mode
充分運用 Deep Think 模式
We achieved this year’s result using an advanced version of Gemini Deep Think – an enhanced reasoning mode for complex problems that incorporates some of our latest research techniques, including parallel thinking. This setup enables the model to simultaneously explore and combine multiple possible solutions before giving a final answer, rather than pursuing a single, linear chain of thought.
我們今年是透過 Gemini Deep Think 的進階版本達成這項成果,這是一種針對複雜問題所設計的強化推理模式,其中整合了我們最新的研究技術,包括平行思考。這項設定讓模型能夠同時探索並結合多種可能的解決方案,然後才給出最終答案,而不是只追循單一、線性的思考路徑。
To make the most of the reasoning capabilities of Deep Think, we additionally trained this version of Gemini on novel reinforcement learning techniques that can leverage more multi-step reasoning, problem-solving and theorem-proving data. We also provided Gemini with access to a curated corpus of high-quality solutions to mathematics problems, and added some general hints and tips on how to approach IMO problems to its instructions.
為了充分發揮 Deep Think 的推理能力,我們額外運用了新穎的強化學習技術來訓練這個版本的 Gemini,讓它能運用更多多步驟推理、問題解決和定理證明資料。我們也讓 Gemini 能夠存取精選的高品質數學問題解答語料庫,並在它的指令中加入一些關於如何處理 IMO 問題的通用提示和技巧。
We will be making a version of this Deep Think model available to a set of trusted testers, including mathematicians, before rolling it out to Google AI Ultra subscribers.
我們將會把這個 Deep Think 模型提供給一群信任的測試人員,包含數學家,之後才會開放給 Google AI Ultra 的訂閱者使用。
The Future of AI and Mathematics
人工智慧與數學的未來
Google DeepMind has ongoing collaborations with the mathematical community, but we are still only at the start of AI’s potential to contribute to mathematics. By teaching our systems to reason more flexibly and intuitively, we are getting closer to building AI that can solve more complex and advanced mathematics.
Google DeepMind 持續與數學界合作,但人工智慧對數學的潛力才剛開始發揮。透過教導我們的系統更靈活、直觀地推理,我們正逐漸接近建構能解決更複雜、更進階數學問題的人工智慧。
While our approach this year was based purely on natural language with Gemini, we also continue making progress on our formal systems, AlphaGeometry and AlphaProof. We believe agents that combine natural language fluency with rigorous reasoning - including verified reasoning in formal languages - will become invaluable tools for mathematicians, scientists, engineers, and researchers, helping us advance human knowledge on the path to AGI.
雖然我們今年純粹以 Gemini 的自然語言為基礎,但我們在形式系統 AlphaGeometry 和 AlphaProof 上也持續取得進展。我們相信,結合自然語言流暢度與嚴謹推理(包括形式語言中的驗證推理)的代理程式,將成為數學家、科學家、工程師和研究人員不可或缺的工具,協助我們在通往通用人工智慧 (AGI) 的道路上推進人類知識。
Acknowledgements
致謝
We thank the International Mathematical Olympiad organization for their support.
我們感謝國際數學奧林匹亞組織的支持。
Thang Luong led the overall technical direction of the advanced Gemini model with Deep Think for IMO and co-led with Edward Lockhart on the overall coordination of the IMO 2025 effort.
Thang Luong 負責 IMO 進階版 Gemini 模型與 Deep Think 的整體技術方向,並與 Edward Lockhart 共同負責 IMO 2025 專案的整體協調。
The IMO 2025 system would not have been possible without the following technical leads. Dawsen Hwang, Junehyuk Jung co-led training data and expert evaluation. Jonathan Lee, Nate Kushman, Pol Moreno, Yi Tay co-led the training of the advanced Gemini Deep Think model while Lei Yu led model evaluation. Golnaz Ghiazi, Garrett Bingham, Lalit Jain co-led Deep Think inference while Dawsen Hwang, Vincent Cohen-Addad co-led an enhanced inference approach.
若沒有以下技術負責人的貢獻,IMO 2025 系統將無法實現。Dawsen Hwang、Junehyuk Jung 共同負責訓練資料和專家評估。Jonathan Lee、Nate Kushman、Pol Moreno、Yi Tay 共同負責進階版 Gemini Deep Think 模型的訓練,而 Lei Yu 則負責模型評估。Golnaz Ghiazi、Garrett Bingham、Lalit Jain 共同負責 Deep Think 推論,而 Dawsen Hwang、Vincent Cohen-Addad 則共同負責強化推論方法。
The IMO 2025 system was also developed with key contributions from Theophane Weber, Ankesh Anand for modeling; Vinay Ramasesh, Andreas Kirsch, Jieming Mao, Zicheng Xu, Wilfried Bounsi, Vahab Mirrokni for inference; Hoang Nguyen, Fred Zhang, Mahan Malihi, Yangsibo Huang for training data.
IMO 2025 系統的開發也獲得了以下人員的關鍵貢獻:Theophane Weber、Ankesh Anand 負責模型建構;Vinay Ramasesh、Andreas Kirsch、Jieming Mao、Zicheng Xu、Wilfried Bounsi、Vahab Mirrokni 負責推論;Hoang Nguyen、Fred Zhang、Mahan Malihi、Yangsibo Huang 負責訓練資料。
We thank contributions from related teams and efforts. AlphaGeometry team with Yuri Chervonyi (lead), Trieu Trinh, Hoang Nguyen, Junsu Kim, Mirek Olšák, Marcelo Menegali, Xiaomeng Yang. Miklós Z. Horváth, Aja Huang, Goran Žužić for formal mathematics. We thank Fabian Pedregosa, Richard Song, Alex Zhai, Sara Javanmardi, YaGuang Li, Filipe Miguel de Almeida, Silvio Lattanzi, Ashkan Norouzi Fard, Tal Schuster, Honglu Fan, Xuezhi Wang, Aditi Mavalankar, Tom Schaul, Rosemary Ke for support and collaboration.
我們感謝相關團隊和專案的貢獻。AlphaGeometry 團隊成員包括 Yuri Chervonyi (負責人)、Trieu Trinh、Hoang Nguyen、Junsu Kim、Mirek Olšák、Marcelo Menegali、Xiaomeng Yang。Miklós Z. Horváth、Aja Huang、Goran Žužić 負責形式數學。我們感謝 Fabian Pedregosa、Richard Song、Alex Zhai、Sara Javanmardi、YaGuang Li、Filipe Miguel de Almeida、Silvio Lattanzi、Ashkan Norouzi Fard、Tal Schuster、Honglu Fan、Xuezhi Wang、Aditi Mavalankar、Tom Schaul、Rosemary Ke 的支持與合作。
We especially thank other core members of the Deep Think team (Archit Sharma, Tong He, Shubha Raghvendra), the post-training effort (Tianhe Kevin Yu, Siamak Shakeri, Hanzhao Lin, Cosmo Du, Sid Lall), and Thinking Area research that the IMO 2025 system were built on.
我們特別感謝 Deep Think 團隊的其他核心成員 (Archit Sharma、Tong He、Shubha Raghvendra)、後訓練專案 (Tianhe Kevin Yu、Siamak Shakeri、Hanzhao Lin、Cosmo Du、Sid Lall),以及 IMO 2025 系統所依據的 Thinking Area 研究。
This effort was advised by Quoc Le and Pushmeet Kohli, with program support from Kristen Chiafullo and Alex Goldin.
這項工作由 Quoc Le 和 Pushmeet Kohli 提供建議,並由 Kristen Chiafullo 和 Alex Goldin 提供專案支援。
We’d also like to thank our experts for providing data and evaluations: Insuk Seo (lead), Jiwon Kang, Donghyun Kim, Junsu Kim, Jimin Kim, Seongbin Jeon, Yoonho Na, Seunghwan Lee, Jihoo Lee, Younghun Jo, Yongsuk Hur, Seongjae Park, Kyuhyeon Choi, Minkyu Choi, Su-Hyeok Moon, Seojin Kim, Yueun Lee, Taehun Kim, Jeeho Ryu, Seungwoo Lee, Dain Kim, Sanha Lee, Hyunwoo Choi, Aiden Jung, Youngbeom Jin, Jeonghyun Ahn, Junhwi Bae, Gyumin Kim, Nam Dung Tran, Cheng-Chiang Tsai, Kari Ragnarsson, Kiat Chuan Tan, Yahya Tabesh, Hamed Mahdavi, Azin Nazari, Xiangzhuo Ding, Chu-Lan Kao, Steven Creech, Tony Feng, Ciprian Manolescu.
我們也要感謝提供資料和評估的專家:Insuk Seo (負責人)、Jiwon Kang、Donghyun Kim、Junsu Kim、Jimin Kim、Seongbin Jeon、Yoonho Na、Seunghwan Lee、Jihoo Lee、Younghun Jo、Yongsuk Hur、Seongjae Park、Kyuhyeon Choi、Minkyu Choi、Su-Hyeok Moon、Seojin Kim、Yueun Lee、Taehun Kim、Jeeho Ryu、Seungwoo Lee、Dain Kim、Sanha Lee、Hyunwoo Choi、Aiden Jung、Youngbeom Jin、Jeonghyun Ahn、Junhwi Bae、Gyumin Kim、Nam Dung Tran、Cheng-Chiang Tsai、Kari Ragnarsson、Kiat Chuan Tan、Yahya Tabesh、Hamed Mahdavi、Azin Nazari、Xiangzhuo Ding、Chu-Lan Kao、Steven Creech、Tony Feng、Ciprian Manolescu。
Further thanks to Jessica Lo and Sajjad Zafar for their support for compute provision and management; Jane Labanowski, Andy Forbes, Sean Nakamoto for legal and logistics; and Omer Levy, Timothy Lillicrap, Jack Rae, Yifeng Lu, Heng-tze Cheng, Ed Chi, Vahab Mirrokni, Tulsee Doshi, Madhavi Sewak, Melvin Johnson, Koray Kavukcuoglu, Oriol Vinyals, Jeff Dean, Demis Hassabis, and Sergey Brin for their support and advice.
另外也要感謝 Jessica Lo 和 Sajjad Zafar 在運算資源提供和管理上的協助;Jane Labanowski、Andy Forbes、Sean Nakamoto 在法律和後勤方面的支援;以及 Omer Levy、Timothy Lillicrap、Jack Rae、Yifeng Lu、Heng-tze Cheng、Ed Chi、Vahab Mirrokni、Tulsee Doshi、Madhavi Sewak、Melvin Johnson、Koray Kavukcuoglu、Oriol Vinyals、Jeff Dean、Demis Hassabis 和 Sergey Brin 提供的支持和建議。
Finally, we thank Prof Gregor Dolinar from the IMO Board for the support and endorsement.
最後,我們感謝國際數學奧林匹亞競賽委員會的 Gregor Dolinar 教授給予的支持與認可。
The IMO have confirmed that our submitted answers are complete and correct solutions. It is important to note that their review does not extend to validating our system, processes, or underlying model (see more).
國際數學奧林匹亞競賽委員會已確認我們提交的答案是完整且正確的解答。值得注意的是,他們的審查範圍不包含驗證我們的系統、流程或底層模型(查看更多)。