可落地的基于攻击者视角的 LLM 安全研究综述
To build a homelab security scanner that “thinks” like an attacker and delivers quick wins, several cutting-edge research works can inspire you. Below, I’ve selected at least 7 influential, practical papers (as found via Semantic Scholar) in areas of asset discovery, attack path analysis, LLM-driven security, and automated vulnerability scanning. For each, I summarize the key idea and how you might apply it for quick wins in your homelab project.要构建一个像攻击者一样“思考”的家庭实验室安全扫描器,并实现快速成果,有几项前沿研究可以为你提供灵感。以下,我挑选了至少 7 篇在资产发现、攻击路径分析、LLM 驱动安全和自动化漏洞扫描领域具有影响力且实用的论文(通过 Semantic Scholar 检索)。对于每篇论文,我总结了其核心思想以及你如何将其应用于家庭实验室项目中以快速取得成效。
Identifying all externally exposed assets is the first step in attack surface management. Two notable works show innovative ways to do this efficiently:识别所有外部暴露的资产是攻击面管理的第一步。有两项重要工作展示了高效完成这一步的创新方法:
ZMap – Internet-Wide Scanner (Durumeric et al., USENIX Security 2013): Introduces a fast, open-source network scanner optimized for Internet-scale surveys. ZMap can scan the entire IPv4 space in under 45 minutes on one machine by approaching the max throughput of Gigabit Ethernetusenix.org. Quick win: You can leverage ZMap’s high-speed scanning techniques to quickly map your homelab’s external ports and services. For instance, using ZMap (or its descendant tools) on your IP ranges can rapidly uncover all open ports and services, giving immediate insight into unexpected exposures.ZMap – 全网扫描器(Durumeric 等,USENIX Security 2013):介绍了一种快速的开源网络扫描器,专为互联网规模的调查优化。ZMap 能在一台机器上用不到 45 分钟的时间扫描整个 IPv4 空间,接近千兆以太网的最大吞吐量 usenix.org。快速收益:你可以利用 ZMap 的高速扫描技术快速映射你的家庭实验室的外部端口和服务。例如,在你的 IP 范围内使用 ZMap(或其衍生工具)可以迅速发现所有开放的端口和服务,立即洞察意外暴露的情况。
GAN for Subdomain Enumeration (Degani et al., ACM SAC 2022): Proposes a novel Generative Adversarial Network approach to find hidden subdomains. Traditional subdomain brute-forcing is hit-or-miss, but this GAN learns patterns from public DNS data and generates high-quality candidate names. In experiments, integrating the GAN boosted subdomain discovery by up to 61% and the model could guess ~32% of unknown subdomains on averagecybersecurityunitn.github.io. Quick win: This suggests you could apply ML to asset discovery – for example, train a model on known hostnames (or use the authors’ approach if available) to predict likely subdomains in your homelab’s domain. This out-of-the-box strategy may reveal services (e.g. dev, staging, forgotten VMs) that a normal wordlist might miss, giving you early “wins” by uncovering unseen assets.用于子域枚举的 GAN(Degani 等,ACM SAC 2022):提出了一种新颖的生成对抗网络方法来发现隐藏的子域。传统的子域暴力破解方法效果不稳定,而该 GAN 从公共 DNS 数据中学习模式,生成高质量的候选名称。在实验中,集成该 GAN 使子域发现提升了最多 61%,模型平均能猜测出约 32%的未知子域。快速获益:这表明你可以将机器学习应用于资产发现——例如,训练一个模型以已知主机名为基础(或者如果有作者的方法可用,则使用该方法)来预测你自建实验室域中的可能子域。这种开箱即用的策略可能揭示服务(如开发、预发布、被遗忘的虚拟机),这些是普通字典可能遗漏的,从而通过发现未见资产为你带来早期“胜利”。
Understanding how an attacker might chain weaknesses across your systems helps prioritize fixes. The following research focuses on attack graphs and multi-step attack simulation:了解攻击者如何在你的系统中串联弱点,有助于优先修复。以下研究聚焦于攻击图和多步骤攻击模拟:
LLM-Generated Attack Graphs (Prapty et al., arXiv 2024): This work uses Large Language Models (ChatGPT) to automate building attack graphs. The LLM “intelligently chains CVEs based on their preconditions and effects” to generate possible attack pathsarxiv.org. It even parses textual threat reports to incorporate real-world tactics. Quick win: You can mimic this by feeding your homelab’s known vulnerabilities or configuration info into an LLM prompt (with a chain-of-thought request) to see how it might link them. Essentially, use an LLM to brainstorm “If an attacker has vuln X on machine A, then they could get access to B, and so on.” This can quickly highlight a potential attack chain in your environment that you hadn’t fully considered – giving you a creative edge in patching the most dangerous combos first.LLM 生成的攻击图(Prapty 等,arXiv 2024):这项工作利用大型语言模型(ChatGPT)自动构建攻击图。LLM“基于漏洞的前置条件和影响智能地串联 CVE”,以生成可能的攻击路径 arxiv.org。它甚至解析文本威胁报告以整合现实世界的战术。快速获益:你可以通过将你的家庭实验室已知的漏洞或配置信息输入 LLM 提示(带有链式思考请求),来模拟这一过程,看看它如何将它们关联起来。本质上,使用 LLM 来头脑风暴“如果攻击者在机器 A 上有漏洞 X,那么他们可能获得对 B 的访问权限,依此类推。”这可以快速突出你环境中尚未充分考虑的潜在攻击链,帮助你在修补最危险的组合时获得创造性优势。
Prometheus – AI-Driven Attack Path Analysis (Jin et al., arXiv 2023): Proposes “Prometheus,” a system for holistic security posture analysis. Given details of your infrastructure (devices, software versions, etc.), it automatically identifies relevant vulnerabilities and constructs potential attack graphs, then even scores the risk of each patharxiv.org. It covers multiple layers (network, system, hardware, etc.) to see how a flaw in one layer could lead to exploitation in another. Quick win: While Prometheus is a complex framework, you can adopt its principle in parts. For example, maintain an inventory of your homelab assets and their software. Use feeds like NVD to find known CVEs for those software versions, then manually or script-wise chain those CVEs by matching their prerequisites and effects (similar to Prometheus’s approach). Even a simple script that says “CVE-1234 enables privilege escalation on Windows – if any machine is Windows and also has CVE-5678 (which requires admin access) then chain them” can uncover a multi-step weakness. This graph-based thinking (even done manually or with basic scripts) can simulate an attacker’s path from an external service deep into your network, giving you a roadmap of what to fix first.Prometheus – 基于 AI 的攻击路径分析(Jin 等,arXiv 2023):提出了“Prometheus”系统,用于整体安全态势分析。给定您的基础设施详情(设备、软件版本等),它会自动识别相关漏洞并构建潜在的攻击图谱,随后对每条路径的风险进行评分。该系统涵盖多个层面(网络、系统、硬件等),以观察一层的漏洞如何可能导致另一层的利用。快速获益:虽然 Prometheus 是一个复杂的框架,但您可以部分采用其原理。例如,维护您的 homelab 资产及其软件的清单。利用 NVD 等信息源查找这些软件版本的已知 CVE,然后通过匹配其前提条件和影响,手动或通过脚本将这些 CVE 串联起来(类似于 Prometheus 的方法)。即使是一个简单的脚本,比如“CVE-1234 允许 Windows 上的权限提升——如果任何机器是 Windows 且同时存在 CVE-5678(需要管理员权限),则将它们串联起来”,也能发现多步骤的弱点。 这种基于图的思维(即使是手动或使用基本脚本完成)也能模拟攻击者从外部服务深入到你网络的路径,为你提供一个优先修复的路线图。
(Aside: Traditional attack graph research like Sheyner et al. 2002 and Ou et al. 2006 laid the groundwork for these toolsarxiv.orgarxiv.org. More recent methods even use reinforcement learning to simulate attackers – e.g. an RL agent finding optimal attack paths in a networkarxiv.org – though such approaches can be heavy to implement. The LLM/NLP-based methods above offer more immediate, data-driven insight.)(顺便说一句:传统的攻击图研究,如 Sheyner 等人 2002 年和 Ou 等人 2006 年的工作,为这些工具奠定了基础 arxiv.orgarxiv.org。更近的方法甚至使用强化学习来模拟攻击者——例如,一个 RL 代理在网络中寻找最优攻击路径 arxiv.org——尽管这类方法实现起来较为复杂。上述基于 LLM/NLP 的方法则提供了更直接、基于数据的洞察。)
Large Language Models can be surprisingly effective “security analysts” – reading configs, reasoning about vulnerabilities, or orchestrating attack steps in natural language. Here are key papers and how they enable creative solutions:大型语言模型可以成为出人意料的“安全分析师”——阅读配置、推理漏洞,或用自然语言协调攻击步骤。以下是关键论文及其如何实现创新解决方案:
PentestGPT – Automated Pentesting with LLMs (Deng et al., USENIX Security 2024): PentestGPT is an LLM-empowered penetration testing tool that interacts in a step-by-step fashion to emulate a human pentester. It’s designed with self-refining modules (for reasoning, generating attacks, parsing results) to handle complex tests without losing context. In evaluation it outperformed a vanilla GPT-3.5 by 228% on task completion, and it’s been open-sourced (6.5k+ GitHub stars in a year) with strong community adoptionusenix.org. Quick win: You can actually try PentestGPT on your homelab – since it’s open source, deploy it and let it guide an audit of your systems. Even without using the tool directly, you could replicate its idea: use an LLM (like GPT-4) in an interactive loop where you feed it results of one recon step, ask “what next?”, execute that suggestion, feed output back, etc. This turns pentesting into a dialog with an AI assistant. Early wins might be as simple as the LLM suggesting “scan for open SSH ports on all hosts” – something you do and discover an forgotten SSH service. The conversational, surprise-driven style keeps things from being boring and can yield quick discoveries.PentestGPT – 使用 LLMs 的自动化渗透测试(Deng 等,USENIX Security 2024):PentestGPT 是一款由 LLM 驱动的渗透测试工具,采用逐步交互的方式模拟人工渗透测试人员。它设计了自我优化模块(用于推理、生成攻击、解析结果),能够处理复杂测试而不丢失上下文。在评估中,其任务完成率比基础版 GPT-3.5 高出 228%,并已开源(一年内 GitHub 获得 6.5k+星标),社区采用度高 usenix.org。快速上手:你实际上可以在你的家庭实验室中试用 PentestGPT——由于它是开源的,部署后让它指导你的系统审计。即使不直接使用该工具,你也可以复制其思路:使用 LLM(如 GPT-4)在交互循环中,向它输入一次侦察步骤的结果,询问“下一步做什么?”,执行建议,反馈输出,依此类推。这将渗透测试转变为与 AI 助手的对话。早期的成果可能很简单,比如 LLM 建议“扫描所有主机的开放 SSH 端口”——你执行后发现了一个被遗忘的 SSH 服务。 对话式、充满惊喜的风格避免了枯燥乏味,并能带来快速的发现。
PentestAgent – Multi-Agent LLM Pentesting (Shen et al., arXiv 2024): PentestAgent extends the above concept by using multiple specialized LLM agents for different stages of an attack (reconnaissance, search, planning, exploitation). They also integrate Retrieval-Augmented Generation (RAG) to pull in up-to-date info (like exploit DBs). This collaborative agent approach automates intel gathering, vuln analysis, and even exploit execution, with minimal human inputarxiv.org. It demonstrated higher task completion and efficiency than prior art. Quick win: In your project, you could modularize tasks similarly. For instance, have one component that uses an LLM prompt template purely for port/service enumeration, another for vulnerability lookup (given a service, query a CVE database), and another for exploit attempt (perhaps guiding a tool like Metasploit). Even if you implement these modules manually, treating them as “agents” that hand off information is powerful. It means each step is optimized (and could even be run in parallel), and you get an automated pipeline from discovery to attempted exploitation. Adopting this mindset of breaking the problem into LLM-assisted agents will let you rack up quick wins at each stage (find subdomain -> find its vuln -> attempt exploit), reinforcing progress continuously.PentestAgent – 多代理 LLM 渗透测试(Shen 等,arXiv 2024):PentestAgent 通过使用多个专门的 LLM 代理来处理攻击的不同阶段(侦察、搜索、规划、利用)扩展了上述概念。他们还集成了检索增强生成(RAG)技术,以获取最新信息(如漏洞数据库)。这种协作代理方法实现了情报收集、漏洞分析甚至漏洞利用的自动化,且几乎无需人工干预。arxiv.org 显示,其任务完成率和效率均优于现有技术。快速启示:在你的项目中,可以类似地模块化任务。例如,设置一个组件,使用 LLM 提示模板专门进行端口/服务枚举,另一个组件用于漏洞查询(给定服务,查询 CVE 数据库),还有一个用于尝试漏洞利用(可能指导 Metasploit 等工具)。即使你手动实现这些模块,将它们视为“代理”并传递信息也非常强大。这意味着每个步骤都被优化(甚至可以并行运行),从发现到尝试利用形成自动化流程。 采用将问题拆解为由 LLM 辅助的代理的思维方式,可以让你在每个阶段快速取得胜利(发现子域名 -> 发现其漏洞 -> 尝试利用),不断强化进展。
LLM for Config Misconfiguration Fixes (Minna et al., Emp. Softw. Eng. 2025): This study looked at Kubernetes Helm charts (app deployment configs) and used LLMs to automatically detect and fix security misconfigurations. They built a pipeline: run static scanners to flag issues, then have an LLM suggest mitigations and “refactor” the config, and finally re-scan to check if the issue is resolvedlink.springer.com. The LLM often could produce correct fixes, though the authors note it sometimes introduced unrelated changes that broke the app (highlighting the need for validation)link.springer.com. Quick win: Think of applying this to your homelab configs (even if you’re not using Helm/K8s). For example, take an Nginx config or a Docker Compose file and prompt an LLM with something like “Find potential security issues in this config and propose improvements.” The LLM might point out a dangerous default or suggest tighter settings. You can then test those suggestions quickly. This gives a fresh perspective on configuration errors that automated scanners might overlook. It’s a fast feedback loop: you get a fix suggestion, apply it, and immediately feel the satisfaction of a more secure setup. (Just remember to review the changes – as the paper found, LLMs can occasionally over-correct, so use domain knowledge to verify the fixes.)用于配置错误修复的 LLM(Minna 等,Emp. Softw. Eng. 2025):这项研究关注 Kubernetes Helm charts(应用部署配置),并利用 LLMs 自动检测和修复安全配置错误。他们构建了一个流程:先运行静态扫描器标记问题,然后让 LLM 建议缓解措施并“重构”配置,最后重新扫描以检查问题是否解决 link.springer.com。LLM 通常能生成正确的修复方案,尽管作者指出它有时会引入无关更改导致应用崩溃(强调了验证的必要性)link.springer.com。快速应用建议:考虑将此方法应用于你的 homelab 配置(即使你不使用 Helm/K8s)。例如,拿一个 Nginx 配置或 Docker Compose 文件,向 LLM 提示“找出此配置中的潜在安全问题并提出改进建议”。LLM 可能会指出危险的默认设置或建议更严格的配置。你可以快速测试这些建议。这为配置错误提供了新的视角,弥补了自动扫描器可能忽视的部分。 这是一个快速反馈循环:你会收到修复建议,应用它,然后立即感受到更安全配置带来的满足感。(只要记得审查这些更改——正如论文所发现的,LLMs 有时会过度修正,因此要利用领域知识来验证修复内容。)
Finally, a truly attacker-like scanner should not only find potential vulnerabilities but also attempt to validate and exploit them safely – this reduces false positives and proves real impact. One landmark piece of research embodies this:最后,一个真正像攻击者一样的扫描器不仅应发现潜在漏洞,还应尝试安全地验证和利用它们——这可以减少误报并证明实际影响。一项具有里程碑意义的研究体现了这一点:
Mayhem – Autonomous Vulnerability Discovery & Patching (Avgerinos et al., IEEE S&P 2018): Mayhem was the system that won DARPA’s Cyber Grand Challenge, the first all-machine hacking tournament. It’s one of the first autonomous security “bots” that can find and fix vulnerabilities without human helpusers.umiacs.umd.edu. Mayhem would analyze binary programs for bugs, craft exploits to prove the bug, and even apply patches on the fly – all at machine speed. While focused on binary software exploitation, its success proved that end-to-end automated vuln scanning and exploitation is possibleusers.umiacs.umd.edu. Quick win: You don’t need to build Mayhem from scratch, but you can embrace its philosophy in your homelab scanner. For any vulnerability your tool suspects (say, an outdated web app), try to automate a safe exploit attempt. This could be as simple as using an existing exploit script or Metasploit module in a sandbox after detection. For example, if your scanner finds an open SMB share, automatically attempt to connect and read a test file to confirm the misconfiguration. If it finds a SQL injection issue, automatically send a benign payload to see if the database responds differently. By incorporating an “exploit verification” step, you’ll immediately know which findings are truly critical. Each time the scanner successfully simulates an attack (without harm) – e.g. dumping a test file or getting a shell in a controlled environment – that’s a huge motivational win and proof your homelab is now hardened against that attack.Mayhem – 自主漏洞发现与修补(Avgerinos 等,IEEE S&P 2018):Mayhem 是赢得 DARPA 网络大挑战赛的系统,这是首个全机器黑客竞赛。它是最早的自主安全“机器人”之一,能够在无人帮助的情况下发现并修复漏洞。Mayhem 会分析二进制程序中的漏洞,编写利用代码以验证漏洞,甚至能即时应用补丁——所有操作均以机器速度完成。虽然其重点是二进制软件利用,但其成功证明了端到端自动化漏洞扫描与利用是可行的。快速获胜:你无需从零构建 Mayhem,但可以在你的家庭实验室扫描器中采纳其理念。对于工具怀疑存在的任何漏洞(例如过时的网页应用),尝试自动化安全的利用尝试。这可以简单到在检测后使用现有的利用脚本或 Metasploit 模块在沙箱中执行。例如,如果扫描器发现开放的 SMB 共享,自动尝试连接并读取测试文件以确认配置错误。 如果发现 SQL 注入问题,自动发送一个无害的有效载荷,观察数据库是否有不同的响应。通过加入“漏洞验证”步骤,你可以立即知道哪些发现是真正关键的。每当扫描器成功模拟一次攻击(无害)——例如在受控环境中导出测试文件或获取 Shell——这都是一个巨大的激励胜利,也是你的家庭实验室已针对该攻击得到加固的证明。
In summary, these research works suggest a path forward where your homelab scanner is agentic (autonomously gathering info and acting on it), uses AI/LLM smarts to think like an attacker, and continuously validates its findings through attack simulation. To keep yourself engaged, focus on those creative, out-of-the-box mini-projects inspired by the papers: e.g. generate likely new targets with AI (asset discovery), map out clever multi-hop exploits (attack graphs), have an AI buddy for pentesting interactions, and auto-exploit/auto-fix where possible. Each of these can give you quick, tangible results – one more exposed service found, one misconfig fixed, one attack path closed off – creating a positive reinforcement loop. By iterating with these “quick win” techniques, you’ll gradually build a powerful scanner that significantly improves your homelab’s security posture, all while keeping the process fun and intellectually stimulating.总之,这些研究工作提出了一条前进的道路:你的 homelab 扫描器应具备代理性(自主收集信息并采取行动),利用 AI/LLM 的智能像攻击者一样思考,并通过攻击模拟不断验证其发现。为了保持兴趣,专注于那些受论文启发的富有创意、跳出框架的小项目:例如,利用 AI 生成可能的新目标(资产发现)、绘制巧妙的多跳利用路径(攻击图)、拥有一个用于渗透测试交互的 AI 伙伴,以及在可能的情况下实现自动利用/自动修复。每一个都能带来快速且具体的成果——发现一个更多暴露的服务,修复一个错误配置,关闭一条攻击路径——从而形成积极的强化循环。通过不断迭代这些“快速胜利”技术,你将逐步构建一个强大的扫描器,显著提升你的 homelab 安全态势,同时保持过程的趣味性和智力挑战性。
Sources:来源:
Durumeric et al. 2013 – “ZMap: Fast Internet-Wide Scanning and Its Security Applications”usenix.orgDurumeric 等人 2013 – “ZMap:快速的全网扫描及其安全应用” usenix.org
Degani et al. 2022 – “Generative Adversarial Networks for Subdomain Enumeration”cybersecurityunitn.github.ioDegani 等人 2022 – “用于子域枚举的生成对抗网络” cybersecurityunitn.github.io
Renascence P. et al. 2024 – “Using Retriever Augmented LLMs for Attack Graph Generation”arxiv.orgRenascence P. 等,2024 年 – “使用检索增强的 LLMs 进行攻击图生成” arxiv.org
Jin et al. 2023 – “Prometheus: Infrastructure Security Posture Analysis with AI-generated Attack Graphs”arxiv.orgJin 等,2023 年 – “Prometheus:基于 AI 生成攻击图的基础设施安全态势分析” arxiv.org
Deng et al. 2024 – “PentestGPT: Harnessing LLMs for Automated Penetration Testing”usenix.orgDeng 等,2024 年 – “PentestGPT:利用 LLMs 进行自动化渗透测试” usenix.org
Shen et al. 2024 – “PentestAgent: Incorporating LLM Agents to Automated Penetration Testing”arxiv.orgShen 等,2024 年 – “PentestAgent:引入 LLM 代理进行自动化渗透测试” arxiv.org
Minna et al. 2025 – “Analyzing and Mitigating (with LLMs) Security Misconfigurations of Helm Charts”link.springer.comMinna 等人 2025 – “使用 LLMs 分析和缓解 Helm Charts 的安全错误配置”link.springer.com
Avgerinos et al. 2018 – “The Mayhem Cyber Reasoning System”users.umiacs.umd.eduAvgerinos 等人 2018 – “Mayhem 网络推理系统”users.umiacs.umd.edu
ZMap: Fast Internet-wide Scanning and Its Security Applications | USENIXZMap:快速全网扫描及其安全应用 | USENIX
Generative Adversarial Networks for Subdomain Enumeration | UniTN Cybersecurity生成对抗网络用于子域枚举 | UniTN 网络安全
Using Retriever Augmented Large Language Models for Attack Graph Generation使用检索增强的大型语言模型进行攻击图生成
Prometheus: Infrastructure Security Posture Analysis with AI-generated Attack GraphsPrometheus:基于 AI 生成攻击图的基础设施安全态势分析
Using Retriever Augmented Large Language Models for Attack Graph Generation使用检索增强的大型语言模型进行攻击图生成
Using Retriever Augmented Large Language Models for Attack Graph Generation使用检索增强型大型语言模型生成攻击图
[1905.05965] Autonomous Penetration Testing using Reinforcement Learning[1905.05965] 使用强化学习的自主渗透测试
PentestAgent: Incorporating LLM Agents to Automated Penetration TestingPentestAgent:将 LLM 代理整合到自动化渗透测试中
Analyzing and mitigating (with LLMs) the security misconfigurations of Helm charts from Artifact Hub | Empirical Software Engineering 使用 LLMs 分析和缓解 Artifact Hub 中 Helm 图表的安全错误配置 | 实证软件工程
Analyzing and mitigating (with LLMs) the security misconfigurations of Helm charts from Artifact Hub | Empirical Software Engineering 使用 LLMs 分析和缓解 Artifact Hub 中 Helm 图表的安全错误配置 | 实证软件工程
The Mayhem Cyber Reasoning SystemMayhem 网络推理系统
The Mayhem Cyber Reasoning SystemMayhem 网络推理系统
PentestAgent: Incorporating LLM Agents to Automated Penetration TestingPentestAgent:引入 LLM 代理实现自动化渗透测试