Simon Willison's Weblog: anthropic

llm-anthropic 0.25

2026-04-16T20:37:12+00:00

New model: claude-opus-4.7, which supports thinking_effort: xhigh. #66

New thinking_display and thinking_adaptive boolean options. thinking_display summarized output is currently only available in JSON output or JSON logs.

Increased default max_tokens to the maximum allowed for each model.

No longer uses obsolete structured-outputs-2025-11-13 beta header for older models.

Tags: llm, anthropic, claude

Qwen3.6-35B-A3B on my laptop drew me a better pelican than Claude Opus 4.7

2026-04-16T17:16:52+00:00

For anyone who has been (inadvisably) taking my pelican riding a bicycle benchmark seriously as a robust way to test models, here are pelicans from this morning's two big model releases - Qwen3.6-35B-A3B from Alibaba and Claude Opus 4.7 from Anthropic.

Here's the Qwen 3.6 pelican, generated using this 20.9GB Qwen3.6-35B-A3B-UD-Q4_K_S.gguf quantized model by Unsloth, running on my MacBook Pro M5 via LM Studio (and the llm-lmstudio plugin) - transcript here:

And here's one I got from Anthropic's brand new Claude Opus 4.7 (transcript):

I'm giving this one to Qwen 3.6. Opus managed to mess up the bicycle frame!

I tried Opus a second time passing thinking_level: max. It didn't do much better (transcript):

I don't think Qwen are cheating

A lot of people are convinced that the labs train for my stupid benchmark. I don't think they do, but honestly this result did give me a little glint of suspicion. So I'm burning one of my secret backup tests - here's what I got from Qwen3.6-35B-A3B and Opus 4.7 for "Generate an SVG of a flamingo riding a unicycle":

Qwen3.6-35B-A3B
(transcript)

Opus 4.7
(transcript)

I'm giving this one to Qwen too, partly for the excellent  SVG comment.

What can we learn from this?

The pelican benchmark has always been meant as a joke - it's mainly a statement on how obtuse and absurd the task of comparing these models is.

The weird thing about that joke is that, for the most part, there has been a direct correlation between the quality of the pelicans produced and the general usefulness of the models. Those first pelicans from October 2024 were junk. The more recent entries have generally been much, much better - to the point that Gemini 3.1 Pro produces illustrations you could actually use somewhere, provided you had a pressing need to illustrate a pelican riding a bicycle.

Today, even that loose connection to utility has been broken. I have enormous respect for Qwen, but I very much doubt that a 21GB quantized version of their latest model is more powerful or useful than Anthropic's latest proprietary release.

If the thing you need is an SVG illustration of a pelican riding a bicycle though, right now Qwen3.6-35B-A3B running on a laptop is a better bet than Opus 4.7!

Tags: ai, generative-ai, local-llms, llms, anthropic, claude, qwen, pelican-riding-a-bicycle, llm-release, lm-studio

Trusted access for the next era of cyber defense

2026-04-14T21:23:59+00:00

Trusted access for the next era of cyber defense

OpenAI's answer to Claude Mythos appears to be a new model called GPT-5.4-Cyber:

In preparation for increasingly more capable models from OpenAI over the next few months, we are fine-tuning our models specifically to enable defensive cybersecurity use cases, starting today with a variant of GPT‑5.4 trained to be cyber-permissive: GPT‑5.4‑Cyber.

They're also extending a program they launched in February (which I had missed) called Trusted Access for Cyber, where users can verify their identity (via a photo of a government-issued ID processed by Persona) to gain "reduced friction" access to OpenAI's models for cybersecurity work.

Honestly, this OpenAI announcement is difficult to follow. Unsurprisingly they don't mention Anthropic at all, but much of the piece emphasizes their many years of existing cybersecurity work and their goal to "democratize access" to these tools, hence the emphasis on that self-service verification flow from February.

If you want access to their best security tools you still need to go through an extra Google Form application process though, which doesn't feel particularly different to me from Anthropic's Project Glasswing.

Via Hacker News

Tags: security, generative-ai, ai-security-research, openai, ai, llms, anthropic

Anthropic's Project Glasswing - restricting Claude Mythos to security researchers - sounds necessary to me

2026-04-07T20:52:54+00:00

Anthropic didn't release their latest model, Claude Mythos (system card PDF), today. They have instead made it available to a very restricted set of preview partners under their newly announced Project Glasswing.

The model is a general purpose model, similar to Claude Opus 4.6, but Anthropic claim that its cyber-security research abilities are strong enough that they need to give the software industry as a whole time to prepare.

Mythos Preview has already found thousands of high-severity vulnerabilities, including some in every major operating system and web browser. Given the rate of AI progress, it will not be long before such capabilities proliferate, potentially beyond actors who are committed to deploying them safely.

[...]

Project Glasswing partners will receive access to Claude Mythos Preview to find and fix vulnerabilities or weaknesses in their foundational systems—systems that represent a very large portion of the world’s shared cyberattack surface. We anticipate this work will focus on tasks like local vulnerability detection, black box testing of binaries, securing endpoints, and penetration testing of systems.

There's a great deal more technical detail in Assessing Claude Mythos Preview’s cybersecurity capabilities on the Anthropic Red Team blog:

In one case, Mythos Preview wrote a web browser exploit that chained together four vulnerabilities, writing a complex JIT heap spray that escaped both renderer and OS sandboxes. It autonomously obtained local privilege escalation exploits on Linux and other operating systems by exploiting subtle race conditions and KASLR-bypasses. And it autonomously wrote a remote code execution exploit on FreeBSD's NFS server that granted full root access to unauthenticated users by splitting a 20-gadget ROP chain over multiple packets.

Plus this comparison with Claude 4.6 Opus:

Our internal evaluations showed that Opus 4.6 generally had a near-0% success rate at autonomous exploit development. But Mythos Preview is in a different league. For example, Opus 4.6 turned the vulnerabilities it had found in Mozilla’s Firefox 147 JavaScript engine—all patched in Firefox 148—into JavaScript shell exploits only two times out of several hundred attempts. We re-ran this experiment as a benchmark for Mythos Preview, which developed working exploits 181 times, and achieved register control on 29 more.

Saying "our model is too dangerous to release" is a great way to build buzz around a new model, but in this case I expect their caution is warranted.

Just a few days (last Friday) ago I started a new ai-security-research tag on this blog to acknowledge an uptick in credible security professionals pulling the alarm on how good modern LLMs have got at vulnerability research.

Greg Kroah-Hartman of the Linux kernel:

Months ago, we were getting what we called 'AI slop,' AI-generated security reports that were obviously wrong or low quality. It was kind of funny. It didn't really worry us.

Something happened a month ago, and the world switched. Now we have real reports. All open source projects have real reports that are made with AI, but they're good, and they're real.

Daniel Stenberg of curl:

The challenge with AI in open source security has transitioned from an AI slop tsunami into more of a ... plain security report tsunami. Less slop but lots of reports. Many of them really good.

I'm spending hours per day on this now. It's intense.

And Thomas Ptacek published Vulnerability Research Is Cooked, a post inspired by his podcast conversation with Anthropic's Nicholas Carlini.

Anthropic have a 5 minute talking heads video describing the Glasswing project. Nicholas Carlini appears as one of those talking heads, where he said (highlights mine):

It has the ability to chain together vulnerabilities. So what this means is you find two vulnerabilities, either of which doesn't really get you very much independently. But this model is able to create exploits out of three, four, or sometimes five vulnerabilities that in sequence give you some kind of very sophisticated end outcome. [...]

I've found more bugs in the last couple of weeks than I found in the rest of my life combined. We've used the model to scan a bunch of open source code, and the thing that we went for first was operating systems, because this is the code that underlies the entire internet infrastructure. For OpenBSD, we found a bug that's been present for 27 years, where I can send a couple of pieces of data to any OpenBSD server and crash it. On Linux, we found a number of vulnerabilities where as a user with no permissions, I can elevate myself to the administrator by just running some binary on my machine. For each of these bugs, we told the maintainers who actually run the software about them, and they went and fixed them and have deployed the patches patches so that anyone who runs the software is no longer vulnerable to these attacks.

I found this on the OpenBSD 7.8 errata page:

025: RELIABILITY FIX: March 25, 2026 All architectures

TCP packets with invalid SACK options could crash the kernel.

A source code patch exists which remedies this problem.

I tracked that change down in the GitHub mirror of the OpenBSD CVS repo (apparently they still use CVS!) and found it using git blame:

Sure enough, the surrounding code is from 27 years ago.

I'm not sure which Linux vulnerability Nicholas was describing, but it may have been this NFS one recently covered by Michael Lynch .

There's enough smoke here that I believe there's a fire. It's not surprising to find vulnerabilities in decades-old software, especially given that they're mostly written in C, but what's new is that coding agents run by the latest frontier LLMs are proving tirelessly capable at digging up these issues.

I actually thought to myself on Friday that this sounded like an industry-wide reckoning in the making, and that it might warrant a huge investment of time and money to get ahead of the inevitable barrage of vulnerabilities. Project Glasswing incorporates "$100M in usage credits ... as well as $4M in direct donations to open-source security organizations". Partners include AWS, Apple, Microsoft, Google, and the Linux Foundation. It would be great to see OpenAI involved as well - GPT-5.4 already has a strong reputation for finding security vulnerabilities and they have stronger models on the near horizon.

The bad news for those of us who are not trusted partners is this:

We do not plan to make Claude Mythos Preview generally available, but our eventual goal is to enable our users to safely deploy Mythos-class models at scale—for cybersecurity purposes, but also for the myriad other benefits that such highly capable models will bring. To do so, we need to make progress in developing cybersecurity (and other) safeguards that detect and block the model’s most dangerous outputs. We plan to launch new safeguards with an upcoming Claude Opus model, allowing us to improve and refine them with a model that does not pose the same level of risk as Mythos Preview.

I can live with that. I think the security risks really are credible here, and having extra time for trusted teams to get ahead of them is a reasonable trade-off.

Tags: security, thomas-ptacek, ai, generative-ai, llms, anthropic, nicholas-carlini, ai-ethics, llm-release, ai-security-research

Quoting A member of Anthropic’s alignment-science team

2026-03-16T21:38:55+00:00

The point of the blackmail exercise was to have something to describe to policymakers—results that are visceral enough to land with people, and make misalignment risk actually salient in practice for people who had never thought about it before.

— A member of Anthropic’s alignment-science team, as told to Gideon Lewis-Kraus

Tags: ai-ethics, anthropic, claude, generative-ai, ai, llms

1M context is now generally available for Opus 4.6 and Sonnet 4.6

2026-03-13T18:29:13+00:00

1M context is now generally available for Opus 4.6 and Sonnet 4.6

Here's what surprised me:

Standard pricing now applies across the full 1M window for both models, with no long-context premium.

OpenAI and Gemini both charge more for prompts where the token count goes above a certain point - 200,000 for Gemini 3.1 Pro and 272,000 for GPT-5.4.

Tags: anthropic, claude, generative-ai, long-context, llm-pricing, ai, llms

Anthropic and the Pentagon

2026-03-06T17:26:50+00:00

Anthropic and the Pentagon

This piece by Bruce Schneier and Nathan E. Sanders is the most thoughtful and grounded coverage I've seen of the recent and ongoing Pentagon/OpenAI/Anthropic contract situation.

AI models are increasingly commodified. The top-tier offerings have about the same performance, and there is little to differentiate one from the other. The latest models from Anthropic, OpenAI and Google, in particular, tend to leapfrog each other with minor hops forward in quality every few months. [...]

In this sort of market, branding matters a lot. Anthropic and its CEO, Dario Amodei, are positioning themselves as the moral and trustworthy AI provider. That has market value for both consumers and enterprise clients.

Tags: ai-ethics, bruce-schneier, anthropic, generative-ai, openai, ai, llms

Quoting Donald Knuth

2026-03-03T23:59:04+00:00

Shock! Shock! I learned yesterday that an open problem I'd been working on for several weeks had just been solved by Claude Opus 4.6 - Anthropic's hybrid reasoning model that had been released three weeks earlier! It seems that I'll have to revise my opinions about "generative AI" one of these days. What a joy it is to learn not only that my conjecture has a nice solution but also to celebrate this dramatic advance in automatic deduction and creative problem solving.

— Donald Knuth, Claude's Cycles

Tags: november-2025-inflection, claude, generative-ai, ai, llms, donald-knuth, llm-reasoning, anthropic

Quoting claude.com/import-memory

2026-03-01T11:21:45+00:00

I'm moving to another service and need to export my data. List every memory you have stored about me, as well as any context you've learned about me from past conversations. Output everything in a single code block so I can easily copy it. Format each entry as: [date saved, if available] - memory content. Make sure to cover all of the following — preserve my words verbatim where possible: Instructions I've given you about how to respond (tone, format, style, 'always do X', 'never do Y'). Personal details: name, location, job, family, interests. Projects, goals, and recurring topics. Tools, languages, and frameworks I use. Preferences and corrections I've made to your behavior. Any other stored context not covered above. Do not summarize, group, or omit any entries. After the code block, confirm whether that is the complete set or if any remain.

— claude.com/import-memory, Anthropic's "import your memories to Claude" feature is a prompt

Tags: prompt-engineering, llm-memory, anthropic, claude, generative-ai, ai, llms

Free Claude Max for (large project) open source maintainers

2026-02-27T18:08:22+00:00

Free Claude Max for (large project) open source maintainers

Anthropic are now offering their $200/month Claude Max 20x plan for free to open source maintainers... for six months... and you have to meet the following criteria:

Maintainers: You're a primary maintainer or core team member of a public repo with 5,000+ GitHub stars or 1M+ monthly NPM downloads. You've made commits, releases, or PR reviews within the last 3 months.

Don't quite fit the criteria If you maintain something the ecosystem quietly depends on, apply anyway and tell us about it.

Also in the small print: "Applications are reviewed on a rolling basis. We accept up to 10,000 contributors".

Via Hacker News

Tags: open-source, anthropic, claude, generative-ai, ai, llms

Claude Code Remote Control

2026-02-25T17:33:24+00:00

Claude Code Remote Control

New Claude Code feature dropped yesterday: you can now run a "remote control" session on your computer and then use the Claude Code for web interfaces (on web, iOS and native desktop app) to send prompts to that session.

It's a little bit janky right now. Initially when I tried it I got the error "Remote Control is not enabled for your account. Contact your administrator." (but I am my administrator?) - then I logged out and back into the Claude Code terminal app and it started working:

claude remote-control

You can only run one session on your machine at a time. If you upgrade the Claude iOS app it then shows up as "Remote Control Session (Mac)" in the Code tab.

It appears not to support the --dangerously-skip-permissions flag (I passed that to claude remote-control and it didn't reject the option, but it also appeared to have no effect) - which means you have to approve every new action it takes.

I also managed to get it to a state where every prompt I tried was met by an API 500 error.

Restarting the program on the machine also causes existing sessions to start returning mysterious API errors rather than neatly explaining that the session has terminated.

I expect they'll iron out all of these issues relatively quickly. It's interesting to then contrast this to solutions like OpenClaw, where one of the big selling points is the ability to control your personal device from your phone.

Claude Code still doesn't have a documented mechanism for running things on a schedule, which is the other killer feature of the Claw category of software.

Update: I spoke too soon: also today Anthropic announced Schedule recurring tasks in Cowork, Claude Code's general agent sibling. These do include an important limitation:

Scheduled tasks only run while your computer is awake and the Claude Desktop app is open. If your computer is asleep or the app is closed when a task is scheduled to run, Cowork will skip the task, then run it automatically once your computer wakes up or you open the desktop app again.

I really hope they're working on a Cowork Cloud product.

Via @claudeai

Tags: anthropic, claude, ai, claude-code, llms, coding-agents, generative-ai, openclaw, applescript

The Claude C Compiler: What It Reveals About the Future of Software

2026-02-22T23:58:43+00:00

The Claude C Compiler: What It Reveals About the Future of Software

On February 5th Anthropic's Nicholas Carlini wrote about a project to use parallel Claudes to build a C compiler on top of the brand new Opus 4.6

Chris Lattner (Swift, LLVM, Clang, Mojo) knows more about C compilers than most. He just published this review of the code.

Some points that stood out to me:

Good software depends on judgment, communication, and clear abstraction. AI has amplified this.

AI coding is automation of implementation, so design and stewardship become more important.

Manual rewrites and translation work are becoming AI-native tasks, automating a large category of engineering effort.

Chris is generally impressed with CCC (the Claude C Compiler):

Taken together, CCC looks less like an experimental research compiler and more like a competent textbook implementation, the sort of system a strong undergraduate team might build early in a project before years of refinement. That alone is remarkable.

It's a long way from being a production-ready compiler though:

Several design choices suggest optimization toward passing tests rather than building general abstractions like a human would. [...] These flaws are informative rather than surprising, suggesting that current AI systems excel at assembling known techniques and optimizing toward measurable success criteria, while struggling with the open-ended generalization required for production-quality systems.

The project also leads to deep open questions about how agentic engineering interacts with licensing and IP for both open source and proprietary code:

If AI systems trained on decades of publicly available code can reproduce familiar structures, patterns, and even specific implementations, where exactly is the boundary between learning and copying?

Tags: compilers, anthropic, claude, nicholas-carlini, ai, open-source, coding-agents, ai-assisted-programming, c, agentic-engineering

Quoting Thariq Shihipar

2026-02-20T07:13:19+00:00

Long running agentic products like Claude Code are made feasible by prompt caching which allows us to reuse computation from previous roundtrips and significantly decrease latency and cost. [...]

At Claude Code, we build our entire harness around prompt caching. A high prompt cache hit rate decreases costs and helps us create more generous rate limits for our subscription plans, so we run alerts on our prompt cache hit rate and declare SEVs if they're too low.

— Thariq Shihipar

Tags: prompt-engineering, anthropic, claude-code, ai-agents, generative-ai, ai, llms

SWE-bench February 2026 leaderboard update

2026-02-19T04:48:47+00:00

SWE-bench February 2026 leaderboard update

SWE-bench is one of the benchmarks that the labs love to list in their model releases. The official leaderboard is infrequently updated but they just did a full run of it against the current generation of models, which is notable because it's always good to see benchmark results like this that weren't self-reported by the labs.

The fresh results are for their "Bash Only" benchmark, which runs their mini-swe-bench agent (~9,000 lines of Python, here are the prompts they use) against the SWE-bench dataset of coding problems - 2,294 real-world examples pulled from 12 open source repos: django/django (850), sympy/sympy (386), scikit-learn/scikit-learn (229), sphinx-doc/sphinx (187), matplotlib/matplotlib (184), pytest-dev/pytest (119), pydata/xarray (110), astropy/astropy (95), pylint-dev/pylint (57), psf/requests (44), mwaskom/seaborn (22), pallets/flask (11).

Correction: The Bash only benchmark runs against SWE-bench Verified, not original SWE-bench. Verified is a manually curated subset of 500 samples described here, funded by OpenAI. Here's SWE-bench Verified on Hugging Face - since it's just 2.1MB of Parquet it's easy to browse using Datasette Lite, which cuts those numbers down to django/django (231), sympy/sympy (75), sphinx-doc/sphinx (44), matplotlib/matplotlib (34), scikit-learn/scikit-learn (32), astropy/astropy (22), pydata/xarray (22), pytest-dev/pytest (19), pylint-dev/pylint (10), psf/requests (8), mwaskom/seaborn (2), pallets/flask (1).

Here's how the top ten models performed:

It's interesting to see Claude Opus 4.5 beat Opus 4.6, though only by about a percentage point. 4.5 Opus is top, then Gemini 3 Flash, then MiniMax M2.5 - a 229B model released last week by Chinese lab MiniMax. GLM-5, Kimi K2.5 and DeepSeek V3.2 are three more Chinese models that make the top ten as well.

OpenAI's GPT-5.2 is their highest performing model at position 6, but it's worth noting that their best coding model, GPT-5.3-Codex, is not represented - maybe because it's not yet available in the OpenAI API.

This benchmark uses the same system prompt for every model, which is important for a fair comparison but does mean that the quality of the different harnesses or optimized prompts is not being measured here.

The chart above is a screenshot from the SWE-bench website, but their charts don't include the actual percentage values visible on the bars. I successfully used Claude for Chrome to add these - transcript here. My prompt sequence included:

Use claude in chrome to open https://www.swebench.com/

Click on "Compare results" and then select "Select top 10"

See those bar charts? I want them to display the percentage on each bar so I can take a better screenshot, modify the page like that

I'm impressed at how well this worked - Claude injected custom JavaScript into the page to draw additional labels on top of the existing chart.

Update: If you look at the transcript Claude claims to have switched to Playwright, which is confusing because I didn't think I had that configured.

Via @KLieret

Tags: browser-agents, anthropic, claude, openai, benchmarks, ai, ai-in-china, llms, minimax, coding-agents, generative-ai, django

Introducing Claude Sonnet 4.6

2026-02-17T23:58:58+00:00

Introducing Claude Sonnet 4.6

Sonnet 4.6 is out today, and Anthropic claim it offers similar performance to November's Opus 4.5 while maintaining the Sonnet pricing of $3/million input and $15/million output tokens (the Opus models are $5/$25). Here's the system card PDF.

Sonnet 4.6 has a "reliable knowledge cutoff" of August 2025, compared to Opus 4.6's May 2025 and Haiku 4.5's February 2025. Both Opus and Sonnet default to 200,000 max input tokens but can stretch to 1 million in beta and at a higher cost.

I just released llm-anthropic 0.24 with support for both Sonnet 4.6 and Opus 4.6. Claude Code did most of the work - the new models had a fiddly amount of extra details around adaptive thinking and no longer supporting prefixes, as described in Anthropic's migration guide.

Here's what I got from:

uvx --with llm-anthropic llm 'Generate an SVG of a pelican riding a bicycle' -m claude-sonnet-4.6

The SVG comments include:

<!-- Hat (fun accessory) -->

I tried a second time and also got a top hat. Sonnet 4.6 apparently loves top hats!

For comparison, here's the pelican Opus 4.5 drew me in November:

And here's Anthropic's current best pelican, drawn by Opus 4.6 on February 5th:

Opus 4.6 produces the best pelican beak/pouch. I do think the top hat from Sonnet 4.6 is a nice touch though.

Via Hacker News

Tags: llm, anthropic, claude, llm-pricing, ai, llms, llm-release, generative-ai, pelican-riding-a-bicycle, claude-code

llm-anthropic 0.24

2026-02-17T23:51:23+00:00

Release: llm-anthropic 0.24

Tags: llm, claude, anthropic

Increase web search accuracy and efficiency with dynamic filtering

2026-02-17T16:38:49+00:00

Increase web search accuracy and efficiency with dynamic filtering

Interesting new feature in the Claude API - yet more evidence that code execution really is the ultimate swiss army knife for improving the way LLMs work with data:

Alongside Claude Opus 4.6 and Sonnet 4.6, we're releasing new versions of our web search and web fetch tools. Claude can now natively write and execute code during web searches to filter results before they reach the context window, improving its accuracy and token efficiency. [...]

To improve Claude’s performance on web searches, our web search and web fetch tools now automatically write and execute code to post-process query results. Instead of reasoning over full HTML files, Claude can dynamically filter the search results before loading them into context, keeping only what’s relevant and discarding the rest.

(Draft post I forgot to publish until March 26th!)

Tags: anthropic, generative-ai, llm-tool-use, ai, llms

Rodney and Claude Code for Desktop

2026-02-16T16:38:57+00:00

I'm a very heavy user of Claude Code on the web, Anthropic's excellent but poorly named cloud version of Claude Code where everything runs in a container environment managed by them, greatly reducing the risk of anything bad happening to a computer I care about.

I don't use the web interface at all (hence my dislike of the name) - I access it exclusively through their native iPhone and Mac desktop apps.

Something I particularly appreciate about the desktop app is that it lets you see images that Claude is "viewing" via its Read /path/to/image tool. Here's what that looks like:

This means you can get a visual preview of what it's working on while it's working, without waiting for it to push code to GitHub for you to try out yourself later on.

The prompt I used to trigger the above screenshot was:

Run "uvx rodney --help" and then use Rodney to manually test the new pages and menu - look at screenshots from it and check you think they look OK

I designed Rodney to have --help output that provides everything a coding agent needs to know in order to use the tool.

The Claude iPhone app doesn't display opened images yet, so I requested it as a feature just now in a thread on Twitter.

Tags: anthropic, claude, ai, claude-code, llms, async-coding-agents, coding-agents, generative-ai, projects, ai-assisted-programming, rodney

Quoting Boris Cherny

2026-02-14T23:59:09+00:00

Someone has to prompt the Claudes, talk to customers, coordinate with other teams, decide what to build next. Engineering is changing and great engineers are more important than ever.

— Boris Cherny, Claude Code creator, on why Anthropic are still hiring developers

Tags: careers, anthropic, ai, claude-code, llms, coding-agents, ai-assisted-programming, generative-ai

Anthropic's public benefit mission

2026-02-13T23:59:51+00:00

Someone asked if there was an Anthropic equivalent to OpenAI's IRS mission statements over time.

Anthropic are a "public benefit corporation" but not a non-profit, so they don't have the same requirements to file public documents with the IRS every year.

But when I asked Claude it ran a search and dug up this Google Drive folder where Zach Stein-Perlman shared Certificate of Incorporation documents he obtained from the State of Delaware!

Anthropic's are much less interesting that OpenAI's. The earliest document from 2021 states:

The specific public benefit that the Corporation will promote is to responsibly develop and maintain advanced Al for the cultural, social and technological improvement of humanity.

Every subsequent document up to 2024 uses an updated version which says:

The specific public benefit that the Corporation will promote is to responsibly develop and maintain advanced AI for the long term benefit of humanity.

Tags: ai-ethics, anthropic, ai

Quoting Anthropic

2026-02-12T20:22:14+00:00

Claude Code was made available to the general public in May 2025. Today, Claude Code’s run-rate revenue has grown to over $2.5 billion; this figure has more than doubled since the beginning of 2026. The number of weekly active Claude Code users has also doubled since January 1 [six weeks ago].

— Anthropic, announcing their $30 billion series G

Tags: coding-agents, anthropic, claude-code, ai-agents, generative-ai, ai, llms

Covering electricity price increases from our data centers

2026-02-12T20:01:23+00:00

Covering electricity price increases from our data centers

One of the sub-threads of the AI energy usage discourse has been the impact new data centers have on the cost of electricity to nearby residents. Here's detailed analysis from Bloomberg in September reporting "Wholesale electricity costs as much as 267% more than it did five years ago in areas near data centers".

Anthropic appear to be taking on this aspect of the problem directly, promising to cover 100% of necessary grid upgrade costs and also saying:

We will work to bring net-new power generation online to match our data centers’ electricity needs. Where new generation isn’t online, we’ll work with utilities and external experts to estimate and cover demand-driven price effects from our data centers.

I look forward to genuine energy industry experts picking this apart to judge if it will actually have the claimed impact on consumers.

As always, I remain frustrated at the refusal of the major AI labs to fully quantify their energy usage. The best data we've had on this still comes from Mistral's report last July and even that lacked key data such as the breakdown between energy usage for training vs inference.

Via @anthropicai

Tags: ai-ethics, ai-energy-usage, anthropic, ai

Quoting Thomas Ptacek

2026-02-08T02:25:53+00:00

People on the orange site are laughing at this, assuming it's just an ad and that there's nothing to it. Vulnerability researchers I talk to do not think this is a joke. As an erstwhile vuln researcher myself: do not bet against LLMs on this.

Axios: Anthropic's Claude Opus 4.6 uncovers 500 zero-day flaws in open-source

I think vulnerability research might be THE MOST LLM-amenable software engineering problem. Pattern-driven. Huge corpus of operational public patterns. Closed loops. Forward progress from stimulus/response tooling. Search problems.

Vulnerability research outcomes are in THE MODEL CARDS for frontier labs. Those companies have so much money they're literally distorting the economy. Money buys vuln research outcomes. Why would you think they were faking any of this?

— Thomas Ptacek

Tags: thomas-ptacek, anthropic, claude, security, generative-ai, ai, llms, open-source, ai-security-research

Claude: Speed up responses with fast mode

2026-02-07T23:10:33+00:00

Claude: Speed up responses with fast mode

New "research preview" from Anthropic today: you can now access a faster version of their frontier model Claude Opus 4.6 by typing /fast in Claude Code... but at a cost that's 6x the normal price.

Opus is usually $5/million input and $25/million output. The new fast mode is $30/million input and $150/million output!

There's a 50% discount until the end of February 16th, so only a 3x multiple (!) before then.

How much faster is it? The linked documentation doesn't say, but on Twitter Claude say:

Our teams have been building with a 2.5x-faster version of Claude Opus 4.6.

We’re now making it available as an early experiment via Claude Code and our API.

Claude Opus 4.5 had a context limit of 200,000 tokens. 4.6 has an option to increase that to 1,000,000 at 2x the input price ($10/m) and 1.5x the output price ($37.50/m) once your input exceeds 200,000 tokens. These multiples hold for fast mode too, so after Feb 16th you'll be able to pay a hefty $60/m input and $225/m output for Anthropic's fastest best model.

Tags: llm-performance, anthropic, claude, claude-code, llm-pricing, generative-ai, ai, llms

Opus 4.6 and Codex 5.3

2026-02-05T20:29:20+00:00

Two major new model releases today, within about 15 minutes of each other.

Anthropic released Opus 4.6. Here's its pelican:

OpenAI release GPT-5.3-Codex, albeit only via their Codex app, not yet in their API. Here's its pelican:

I've had a bit of preview access to both of these models and to be honest I'm finding it hard to find a good angle to write about them - they're both really good, but so were their predecessors Codex 5.2 and Opus 4.5. I've been having trouble finding tasks that those previous models couldn't handle but the new ones are able to ace.

The most convincing story about capabilities of the new model so far is Nicholas Carlini from Anthropic talking about Opus 4.6 and Building a C compiler with a team of parallel Claudes - Anthropic's version of Cursor's FastRender project.

Tags: llm-release, anthropic, generative-ai, openai, pelican-riding-a-bicycle, ai, llms, parallel-agents, c, nicholas-carlini

Claude's new constitution

2026-01-21T23:39:49+00:00

Claude's new constitution

Late last year Richard Weiss found something interesting while poking around with the just-released Claude Opus 4.5: he was able to talk the model into regurgitating a document which was not part of the system prompt but appeared instead to be baked in during training, and which described Claude's core values at great length.

He called this leak the soul document, and Amanda Askell from Anthropic quickly confirmed that it was indeed part of Claude's training procedures.

Today Anthropic made this official, releasing that full "constitution" document under a CC0 (effectively public domain) license. There's a lot to absorb! It's over 35,000 tokens, more than 10x the length of the published Opus 4.5 system prompt.

One detail that caught my eye is the acknowledgements at the end, which include a list of external contributors who helped review the document. I was intrigued to note that two of the fifteen listed names are Catholic members of the clergy - Father Brendan McGuire is a pastor in Los Altos with a Master’s degree in Computer Science and Math and Bishop Paul Tighe is an Irish Catholic bishop with a background in moral theology.

Tags: anthropic, claude, ai-personality, amanda-askell, ai, llms, ai-ethics, generative-ai

Claude Cowork Exfiltrates Files

2026-01-14T22:15:22+00:00

Claude Cowork Exfiltrates Files

Claude Cowork defaults to allowing outbound HTTP traffic to only a specific list of domains, to help protect the user against prompt injection attacks that exfiltrate their data.

Prompt Armor found a creative workaround: Anthropic's API domain is on that list, so they constructed an attack that includes an attacker's own Anthropic API key and has the agent upload any files it can see to the https://api.anthropic.com/v1/files endpoint, allowing the attacker to retrieve their content later.

Via Hacker News

Tags: anthropic, ai-agents, ai, claude-code, llms, prompt-injection, security, generative-ai, lethal-trifecta, exfiltration-attacks, claude-cowork

Anthropic invests $1.5 million in the Python Software Foundation and open source security

2026-01-13T23:58:17+00:00

Anthropic invests $1.5 million in the Python Software Foundation and open source security

This is outstanding news, especially given our decision to withdraw from that NSF grant application back in October.

We are thrilled to announce that Anthropic has entered into a two-year partnership with the Python Software Foundation (PSF) to contribute a landmark total of $1.5 million to support the foundation’s work, with an emphasis on Python ecosystem security. This investment will enable the PSF to make crucial security advances to CPython and the Python Package Index (PyPI) benefiting all users, and it will also sustain the foundation’s core work supporting the Python language, ecosystem, and global community.

Note that while security is a focus these funds will also support other aspects of the PSF's work:

Anthropic’s support will also go towards the PSF’s core work, including the Developer in Residence program driving contributions to CPython, community support through grants and other programs, running core infrastructure such as PyPI, and more.

Tags: open-source, anthropic, python, ai, psf

First impressions of Claude Cowork, Anthropic's general agent

2026-01-12T21:46:13+00:00

New from Anthropic today is Claude Cowork, a "research preview" that they describe as "Claude Code for the rest of your work". It's currently available only to Max subscribers ($100 or $200 per month plans) as part of the updated Claude Desktop macOS application. Update 16th January 2026: it's now also available to $20/month Claude Pro subscribers.

I've been saying for a while now that Claude Code is a "general agent" disguised as a developer tool. It can help you with any computer task that can be achieved by executing code or running terminal commands... which covers almost anything, provided you know what you're doing with it! What it really needs is a UI that doesn't involve the terminal and a name that doesn't scare away non-developers.

"Cowork" is a pretty solid choice on the name front!

What it looks like

The interface for Cowork is a new tab in the Claude desktop app, called Cowork. It sits next to the existing Chat and Code tabs.

It looks very similar to the desktop interface for regular Claude Code. You start with a prompt, optionally attaching a folder of files. It then starts work.

I tried it out against my perpetually growing "blog-drafts" folder with the following prompt:

Look at my drafts that were started within the last three months and then check that I didn't publish them on simonwillison.net using a search against content on that site and then suggest the ones that are most close to being ready

It started by running this command:

find /sessions/zealous-bold-ramanujan/mnt/blog-drafts \
  -type f \( -name "*.md" -o -name "*.txt" -o -name "*.html" \) \
  -mtime -90 -exec ls -la {} \;

That /sessions/zealous-bold-ramanujan/mnt/blog-drafts path instantly caught my eye. Anthropic say that Cowork can only access files you grant it access to - it looks to me like they're mounting those files into a containerized environment, which should mean we can trust Cowork not to be able to access anything outside of that sandbox.

It turns out I have 46 draft files from the past three months. Claude then went to work with its search tool, running 44 individual searches against site:simonwillison.net to figure out which of my drafts had already been published.

Here's the eventual reply:

Based on my analysis, here are your unpublished drafts that appear closest to being ready for publication:

🔥 Most Ready to Publish (substantial content, not yet published)

freq-argued-llms.md - "Frequently Argued Questions about LLMs" (22,602 bytes)

This is a meaty piece documenting common arguments about LLMs with your counterpoints

Well-structured with a TL;DR and multiple sections

No matching published article found on your site

Very close to ready - just needs a final review pass

claude-code-timeline.md - "Claude Code Timeline and Codex Timeline" (3,075 bytes)

About viewing JSONL session logs from Claude Code and Codex

You published claude-code-transcripts on Dec 25, but this appears to be a different/earlier piece about timeline viewing tools

Shorter but seems complete

datasette-1a20-upgrade-instructions.md - Plugin Upgrade Guide (3,147 bytes)

Technical guide for plugin authors

You published the main 1.0a20 announcement but this companion upgrade guide appears unpublished

Would be valuable for plugin maintainers

That's a good response! It found exactly what I needed to see, although those upgrade instructions are actually published elsewhere now (in the Datasette docs) and weren't actually intended for my blog.

Just for fun, and because I really like artifacts, I asked for a follow-up:

Make me an artifact with exciting animated encouragements to get me to do it

Here's what I got:

I couldn't figure out how to close the right sidebar so the artifact ended up cramped into a thin column but it did work. I expect Anthropic will fix that display bug pretty quickly.

Isn't this just Claude Code?

I've seen a few people ask what the difference between this and regular Claude Code is. The answer is not a lot. As far as I can tell Claude Cowork is regular Claude Code wrapped in a less intimidating default interface and with a filesystem sandbox configured for you without you needing to know what a "filesystem sandbox" is.

Update: It's more than just a filesystem sandbox - I had Claude Code reverse engineer the Claude app and it found out that Claude uses VZVirtualMachine - the Apple Virtualization Framework - and downloads and boots a custom Linux root filesystem.

I think that's a really smart product. Claude Code has an enormous amount of value that hasn't yet been unlocked for a general audience, and this seems like a pragmatic approach.

The ever-present threat of prompt injection

With a feature like this, my first thought always jumps straight to security. How big is the risk that someone using this might be hit by hidden malicious instruction somewhere that break their computer or steal their data?

Anthropic touch on that directly in the announcement:

You should also be aware of the risk of "prompt injections": attempts by attackers to alter Claude's plans through content it might encounter on the internet. We've built sophisticated defenses against prompt injections, but agent safety---that is, the task of securing Claude's real-world actions---is still an active area of development in the industry.

These risks aren't new with Cowork, but it might be the first time you're using a more advanced tool that moves beyond a simple conversation. We recommend taking precautions, particularly while you learn how it works. We provide more detail in our Help Center.

That help page includes the following tips:

To minimize risks:

Avoid granting access to local files with sensitive information, like financial documents.

When using the Claude in Chrome extension, limit access to trusted sites.

If you chose to extend Claude’s default internet access settings, be careful to only extend internet access to sites you trust.

Monitor Claude for suspicious actions that may indicate prompt injection.

I do not think it is fair to tell regular non-programmer users to watch out for "suspicious actions that may indicate prompt injection"!

I'm sure they have some impressive mitigations going on behind the scenes. I recently learned that the summarization applied by the WebFetch function in Claude Code and now in Cowork is partly intended as a prompt injection protection layer via this tweet from Claude Code creator Boris Cherny:

Summarization is one thing we do to reduce prompt injection risk. Are you running into specific issues with it?

But Anthropic are being honest here with their warnings: they can attempt to filter out potential attacks all they like but the one thing they can't provide is guarantees that no future attack will be found that sneaks through their defenses and steals your data (see the lethal trifecta for more on this.)

The problem with prompt injection remains that until there's a high profile incident it's really hard to get people to take it seriously. I myself have all sorts of Claude Code usage that could cause havoc if a malicious injection got in. Cowork does at least run in a filesystem sandbox by default, which is more than can be said for my claude --dangerously-skip-permissions habit!

I wrote more about this in my 2025 round-up: The year of YOLO and the Normalization of Deviance.

This is still a strong signal of the future

Security worries aside, Cowork represents something really interesting. This is a general agent that looks well positioned to bring the wildly powerful capabilities of Claude Code to a wider audience.

I would be very surprised if Gemini and OpenAI don't follow suit with their own offerings in this category.

I imagine OpenAI are already regretting burning the name "ChatGPT Agent" on their janky, experimental and mostly forgotten browser automation tool back in August!

Bonus: and a silly logo

bashtoni on Hacker News:

Simple suggestion: logo should be a cow and and orc to match how I originally read the product name.

I couldn't resist throwing that one at Nano Banana:

Tags: sandboxing, ai, prompt-injection, generative-ai, llms, anthropic, claude, ai-agents, claude-code, lethal-trifecta, claude-cowork

The November 2025 inflection point

2026-01-04T23:21:42+00:00

It genuinely feels to me like GPT-5.2 and Opus 4.5 in November represent an inflection point - one of those moments where the models get incrementally better in a way that tips across an invisible capability line where suddenly a whole bunch of much harder coding problems open up.

Tags: anthropic, claude, openai, ai, llms, gpt-5, ai-assisted-programming, generative-ai, claude-4, november-2025-inflection