<?xml version="1.0" encoding="utf-8"?>
<feed xml:lang="en-us" xmlns="http://www.w3.org/2005/Atom"><title>Simon Willison's Weblog: cli</title><link href="http://simonwillison.net/" rel="alternate"/><link href="http://simonwillison.net/tags/cli.atom" rel="self"/><id>http://simonwillison.net/</id><updated>2025-09-02T11:05:23+00:00</updated><author><name>Simon Willison</name></author><entry><title>Rich Pixels</title><link href="https://simonwillison.net/2025/Sep/2/rich-pixels/#atom-tag" rel="alternate"/><published>2025-09-02T11:05:23+00:00</published><updated>2025-09-02T11:05:23+00:00</updated><id>https://simonwillison.net/2025/Sep/2/rich-pixels/#atom-tag</id><summary type="html">
    
&lt;p&gt;&lt;strong&gt;&lt;a href="https://github.com/darrenburns/rich-pixels"&gt;Rich Pixels&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;
Neat Python library by Darren Burns adding pixel image support to the Rich terminal library, using tricks to render an image using full or half-height colored blocks.&lt;/p&gt;
&lt;p&gt;Here's &lt;a href="https://github.com/darrenburns/rich-pixels/blob/a0745ebcc26b966d9dbac5875720364ee5c6a1d3/rich_pixels/_renderer.py#L123C25-L123C26"&gt;the key trick&lt;/a&gt; - it renders Unicode ▄ (U+2584, "lower half block") characters after setting a foreground and background color for the two pixels it needs to display.&lt;/p&gt;
&lt;p&gt;I got GPT-5 to &lt;a href="https://chatgpt.com/share/68b6c443-2408-8006-8f4a-6862755cd1e4"&gt;vibe code up&lt;/a&gt; a &lt;code&gt;show_image.py&lt;/code&gt; terminal command which resizes the provided image to fit the width and height of the current terminal and displays it using Rich Pixels. That &lt;a href="https://github.com/simonw/tools/blob/main/python/show_image.py"&gt;script is here&lt;/a&gt;, you can run it with &lt;code&gt;uv&lt;/code&gt; like this:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;uv run https://tools.simonwillison.net/python/show_image.py \
  image.jpg
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Here's what I got when I ran it against my V&amp;amp;A East Storehouse photo from &lt;a href="https://simonwillison.net/2025/Aug/27/london-culture/"&gt;this post&lt;/a&gt;:&lt;/p&gt;
&lt;p&gt;&lt;img alt="Terminal window. I ran that command and it spat out quite a pleasing and recognizable pixel art version of the photograph." src="https://static.simonwillison.net/static/2025/pixel-storehouse.jpg" /&gt;


    &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/ascii-art"&gt;ascii-art&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/cli"&gt;cli&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/python"&gt;python&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/unicode"&gt;unicode&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/ai"&gt;ai&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/generative-ai"&gt;generative-ai&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/llms"&gt;llms&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/uv"&gt;uv&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/vibe-coding"&gt;vibe-coding&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/gpt-5"&gt;gpt-5&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/rich"&gt;rich&lt;/a&gt;&lt;/p&gt;



</summary><category term="ascii-art"/><category term="cli"/><category term="python"/><category term="unicode"/><category term="ai"/><category term="generative-ai"/><category term="llms"/><category term="uv"/><category term="vibe-coding"/><category term="gpt-5"/><category term="rich"/></entry><entry><title>f2</title><link href="https://simonwillison.net/2025/May/24/f2/#atom-tag" rel="alternate"/><published>2025-05-24T19:20:48+00:00</published><updated>2025-05-24T19:20:48+00:00</updated><id>https://simonwillison.net/2025/May/24/f2/#atom-tag</id><summary type="html">
    
&lt;p&gt;&lt;strong&gt;&lt;a href="https://github.com/ayoisaiah/f2"&gt;f2&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;
Really neat CLI tool for bulk renaming of files and directories by Ayooluwa Isaiah, written in Go and designed to work cross-platform.&lt;/p&gt;
&lt;p&gt;There's a &lt;em&gt;lot&lt;/em&gt; of great design in this. &lt;a href="https://f2.freshman.tech/guide/tutorial"&gt;Basic usage&lt;/a&gt; is intuitive - here's how to rename all &lt;code&gt;.svg&lt;/code&gt; files to &lt;code&gt;.tmp.svg&lt;/code&gt; in the current directory:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;f2 -f '.svg' -r '.tmp.svg' path/to/dir
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;f2 defaults to a dry run which looks like this:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;*————————————————————*————————————————————————*————————*
|      ORIGINAL      |        RENAMED         | STATUS |
*————————————————————*————————————————————————*————————*
| claude-pelican.svg | claude-pelican.tmp.svg | ok     |
| gemini-pelican.svg | gemini-pelican.tmp.svg | ok     |
*————————————————————*————————————————————————*————————*
dry run: commit the above changes with the -x/--exec flag
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Running &lt;code&gt;-x&lt;/code&gt; executes the rename.&lt;/p&gt;
&lt;p&gt;The really cool stuff is the advanced features - Ayooluwa has thought of &lt;em&gt;everything&lt;/em&gt;. The EXIF integration is particularly clevel - here's an example &lt;a href="https://f2.freshman.tech/guide/organizing-image-library"&gt;from the advanced tutorial&lt;/a&gt; which renames a library of photos to use their EXIF creation date as part of the file path:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;f2 -r '{x.cdt.YYYY}/{x.cdt.MM}-{x.cdt.MMM}/{x.cdt.YYYY}-{x.cdt.MM}-{x.cdt.DD}/{f}{ext}' -R
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;The &lt;code&gt;-R&lt;/code&gt; flag means "recursive". The small &lt;code&gt;-r&lt;/code&gt; uses variable syntax &lt;a href="https://f2.freshman.tech/guide/exif-variables"&gt;for EXIF data&lt;/a&gt;. There are plenty of others too, including &lt;a href="https://f2.freshman.tech/guide/file-hash-variables"&gt;hash variables&lt;/a&gt; that use the hash of the file contents.&lt;/p&gt;
&lt;h4 id="f2-installation"&gt;Installation notes&lt;/h4&gt;

&lt;p&gt;I had Go 1.23.2 installed on my Mac via Homebrew. I ran this:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;go install github.com/ayoisaiah/f2/v2/cmd/f2@latest
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;And got an error:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;requires go &amp;gt;= 1.24.2 (running go 1.23.2; GOTOOLCHAIN=local)
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;So I upgraded Go using Homebrew:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;brew upgrade go
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Which took me to 1.24.3 - then the &lt;code&gt;go install&lt;/code&gt; command worked. It put the binary in &lt;code&gt;~/go/bin/f2&lt;/code&gt;.&lt;/p&gt;
&lt;p&gt;There's also &lt;a href="https://www.npmjs.com/package/@ayoisaiah/f2"&gt;an npm package&lt;/a&gt;, similar to the pattern I wrote about a while ago of people &lt;a href="https://simonwillison.net/2022/May/23/bundling-binary-tools-in-python-wheels/"&gt;Bundling binary tools in Python wheels&lt;/a&gt;.

    &lt;p&gt;&lt;small&gt;&lt;/small&gt;Via &lt;a href="https://news.ycombinator.com/item?id=44081850"&gt;Hacker News&lt;/a&gt;&lt;/small&gt;&lt;/p&gt;


    &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/cli"&gt;cli&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/go"&gt;go&lt;/a&gt;&lt;/p&gt;



</summary><category term="cli"/><category term="go"/></entry><entry><title>OpenAI Codex</title><link href="https://simonwillison.net/2025/May/16/openai-codex/#atom-tag" rel="alternate"/><published>2025-05-16T19:12:06+00:00</published><updated>2025-05-16T19:12:06+00:00</updated><id>https://simonwillison.net/2025/May/16/openai-codex/#atom-tag</id><summary type="html">
    
&lt;p&gt;&lt;strong&gt;&lt;a href="https://platform.openai.com/docs/codex"&gt;OpenAI Codex&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;
&lt;a href="https://openai.com/index/introducing-codex/"&gt;Announced today&lt;/a&gt;, here's the documentation for OpenAI's "cloud-based software engineering agent". It's not yet available for us $20/month Plus customers ("coming soon") but if you're a $200/month Pro user you can try it out now.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;At a high level, you specify a prompt, and the agent goes to work in its own environment. After about 8–10 minutes, the agent gives you back a diff.&lt;/p&gt;
&lt;p&gt;You can execute prompts in either &lt;em&gt;ask&lt;/em&gt; mode or &lt;em&gt;code&lt;/em&gt; mode. When you select &lt;em&gt;ask&lt;/em&gt;, Codex clones a read-only version of your repo, booting faster and giving you follow-up tasks. &lt;em&gt;Code&lt;/em&gt; mode, however, creates a full-fledged environment that the agent can run and test against.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;This &lt;a href="https://twitter.com/openaidevs/status/1923492740526112819"&gt;4 minute demo video&lt;/a&gt; is a useful overview. One note that caught my eye is that the setup phase for an environment can pull from the internet (to install necessary dependencies) but the agent loop itself still runs in a network disconnected sandbox.&lt;/p&gt;
&lt;p&gt;It sounds similar to GitHub's own &lt;a href="https://githubnext.com/projects/copilot-workspace"&gt;Copilot Workspace&lt;/a&gt; project, which can compose PRs against your code based on a prompt. The big difference is that Codex incorporates a full Code Interpeter style environment, allowing it to build and run the code it's creating and execute tests in a loop.&lt;/p&gt;
&lt;p&gt;Copilot Workspaces has a level of integration with Codespaces but still requires manual intervention to help exercise the code.&lt;/p&gt;
&lt;p&gt;Also similar to Copilot Workspaces is a confusing  name. OpenAI now have &lt;em&gt;four&lt;/em&gt; products called Codex:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href="https://openai.com/codex/"&gt;OpenAI Codex&lt;/a&gt;, announced today.&lt;/li&gt;
&lt;li&gt;&lt;a href="https://github.com/openai/codex"&gt;Codex CLI&lt;/a&gt;, a completely different coding assistant tool they released a few weeks ago that is the same kind of shape as &lt;a href="https://docs.anthropic.com/en/docs/claude-code/overview"&gt;Claude Code&lt;/a&gt;. This one owns the &lt;a href="https://github.com/openai/codex"&gt;openai/codex&lt;/a&gt; namespace on GitHub.&lt;/li&gt;
&lt;li&gt;&lt;a href="https://platform.openai.com/docs/models/codex-mini-latest"&gt;codex-mini&lt;/a&gt;, a brand new model released today that is used by their Codex product. It's a fine-tuned o4-mini variant. I released &lt;a href="https://github.com/simonw/llm-openai-plugin/releases/tag/0.4"&gt;llm-openai-plugin 0.4&lt;/a&gt; adding support for that model.&lt;/li&gt;
&lt;li&gt;&lt;a href="https://web.archive.org/web/20230203201912/https://openai.com/blog/openai-codex/"&gt;OpenAI Codex (2021)&lt;/a&gt; - Internet Archive link, OpenAI's first specialist coding model from the GPT-3 era. This was used by the original GitHub Copilot and is still the current topic of Wikipedia's &lt;a href="https://en.m.wikipedia.org/wiki/OpenAI_Codex"&gt;OpenAI Codex&lt;/a&gt; page.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;My favorite thing about this most recent Codex product is that OpenAI shared &lt;a href="https://github.com/openai/codex-universal/blob/main/Dockerfile"&gt;the full Dockerfile&lt;/a&gt; for the environment that the system uses to run code - in &lt;code&gt;openai/codex-universal&lt;/code&gt; on GitHub because &lt;code&gt;openai/codex&lt;/code&gt; was taken already.&lt;/p&gt;
&lt;p&gt;This is extremely useful documentation for figuring out how to use this thing - I'm glad they're making this as transparent as possible.&lt;/p&gt;
&lt;p&gt;And to be fair, If you ignore it previous history Codex Is a good name for this product. I'm just glad they didn't call it &lt;a href="https://twitter.com/simonw/status/1730259398990385355"&gt;Ada&lt;/a&gt;.


    &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/cli"&gt;cli&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/github"&gt;github&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/ai"&gt;ai&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/openai"&gt;openai&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/generative-ai"&gt;generative-ai&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/llms"&gt;llms&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/ai-assisted-programming"&gt;ai-assisted-programming&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/llm"&gt;llm&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/ai-agents"&gt;ai-agents&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/llm-release"&gt;llm-release&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/coding-agents"&gt;coding-agents&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/async-coding-agents"&gt;async-coding-agents&lt;/a&gt;&lt;/p&gt;



</summary><category term="cli"/><category term="github"/><category term="ai"/><category term="openai"/><category term="generative-ai"/><category term="llms"/><category term="ai-assisted-programming"/><category term="llm"/><category term="ai-agents"/><category term="llm-release"/><category term="coding-agents"/><category term="async-coding-agents"/></entry><entry><title>sqlite-utils 4.0a0</title><link href="https://simonwillison.net/2025/May/9/sqlite-utils-40a0/#atom-tag" rel="alternate"/><published>2025-05-09T04:02:31+00:00</published><updated>2025-05-09T04:02:31+00:00</updated><id>https://simonwillison.net/2025/May/9/sqlite-utils-40a0/#atom-tag</id><summary type="html">
    
&lt;p&gt;&lt;strong&gt;&lt;a href="https://github.com/simonw/sqlite-utils/releases/tag/4.0a0"&gt;sqlite-utils 4.0a0&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;
New alpha release of &lt;a href="https://sqlite-utils.datasette.io/"&gt;sqlite-utils&lt;/a&gt;, my Python library and CLI tool for manipulating SQLite databases.&lt;/p&gt;
&lt;p&gt;It's the first 4.0 alpha because there's a (minor) backwards-incompatible change: I've upgraded the &lt;code&gt;.upsert()&lt;/code&gt; and &lt;code&gt;.upsert_all()&lt;/code&gt; methods to use SQLIte's &lt;a href="https://www.sqlite.org/lang_upsert.html"&gt;UPSERT&lt;/a&gt; mechanism, &lt;code&gt;INSERT INTO ... ON CONFLICT DO UPDATE&lt;/code&gt;. Details in &lt;a href="https://github.com/simonw/sqlite-utils/issues/652"&gt;this issue&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;That feature was added to SQLite in version 3.24.0, released 2018-06-04. I'm pretty cautious about my SQLite version support since the underlying library can be difficult to upgrade, depending on your platform and operating system.&lt;/p&gt;
&lt;p&gt;I'm going to leave the new alpha to bake for a little while before pushing a stable release. Since this is a major version bump I'm going to &lt;a href="https://github.com/simonw/sqlite-utils/issues/656"&gt;take the opportunity&lt;/a&gt; to see if there are any other minor API warts that I can clean up at the same time.


    &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/cli"&gt;cli&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/projects"&gt;projects&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/sqlite"&gt;sqlite&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/sqlite-utils"&gt;sqlite-utils&lt;/a&gt;&lt;/p&gt;



</summary><category term="cli"/><category term="projects"/><category term="sqlite"/><category term="sqlite-utils"/></entry><entry><title>Feed a video to a vision LLM as a sequence of JPEG frames on the CLI (also LLM 0.25)</title><link href="https://simonwillison.net/2025/May/5/llm-video-frames/#atom-tag" rel="alternate"/><published>2025-05-05T17:38:25+00:00</published><updated>2025-05-05T17:38:25+00:00</updated><id>https://simonwillison.net/2025/May/5/llm-video-frames/#atom-tag</id><summary type="html">
    &lt;p&gt;The new &lt;strong&gt;&lt;a href="https://github.com/simonw/llm-video-frames"&gt;llm-video-frames&lt;/a&gt;&lt;/strong&gt; plugin can turn a video file into a sequence of JPEG frames and feed them directly into a long context vision LLM such as GPT-4.1, even when that LLM doesn't directly support video input. It depends on a plugin feature I added to &lt;a href="https://llm.datasette.io/en/stable/changelog.html#v0-25"&gt;LLM 0.25&lt;/a&gt;, which I released last night.&lt;/p&gt;
&lt;p&gt;Here's how to try it out:&lt;/p&gt;
&lt;div class="highlight highlight-source-shell"&gt;&lt;pre&gt;brew install ffmpeg &lt;span class="pl-c"&gt;&lt;span class="pl-c"&gt;#&lt;/span&gt; or apt-get or your package manager of choice&lt;/span&gt;
uv tool install llm &lt;span class="pl-c"&gt;&lt;span class="pl-c"&gt;#&lt;/span&gt; or pipx install llm or pip install llm&lt;/span&gt;
llm install llm-video-frames
llm keys &lt;span class="pl-c1"&gt;set&lt;/span&gt; openai
&lt;span class="pl-c"&gt;&lt;span class="pl-c"&gt;#&lt;/span&gt; Paste your OpenAI API key here&lt;/span&gt;

llm -f video-frames:video.mp4 \
  &lt;span class="pl-s"&gt;&lt;span class="pl-pds"&gt;'&lt;/span&gt;describe the key scenes in this video&lt;span class="pl-pds"&gt;'&lt;/span&gt;&lt;/span&gt; \
  -m gpt-4.1-mini&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;The &lt;code&gt;video-frames:filepath.mp4&lt;/code&gt; syntax is provided by the new plugin. It takes that video, converts it to a JPEG for every second of the video and then turns those into &lt;a href="https://llm.datasette.io/en/stable/usage.html#attachments"&gt;attachments&lt;/a&gt; that can be passed to the LLM.&lt;/p&gt;
&lt;p&gt;Here's a demo, against &lt;a href="https://static.simonwillison.net/static/2025/cleo.mp4"&gt;this video&lt;/a&gt; of Cleo:&lt;/p&gt;

&lt;div style="max-width: 100%; margin-bottom: 0.4em"&gt;
    &lt;video controls="controls" preload="none" aria-label="Cleo " poster="https://static.simonwillison.net/static/2025/cleo-still.jpg" loop="loop" style="width: 100%; height: auto;" muted="muted"&gt;
        &lt;source src="https://static.simonwillison.net/static/2025/cleo.mp4" type="video/mp4" /&gt;
    &lt;/video&gt;
&lt;/div&gt;

&lt;div class="highlight highlight-source-shell"&gt;&lt;pre&gt;llm -f video-frames:cleo.mp4 &lt;span class="pl-s"&gt;&lt;span class="pl-pds"&gt;'&lt;/span&gt;describe key moments&lt;span class="pl-pds"&gt;'&lt;/span&gt;&lt;/span&gt; -m gpt-4.1-mini&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;And the output from the model (&lt;a href="https://gist.github.com/simonw/a4e26166a524f9c07b4cf32b2f8da6c9"&gt;transcript here&lt;/a&gt;):&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;The sequence of images captures the key moments of a dog being offered and then enjoying a small treat:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;In the first image, a hand is holding a small cupcake with purple frosting close to a black dog's face. The dog looks eager and is focused intently on the treat.&lt;/li&gt;
&lt;li&gt;The second image shows the dog beginning to take a bite of the cupcake from the person's fingers. The dog's mouth is open, gently nibbling on the treat.&lt;/li&gt;
&lt;li&gt;In the third image, the dog has finished or is almost done with the treat and looks content, with a slight smile and a relaxed posture. The treat is no longer in the person's hand, indicating that the dog has consumed it.&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;This progression highlights the dog's anticipation, enjoyment, and satisfaction with the treat.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;Total cost: 7,072 input tokens, 156 output tokens - for GPT-4.1 mini that's 0.3078 cents (less than a third of a cent).&lt;/p&gt;
&lt;p&gt;In this case the plugin converted the video into three images: &lt;a href="https://static.simonwillison.net/static/2025/frame_00001.jpg"&gt;frame_00001.jpg&lt;/a&gt;, &lt;a href="https://static.simonwillison.net/static/2025/frame_00002.jpg"&gt;frame_00002.jpg&lt;/a&gt; and &lt;a href="https://static.simonwillison.net/static/2025/frame_00003.jpg"&gt;frame_00003.jpg&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;The plugin accepts additional arguments. You can increase the frames-per-second using &lt;code&gt;?fps=2&lt;/code&gt; - for example:&lt;/p&gt;
&lt;div class="highlight highlight-source-shell"&gt;&lt;pre&gt;llm -f &lt;span class="pl-s"&gt;&lt;span class="pl-pds"&gt;'&lt;/span&gt;video-frames:video.mp4?fps=2&lt;span class="pl-pds"&gt;'&lt;/span&gt;&lt;/span&gt; &lt;span class="pl-s"&gt;&lt;span class="pl-pds"&gt;'&lt;/span&gt;summarize this video&lt;span class="pl-pds"&gt;'&lt;/span&gt;&lt;/span&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;Or you can add &lt;code&gt;?timestamps=1&lt;/code&gt; to cause &lt;code&gt;ffmpeg&lt;/code&gt; to overlay a timestamp in the bottom right corner of each frame. This gives the model a chance to return timestamps in its output.&lt;/p&gt;
&lt;p&gt;Let's try that with the Cleo video:&lt;/p&gt;
&lt;div class="highlight highlight-source-shell"&gt;&lt;pre&gt;llm -f &lt;span class="pl-s"&gt;&lt;span class="pl-pds"&gt;'&lt;/span&gt;video-frames:cleo.mp4?timestamps=1&amp;amp;fps=5&lt;span class="pl-pds"&gt;'&lt;/span&gt;&lt;/span&gt; \
  &lt;span class="pl-s"&gt;&lt;span class="pl-pds"&gt;'&lt;/span&gt;key moments, include timestamps&lt;span class="pl-pds"&gt;'&lt;/span&gt;&lt;/span&gt; -m gpt-4.1-mini&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;Here's the output (&lt;a href="https://gist.github.com/simonw/371719849a7d2260f1eb0f422ea63187"&gt;transcript here&lt;/a&gt;):&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;Here are the key moments from the video "cleo.mp4" with timestamps:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;00:00:00.000 - A dog on leash looks at a small cupcake with purple frosting being held by a person.&lt;/li&gt;
&lt;li&gt;00:00:00.800 - The dog closely sniffs the cupcake.&lt;/li&gt;
&lt;li&gt;00:00:01.400 - The person puts a small amount of the cupcake frosting on their finger.&lt;/li&gt;
&lt;li&gt;00:00:01.600 - The dog starts licking the frosting from the person's finger.&lt;/li&gt;
&lt;li&gt;00:00:02.600 - The dog continues licking enthusiastically.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Let me know if you need more details or a description of any specific part.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;That one sent 14 images to the API, at a total cost of 32,968 input, 141 output = 1.3413 cents.&lt;/p&gt;
&lt;p&gt;It sent 5.9MB of image data to OpenAI's API, encoded as base64 in the JSON API call.&lt;/p&gt;
&lt;p&gt;The GPT-4.1 model family accepts up to 1,047,576 tokens. Aside from a 20MB size limit per image I haven't seen any documentation of limits on the number of images. You can fit a whole lot of JPEG frames in a million tokens!&lt;/p&gt;
&lt;p&gt;Here's what one of those frames looks like with the timestamp overlaid in the corner:&lt;/p&gt;
&lt;p&gt;&lt;img src="https://static.simonwillison.net/static/2025/cleo-finger.jpg" alt="Cleo taking a treat from my fingers, in the bottom right corner is an overlay t hat says cleo.mp4 00:00:01.600" style="max-width: 100%;" /&gt;&lt;/p&gt;
&lt;h4 id="how-i-built-the-plugin-with-o4-mini"&gt;How I built the plugin with o4-mini&lt;/h4&gt;
&lt;p&gt;This is a great example of how rapid prototyping with an LLM can help demonstrate the value of a feature.&lt;/p&gt;
&lt;p&gt;I was considering whether it would make sense for fragment plugins to return images in &lt;a href="https://github.com/simonw/llm/issues/972#issuecomment-2849342103"&gt;issue 972&lt;/a&gt; when I had the idea to use &lt;code&gt;ffmpeg&lt;/code&gt; to split a video into frames.&lt;/p&gt;
&lt;p&gt;I know &lt;a href="https://simonwillison.net/2025/Apr/23/llm-fragment-symbex/"&gt;from past experience&lt;/a&gt; that a good model can write an entire plugin for LLM if you feed it the right example, so I started with this (reformatted here for readability):&lt;/p&gt;
&lt;div class="highlight highlight-source-shell"&gt;&lt;pre&gt;llm -m o4-mini -f github:simonw/llm-hacker-news -s &lt;span class="pl-s"&gt;&lt;span class="pl-pds"&gt;'&lt;/span&gt;write a new plugin called llm_video_frames.py which takes video:path-to-video.mp4 and creates a temporary directory which it then populates with one frame per second of that video using ffmpeg - then it returns a list of [llm.Attachment(path="path-to-frame1.jpg"), ...] - it should also support passing video:video.mp4?fps=2 to increase to two frames per second, and if you pass ?timestamps=1 or &amp;amp;timestamps=1 then it should add a text timestamp to the bottom right conner of each image with the mm:ss timestamp of that frame (or hh:mm:ss if more than one hour in) and the filename of the video without the path as well.&lt;span class="pl-pds"&gt;'&lt;/span&gt;&lt;/span&gt; -o reasoning_effort high&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;Here's &lt;a href="https://gist.github.com/simonw/4f545ecb347884d1d923dbc49550b8b0#response"&gt;the transcript&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;The new attachment mechanism went from vague idea to "I should build that" as a direct result of having an LLM-built proof-of-concept that demonstrated the feasibility of the new feature.&lt;/p&gt;
&lt;p&gt;The code it produced was about 90% of the code I shipped in the finished plugin. Total cost 5,018 input, 2,208 output = 1.5235 cents.&lt;/p&gt;
&lt;h4 id="annotated-release-notes-for-everything-else-in-llm-0-25"&gt;Annotated release notes for everything else in LLM 0.25&lt;/h4&gt;
&lt;p&gt;Here are the annotated release notes for everything else:&lt;/p&gt;
&lt;blockquote&gt;
&lt;ul&gt;
&lt;li&gt;New plugin feature: &lt;a href="https://llm.datasette.io/en/stable/plugins/plugin-hooks.html#plugin-hooks-register-fragment-loaders"&gt;register_fragment_loaders(register)&lt;/a&gt; plugins can now return a mixture of fragments and attachments. The &lt;a href="https://github.com/simonw/llm-video-frames"&gt;llm-video-frames&lt;/a&gt; plugin is the first to take advantage of this mechanism. &lt;a href="https://github.com/simonw/llm/issues/972"&gt;#972&lt;/a&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;/blockquote&gt;
&lt;p&gt;As decsribed above. The inspiration for this feature came from the &lt;a href="https://github.com/agustif/llm-arxiv"&gt;llm-arxiv&lt;/a&gt; plugin by &lt;a href="https://github.com/agustif"&gt;agustif&lt;/a&gt;.&lt;/p&gt;
&lt;blockquote&gt;
&lt;ul&gt;
&lt;li&gt;New OpenAI models: &lt;code&gt;gpt-4.1&lt;/code&gt;, &lt;code&gt;gpt-4.1-mini&lt;/code&gt;, &lt;code&gt;gpt-41-nano&lt;/code&gt;, &lt;code&gt;o3&lt;/code&gt;, &lt;code&gt;o4-mini&lt;/code&gt;. &lt;a href="https://github.com/simonw/llm/issues/945"&gt;#945&lt;/a&gt;, &lt;a href="https://github.com/simonw/llm/issues/965"&gt;#965&lt;/a&gt;, &lt;a href="https://github.com/simonw/llm/issues/976"&gt;#976&lt;/a&gt;.&lt;/li&gt;
&lt;/ul&gt;
&lt;/blockquote&gt;
&lt;p&gt;My original plan was to leave these models exclusively to the new &lt;a href="https://github.com/simonw/llm-openai-plugin"&gt;llm-openai&lt;/a&gt; plugin, since that allows me to add support for new models without a full LLM release. I'm going to punt on that until I'm ready to entirely remove the OpenAI models from LLM core.&lt;/p&gt;
&lt;blockquote&gt;
&lt;ul&gt;
&lt;li&gt;New environment variables: &lt;code&gt;LLM_MODEL&lt;/code&gt; and &lt;code&gt;LLM_EMBEDDING_MODEL&lt;/code&gt; for setting the model to use without needing to specify &lt;code&gt;-m model_id&lt;/code&gt; every time. &lt;a href="https://github.com/simonw/llm/issues/932"&gt;#932&lt;/a&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;/blockquote&gt;
&lt;p&gt;A convenience feature for if you want to set the default model for a terminal session with LLM without using the global &lt;a href="https://llm.datasette.io/en/stable/setup.html#setting-a-custom-default-model"&gt;default model" mechanism&lt;/a&gt;.&lt;/p&gt;
&lt;blockquote&gt;
&lt;ul&gt;
&lt;li&gt;New command: &lt;code&gt;llm fragments loaders&lt;/code&gt;, to list all currently available fragment loader prefixes provided by plugins. &lt;a href="https://github.com/simonw/llm/issues/941"&gt;#941&lt;/a&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;/blockquote&gt;
&lt;p&gt;Mainly for consistence with the existing &lt;a href="https://llm.datasette.io/en/stable/help.html#llm-templates-loaders-help"&gt;llm templates loaders&lt;/a&gt; command. Here's the output when I run &lt;code&gt;llm fragments loaders&lt;/code&gt; on my machine:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;docs:
  Fetch the latest documentation for the specified package from
  https://github.com/simonw/docs-for-llms

  Use '-f docs:' for the documentation of your current version of LLM.

docs-preview:
  Similar to docs: but fetches the latest docs including alpha/beta releases.

symbex:
  Walk the given directory, parse every .py file, and for every
  top-level function or class-method produce its signature and
  docstring plus an import line.

github:
  Load files from a GitHub repository as fragments

  Argument is a GitHub repository URL or username/repository

issue:
  Fetch GitHub issue/pull and comments as Markdown

  Argument is either "owner/repo/NUMBER" or URL to an issue

pr:
  Fetch GitHub pull request with comments and diff as Markdown

  Argument is either "owner/repo/NUMBER" or URL to a pull request

hn:
  Given a Hacker News article ID returns the full nested conversation.

  For example: -f hn:43875136

video-frames:
  Fragment loader "video-frames:&amp;lt;path&amp;gt;?fps=N&amp;amp;timestamps=1"
  - extracts frames at `fps` per second (default 1)
  - if `timestamps=1`, overlays "filename hh:mm:ss" at bottom-right
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;That's from &lt;a href="https://github.com/simonw/llm-docs"&gt;llm-docs&lt;/a&gt;, &lt;a href="https://github.com/simonw/llm-fragments-github"&gt;llm-fragments-symbex&lt;/a&gt;, &lt;a href="https://github.com/simonw/llm-fragments-github"&gt;llm-fragments-github&lt;/a&gt;, &lt;a href="https://github.com/simonw/llm-hacker-news"&gt;llm-hacker-news&lt;/a&gt; and &lt;a href="https://github.com/simonw/llm-video-frames"&gt;llm-video-frames&lt;/a&gt;.&lt;/p&gt;
&lt;blockquote&gt;
&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;llm fragments&lt;/code&gt; command now shows fragments ordered by the date they were first used. &lt;a href="https://github.com/simonw/llm/issues/973"&gt;#973&lt;/a&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;/blockquote&gt;
&lt;p&gt;This makes it easier to quickly debug a new fragment plugin - you can run &lt;code&gt;llm fragments&lt;/code&gt; and glance at the bottom few entries.&lt;/p&gt;
&lt;p&gt;I've also been using the new &lt;a href="https://github.com/simonw/llm-echo"&gt;llm-echo&lt;/a&gt; debugging plugin for this - it adds a new fake model called "echo" which simply outputs whatever the prompt, system prompt, fragments and attachments are that were passed to the model:&lt;/p&gt;
&lt;div class="highlight highlight-source-shell"&gt;&lt;pre&gt;llm -f docs:sqlite-utils -m &lt;span class="pl-c1"&gt;echo&lt;/span&gt; &lt;span class="pl-s"&gt;&lt;span class="pl-pds"&gt;'&lt;/span&gt;Show me the context&lt;span class="pl-pds"&gt;'&lt;/span&gt;&lt;/span&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;&lt;a href="https://gist.github.com/simonw/cb3249856887379759515022c76d0d9e"&gt;Output here&lt;/a&gt;.&lt;/p&gt;
&lt;blockquote&gt;
&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;llm chat&lt;/code&gt; now includes a &lt;code&gt;!edit&lt;/code&gt; command for editing a prompt using your default terminal text editor. Thanks, &lt;a href="https://github.com/Hopiu"&gt;Benedikt Willi&lt;/a&gt;. &lt;a href="https://github.com/simonw/llm/pull/969"&gt;#969&lt;/a&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;/blockquote&gt;
&lt;p&gt;This is a really nice enhancement to &lt;code&gt;llm chat&lt;/code&gt;, making it much more convenient to edit longe prompts.&lt;/p&gt;
&lt;p&gt;And the rest:&lt;/p&gt;
&lt;blockquote&gt;
&lt;ul&gt;
&lt;li&gt;Allow &lt;code&gt;-t&lt;/code&gt; and &lt;code&gt;--system&lt;/code&gt; to be used at the same time. &lt;a href="https://github.com/simonw/llm/issues/916"&gt;#916&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;Fixed a bug where accessing a model via its alias would fail to respect any default options set for that model. &lt;a href="https://github.com/simonw/llm/issues/968"&gt;#968&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;Improved documentation for &lt;a href="https://llm.datasette.io/en/stable/other-models.html#openai-compatible-models"&gt;extra-openai-models.yaml&lt;/a&gt;. Thanks, &lt;a href="https://github.com/rahimnathwani"&gt;Rahim Nathwani&lt;/a&gt; and &lt;a href="https://github.com/dguido"&gt;Dan Guido&lt;/a&gt;. &lt;a href="https://github.com/simonw/llm/pull/950"&gt;#950&lt;/a&gt;, &lt;a href="https://github.com/simonw/llm/pull/957"&gt;#957&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;llm -c/--continue&lt;/code&gt; now works correctly with the &lt;code&gt;-d/--database&lt;/code&gt; option. &lt;code&gt;llm chat&lt;/code&gt; now accepts that &lt;code&gt;-d/--database&lt;/code&gt; option. Thanks, &lt;a href="https://github.com/sukhbinder"&gt;Sukhbinder Singh&lt;/a&gt;. &lt;a href="https://github.com/simonw/llm/issues/933"&gt;#933&lt;/a&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;/blockquote&gt;
    
        &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/cli"&gt;cli&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/ffmpeg"&gt;ffmpeg&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/plugins"&gt;plugins&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/projects"&gt;projects&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/ai"&gt;ai&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/generative-ai"&gt;generative-ai&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/llms"&gt;llms&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/ai-assisted-programming"&gt;ai-assisted-programming&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/llm"&gt;llm&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/vision-llms"&gt;vision-llms&lt;/a&gt;&lt;/p&gt;
    

</summary><category term="cli"/><category term="ffmpeg"/><category term="plugins"/><category term="projects"/><category term="ai"/><category term="generative-ai"/><category term="llms"/><category term="ai-assisted-programming"/><category term="llm"/><category term="vision-llms"/></entry><entry><title>llm-fragment-symbex</title><link href="https://simonwillison.net/2025/Apr/23/llm-fragment-symbex/#atom-tag" rel="alternate"/><published>2025-04-23T14:25:38+00:00</published><updated>2025-04-23T14:25:38+00:00</updated><id>https://simonwillison.net/2025/Apr/23/llm-fragment-symbex/#atom-tag</id><summary type="html">
    
&lt;p&gt;&lt;strong&gt;&lt;a href="https://github.com/simonw/llm-fragments-symbex"&gt;llm-fragment-symbex&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;
I released a new LLM &lt;a href="https://llm.datasette.io/en/stable/fragments.html#using-fragments-from-plugins"&gt;fragment loader plugin&lt;/a&gt; that builds on top of my &lt;a href="https://simonwillison.net/2023/Jun/18/symbex/"&gt;Symbex&lt;/a&gt; project.&lt;/p&gt;
&lt;p&gt;Symbex is a CLI tool I wrote that can run against a folder full of Python code and output functions, classes, methods or just their docstrings and signatures, using the Python AST module to parse the code.&lt;/p&gt;
&lt;p&gt;&lt;code&gt;llm-fragments-symbex&lt;/code&gt; brings that ability directly to LLM. It lets you do things like this:&lt;/p&gt;
&lt;pre&gt;llm install llm-fragments-symbex
llm -f symbex:path/to/project -s &lt;span class="pl-s"&gt;&lt;span class="pl-pds"&gt;'&lt;/span&gt;Describe this codebase&lt;span class="pl-pds"&gt;'&lt;/span&gt;&lt;/span&gt;&lt;/pre&gt;

&lt;p&gt;I just ran that against my LLM project itself like this:&lt;/p&gt;
&lt;pre&gt;cd llm
llm -f symbex:. -s &lt;span class="pl-s"&gt;&lt;span class="pl-pds"&gt;'&lt;/span&gt;guess what this code does&lt;span class="pl-pds"&gt;'&lt;/span&gt;&lt;/span&gt;&lt;/pre&gt;

&lt;p&gt;Here's &lt;a href="https://gist.github.com/simonw/b43d5b3ea897900f5c7de7173cc51c82#response"&gt;the full output&lt;/a&gt;, which starts like this:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;This code listing appears to be an index or dump of Python functions, classes, and methods primarily belonging to a codebase related to large language models (LLMs). It covers a broad functionality set related to managing LLMs, embeddings, templates, plugins, logging, and command-line interface (CLI) utilities for interaction with language models. [...]&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;That page also &lt;a href="https://gist.github.com/simonw/b43d5b3ea897900f5c7de7173cc51c82#prompt-fragments"&gt;shows the input generated by the fragment&lt;/a&gt; - here's a representative extract:&lt;/p&gt;
&lt;pre&gt;&lt;span class="pl-c"&gt;# from llm.cli import resolve_attachment&lt;/span&gt;
&lt;span class="pl-k"&gt;def&lt;/span&gt; &lt;span class="pl-en"&gt;resolve_attachment&lt;/span&gt;(&lt;span class="pl-s1"&gt;value&lt;/span&gt;):
    &lt;span class="pl-s"&gt;"""Resolve an attachment from a string value which could be:&lt;/span&gt;
&lt;span class="pl-s"&gt;    - "-" for stdin&lt;/span&gt;
&lt;span class="pl-s"&gt;    - A URL&lt;/span&gt;
&lt;span class="pl-s"&gt;    - A file path&lt;/span&gt;
&lt;span class="pl-s"&gt;&lt;/span&gt;
&lt;span class="pl-s"&gt;    Returns an Attachment object.&lt;/span&gt;
&lt;span class="pl-s"&gt;    Raises AttachmentError if the attachment cannot be resolved."""&lt;/span&gt;

&lt;span class="pl-c"&gt;# from llm.cli import AttachmentType&lt;/span&gt;
&lt;span class="pl-k"&gt;class&lt;/span&gt; &lt;span class="pl-v"&gt;AttachmentType&lt;/span&gt;:

    &lt;span class="pl-k"&gt;def&lt;/span&gt; &lt;span class="pl-en"&gt;convert&lt;/span&gt;(&lt;span class="pl-s1"&gt;self&lt;/span&gt;, &lt;span class="pl-s1"&gt;value&lt;/span&gt;, &lt;span class="pl-s1"&gt;param&lt;/span&gt;, &lt;span class="pl-s1"&gt;ctx&lt;/span&gt;):

&lt;span class="pl-c"&gt;# from llm.cli import resolve_attachment_with_type&lt;/span&gt;
&lt;span class="pl-k"&gt;def&lt;/span&gt; &lt;span class="pl-en"&gt;resolve_attachment_with_type&lt;/span&gt;(&lt;span class="pl-s1"&gt;value&lt;/span&gt;: &lt;span class="pl-smi"&gt;str&lt;/span&gt;, &lt;span class="pl-s1"&gt;mimetype&lt;/span&gt;: &lt;span class="pl-smi"&gt;str&lt;/span&gt;) &lt;span class="pl-c1"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="pl-smi"&gt;Attachment&lt;/span&gt;:&lt;/pre&gt;

&lt;p&gt;If your Python code has good docstrings and type annotations, this should hopefully be a shortcut for providing full API documentation to a model without needing to dump in the entire codebase.&lt;/p&gt;
&lt;p&gt;The above example used 13,471 input tokens and 781 output tokens, using &lt;code&gt;openai/gpt-4.1-mini&lt;/code&gt;. That model is extremely cheap, so the total cost was 0.6638 cents - less than a cent.&lt;/p&gt;
&lt;p&gt;The plugin itself was mostly written by o4-mini using the &lt;a href="https://github.com/simonw/llm-fragments-github"&gt;llm-fragments-github&lt;/a&gt; plugin to load the &lt;a href="https://github.com/simonw/symbex"&gt;simonw/symbex&lt;/a&gt; and &lt;a href="https://github.com/simonw/llm-hacker-news"&gt;simonw/llm-hacker-news&lt;/a&gt; repositories as example code:&lt;/p&gt;
&lt;pre&gt;llm \
  -f github:simonw/symbex \
  -f github:simonw/llm-hacker-news \
  -s &lt;span class="pl-s"&gt;&lt;span class="pl-pds"&gt;"&lt;/span&gt;Write a new plugin as a single llm_fragments_symbex.py file which&lt;/span&gt;
&lt;span class="pl-s"&gt;   provides a custom loader which can be used like this:&lt;/span&gt;
&lt;span class="pl-s"&gt;   llm -f symbex:path/to/folder - it then loads in all of the python&lt;/span&gt;
&lt;span class="pl-s"&gt;   function signatures with their docstrings from that folder using&lt;/span&gt;
&lt;span class="pl-s"&gt;   the same trick that symbex uses, effectively the same as running&lt;/span&gt;
&lt;span class="pl-s"&gt;   symbex . '*' '*.*' --docs --imports -n&lt;span class="pl-pds"&gt;"&lt;/span&gt;&lt;/span&gt; \
   -m openai/o4-mini -o reasoning_effort high&lt;/pre&gt;

&lt;p&gt;Here's &lt;a href="https://gist.github.com/simonw/c46390522bc839daab6c08bad3f87b39#response"&gt;the response&lt;/a&gt;. 27,819 input, 2,918 output =  4.344 cents.&lt;/p&gt;
&lt;p&gt;In working on this project I identified and fixed &lt;a href="https://github.com/simonw/symbex/issues/46"&gt;a minor cosmetic defect&lt;/a&gt; in Symbex itself. Technically this is a breaking change (it changes the output) so I shipped that as &lt;a href="https://github.com/simonw/symbex/releases/tag/2.0"&gt;Symbex 2.0&lt;/a&gt;.


    &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/cli"&gt;cli&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/projects"&gt;projects&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/ai"&gt;ai&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/generative-ai"&gt;generative-ai&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/llms"&gt;llms&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/ai-assisted-programming"&gt;ai-assisted-programming&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/symbex"&gt;symbex&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/llm"&gt;llm&lt;/a&gt;&lt;/p&gt;



</summary><category term="cli"/><category term="projects"/><category term="ai"/><category term="generative-ai"/><category term="llms"/><category term="ai-assisted-programming"/><category term="symbex"/><category term="llm"/></entry><entry><title>Claude Code: Best practices for agentic coding</title><link href="https://simonwillison.net/2025/Apr/19/claude-code-best-practices/#atom-tag" rel="alternate"/><published>2025-04-19T22:17:38+00:00</published><updated>2025-04-19T22:17:38+00:00</updated><id>https://simonwillison.net/2025/Apr/19/claude-code-best-practices/#atom-tag</id><summary type="html">
    
&lt;p&gt;&lt;strong&gt;&lt;a href="https://www.anthropic.com/engineering/claude-code-best-practices"&gt;Claude Code: Best practices for agentic coding&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;
Extensive new documentation from Anthropic on how to get the best results out of their &lt;a href="https://github.com/anthropics/claude-code"&gt;Claude Code&lt;/a&gt; CLI coding agent tool, which includes this fascinating tip:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;We recommend using the word "think" to trigger extended thinking mode, which gives Claude additional computation time to evaluate alternatives more thoroughly. These specific phrases are mapped directly to increasing levels of thinking budget in the system: "think" &amp;lt; "think hard" &amp;lt; "think harder" &amp;lt; "ultrathink." Each level allocates progressively more thinking budget for Claude to use.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;Apparently &lt;strong&gt;ultrathink&lt;/strong&gt; is a magic word!&lt;/p&gt;
&lt;p&gt;I was curious if this was a feature of the Claude model itself or Claude Code in particular. Claude Code isn't open source but you can view the obfuscated JavaScript for it, and make it a tiny bit less obfuscated by running it through &lt;a href="https://prettier.io/"&gt;Prettier&lt;/a&gt;. With &lt;a href="https://claude.ai/share/77c398ec-6a8b-4390-91d3-6e9f0403916e"&gt;Claude's help&lt;/a&gt; I used this recipe:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;mkdir -p /tmp/claude-code-examine
cd /tmp/claude-code-examine
npm init -y
npm install @anthropic-ai/claude-code
cd node_modules/@anthropic-ai/claude-code
npx prettier --write cli.js
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Then used &lt;a href="https://github.com/BurntSushi/ripgrep"&gt;ripgrep&lt;/a&gt; to search for "ultrathink":&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;rg ultrathink -C 30
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;And found this chunk of code:&lt;/p&gt;
&lt;pre&gt;&lt;span class="pl-k"&gt;let&lt;/span&gt; &lt;span class="pl-v"&gt;B&lt;/span&gt; &lt;span class="pl-c1"&gt;=&lt;/span&gt; &lt;span class="pl-v"&gt;W&lt;/span&gt;&lt;span class="pl-kos"&gt;.&lt;/span&gt;&lt;span class="pl-c1"&gt;message&lt;/span&gt;&lt;span class="pl-kos"&gt;.&lt;/span&gt;&lt;span class="pl-c1"&gt;content&lt;/span&gt;&lt;span class="pl-kos"&gt;.&lt;/span&gt;&lt;span class="pl-en"&gt;toLowerCase&lt;/span&gt;&lt;span class="pl-kos"&gt;(&lt;/span&gt;&lt;span class="pl-kos"&gt;)&lt;/span&gt;&lt;span class="pl-kos"&gt;;&lt;/span&gt;
&lt;span class="pl-k"&gt;if&lt;/span&gt; &lt;span class="pl-kos"&gt;(&lt;/span&gt;
  &lt;span class="pl-v"&gt;B&lt;/span&gt;&lt;span class="pl-kos"&gt;.&lt;/span&gt;&lt;span class="pl-en"&gt;includes&lt;/span&gt;&lt;span class="pl-kos"&gt;(&lt;/span&gt;&lt;span class="pl-s"&gt;"think harder"&lt;/span&gt;&lt;span class="pl-kos"&gt;)&lt;/span&gt; &lt;span class="pl-c1"&gt;||&lt;/span&gt;
  &lt;span class="pl-v"&gt;B&lt;/span&gt;&lt;span class="pl-kos"&gt;.&lt;/span&gt;&lt;span class="pl-en"&gt;includes&lt;/span&gt;&lt;span class="pl-kos"&gt;(&lt;/span&gt;&lt;span class="pl-s"&gt;"think intensely"&lt;/span&gt;&lt;span class="pl-kos"&gt;)&lt;/span&gt; &lt;span class="pl-c1"&gt;||&lt;/span&gt;
  &lt;span class="pl-v"&gt;B&lt;/span&gt;&lt;span class="pl-kos"&gt;.&lt;/span&gt;&lt;span class="pl-en"&gt;includes&lt;/span&gt;&lt;span class="pl-kos"&gt;(&lt;/span&gt;&lt;span class="pl-s"&gt;"think longer"&lt;/span&gt;&lt;span class="pl-kos"&gt;)&lt;/span&gt; &lt;span class="pl-c1"&gt;||&lt;/span&gt;
  &lt;span class="pl-v"&gt;B&lt;/span&gt;&lt;span class="pl-kos"&gt;.&lt;/span&gt;&lt;span class="pl-en"&gt;includes&lt;/span&gt;&lt;span class="pl-kos"&gt;(&lt;/span&gt;&lt;span class="pl-s"&gt;"think really hard"&lt;/span&gt;&lt;span class="pl-kos"&gt;)&lt;/span&gt; &lt;span class="pl-c1"&gt;||&lt;/span&gt;
  &lt;span class="pl-v"&gt;B&lt;/span&gt;&lt;span class="pl-kos"&gt;.&lt;/span&gt;&lt;span class="pl-en"&gt;includes&lt;/span&gt;&lt;span class="pl-kos"&gt;(&lt;/span&gt;&lt;span class="pl-s"&gt;"think super hard"&lt;/span&gt;&lt;span class="pl-kos"&gt;)&lt;/span&gt; &lt;span class="pl-c1"&gt;||&lt;/span&gt;
  &lt;span class="pl-v"&gt;B&lt;/span&gt;&lt;span class="pl-kos"&gt;.&lt;/span&gt;&lt;span class="pl-en"&gt;includes&lt;/span&gt;&lt;span class="pl-kos"&gt;(&lt;/span&gt;&lt;span class="pl-s"&gt;"think very hard"&lt;/span&gt;&lt;span class="pl-kos"&gt;)&lt;/span&gt; &lt;span class="pl-c1"&gt;||&lt;/span&gt;
  &lt;span class="pl-v"&gt;B&lt;/span&gt;&lt;span class="pl-kos"&gt;.&lt;/span&gt;&lt;span class="pl-en"&gt;includes&lt;/span&gt;&lt;span class="pl-kos"&gt;(&lt;/span&gt;&lt;span class="pl-s"&gt;"ultrathink"&lt;/span&gt;&lt;span class="pl-kos"&gt;)&lt;/span&gt;
&lt;span class="pl-kos"&gt;)&lt;/span&gt;
  &lt;span class="pl-k"&gt;return&lt;/span&gt; &lt;span class="pl-kos"&gt;(&lt;/span&gt;
    &lt;span class="pl-en"&gt;l1&lt;/span&gt;&lt;span class="pl-kos"&gt;(&lt;/span&gt;&lt;span class="pl-s"&gt;"tengu_thinking"&lt;/span&gt;&lt;span class="pl-kos"&gt;,&lt;/span&gt; &lt;span class="pl-kos"&gt;{&lt;/span&gt; &lt;span class="pl-c1"&gt;tokenCount&lt;/span&gt;: &lt;span class="pl-c1"&gt;31999&lt;/span&gt;&lt;span class="pl-kos"&gt;,&lt;/span&gt; &lt;span class="pl-c1"&gt;messageId&lt;/span&gt;: &lt;span class="pl-v"&gt;Z&lt;/span&gt;&lt;span class="pl-kos"&gt;,&lt;/span&gt; &lt;span class="pl-c1"&gt;provider&lt;/span&gt;: &lt;span class="pl-v"&gt;G&lt;/span&gt; &lt;span class="pl-kos"&gt;}&lt;/span&gt;&lt;span class="pl-kos"&gt;)&lt;/span&gt;&lt;span class="pl-kos"&gt;,&lt;/span&gt;
    &lt;span class="pl-c1"&gt;31999&lt;/span&gt;
  &lt;span class="pl-kos"&gt;)&lt;/span&gt;&lt;span class="pl-kos"&gt;;&lt;/span&gt;
&lt;span class="pl-k"&gt;if&lt;/span&gt; &lt;span class="pl-kos"&gt;(&lt;/span&gt;
  &lt;span class="pl-v"&gt;B&lt;/span&gt;&lt;span class="pl-kos"&gt;.&lt;/span&gt;&lt;span class="pl-en"&gt;includes&lt;/span&gt;&lt;span class="pl-kos"&gt;(&lt;/span&gt;&lt;span class="pl-s"&gt;"think about it"&lt;/span&gt;&lt;span class="pl-kos"&gt;)&lt;/span&gt; &lt;span class="pl-c1"&gt;||&lt;/span&gt;
  &lt;span class="pl-v"&gt;B&lt;/span&gt;&lt;span class="pl-kos"&gt;.&lt;/span&gt;&lt;span class="pl-en"&gt;includes&lt;/span&gt;&lt;span class="pl-kos"&gt;(&lt;/span&gt;&lt;span class="pl-s"&gt;"think a lot"&lt;/span&gt;&lt;span class="pl-kos"&gt;)&lt;/span&gt; &lt;span class="pl-c1"&gt;||&lt;/span&gt;
  &lt;span class="pl-v"&gt;B&lt;/span&gt;&lt;span class="pl-kos"&gt;.&lt;/span&gt;&lt;span class="pl-en"&gt;includes&lt;/span&gt;&lt;span class="pl-kos"&gt;(&lt;/span&gt;&lt;span class="pl-s"&gt;"think deeply"&lt;/span&gt;&lt;span class="pl-kos"&gt;)&lt;/span&gt; &lt;span class="pl-c1"&gt;||&lt;/span&gt;
  &lt;span class="pl-v"&gt;B&lt;/span&gt;&lt;span class="pl-kos"&gt;.&lt;/span&gt;&lt;span class="pl-en"&gt;includes&lt;/span&gt;&lt;span class="pl-kos"&gt;(&lt;/span&gt;&lt;span class="pl-s"&gt;"think hard"&lt;/span&gt;&lt;span class="pl-kos"&gt;)&lt;/span&gt; &lt;span class="pl-c1"&gt;||&lt;/span&gt;
  &lt;span class="pl-v"&gt;B&lt;/span&gt;&lt;span class="pl-kos"&gt;.&lt;/span&gt;&lt;span class="pl-en"&gt;includes&lt;/span&gt;&lt;span class="pl-kos"&gt;(&lt;/span&gt;&lt;span class="pl-s"&gt;"think more"&lt;/span&gt;&lt;span class="pl-kos"&gt;)&lt;/span&gt; &lt;span class="pl-c1"&gt;||&lt;/span&gt;
  &lt;span class="pl-v"&gt;B&lt;/span&gt;&lt;span class="pl-kos"&gt;.&lt;/span&gt;&lt;span class="pl-en"&gt;includes&lt;/span&gt;&lt;span class="pl-kos"&gt;(&lt;/span&gt;&lt;span class="pl-s"&gt;"megathink"&lt;/span&gt;&lt;span class="pl-kos"&gt;)&lt;/span&gt;
&lt;span class="pl-kos"&gt;)&lt;/span&gt;
  &lt;span class="pl-k"&gt;return&lt;/span&gt; &lt;span class="pl-kos"&gt;(&lt;/span&gt;
    &lt;span class="pl-en"&gt;l1&lt;/span&gt;&lt;span class="pl-kos"&gt;(&lt;/span&gt;&lt;span class="pl-s"&gt;"tengu_thinking"&lt;/span&gt;&lt;span class="pl-kos"&gt;,&lt;/span&gt; &lt;span class="pl-kos"&gt;{&lt;/span&gt; &lt;span class="pl-c1"&gt;tokenCount&lt;/span&gt;: &lt;span class="pl-c1"&gt;1e4&lt;/span&gt;&lt;span class="pl-kos"&gt;,&lt;/span&gt; &lt;span class="pl-c1"&gt;messageId&lt;/span&gt;: &lt;span class="pl-v"&gt;Z&lt;/span&gt;&lt;span class="pl-kos"&gt;,&lt;/span&gt; &lt;span class="pl-c1"&gt;provider&lt;/span&gt;: &lt;span class="pl-v"&gt;G&lt;/span&gt; &lt;span class="pl-kos"&gt;}&lt;/span&gt;&lt;span class="pl-kos"&gt;)&lt;/span&gt;&lt;span class="pl-kos"&gt;,&lt;/span&gt; &lt;span class="pl-c1"&gt;1e4&lt;/span&gt;
  &lt;span class="pl-kos"&gt;)&lt;/span&gt;&lt;span class="pl-kos"&gt;;&lt;/span&gt;
&lt;span class="pl-k"&gt;if&lt;/span&gt; &lt;span class="pl-kos"&gt;(&lt;/span&gt;&lt;span class="pl-v"&gt;B&lt;/span&gt;&lt;span class="pl-kos"&gt;.&lt;/span&gt;&lt;span class="pl-en"&gt;includes&lt;/span&gt;&lt;span class="pl-kos"&gt;(&lt;/span&gt;&lt;span class="pl-s"&gt;"think"&lt;/span&gt;&lt;span class="pl-kos"&gt;)&lt;/span&gt;&lt;span class="pl-kos"&gt;)&lt;/span&gt;
  &lt;span class="pl-k"&gt;return&lt;/span&gt; &lt;span class="pl-kos"&gt;(&lt;/span&gt;
    &lt;span class="pl-en"&gt;l1&lt;/span&gt;&lt;span class="pl-kos"&gt;(&lt;/span&gt;&lt;span class="pl-s"&gt;"tengu_thinking"&lt;/span&gt;&lt;span class="pl-kos"&gt;,&lt;/span&gt; &lt;span class="pl-kos"&gt;{&lt;/span&gt; &lt;span class="pl-c1"&gt;tokenCount&lt;/span&gt;: &lt;span class="pl-c1"&gt;4000&lt;/span&gt;&lt;span class="pl-kos"&gt;,&lt;/span&gt; &lt;span class="pl-c1"&gt;messageId&lt;/span&gt;: &lt;span class="pl-v"&gt;Z&lt;/span&gt;&lt;span class="pl-kos"&gt;,&lt;/span&gt; &lt;span class="pl-c1"&gt;provider&lt;/span&gt;: &lt;span class="pl-v"&gt;G&lt;/span&gt; &lt;span class="pl-kos"&gt;}&lt;/span&gt;&lt;span class="pl-kos"&gt;)&lt;/span&gt;&lt;span class="pl-kos"&gt;,&lt;/span&gt;
    &lt;span class="pl-c1"&gt;4000&lt;/span&gt;
  &lt;span class="pl-kos"&gt;)&lt;/span&gt;&lt;span class="pl-kos"&gt;;&lt;/span&gt;&lt;/pre&gt;

&lt;p&gt;So yeah, it looks like "ultrathink" is a Claude Code feature - presumably that 31999 is a number that affects the token &lt;a href="https://docs.anthropic.com/en/docs/build-with-claude/extended-thinking#implementing-extended-thinking"&gt;thinking budget&lt;/a&gt;, especially since "megathink" maps to 1e4 tokens (10,000) and just plain "think" maps to 4,000.

    &lt;p&gt;&lt;small&gt;&lt;/small&gt;Via &lt;a href="https://twitter.com/HamelHusain/status/1913702157108592719"&gt;@HamelHusain&lt;/a&gt;&lt;/small&gt;&lt;/p&gt;


    &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/cli"&gt;cli&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/ai"&gt;ai&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/generative-ai"&gt;generative-ai&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/llms"&gt;llms&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/ai-assisted-programming"&gt;ai-assisted-programming&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/anthropic"&gt;anthropic&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/claude"&gt;claude&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/llm-reasoning"&gt;llm-reasoning&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/coding-agents"&gt;coding-agents&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/claude-code"&gt;claude-code&lt;/a&gt;&lt;/p&gt;



</summary><category term="cli"/><category term="ai"/><category term="generative-ai"/><category term="llms"/><category term="ai-assisted-programming"/><category term="anthropic"/><category term="claude"/><category term="llm-reasoning"/><category term="coding-agents"/><category term="claude-code"/></entry><entry><title>openai/codex</title><link href="https://simonwillison.net/2025/Apr/16/openai-codex/#atom-tag" rel="alternate"/><published>2025-04-16T17:25:39+00:00</published><updated>2025-04-16T17:25:39+00:00</updated><id>https://simonwillison.net/2025/Apr/16/openai-codex/#atom-tag</id><summary type="html">
    
&lt;p&gt;&lt;strong&gt;&lt;a href="https://github.com/openai/codex"&gt;openai/codex&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;
Just released by OpenAI, a "lightweight coding agent that runs in your terminal". Looks like their version of &lt;a href="https://docs.anthropic.com/en/docs/agents-and-tools/claude-code/overview"&gt;Claude Code&lt;/a&gt;, though unlike Claude Code Codex is released under an open source (Apache 2) license.&lt;/p&gt;
&lt;p&gt;Here's &lt;a href="https://github.com/openai/codex/blob/9b733fc48fb81b3f3460c1fdda111ba9b861f81f/codex-cli/src/utils/agent/agent-loop.ts#L1001-L1046"&gt;the main prompt&lt;/a&gt; that runs in a loop, which starts like this:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;code&gt;You are operating as and within the Codex CLI, a terminal-based agentic coding assistant built by OpenAI. It wraps OpenAI models to enable natural language interaction with a local codebase. You are expected to be precise, safe, and helpful.&lt;/code&gt;&lt;/p&gt;
&lt;p&gt;&lt;code&gt;You can:&lt;/code&gt;&lt;br&gt;
&lt;code&gt;- Receive user prompts, project context, and files.&lt;/code&gt;&lt;br&gt;
&lt;code&gt;- Stream responses and emit function calls (e.g., shell commands, code edits).&lt;/code&gt;&lt;br&gt;
&lt;code&gt;- Apply patches, run commands, and manage user approvals based on policy.&lt;/code&gt;&lt;br&gt;
&lt;code&gt;- Work inside a sandboxed, git-backed workspace with rollback support.&lt;/code&gt;&lt;br&gt;
&lt;code&gt;- Log telemetry so sessions can be replayed or inspected later.&lt;/code&gt;&lt;br&gt;
&lt;code&gt;- More details on your functionality are available at codex --help&lt;/code&gt;&lt;br&gt;&lt;/p&gt;
&lt;p&gt;&lt;code&gt;The Codex CLI is open-sourced. Don't confuse yourself with the old Codex language model built by OpenAI many moons ago (this is understandably top of mind for you!). Within this context, Codex refers to the open-source agentic coding interface. [...]&lt;/code&gt;&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;I like that the prompt describes OpenAI's previous Codex language model as being from "many moons ago". Prompt engineering is so weird.&lt;/p&gt;
&lt;p&gt;Since the prompt says that it works "inside a sandboxed, git-backed workspace" I went looking for the sandbox. On macOS &lt;a href="https://github.com/openai/codex/blob/9b733fc48fb81b3f3460c1fdda111ba9b861f81f/codex-cli/src/utils/agent/sandbox/macos-seatbelt.ts"&gt;it uses&lt;/a&gt; the little-known &lt;code&gt;sandbox-exec&lt;/code&gt; process, part of the OS but grossly under-documented. The best information I've found about it is &lt;a href="https://www.karltarvas.com/macos-app-sandboxing-via-sandbox-exec.html"&gt;this article from 2020&lt;/a&gt;, which notes that &lt;code&gt;man sandbox-exec&lt;/code&gt; lists it as deprecated. I didn't spot evidence in the Codex code of sandboxes for other platforms.


    &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/cli"&gt;cli&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/macos"&gt;macos&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/open-source"&gt;open-source&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/sandboxing"&gt;sandboxing&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/ai"&gt;ai&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/openai"&gt;openai&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/prompt-engineering"&gt;prompt-engineering&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/generative-ai"&gt;generative-ai&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/llms"&gt;llms&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/ai-assisted-programming"&gt;ai-assisted-programming&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/ai-agents"&gt;ai-agents&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/coding-agents"&gt;coding-agents&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/claude-code"&gt;claude-code&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/codex-cli"&gt;codex-cli&lt;/a&gt;&lt;/p&gt;



</summary><category term="cli"/><category term="macos"/><category term="open-source"/><category term="sandboxing"/><category term="ai"/><category term="openai"/><category term="prompt-engineering"/><category term="generative-ai"/><category term="llms"/><category term="ai-assisted-programming"/><category term="ai-agents"/><category term="coding-agents"/><category term="claude-code"/><category term="codex-cli"/></entry><entry><title>llm-openrouter 0.4</title><link href="https://simonwillison.net/2025/Mar/10/llm-openrouter-04/#atom-tag" rel="alternate"/><published>2025-03-10T21:40:56+00:00</published><updated>2025-03-10T21:40:56+00:00</updated><id>https://simonwillison.net/2025/Mar/10/llm-openrouter-04/#atom-tag</id><summary type="html">
    
&lt;p&gt;&lt;strong&gt;&lt;a href="https://github.com/simonw/llm-openrouter/releases/tag/0.4"&gt;llm-openrouter 0.4&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;
I found out this morning that &lt;a href="https://openrouter.ai/"&gt;OpenRouter&lt;/a&gt; include support for a number of (rate-limited) &lt;a href="https://openrouter.ai/models?max_price=0"&gt;free API models&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;I occasionally run workshops on top of LLMs (&lt;a href="https://simonwillison.net/2025/Mar/8/cutting-edge-web-scraping/"&gt;like this one&lt;/a&gt;) and being able to provide students with a quick way to obtain an API key against models where they don't have to setup billing is really valuable to me!&lt;/p&gt;
&lt;p&gt;This inspired me to upgrade my existing &lt;a href="https://github.com/simonw/llm-openrouter"&gt;llm-openrouter&lt;/a&gt; plugin, and in doing so I closed out a bunch of open feature requests.&lt;/p&gt;
&lt;p&gt;Consider this post the &lt;a href="https://simonwillison.net/tags/annotated-release-notes/"&gt;annotated release notes&lt;/a&gt;:&lt;/p&gt;
&lt;blockquote&gt;
&lt;ul&gt;
&lt;li&gt;LLM &lt;a href="https://llm.datasette.io/en/stable/schemas.html"&gt;schema support&lt;/a&gt; for OpenRouter models that &lt;a href="https://openrouter.ai/models?order=newest&amp;amp;supported_parameters=structured_outputs"&gt;support structured output&lt;/a&gt;. &lt;a href="https://github.com/simonw/llm-openrouter/issues/23"&gt;#23&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/blockquote&gt;
&lt;p&gt;I'm trying to get support for LLM's &lt;a href="https://simonwillison.net/2025/Feb/28/llm-schemas/"&gt;new schema feature&lt;/a&gt; into as many plugins as possible.&lt;/p&gt;
&lt;p&gt;OpenRouter's OpenAI-compatible API includes support for the &lt;code&gt;response_format&lt;/code&gt; &lt;a href="https://openrouter.ai/docs/features/structured-outputs"&gt;structured content option&lt;/a&gt;, but with an important caveat: it only works for some models, and if you try to use it on others it is silently ignored.&lt;/p&gt;
&lt;p&gt;I &lt;a href="https://github.com/OpenRouterTeam/openrouter-examples/issues/20"&gt;filed an issue&lt;/a&gt; with OpenRouter requesting they include schema support in their machine-readable model index. For the moment LLM will let you specify schemas for unsupported models and will ignore them entirely, which isn't ideal.&lt;/p&gt;
&lt;blockquote&gt;
&lt;ul&gt;
&lt;li&gt;&lt;code&gt;llm openrouter key&lt;/code&gt; command displays information about your current API key. &lt;a href="https://github.com/simonw/llm-openrouter/issues/24"&gt;#24&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/blockquote&gt;
&lt;p&gt;Useful for debugging and checking the details of your key's rate limit.&lt;/p&gt;
&lt;blockquote&gt;
&lt;ul&gt;
&lt;li&gt;&lt;code&gt;llm -m ... -o online 1&lt;/code&gt; enables &lt;a href="https://openrouter.ai/docs/features/web-search"&gt;web search grounding&lt;/a&gt; against any model, powered by &lt;a href="https://exa.ai/"&gt;Exa&lt;/a&gt;. &lt;a href="https://github.com/simonw/llm-openrouter/issues/25"&gt;#25&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/blockquote&gt;
&lt;p&gt;OpenRouter apparently make this feature available to every one of their supported models! They're using new-to-me &lt;a href="https://exa.ai/"&gt;Exa&lt;/a&gt; to power this feature, an AI-focused search engine startup who appear to have built their own index with their own crawlers (according to &lt;a href="https://docs.exa.ai/reference/faqs#how-often-is-the-index-updated"&gt;their FAQ&lt;/a&gt;). This feature is currently priced by OpenRouter at $4 per 1000 results, and since 5 results are returned for every prompt that's 2 cents per prompt.&lt;/p&gt;
&lt;blockquote&gt;
&lt;ul&gt;
&lt;li&gt;&lt;code&gt;llm openrouter models&lt;/code&gt; command for listing details of the OpenRouter models, including a &lt;code&gt;--json&lt;/code&gt; option to get JSON and a &lt;code&gt;--free&lt;/code&gt; option to filter for just the free models. &lt;a href="https://github.com/simonw/llm-openrouter/issues/26"&gt;#26&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/blockquote&gt;
&lt;p&gt;This offers a neat way to list the available models. There are examples of the output &lt;a href="https://github.com/simonw/llm-openrouter/issues/26#issuecomment-2711908704"&gt;in the comments on the issue&lt;/a&gt;.&lt;/p&gt;
&lt;blockquote&gt;
&lt;ul&gt;
&lt;li&gt;New option to specify custom provider routing: &lt;code&gt;-o provider '{JSON here}'&lt;/code&gt;. &lt;a href="https://github.com/simonw/llm-openrouter/issues/17"&gt;#17&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/blockquote&gt;
&lt;p&gt;Part of OpenRouter's USP is that it can route prompts to different providers depending on factors like latency, cost or as a fallback if your first choice is unavailable - great for if you are using open weight models like Llama which are hosted by competing companies.&lt;/p&gt;
&lt;p&gt;The options they provide for routing are &lt;a href="https://openrouter.ai/docs/features/provider-routing"&gt;very thorough&lt;/a&gt; - I had initially hoped to provide a set of CLI options that covered all of these bases, but I decided instead to reuse their JSON format and forward those options directly on to the model.


    &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/cli"&gt;cli&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/plugins"&gt;plugins&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/projects"&gt;projects&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/ai"&gt;ai&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/annotated-release-notes"&gt;annotated-release-notes&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/generative-ai"&gt;generative-ai&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/llms"&gt;llms&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/llm"&gt;llm&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/openrouter"&gt;openrouter&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/ai-assisted-search"&gt;ai-assisted-search&lt;/a&gt;&lt;/p&gt;



</summary><category term="cli"/><category term="plugins"/><category term="projects"/><category term="ai"/><category term="annotated-release-notes"/><category term="generative-ai"/><category term="llms"/><category term="llm"/><category term="openrouter"/><category term="ai-assisted-search"/></entry><entry><title>Mistral OCR</title><link href="https://simonwillison.net/2025/Mar/7/mistral-ocr/#atom-tag" rel="alternate"/><published>2025-03-07T01:39:26+00:00</published><updated>2025-03-07T01:39:26+00:00</updated><id>https://simonwillison.net/2025/Mar/7/mistral-ocr/#atom-tag</id><summary type="html">
    
&lt;p&gt;&lt;strong&gt;&lt;a href="https://mistral.ai/fr/news/mistral-ocr"&gt;Mistral OCR&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;
New closed-source specialist OCR model by Mistral - you can feed it images or a PDF and it produces Markdown with optional embedded images.&lt;/p&gt;
&lt;p&gt;It's available &lt;a href="https://docs.mistral.ai/api/#tag/ocr"&gt;via their API&lt;/a&gt;, or it's "available to self-host on a selective basis" for people with stringent privacy requirements who are willing to talk to their sales team.&lt;/p&gt;
&lt;p&gt;I decided to try out their API, so I copied and pasted example code &lt;a href="https://colab.research.google.com/drive/11NdqWVwC_TtJyKT6cmuap4l9SryAeeVt?usp=sharing"&gt;from their notebook&lt;/a&gt; into my &lt;a href="https://simonwillison.net/2024/Dec/19/one-shot-python-tools/"&gt;custom Claude project&lt;/a&gt; and &lt;a href="https://claude.ai/share/153d8eb8-82dd-4f8c-a3d0-6c23b4dc21a2"&gt;told it&lt;/a&gt;:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;code&gt;Turn this into a CLI app, depends on mistralai - it should take a file path and an optional API key defauling to env vironment called MISTRAL_API_KEY&lt;/code&gt;&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;After &lt;a href="https://claude.ai/share/b746cab4-293b-4e04-b662-858bb164ab78"&gt;some further&lt;/a&gt; iteration / vibe coding I got to something that worked, which I then tidied up and shared as &lt;a href="https://github.com/simonw/tools/blob/main/python/mistral_ocr.py"&gt;mistral_ocr.py&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;You can try it out like this:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;export MISTRAL_API_KEY='...'
uv run http://tools.simonwillison.net/python/mistral_ocr.py \
  mixtral.pdf --html --inline-images &amp;gt; mixtral.html
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;I fed in &lt;a href="https://arxiv.org/abs/2401.04088"&gt;the Mixtral paper&lt;/a&gt; as a PDF. The API returns Markdown, but my &lt;code&gt;--html&lt;/code&gt; option renders that Markdown as HTML and the &lt;code&gt;--inline-images&lt;/code&gt; option takes any images and inlines them as base64 URIs (inspired &lt;a href="https://simonwillison.net/2025/Mar/6/monolith/"&gt;by monolith&lt;/a&gt;). The result is &lt;a href="https://static.simonwillison.net/static/2025/mixtral.html"&gt;mixtral.html&lt;/a&gt;, a 972KB HTML file with images and text bundled together.&lt;/p&gt;
&lt;p&gt;This did a pretty great job!&lt;/p&gt;
&lt;p&gt;&lt;img alt="Screenshot of part of the document, it has a heading, some text, an image and the start of a table. The table contains some unrendered MathML syntax." src="https://static.simonwillison.net/static/2025/mixtral-as-html.jpg" /&gt;&lt;/p&gt;
&lt;p&gt;My script renders Markdown tables but I haven't figured out how to render inline Markdown MathML yet. I ran the command a second time and requested Markdown output (the default) like this:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;uv run http://tools.simonwillison.net/python/mistral_ocr.py \
  mixtral.pdf &amp;gt; mixtral.md
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Here's &lt;a href="https://gist.github.com/simonw/023d1cf403c1cd9f41801c85510aef21"&gt;that Markdown rendered as a Gist&lt;/a&gt; - there are a few MathML glitches so clearly the Mistral OCR MathML dialect and the GitHub Formatted Markdown dialect don't quite line up.&lt;/p&gt;
&lt;p&gt;My tool can also output raw JSON as an alternative to Markdown or HTML - full details &lt;a href="https://tools.simonwillison.net/python/#mistral_ocrpy"&gt;in the documentation&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;The Mistral API is priced at roughly 1000 pages per dollar, with a 50% discount for batch usage.&lt;/p&gt;
&lt;p&gt;The big question with LLM-based OCR is always how well it copes with accidental instructions in the text (can you safely OCR a document full of prompting examples?) and how well it handles text it can't write.&lt;/p&gt;
&lt;p&gt;Mistral's Sophia Yang says it &lt;a href="https://x.com/sophiamyang/status/1897719199595720722"&gt;"should be robust"&lt;/a&gt; against following instructions in the text, and invited people to try and find counter-examples.&lt;/p&gt;
&lt;p&gt;Alexander Doria noted that &lt;a href="https://twitter.com/Dorialexander/status/1897702264543875535"&gt;Mistral OCR can hallucinate text&lt;/a&gt; when faced with handwriting that it cannot understand.

    &lt;p&gt;&lt;small&gt;&lt;/small&gt;Via &lt;a href="https://twitter.com/sophiamyang/status/1897713370029068381"&gt;@sophiamyang&lt;/a&gt;&lt;/small&gt;&lt;/p&gt;


    &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/cli"&gt;cli&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/ocr"&gt;ocr&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/pdf"&gt;pdf&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/projects"&gt;projects&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/ai"&gt;ai&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/generative-ai"&gt;generative-ai&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/llms"&gt;llms&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/claude"&gt;claude&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/mistral"&gt;mistral&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/vision-llms"&gt;vision-llms&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/uv"&gt;uv&lt;/a&gt;&lt;/p&gt;



</summary><category term="cli"/><category term="ocr"/><category term="pdf"/><category term="projects"/><category term="ai"/><category term="generative-ai"/><category term="llms"/><category term="claude"/><category term="mistral"/><category term="vision-llms"/><category term="uv"/></entry><entry><title>monolith</title><link href="https://simonwillison.net/2025/Mar/6/monolith/#atom-tag" rel="alternate"/><published>2025-03-06T15:37:48+00:00</published><updated>2025-03-06T15:37:48+00:00</updated><id>https://simonwillison.net/2025/Mar/6/monolith/#atom-tag</id><summary type="html">
    
&lt;p&gt;&lt;strong&gt;&lt;a href="https://github.com/Y2Z/monolith"&gt;monolith&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;
Neat CLI tool built in Rust that can create a single packaged HTML file of a web page plus all of its dependencies.&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;cargo install monolith # or brew install
monolith https://simonwillison.net/ &amp;gt; simonwillison.html
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;That command produced &lt;a href="https://static.simonwillison.net/static/2025/simonwillison.html"&gt;this 1.5MB single file result&lt;/a&gt;. All of the linked images, CSS and JavaScript assets have had their contents inlined into base64 URIs in their &lt;code&gt;src=&lt;/code&gt; and &lt;code&gt;href=&lt;/code&gt; attributes.&lt;/p&gt;
&lt;p&gt;I was intrigued as to how it works, so I dumped the whole repository into Gemini 2.0 Pro and asked for an architectural summary:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;cd /tmp
git clone https://github.com/Y2Z/monolith
cd monolith
files-to-prompt . -c | llm -m gemini-2.0-pro-exp-02-05 \
  -s 'architectural overview as markdown'
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Here's &lt;a href="https://gist.github.com/simonw/2c80749935ae3339d6f7175dc7cf325b"&gt;what I got&lt;/a&gt;. Short version: it uses the &lt;code&gt;reqwest&lt;/code&gt;, &lt;code&gt;html5ever&lt;/code&gt;, &lt;code&gt;markup5ever_rcdom&lt;/code&gt; and &lt;code&gt;cssparser&lt;/code&gt; crates to fetch and parse HTML and CSS and extract, combine and rewrite the assets. It doesn't currently attempt to run any JavaScript.

    &lt;p&gt;&lt;small&gt;&lt;/small&gt;Via &lt;a href="https://news.ycombinator.com/item?id=42933383#42935115"&gt;Comment on Hacker News&lt;/a&gt;&lt;/small&gt;&lt;/p&gt;


    &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/cli"&gt;cli&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/scraping"&gt;scraping&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/ai"&gt;ai&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/rust"&gt;rust&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/generative-ai"&gt;generative-ai&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/llms"&gt;llms&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/ai-assisted-programming"&gt;ai-assisted-programming&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/files-to-prompt"&gt;files-to-prompt&lt;/a&gt;&lt;/p&gt;



</summary><category term="cli"/><category term="scraping"/><category term="ai"/><category term="rust"/><category term="generative-ai"/><category term="llms"/><category term="ai-assisted-programming"/><category term="files-to-prompt"/></entry><entry><title>Aider: Using uv as an installer</title><link href="https://simonwillison.net/2025/Mar/6/aider-using-uv-as-an-installer/#atom-tag" rel="alternate"/><published>2025-03-06T01:47:20+00:00</published><updated>2025-03-06T01:47:20+00:00</updated><id>https://simonwillison.net/2025/Mar/6/aider-using-uv-as-an-installer/#atom-tag</id><summary type="html">
    
&lt;p&gt;&lt;strong&gt;&lt;a href="https://aider.chat/2025/01/15/uv.html"&gt;Aider: Using uv as an installer&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;
Paul Gauthier has an innovative solution for the challenge of helping end users get a copy of his Aider CLI Python utility installed in an isolated virtual environment without first needing to teach them what an "isolated virtual environment" is.&lt;/p&gt;
&lt;p&gt;Provided you already have a Python install of version 3.8 or higher you can run this:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;pip install aider-install &amp;amp;&amp;amp; aider-install
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;The &lt;a href="https://pypi.org/project/aider-install/"&gt;aider-install&lt;/a&gt; package itself depends on &lt;a href="https://github.com/astral-sh/uv"&gt;uv&lt;/a&gt;. When you run &lt;code&gt;aider-install&lt;/code&gt; it executes the following &lt;a href="https://github.com/Aider-AI/aider-install/blob/main/aider_install/main.py"&gt;Python code&lt;/a&gt;:&lt;/p&gt;
&lt;pre&gt;&lt;span class="pl-k"&gt;def&lt;/span&gt; &lt;span class="pl-en"&gt;install_aider&lt;/span&gt;():
    &lt;span class="pl-k"&gt;try&lt;/span&gt;:
        &lt;span class="pl-s1"&gt;uv_bin&lt;/span&gt; &lt;span class="pl-c1"&gt;=&lt;/span&gt; &lt;span class="pl-s1"&gt;uv&lt;/span&gt;.&lt;span class="pl-c1"&gt;find_uv_bin&lt;/span&gt;()
        &lt;span class="pl-s1"&gt;subprocess&lt;/span&gt;.&lt;span class="pl-c1"&gt;check_call&lt;/span&gt;([
            &lt;span class="pl-s1"&gt;uv_bin&lt;/span&gt;, &lt;span class="pl-s"&gt;"tool"&lt;/span&gt;, &lt;span class="pl-s"&gt;"install"&lt;/span&gt;, &lt;span class="pl-s"&gt;"--force"&lt;/span&gt;, &lt;span class="pl-s"&gt;"--python"&lt;/span&gt;, &lt;span class="pl-s"&gt;"python3.12"&lt;/span&gt;, &lt;span class="pl-s"&gt;"aider-chat@latest"&lt;/span&gt;
        ])
        &lt;span class="pl-s1"&gt;subprocess&lt;/span&gt;.&lt;span class="pl-c1"&gt;check_call&lt;/span&gt;([&lt;span class="pl-s1"&gt;uv_bin&lt;/span&gt;, &lt;span class="pl-s"&gt;"tool"&lt;/span&gt;, &lt;span class="pl-s"&gt;"update-shell"&lt;/span&gt;])
    &lt;span class="pl-k"&gt;except&lt;/span&gt; &lt;span class="pl-s1"&gt;subprocess&lt;/span&gt;.&lt;span class="pl-c1"&gt;CalledProcessError&lt;/span&gt; &lt;span class="pl-k"&gt;as&lt;/span&gt; &lt;span class="pl-s1"&gt;e&lt;/span&gt;:
        &lt;span class="pl-en"&gt;print&lt;/span&gt;(&lt;span class="pl-s"&gt;f"Failed to install aider: &lt;span class="pl-s1"&gt;&lt;span class="pl-kos"&gt;{&lt;/span&gt;&lt;span class="pl-s1"&gt;e&lt;/span&gt;&lt;span class="pl-kos"&gt;}&lt;/span&gt;&lt;/span&gt;"&lt;/span&gt;)
        &lt;span class="pl-s1"&gt;sys&lt;/span&gt;.&lt;span class="pl-c1"&gt;exit&lt;/span&gt;(&lt;span class="pl-c1"&gt;1&lt;/span&gt;)&lt;/pre&gt;

&lt;p&gt;This first figures out the location of the &lt;code&gt;uv&lt;/code&gt; Rust binary, then uses it to install his &lt;a href="https://pypi.org/project/aider-chat/"&gt;aider-chat&lt;/a&gt; package by running the equivalent of this command:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;uv tool install --force --python python3.12 aider-chat@latest
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;This will in turn install a brand new standalone copy of Python 3.12 and tuck it away in uv's own managed directory structure where it shouldn't hurt anything else.&lt;/p&gt;
&lt;p&gt;The &lt;code&gt;aider-chat&lt;/code&gt; script defaults to being dropped in the XDG standard directory, which is probably &lt;code&gt;~/.local/bin&lt;/code&gt; - see &lt;a href="https://docs.astral.sh/uv/concepts/tools/#the-bin-directory"&gt;uv's documentation&lt;/a&gt;. The &lt;a href="https://docs.astral.sh/uv/concepts/tools/#overwriting-executables"&gt;--force flag&lt;/a&gt; ensures that &lt;code&gt;uv&lt;/code&gt; will overwrite any previous attempts at installing &lt;code&gt;aider-chat&lt;/code&gt; in that location with the new one.&lt;/p&gt;
&lt;p&gt;Finally, running &lt;code&gt;uv tool update-shell&lt;/code&gt; ensures that bin directory is &lt;a href="https://docs.astral.sh/uv/concepts/tools/#the-path"&gt;on the user's PATH&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;I &lt;em&gt;think&lt;/em&gt; I like this. There is a LOT of stuff going on here, and experienced users may well opt for an &lt;a href="https://aider.chat/docs/install.html"&gt;alternative installation mechanism&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;But for non-expert Python users who just want to start using Aider, I think this pattern represents quite a tasteful way of getting everything working with minimal risk of breaking the user's system.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Update&lt;/strong&gt;: Paul &lt;a href="https://twitter.com/paulgauthier/status/1897486573857595877"&gt;adds&lt;/a&gt;:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;Offering this install method dramatically reduced the number of GitHub issues from users with conflicted/broken python environments.&lt;/p&gt;
&lt;p&gt;I also really like the "curl | sh" aider installer based on uv. Even users who don't have python installed can use it.&lt;/p&gt;
&lt;/blockquote&gt;


    &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/cli"&gt;cli&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/python"&gt;python&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/aider"&gt;aider&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/uv"&gt;uv&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/paul-gauthier"&gt;paul-gauthier&lt;/a&gt;&lt;/p&gt;



</summary><category term="cli"/><category term="python"/><category term="aider"/><category term="uv"/><category term="paul-gauthier"/></entry><entry><title>strip-tags 0.6</title><link href="https://simonwillison.net/2025/Feb/28/strip-tags/#atom-tag" rel="alternate"/><published>2025-02-28T22:02:16+00:00</published><updated>2025-02-28T22:02:16+00:00</updated><id>https://simonwillison.net/2025/Feb/28/strip-tags/#atom-tag</id><summary type="html">
    
&lt;p&gt;&lt;strong&gt;&lt;a href="https://github.com/simonw/strip-tags/releases/tag/0.6"&gt;strip-tags 0.6&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;
It's been a while since I updated this tool, but in investigating &lt;a href="https://github.com/simonw/llm/issues/808"&gt;a tricky mistake&lt;/a&gt; in my tutorial for LLM schemas I discovered &lt;a href="https://github.com/simonw/strip-tags/issues/32"&gt;a bug&lt;/a&gt; that I needed to fix.&lt;/p&gt;
&lt;p&gt;Those release notes in full:&lt;/p&gt;
&lt;blockquote&gt;
&lt;ul&gt;
&lt;li&gt;Fixed a bug where &lt;code&gt;strip-tags -t meta&lt;/code&gt; still removed &lt;code&gt;&amp;lt;meta&amp;gt;&lt;/code&gt; tags from the &lt;code&gt;&amp;lt;head&amp;gt;&lt;/code&gt; because the entire &lt;code&gt;&amp;lt;head&amp;gt;&lt;/code&gt; element was removed first. &lt;a href="https://github.com/simonw/strip-tags/issues/32"&gt;#32&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;Kept &lt;code&gt;&amp;lt;meta&amp;gt;&lt;/code&gt; tags now default to keeping their &lt;code&gt;content&lt;/code&gt; and &lt;code&gt;property&lt;/code&gt; attributes.&lt;/li&gt;
&lt;li&gt;The CLI &lt;code&gt;-m/--minify&lt;/code&gt; option now also removes any remaining blank lines. &lt;a href="https://github.com/simonw/strip-tags/issues/33"&gt;#33&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;A new &lt;code&gt;strip_tags(remove_blank_lines=True)&lt;/code&gt; option can be used to achieve the same thing with the Python library function.&lt;/li&gt;
&lt;/ul&gt;
&lt;/blockquote&gt;
&lt;p&gt;Now I can do this and persist the &lt;code&gt;&amp;lt;meta&amp;gt;&lt;/code&gt; tags for the article along with the stripped text content:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;curl -s 'https://apnews.com/article/trump-federal-employees-firings-a85d1aaf1088e050d39dcf7e3664bb9f' | \
  strip-tags -t meta --minify
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Here's &lt;a href="https://gist.github.com/simonw/22902a75e2e73ca513231e1d8d0dac6e"&gt;the output from that command&lt;/a&gt;.


    &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/cli"&gt;cli&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/html"&gt;html&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/projects"&gt;projects&lt;/a&gt;&lt;/p&gt;



</summary><category term="cli"/><category term="html"/><category term="projects"/></entry><entry><title>Structured data extraction from unstructured content using LLM schemas</title><link href="https://simonwillison.net/2025/Feb/28/llm-schemas/#atom-tag" rel="alternate"/><published>2025-02-28T17:07:07+00:00</published><updated>2025-02-28T17:07:07+00:00</updated><id>https://simonwillison.net/2025/Feb/28/llm-schemas/#atom-tag</id><summary type="html">
    &lt;p&gt;&lt;a href="https://llm.datasette.io/en/stable/changelog.html#v0-23"&gt;LLM 0.23&lt;/a&gt; is out today, and the signature feature is support for &lt;strong&gt;&lt;a href="https://llm.datasette.io/en/stable/schemas.html"&gt;schemas&lt;/a&gt;&lt;/strong&gt; - a new way of providing structured output from a model that matches a specification provided by the user. I've also upgraded both the &lt;a href="https://github.com/simonw/llm-anthropic"&gt;llm-anthropic&lt;/a&gt; and &lt;a href="https://github.com/simonw/llm-gemini"&gt;llm-gemini&lt;/a&gt; plugins to add support for  schemas.&lt;/p&gt;
&lt;p&gt;TLDR: you can now do things like this:&lt;/p&gt;
&lt;div class="highlight highlight-source-shell"&gt;&lt;pre&gt;llm --schema &lt;span class="pl-s"&gt;&lt;span class="pl-pds"&gt;'&lt;/span&gt;name,age int,short_bio&lt;span class="pl-pds"&gt;'&lt;/span&gt;&lt;/span&gt; &lt;span class="pl-s"&gt;&lt;span class="pl-pds"&gt;'&lt;/span&gt;invent a cool dog&lt;span class="pl-pds"&gt;'&lt;/span&gt;&lt;/span&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;And get back:&lt;/p&gt;
&lt;div class="highlight highlight-source-json"&gt;&lt;pre&gt;{
  &lt;span class="pl-ent"&gt;"name"&lt;/span&gt;: &lt;span class="pl-s"&gt;&lt;span class="pl-pds"&gt;"&lt;/span&gt;Zylo&lt;span class="pl-pds"&gt;"&lt;/span&gt;&lt;/span&gt;,
  &lt;span class="pl-ent"&gt;"age"&lt;/span&gt;: &lt;span class="pl-c1"&gt;4&lt;/span&gt;,
  &lt;span class="pl-ent"&gt;"short_bio"&lt;/span&gt;: &lt;span class="pl-s"&gt;&lt;span class="pl-pds"&gt;"&lt;/span&gt;Zylo is a unique hybrid breed, a mix between a Siberian Husky and a Corgi. With striking blue eyes and a fluffy, colorful coat that changes shades with the seasons, Zylo embodies the spirit of winter and summer alike. Known for his playful personality and intelligence, Zylo can perform a variety of tricks and loves to fetch his favorite frisbee. Always ready for an adventure, he's just as happy hiking in the mountains as he is cuddling on the couch after a long day of play.&lt;span class="pl-pds"&gt;"&lt;/span&gt;&lt;/span&gt;
}&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;More details &lt;a href="https://llm.datasette.io/en/stable/changelog.html#v0-23"&gt;in the release notes&lt;/a&gt; and &lt;a href="https://llm.datasette.io/en/stable/schemas.html#schemas-tutorial"&gt;LLM schemas tutorial&lt;/a&gt;, which includes an example (extracting people from news articles) that's even more useful than inventing dogs!&lt;/p&gt;



&lt;ul&gt;
  &lt;li&gt;&lt;a href="https://simonwillison.net/2025/Feb/28/llm-schemas/#structured-data-extraction-is-a-killer-app-for-llms"&gt;Structured data extraction is a killer app for LLMs&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href="https://simonwillison.net/2025/Feb/28/llm-schemas/#designing-this-feature-for-llm"&gt;Designing this feature for LLM&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href="https://simonwillison.net/2025/Feb/28/llm-schemas/#reusing-schemas-and-creating-templates"&gt;Reusing schemas and creating templates&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href="https://simonwillison.net/2025/Feb/28/llm-schemas/#doing-more-with-the-logged-structured-data"&gt;Doing more with the logged structured data&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href="https://simonwillison.net/2025/Feb/28/llm-schemas/#using-schemas-from-llm-s-python-library"&gt;Using schemas from LLM's Python library&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href="https://simonwillison.net/2025/Feb/28/llm-schemas/#what-s-next-for-llm-schemas-"&gt;What's next for LLM schemas?&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;h4 id="structured-data-extraction-is-a-killer-app-for-llms"&gt;Structured data extraction is a killer app for LLMs&lt;/h4&gt;
&lt;p&gt;I've suspected for a while that the single most commercially valuable application of LLMs is turning unstructured content into structured data. That's the trick where you feed an LLM an article, or a PDF, or a screenshot and use it to turn that into JSON or CSV or some other structured format.&lt;/p&gt;
&lt;p&gt;It's possible to achieve strong results on this with prompting alone: feed data into an LLM, give it an example of the output you would like and let it figure out the details.&lt;/p&gt;
&lt;p&gt;Many of the leading LLM providers now bake this in as a feature. OpenAI, Anthropic, Gemini and Mistral all offer variants of "structured output" as additional options through their API:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;OpenAI: &lt;a href="https://platform.openai.com/docs/guides/structured-outputs"&gt;Structured Outputs&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;Gemini: &lt;a href="https://ai.google.dev/gemini-api/docs/structured-output?lang=rest"&gt;Generate structured output with the Gemini API&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;Mistral: &lt;a href="https://docs.mistral.ai/capabilities/structured-output/custom_structured_output/"&gt;Custom Structured Outputs&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;Anthropic's &lt;a href="https://docs.anthropic.com/en/docs/build-with-claude/tool-use/overview"&gt;tool use&lt;/a&gt; can be used for this, as shown in their &lt;a href="https://github.com/anthropics/anthropic-cookbook/blob/main/tool_use/extracting_structured_json.ipynb"&gt;Extracting Structured JSON using Claude and Tool Use&lt;/a&gt; cookbook example.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;These mechanisms are all very similar: you pass a &lt;a href="https://json-schema.org/"&gt;JSON schema&lt;/a&gt; to the model defining the shape that you would like, they then use that schema to guide the output of the model.&lt;/p&gt;
&lt;p&gt;How reliable that is can vary! Some providers use tricks along the lines of &lt;a href="https://github.com/1rgs/jsonformer"&gt;Jsonformer&lt;/a&gt;, compiling the JSON schema into code that interacts with the model's next-token generation at runtime, limiting it to only generate tokens that are valid in the context of the schema.&lt;/p&gt;
&lt;p&gt;Other providers YOLO it - they trust that their model is "good enough" that showing it the schema will produce the right results!&lt;/p&gt;
&lt;p&gt;In practice, this means that you need to be aware that sometimes this stuff will go wrong. As with anything LLM, 100% reliability is never guaranteed.&lt;/p&gt;
&lt;p&gt;From my experiments so far, and depending on the model that you chose, these mistakes are rare. If you're using a top tier model it will almost certainly do the right thing.&lt;/p&gt;
&lt;h4 id="designing-this-feature-for-llm"&gt;Designing this feature for LLM&lt;/h4&gt;
&lt;p&gt;I've wanted this feature for ages. I see it as an important step on the way to full tool usage, which is something I'm very excited to bring to the CLI tool and Python library.&lt;/p&gt;
&lt;p&gt;LLM is designed as an abstraction layer over different models. This makes building new features &lt;em&gt;much harder&lt;/em&gt;, because I need to figure out a common denominator and then build an abstraction that captures as much value as possible while still being general enough to work across multiple models.&lt;/p&gt;
&lt;p&gt;Support for structured output across multiple vendors has matured now to the point that I'm ready to commit to a design.&lt;/p&gt;
&lt;p&gt;My first version of this feature worked exclusively with JSON schemas. An earlier version of the tutorial started with this example:&lt;/p&gt;
&lt;div class="highlight highlight-source-shell"&gt;&lt;pre&gt;curl https://www.nytimes.com/ &lt;span class="pl-k"&gt;|&lt;/span&gt; uvx strip-tags &lt;span class="pl-k"&gt;|&lt;/span&gt; \
  llm --schema &lt;span class="pl-s"&gt;&lt;span class="pl-pds"&gt;'&lt;/span&gt;{&lt;/span&gt;
&lt;span class="pl-s"&gt;  "type": "object",&lt;/span&gt;
&lt;span class="pl-s"&gt;  "properties": {&lt;/span&gt;
&lt;span class="pl-s"&gt;    "items": {&lt;/span&gt;
&lt;span class="pl-s"&gt;      "type": "array",&lt;/span&gt;
&lt;span class="pl-s"&gt;      "items": {&lt;/span&gt;
&lt;span class="pl-s"&gt;        "type": "object",&lt;/span&gt;
&lt;span class="pl-s"&gt;        "properties": {&lt;/span&gt;
&lt;span class="pl-s"&gt;          "headline": {&lt;/span&gt;
&lt;span class="pl-s"&gt;            "type": "string"&lt;/span&gt;
&lt;span class="pl-s"&gt;          },&lt;/span&gt;
&lt;span class="pl-s"&gt;          "short_summary": {&lt;/span&gt;
&lt;span class="pl-s"&gt;            "type": "string"&lt;/span&gt;
&lt;span class="pl-s"&gt;          },&lt;/span&gt;
&lt;span class="pl-s"&gt;          "key_points": {&lt;/span&gt;
&lt;span class="pl-s"&gt;            "type": "array",&lt;/span&gt;
&lt;span class="pl-s"&gt;            "items": {&lt;/span&gt;
&lt;span class="pl-s"&gt;              "type": "string"&lt;/span&gt;
&lt;span class="pl-s"&gt;            }&lt;/span&gt;
&lt;span class="pl-s"&gt;          }&lt;/span&gt;
&lt;span class="pl-s"&gt;        },&lt;/span&gt;
&lt;span class="pl-s"&gt;        "required": ["headline", "short_summary", "key_points"]&lt;/span&gt;
&lt;span class="pl-s"&gt;      }&lt;/span&gt;
&lt;span class="pl-s"&gt;    }&lt;/span&gt;
&lt;span class="pl-s"&gt;  },&lt;/span&gt;
&lt;span class="pl-s"&gt;  "required": ["items"]&lt;/span&gt;
&lt;span class="pl-s"&gt;}&lt;span class="pl-pds"&gt;'&lt;/span&gt;&lt;/span&gt; &lt;span class="pl-k"&gt;|&lt;/span&gt; jq&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;Here we're feeding a full JSON schema document to the new &lt;code&gt;llm --schema&lt;/code&gt; option, then piping in the homepage of the New York Times (after running it through &lt;a href="https://github.com/simonw/strip-tags"&gt;strip-tags&lt;/a&gt;) and asking for &lt;code&gt;headline&lt;/code&gt;, &lt;code&gt;short_summary&lt;/code&gt; and &lt;code&gt;key_points&lt;/code&gt; for multiple items on the page.&lt;/p&gt;
&lt;p&gt;This example still works with the finished feature - you can see &lt;a href="https://gist.github.com/simonw/372d11e2729a9745654740ff3f5669ab"&gt;example JSON output here&lt;/a&gt; - but constructing those long-form schemas by hand was a big pain.&lt;/p&gt;
&lt;p&gt;So... I invented my own shortcut syntax.&lt;/p&gt;
&lt;p&gt;That earlier example is a simple illustration:&lt;/p&gt;
&lt;div class="highlight highlight-source-shell"&gt;&lt;pre&gt;llm --schema &lt;span class="pl-s"&gt;&lt;span class="pl-pds"&gt;'&lt;/span&gt;name,age int,short_bio&lt;span class="pl-pds"&gt;'&lt;/span&gt;&lt;/span&gt; &lt;span class="pl-s"&gt;&lt;span class="pl-pds"&gt;'&lt;/span&gt;invent a cool dog&lt;span class="pl-pds"&gt;'&lt;/span&gt;&lt;/span&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;Here the schema is a comma-separated list of field names, with an optional space-separated type.&lt;/p&gt;
&lt;p&gt;The full concise schema syntax &lt;a href="https://llm.datasette.io/en/stable/schemas.html#concise-llm-schema-syntax"&gt;is described here&lt;/a&gt;. There's a more complex example &lt;a href="https://llm.datasette.io/en/latest/schemas.html#extracting-people-from-a-news-articles"&gt;in the tutorial&lt;/a&gt;, which uses the newline-delimited form to extract information about people who are mentioned in a news article:&lt;/p&gt;
&lt;div class="highlight highlight-source-shell"&gt;&lt;pre&gt;curl &lt;span class="pl-s"&gt;&lt;span class="pl-pds"&gt;'&lt;/span&gt;https://apnews.com/article/trump-federal-employees-firings-a85d1aaf1088e050d39dcf7e3664bb9f&lt;span class="pl-pds"&gt;'&lt;/span&gt;&lt;/span&gt; &lt;span class="pl-k"&gt;|&lt;/span&gt; \
  uvx strip-tags &lt;span class="pl-k"&gt;|&lt;/span&gt; \
  llm --schema-multi &lt;span class="pl-s"&gt;&lt;span class="pl-pds"&gt;"&lt;/span&gt;&lt;/span&gt;
&lt;span class="pl-s"&gt;name: the person's name&lt;/span&gt;
&lt;span class="pl-s"&gt;organization: who they represent&lt;/span&gt;
&lt;span class="pl-s"&gt;role: their job title or role&lt;/span&gt;
&lt;span class="pl-s"&gt;learned: what we learned about them from this story&lt;/span&gt;
&lt;span class="pl-s"&gt;article_headline: the headline of the story&lt;/span&gt;
&lt;span class="pl-s"&gt;article_date: the publication date in YYYY-MM-DD&lt;/span&gt;
&lt;span class="pl-s"&gt;&lt;span class="pl-pds"&gt;"&lt;/span&gt;&lt;/span&gt; --system &lt;span class="pl-s"&gt;&lt;span class="pl-pds"&gt;'&lt;/span&gt;extract people mentioned in this article&lt;span class="pl-pds"&gt;'&lt;/span&gt;&lt;/span&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;The &lt;code&gt;--schema-multi&lt;/code&gt; option here tells LLM to take that schema for a single object and upgrade it to an array of those objects (actually an object with a single &lt;code&gt;"items"&lt;/code&gt; property that's an array of objects), which is a quick way to request that the same schema be returned multiple times against a single input.&lt;/p&gt;
&lt;h4 id="reusing-schemas-and-creating-templates"&gt;Reusing schemas and creating templates&lt;/h4&gt;
&lt;p&gt;My original plan with schemas was to provide a separate &lt;code&gt;llm extract&lt;/code&gt; command for running these kinds of operations. I ended up going in a different direction - I realized that adding &lt;code&gt;--schema&lt;/code&gt; to the default &lt;code&gt;llm prompt&lt;/code&gt; command would make it interoperable with other existing features (like &lt;a href="https://llm.datasette.io/en/stable/usage.html#attachments"&gt;attachments&lt;/a&gt; for feeding in images and PDFs).&lt;/p&gt;
&lt;p&gt;The most valuable way to apply schemas is across many different prompts, in order to gather the same structure of information from many different sources.&lt;/p&gt;
&lt;p&gt;I put a bunch of thought into the &lt;code&gt;--schema&lt;/code&gt; option. It takes a variety of different values - quoting &lt;a href="https://llm.datasette.io/en/latest/schemas.html#ways-to-specify-a-schema"&gt;the documentation&lt;/a&gt;:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;This option can take multiple forms:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;A string providing a JSON schema: &lt;code&gt;--schema '{"type": "object", ...}'&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;A &lt;a href="https://llm.datasette.io/en/stable/schemas.html#schemas-dsl"&gt;condensed schema definition&lt;/a&gt;: &lt;code&gt;--schema 'name,age int'&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;The name or path of a file on disk containing a JSON schema: &lt;code&gt;--schema dogs.schema.json&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;The hexadecimal ID of a previously logged schema: &lt;code&gt;--schema 520f7aabb121afd14d0c6c237b39ba2d&lt;/code&gt; - these IDs can be found using the &lt;code&gt;llm schemas&lt;/code&gt; command.&lt;/li&gt;
&lt;li&gt;A schema that has been &lt;a href="https://llm.datasette.io/en/latest/templates.html#prompt-templates-save"&gt;saved in a template&lt;/a&gt;: &lt;code&gt;--schema t:name-of-template&lt;/code&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;/blockquote&gt;
&lt;p&gt;The &lt;a href="https://llm.datasette.io/en/latest/schemas.html#extracting-people-from-a-news-articles"&gt;tutorial&lt;/a&gt; demonstrates saving a schema by using it once and then obtaining its ID through the new &lt;code&gt;llm schemas&lt;/code&gt; command, then saving it to a &lt;a href="https://llm.datasette.io/en/stable/templates.html"&gt;template&lt;/a&gt; (along with the system prompt) like this:&lt;/p&gt;
&lt;div class="highlight highlight-source-shell"&gt;&lt;pre&gt;llm --schema 3b7702e71da3dd791d9e17b76c88730e \
  --system &lt;span class="pl-s"&gt;&lt;span class="pl-pds"&gt;'&lt;/span&gt;extract people mentioned in this article&lt;span class="pl-pds"&gt;'&lt;/span&gt;&lt;/span&gt; \
  --save people&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;And now we can feed in new articles using the &lt;code&gt;llm -t people&lt;/code&gt; shortcut to apply that newly saved template:&lt;/p&gt;
&lt;div class="highlight highlight-source-shell"&gt;&lt;pre&gt;curl https://www.theguardian.com/commentisfree/2025/feb/27/billy-mcfarland-new-fyre-festival-fantasist &lt;span class="pl-k"&gt;|&lt;/span&gt; \
  strip-tags &lt;span class="pl-k"&gt;|&lt;/span&gt; llm -t people&lt;/pre&gt;&lt;/div&gt;
&lt;h4 id="doing-more-with-the-logged-structured-data"&gt;Doing more with the logged structured data&lt;/h4&gt;
&lt;p&gt;Having run a few prompts that use the same schema, an obvious next step is to do something with the data that has been collected.&lt;/p&gt;
&lt;p&gt;I ended up implementing this on top of the existing &lt;a href="https://llm.datasette.io/en/stable/logging.html"&gt;llm logs&lt;/a&gt; mechanism.&lt;/p&gt;
&lt;p&gt;LLM already defaults to logging every prompt and response it makes to a SQLite database - mine contains over 4,747 of these records now, according to this query:&lt;/p&gt;
&lt;div class="highlight highlight-source-shell"&gt;&lt;pre&gt;sqlite3 &lt;span class="pl-s"&gt;&lt;span class="pl-pds"&gt;"&lt;/span&gt;&lt;span class="pl-s"&gt;&lt;span class="pl-pds"&gt;$(&lt;/span&gt;llm logs path&lt;span class="pl-pds"&gt;)&lt;/span&gt;&lt;/span&gt;&lt;span class="pl-pds"&gt;"&lt;/span&gt;&lt;/span&gt; &lt;span class="pl-s"&gt;&lt;span class="pl-pds"&gt;'&lt;/span&gt;select count(*) from responses&lt;span class="pl-pds"&gt;'&lt;/span&gt;&lt;/span&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;With schemas, an increasing portion of those are valid JSON.&lt;/p&gt;
&lt;p&gt;Since LLM records the schema that was used for each response - using the schema ID, which is derived from a content hash of the expanded JSON schema - it's now possible to ask LLM for all responses that used a particular schema:&lt;/p&gt;
&lt;div class="highlight highlight-source-shell"&gt;&lt;pre&gt;llm logs --schema 3b7702e71da3dd791d9e17b76c88730e --short&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;I got back:&lt;/p&gt;
&lt;div class="highlight highlight-source-yaml"&gt;&lt;pre&gt;- &lt;span class="pl-ent"&gt;model&lt;/span&gt;: &lt;span class="pl-s"&gt;gpt-4o-mini&lt;/span&gt;
  &lt;span class="pl-ent"&gt;datetime&lt;/span&gt;: &lt;span class="pl-s"&gt;&lt;span class="pl-pds"&gt;'&lt;/span&gt;2025-02-28T07:37:18&lt;span class="pl-pds"&gt;'&lt;/span&gt;&lt;/span&gt;
  &lt;span class="pl-ent"&gt;conversation&lt;/span&gt;: &lt;span class="pl-s"&gt;01jn5qt397aaxskf1vjp6zxw2a&lt;/span&gt;
  &lt;span class="pl-ent"&gt;system&lt;/span&gt;: &lt;span class="pl-s"&gt;extract people mentioned in this article&lt;/span&gt;
  &lt;span class="pl-ent"&gt;prompt&lt;/span&gt;: &lt;span class="pl-s"&gt;Menu AP Logo Menu World U.S. Politics Sports Entertainment Business Science&lt;/span&gt;
    &lt;span class="pl-s"&gt;Fact Check Oddities Be Well Newsletters N...&lt;/span&gt;
- &lt;span class="pl-ent"&gt;model&lt;/span&gt;: &lt;span class="pl-s"&gt;gpt-4o-mini&lt;/span&gt;
  &lt;span class="pl-ent"&gt;datetime&lt;/span&gt;: &lt;span class="pl-s"&gt;&lt;span class="pl-pds"&gt;'&lt;/span&gt;2025-02-28T07:38:58&lt;span class="pl-pds"&gt;'&lt;/span&gt;&lt;/span&gt;
  &lt;span class="pl-ent"&gt;conversation&lt;/span&gt;: &lt;span class="pl-s"&gt;01jn5qx4q5he7yq803rnexp28p&lt;/span&gt;
  &lt;span class="pl-ent"&gt;system&lt;/span&gt;: &lt;span class="pl-s"&gt;extract people mentioned in this article&lt;/span&gt;
  &lt;span class="pl-ent"&gt;prompt&lt;/span&gt;: &lt;span class="pl-s"&gt;Skip to main contentSkip to navigationSkip to navigationPrint subscriptionsNewsletters&lt;/span&gt;
    &lt;span class="pl-s"&gt;Sign inUSUS editionUK editionA...&lt;/span&gt;
- &lt;span class="pl-ent"&gt;model&lt;/span&gt;: &lt;span class="pl-s"&gt;gpt-4o&lt;/span&gt;
  &lt;span class="pl-ent"&gt;datetime&lt;/span&gt;: &lt;span class="pl-s"&gt;&lt;span class="pl-pds"&gt;'&lt;/span&gt;2025-02-28T07:39:07&lt;span class="pl-pds"&gt;'&lt;/span&gt;&lt;/span&gt;
  &lt;span class="pl-ent"&gt;conversation&lt;/span&gt;: &lt;span class="pl-s"&gt;01jn5qxh20tksb85tf3bx2m3bd&lt;/span&gt;
  &lt;span class="pl-ent"&gt;system&lt;/span&gt;: &lt;span class="pl-s"&gt;extract people mentioned in this article&lt;/span&gt;
  &lt;span class="pl-ent"&gt;attachments&lt;/span&gt;:
  - &lt;span class="pl-ent"&gt;type&lt;/span&gt;: &lt;span class="pl-s"&gt;image/jpeg&lt;/span&gt;
    &lt;span class="pl-ent"&gt;url&lt;/span&gt;: &lt;span class="pl-s"&gt;https://static.simonwillison.net/static/2025/onion-zuck.jpg&lt;/span&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;As you can see, I've run that example schema three times (while constructing the tutorial) using GPT-4o mini - twice against text content from &lt;code&gt;curl ... | strip-tags&lt;/code&gt; and once against &lt;a href="https://static.simonwillison.net/static/2025/onion-zuck.jpg"&gt;a screenshot JPEG&lt;/a&gt; to demonstrate attachment support.&lt;/p&gt;
&lt;p&gt;Extracting gathered JSON from the logs is clearly a useful next step... so I added several options to &lt;code&gt;llm logs&lt;/code&gt; to support that use-case.&lt;/p&gt;
&lt;p&gt;The first is &lt;code&gt;--data&lt;/code&gt; - adding that will cause &lt;code&gt;LLM logs&lt;/code&gt; to output just the data that was gathered using a schema. Mix that with &lt;code&gt;-c&lt;/code&gt; to see the JSON from the most recent response:&lt;/p&gt;
&lt;div class="highlight highlight-source-shell"&gt;&lt;pre&gt;llm logs -c --data&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;Outputs:&lt;/p&gt;
&lt;div class="highlight highlight-source-json"&gt;&lt;pre&gt;{&lt;span class="pl-ent"&gt;"name"&lt;/span&gt;: &lt;span class="pl-s"&gt;&lt;span class="pl-pds"&gt;"&lt;/span&gt;Zap&lt;span class="pl-pds"&gt;"&lt;/span&gt;&lt;/span&gt;, &lt;span class="pl-ent"&gt;"age"&lt;/span&gt;: &lt;span class="pl-c1"&gt;5&lt;/span&gt;, &lt;span class="pl-ent"&gt;"short_bio"&lt;/span&gt;: ...&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;Combining that with the &lt;code&gt;--schema&lt;/code&gt; option is where things get really interesting. You can specify a schema using any of the mechanisms described earlier, which means you can see ALL of the data gathered using that schema by combining &lt;code&gt;--data&lt;/code&gt; with &lt;code&gt;--schema X&lt;/code&gt; (and &lt;code&gt;-n 0&lt;/code&gt; for everything).&lt;/p&gt;
&lt;p&gt;Here are all of the dogs I've invented:&lt;/p&gt;
&lt;div class="highlight highlight-source-shell"&gt;&lt;pre&gt;llm logs --schema &lt;span class="pl-s"&gt;&lt;span class="pl-pds"&gt;'&lt;/span&gt;name,age int,short_bio&lt;span class="pl-pds"&gt;'&lt;/span&gt;&lt;/span&gt; --data -n 0&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;Output (here truncated):&lt;/p&gt;
&lt;div class="highlight highlight-source-json"&gt;&lt;pre&gt;{&lt;span class="pl-ent"&gt;"name"&lt;/span&gt;: &lt;span class="pl-s"&gt;&lt;span class="pl-pds"&gt;"&lt;/span&gt;Zap&lt;span class="pl-pds"&gt;"&lt;/span&gt;&lt;/span&gt;, &lt;span class="pl-ent"&gt;"age"&lt;/span&gt;: &lt;span class="pl-c1"&gt;5&lt;/span&gt;, &lt;span class="pl-ent"&gt;"short_bio"&lt;/span&gt;: &lt;span class="pl-s"&gt;&lt;span class="pl-pds"&gt;"&lt;/span&gt;Zap is a futuristic ...&lt;span class="pl-pds"&gt;"&lt;/span&gt;&lt;/span&gt;}
{&lt;span class="pl-ent"&gt;"name"&lt;/span&gt;: &lt;span class="pl-s"&gt;&lt;span class="pl-pds"&gt;"&lt;/span&gt;Zephyr&lt;span class="pl-pds"&gt;"&lt;/span&gt;&lt;/span&gt;, &lt;span class="pl-ent"&gt;"age"&lt;/span&gt;: &lt;span class="pl-c1"&gt;3&lt;/span&gt;, &lt;span class="pl-ent"&gt;"short_bio"&lt;/span&gt;: &lt;span class="pl-s"&gt;&lt;span class="pl-pds"&gt;"&lt;/span&gt;Zephyr is an adventurous...&lt;span class="pl-pds"&gt;"&lt;/span&gt;&lt;/span&gt;}
{&lt;span class="pl-ent"&gt;"name"&lt;/span&gt;: &lt;span class="pl-s"&gt;&lt;span class="pl-pds"&gt;"&lt;/span&gt;Zylo&lt;span class="pl-pds"&gt;"&lt;/span&gt;&lt;/span&gt;, &lt;span class="pl-ent"&gt;"age"&lt;/span&gt;: &lt;span class="pl-c1"&gt;4&lt;/span&gt;, &lt;span class="pl-ent"&gt;"short_bio"&lt;/span&gt;: &lt;span class="pl-s"&gt;&lt;span class="pl-pds"&gt;"&lt;/span&gt;Zylo is a unique ...&lt;span class="pl-pds"&gt;"&lt;/span&gt;&lt;/span&gt;}&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;Some schemas gather multiple items, producing output that looks like this (from the tutorial):&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;{"items": [{"name": "Mark Zuckerberg", "organization": "...
{"items": [{"name": "Billy McFarland", "organization": "...
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;We can get back the individual objects by adding &lt;code&gt;--data-key items&lt;/code&gt;. Here I'm also using the &lt;code&gt;--schema t:people&lt;/code&gt; shortcut to specify the schema that was saved to the &lt;code&gt;people&lt;/code&gt; template earlier on.&lt;/p&gt;
&lt;div class="highlight highlight-source-shell"&gt;&lt;pre&gt;llm logs --schema t:people --data-key items&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;Output:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;{"name": "Katy Perry", "organization": ...
{"name": "Gayle King", "organization": ...
{"name": "Lauren Sanchez", "organization": ...
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;This feature defaults to outputting newline-delimited JSON, but you can add the &lt;code&gt;--data-array&lt;/code&gt; flag to get back a JSON array of objects instead.&lt;/p&gt;
&lt;p&gt;... which means you can pipe it into &lt;a href="https://sqlite-utils.datasette.io/en/stable/cli.html#inserting-json-data"&gt;sqlite-utils insert&lt;/a&gt; to create a SQLite database!&lt;/p&gt;
&lt;div class="highlight highlight-source-shell"&gt;&lt;pre&gt;llm logs --schema t:people --data-key items --data-array &lt;span class="pl-k"&gt;|&lt;/span&gt; \
  sqlite-utils insert data.db people -&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;Add all of this together and we can construct a schema, run it against a bunch of sources and dump the resulting structured data into SQLite where we can explore it using SQL queries (and &lt;a href="https://datasette.io/"&gt;Datasette&lt;/a&gt;). It's a really powerful combination.&lt;/p&gt;
&lt;h4 id="using-schemas-from-llm-s-python-library"&gt;Using schemas from LLM's Python library&lt;/h4&gt;
&lt;p&gt;The most popular way to work with schemas in Python these days is with &lt;a href="https://docs.pydantic.dev/"&gt;Pydantic&lt;/a&gt;, to the point that many of the official API libraries for models directly incorporate Pydantic for this purpose.&lt;/p&gt;
&lt;p&gt;LLM depended on Pydantic already, and for this project I finally dropped my dual support for Pydantic v1 and v2 and &lt;a href="https://github.com/simonw/llm/pull/775"&gt;committed to v2 only&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;A key reason Pydantic is popular for this is that it's trivial to use it to build a JSON schema document:&lt;/p&gt;
&lt;pre&gt;&lt;span class="pl-k"&gt;import&lt;/span&gt; &lt;span class="pl-s1"&gt;pydantic&lt;/span&gt;, &lt;span class="pl-s1"&gt;json&lt;/span&gt;

&lt;span class="pl-k"&gt;class&lt;/span&gt; &lt;span class="pl-v"&gt;Dog&lt;/span&gt;(&lt;span class="pl-s1"&gt;pydantic&lt;/span&gt;.&lt;span class="pl-c1"&gt;BaseModel&lt;/span&gt;):
    &lt;span class="pl-s1"&gt;name&lt;/span&gt;: &lt;span class="pl-smi"&gt;str&lt;/span&gt;
    &lt;span class="pl-s1"&gt;age&lt;/span&gt;: &lt;span class="pl-smi"&gt;int&lt;/span&gt;
    &lt;span class="pl-s1"&gt;bio&lt;/span&gt;: &lt;span class="pl-smi"&gt;str&lt;/span&gt;

&lt;span class="pl-s1"&gt;schema&lt;/span&gt; &lt;span class="pl-c1"&gt;=&lt;/span&gt; &lt;span class="pl-v"&gt;Dog&lt;/span&gt;.&lt;span class="pl-c1"&gt;model_json_schema&lt;/span&gt;()
&lt;span class="pl-en"&gt;print&lt;/span&gt;(&lt;span class="pl-s1"&gt;json&lt;/span&gt;.&lt;span class="pl-c1"&gt;dumps&lt;/span&gt;(&lt;span class="pl-s1"&gt;schema&lt;/span&gt;, &lt;span class="pl-s1"&gt;indent&lt;/span&gt;&lt;span class="pl-c1"&gt;=&lt;/span&gt;&lt;span class="pl-c1"&gt;2&lt;/span&gt;))&lt;/pre&gt;
&lt;p&gt;Outputs:&lt;/p&gt;
&lt;div class="highlight highlight-source-json"&gt;&lt;pre&gt;{
  &lt;span class="pl-ent"&gt;"properties"&lt;/span&gt;: {
    &lt;span class="pl-ent"&gt;"name"&lt;/span&gt;: {
      &lt;span class="pl-ent"&gt;"title"&lt;/span&gt;: &lt;span class="pl-s"&gt;&lt;span class="pl-pds"&gt;"&lt;/span&gt;Name&lt;span class="pl-pds"&gt;"&lt;/span&gt;&lt;/span&gt;,
      &lt;span class="pl-ent"&gt;"type"&lt;/span&gt;: &lt;span class="pl-s"&gt;&lt;span class="pl-pds"&gt;"&lt;/span&gt;string&lt;span class="pl-pds"&gt;"&lt;/span&gt;&lt;/span&gt;
    },
    &lt;span class="pl-ent"&gt;"age"&lt;/span&gt;: {
      &lt;span class="pl-ent"&gt;"title"&lt;/span&gt;: &lt;span class="pl-s"&gt;&lt;span class="pl-pds"&gt;"&lt;/span&gt;Age&lt;span class="pl-pds"&gt;"&lt;/span&gt;&lt;/span&gt;,
      &lt;span class="pl-ent"&gt;"type"&lt;/span&gt;: &lt;span class="pl-s"&gt;&lt;span class="pl-pds"&gt;"&lt;/span&gt;integer&lt;span class="pl-pds"&gt;"&lt;/span&gt;&lt;/span&gt;
    },
    &lt;span class="pl-ent"&gt;"bio"&lt;/span&gt;: {
      &lt;span class="pl-ent"&gt;"title"&lt;/span&gt;: &lt;span class="pl-s"&gt;&lt;span class="pl-pds"&gt;"&lt;/span&gt;Bio&lt;span class="pl-pds"&gt;"&lt;/span&gt;&lt;/span&gt;,
      &lt;span class="pl-ent"&gt;"type"&lt;/span&gt;: &lt;span class="pl-s"&gt;&lt;span class="pl-pds"&gt;"&lt;/span&gt;string&lt;span class="pl-pds"&gt;"&lt;/span&gt;&lt;/span&gt;
    }
  },
  &lt;span class="pl-ent"&gt;"required"&lt;/span&gt;: [
    &lt;span class="pl-s"&gt;&lt;span class="pl-pds"&gt;"&lt;/span&gt;name&lt;span class="pl-pds"&gt;"&lt;/span&gt;&lt;/span&gt;,
    &lt;span class="pl-s"&gt;&lt;span class="pl-pds"&gt;"&lt;/span&gt;age&lt;span class="pl-pds"&gt;"&lt;/span&gt;&lt;/span&gt;,
    &lt;span class="pl-s"&gt;&lt;span class="pl-pds"&gt;"&lt;/span&gt;bio&lt;span class="pl-pds"&gt;"&lt;/span&gt;&lt;/span&gt;
  ],
  &lt;span class="pl-ent"&gt;"title"&lt;/span&gt;: &lt;span class="pl-s"&gt;&lt;span class="pl-pds"&gt;"&lt;/span&gt;Dog&lt;span class="pl-pds"&gt;"&lt;/span&gt;&lt;/span&gt;,
  &lt;span class="pl-ent"&gt;"type"&lt;/span&gt;: &lt;span class="pl-s"&gt;&lt;span class="pl-pds"&gt;"&lt;/span&gt;object&lt;span class="pl-pds"&gt;"&lt;/span&gt;&lt;/span&gt;
}&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;LLM's Python library doesn't require you to use Pydantic, but it supports passing either a Pydantic &lt;code&gt;BaseModel&lt;/code&gt; subclass or a full JSON schema to the new &lt;code&gt;model.prompt(schema=)&lt;/code&gt; parameter. Here's &lt;a href="https://llm.datasette.io/en/latest/python-api.html#schemas"&gt;the usage example&lt;/a&gt; from the documentation:&lt;/p&gt;
&lt;pre&gt;&lt;span class="pl-k"&gt;import&lt;/span&gt; &lt;span class="pl-s1"&gt;llm&lt;/span&gt;, &lt;span class="pl-s1"&gt;json&lt;/span&gt;
&lt;span class="pl-k"&gt;from&lt;/span&gt; &lt;span class="pl-s1"&gt;pydantic&lt;/span&gt; &lt;span class="pl-k"&gt;import&lt;/span&gt; &lt;span class="pl-v"&gt;BaseModel&lt;/span&gt;

&lt;span class="pl-k"&gt;class&lt;/span&gt; &lt;span class="pl-v"&gt;Dog&lt;/span&gt;(&lt;span class="pl-v"&gt;BaseModel&lt;/span&gt;):
    &lt;span class="pl-s1"&gt;name&lt;/span&gt;: &lt;span class="pl-smi"&gt;str&lt;/span&gt;
    &lt;span class="pl-s1"&gt;age&lt;/span&gt;: &lt;span class="pl-smi"&gt;int&lt;/span&gt;

&lt;span class="pl-s1"&gt;model&lt;/span&gt; &lt;span class="pl-c1"&gt;=&lt;/span&gt; &lt;span class="pl-s1"&gt;llm&lt;/span&gt;.&lt;span class="pl-c1"&gt;get_model&lt;/span&gt;(&lt;span class="pl-s"&gt;"gpt-4o-mini"&lt;/span&gt;)
&lt;span class="pl-s1"&gt;response&lt;/span&gt; &lt;span class="pl-c1"&gt;=&lt;/span&gt; &lt;span class="pl-s1"&gt;model&lt;/span&gt;.&lt;span class="pl-c1"&gt;prompt&lt;/span&gt;(&lt;span class="pl-s"&gt;"Describe a nice dog"&lt;/span&gt;, &lt;span class="pl-s1"&gt;schema&lt;/span&gt;&lt;span class="pl-c1"&gt;=&lt;/span&gt;&lt;span class="pl-v"&gt;Dog&lt;/span&gt;)
&lt;span class="pl-s1"&gt;dog&lt;/span&gt; &lt;span class="pl-c1"&gt;=&lt;/span&gt; &lt;span class="pl-s1"&gt;json&lt;/span&gt;.&lt;span class="pl-c1"&gt;loads&lt;/span&gt;(&lt;span class="pl-s1"&gt;response&lt;/span&gt;.&lt;span class="pl-c1"&gt;text&lt;/span&gt;())
&lt;span class="pl-en"&gt;print&lt;/span&gt;(&lt;span class="pl-s1"&gt;dog&lt;/span&gt;)
&lt;span class="pl-c"&gt;# {"name":"Buddy","age":3}&lt;/span&gt;&lt;/pre&gt;
&lt;h4 id="what-s-next-for-llm-schemas-"&gt;What's next for LLM schemas?&lt;/h4&gt;
&lt;p&gt;So far I've implemented schema support for models from OpenAI, Anthropic and Gemini. The &lt;a href="https://llm.datasette.io/en/stable/plugins/advanced-model-plugins.html#supporting-schemas"&gt;plugin author documentation&lt;/a&gt; includes details on how to add this to further plugins - I'd love to see one of the local model plugins implement this pattern as well.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Update&lt;/strong&gt; &lt;a href="https://github.com/taketwo/llm-ollama"&gt;llm-ollama&lt;/a&gt; now support schemas thanks to &lt;a href="https://github.com/taketwo/llm-ollama/pull/36"&gt;this PR&lt;/a&gt; by Adam Compton. And I've added support &lt;a href="https://simonwillison.net/2025/Mar/4/llm-mistral-011/"&gt;to llm-mistral&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;I'm presenting a workshop at the &lt;a href="https://www.ire.org/training/conferences/nicar-2025/"&gt;NICAR 2025&lt;/a&gt; data journalism conference next week about &lt;a href="https://github.com/simonw/nicar-2025-scraping/"&gt;Cutting-edge web scraping techniques&lt;/a&gt;. LLM schemas is a great example of NDD - NICAR-Driven Development - where I'm churning out features I need for that conference (see also shot-scraper's new &lt;a href="https://shot-scraper.datasette.io/en/stable/har.html"&gt;HAR support&lt;/a&gt;).&lt;/p&gt;
&lt;p&gt;I expect the workshop will be a great opportunity to further refine the design and implementation of this feature!&lt;/p&gt;
&lt;p&gt;I'm also going to be using this new feature to add multiple model support to my &lt;a href="https://www.datasette.cloud/blog/2024/datasette-extract/"&gt;datasette-extract plugin&lt;/a&gt;, which provides a web UI for structured data extraction that writes the resulting records directly to a SQLite database table.&lt;/p&gt;
    
        &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/cli"&gt;cli&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/data-journalism"&gt;data-journalism&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/projects"&gt;projects&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/ai"&gt;ai&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/annotated-release-notes"&gt;annotated-release-notes&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/generative-ai"&gt;generative-ai&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/local-llms"&gt;local-llms&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/llms"&gt;llms&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/llm"&gt;llm&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/mistral"&gt;mistral&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/gemini"&gt;gemini&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/ollama"&gt;ollama&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/structured-extraction"&gt;structured-extraction&lt;/a&gt;&lt;/p&gt;
    

</summary><category term="cli"/><category term="data-journalism"/><category term="projects"/><category term="ai"/><category term="annotated-release-notes"/><category term="generative-ai"/><category term="local-llms"/><category term="llms"/><category term="llm"/><category term="mistral"/><category term="gemini"/><category term="ollama"/><category term="structured-extraction"/></entry><entry><title>Claude 3.7 Sonnet and Claude Code</title><link href="https://simonwillison.net/2025/Feb/24/claude-37-sonnet-and-claude-code/#atom-tag" rel="alternate"/><published>2025-02-24T20:25:39+00:00</published><updated>2025-02-24T20:25:39+00:00</updated><id>https://simonwillison.net/2025/Feb/24/claude-37-sonnet-and-claude-code/#atom-tag</id><summary type="html">
    
&lt;p&gt;&lt;strong&gt;&lt;a href="https://www.anthropic.com/news/claude-3-7-sonnet"&gt;Claude 3.7 Sonnet and Claude Code&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;
Anthropic released &lt;strong&gt;Claude 3.7 Sonnet&lt;/strong&gt; today - skipping the name "Claude 3.6" because the Anthropic user community had already started using that as the unofficial name for their &lt;a href="https://www.anthropic.com/news/3-5-models-and-computer-use"&gt;October update to 3.5 Sonnet&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;As you may expect, 3.7 Sonnet is an improvement over 3.5 Sonnet - and is priced the same, at $3/million tokens for input and $15/m output.&lt;/p&gt;
&lt;p&gt;The big difference is that this is Anthropic's first "reasoning" model - applying the same trick that we've now seen from OpenAI o1 and o3, Grok 3, Google Gemini 2.0 Thinking, DeepSeek R1 and Qwen's QwQ and QvQ. The only big model families without an official reasoning model now are Mistral and Meta's Llama.&lt;/p&gt;
&lt;p&gt;I'm still working on &lt;a href="https://github.com/simonw/llm-anthropic/pull/15"&gt;adding support to my llm-anthropic plugin&lt;/a&gt; but I've got enough working code that I was able to get it to draw me a pelican riding a bicycle. Here's the non-reasoning model:&lt;/p&gt;
&lt;p style="text-align: center"&gt;&lt;img src="https://static.simonwillison.net/static/2025/pelican-claude-3.7-sonnet.svg" alt="A very good attempt"&gt;&lt;/p&gt;

&lt;p&gt;And here's that same prompt but with "thinking mode" enabled:&lt;/p&gt;
&lt;p style="text-align: center"&gt;&lt;img src="https://static.simonwillison.net/static/2025/pelican-claude-3.7-sonnet-thinking.svg" alt="A very good attempt"&gt;&lt;/p&gt;

&lt;p&gt;Here's &lt;a href="https://gist.github.com/simonw/9c2d119f815b4a6c3802ab591857bf40"&gt;the transcript&lt;/a&gt; for that second one, which mixes together the thinking and the output tokens. I'm still working through how best to differentiate between those two types of token.&lt;/p&gt;
&lt;p&gt;Claude 3.7 Sonnet has a training cut-off date of Oct 2024 - an improvement on 3.5 Haiku's July 2024 - and can output up to 64,000 tokens in thinking mode (some of which are used for thinking tokens) and up to 128,000 if you enable &lt;a href="https://docs.anthropic.com/en/docs/build-with-claude/extended-thinking#extended-output-capabilities-beta"&gt;a special header&lt;/a&gt;:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;Claude 3.7 Sonnet can produce substantially longer responses than previous models with support for up to 128K output tokens (beta)---more than 15x longer than other Claude models. This expanded capability is particularly effective for extended thinking use cases involving complex reasoning, rich code generation, and comprehensive content creation.&lt;/p&gt;
&lt;p&gt;This feature can be enabled by passing an &lt;code&gt;anthropic-beta&lt;/code&gt; header of &lt;code&gt;output-128k-2025-02-19&lt;/code&gt;.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;Anthropic's other big release today is a preview of &lt;strong&gt;Claude Code&lt;/strong&gt; - a CLI tool for interacting with Claude that includes the ability to prompt Claude in terminal chat and have it read and modify files and execute commands. This means it can both iterate on code and execute tests, making it an extremely powerful "agent" for coding assistance.&lt;/p&gt;
&lt;p&gt;Here's &lt;a href="https://docs.anthropic.com/en/docs/agents-and-tools/claude-code/overview"&gt;Anthropic's documentation&lt;/a&gt; on getting started with Claude Code, which uses OAuth (a first for Anthropic's API) to authenticate against your API account, so you'll need to configure billing.&lt;/p&gt;
&lt;p&gt;Short version:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;npm install -g @anthropic-ai/claude-code
claude
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;It can burn a lot of tokens so don't be surprised if a lengthy session with it adds up to single digit dollars of API spend.


    &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/cli"&gt;cli&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/oauth"&gt;oauth&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/ai"&gt;ai&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/generative-ai"&gt;generative-ai&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/llms"&gt;llms&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/ai-assisted-programming"&gt;ai-assisted-programming&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/llm"&gt;llm&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/anthropic"&gt;anthropic&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/claude"&gt;claude&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/ai-agents"&gt;ai-agents&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/pelican-riding-a-bicycle"&gt;pelican-riding-a-bicycle&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/llm-reasoning"&gt;llm-reasoning&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/llm-release"&gt;llm-release&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/coding-agents"&gt;coding-agents&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/claude-code"&gt;claude-code&lt;/a&gt;&lt;/p&gt;



</summary><category term="cli"/><category term="oauth"/><category term="ai"/><category term="generative-ai"/><category term="llms"/><category term="ai-assisted-programming"/><category term="llm"/><category term="anthropic"/><category term="claude"/><category term="ai-agents"/><category term="pelican-riding-a-bicycle"/><category term="llm-reasoning"/><category term="llm-release"/><category term="coding-agents"/><category term="claude-code"/></entry><entry><title>files-to-prompt 0.6</title><link href="https://simonwillison.net/2025/Feb/19/files-to-prompt/#atom-tag" rel="alternate"/><published>2025-02-19T06:12:12+00:00</published><updated>2025-02-19T06:12:12+00:00</updated><id>https://simonwillison.net/2025/Feb/19/files-to-prompt/#atom-tag</id><summary type="html">
    
&lt;p&gt;&lt;strong&gt;&lt;a href="https://github.com/simonw/files-to-prompt/releases/tag/0.6"&gt;files-to-prompt 0.6&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;
New release of my CLI tool for turning a whole directory of code into a single prompt ready to pipe or paste into an LLM.&lt;/p&gt;
&lt;p&gt;Here are the full release notes:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;ul&gt;&lt;li&gt;New &lt;code&gt;-m/--markdown&lt;/code&gt; option for outputting results as Markdown with each file in a fenced code block. &lt;a href="https://github.com/simonw/files-to-prompt/issues/42"&gt;#42&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;Support for reading a list of files from standard input. Thanks, &lt;a href="https://github.com/thelastnode"&gt;Ankit Shankar&lt;/a&gt;. &lt;a href="https://github.com/simonw/files-to-prompt/issues/44"&gt;#44&lt;/a&gt;&lt;br&gt;
  Here's how to process just files modified within the last day:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;find . -mtime -1 | files-to-prompt
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;You can also use the &lt;code&gt;-0/--null&lt;/code&gt; flag to accept lists of file paths separated by null delimiters, which is useful for handling file names with spaces in them:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;find . -name "*.txt" -print0 | files-to-prompt -0
&lt;/code&gt;&lt;/pre&gt;
&lt;/blockquote&gt;
&lt;p&gt;I also have a potential fix for a reported bug concerning nested &lt;code&gt;.gitignore&lt;/code&gt; files that's currently &lt;a href="https://github.com/simonw/files-to-prompt/pull/45"&gt;sitting in a PR&lt;/a&gt;. I'm waiting for someone else to confirm that it behaves as they would expect. I've left &lt;a href="https://github.com/simonw/files-to-prompt/issues/40#issuecomment-2667571418"&gt;details in this issue comment&lt;/a&gt;, but the short version is that you can try out the version from the PR using this &lt;code&gt;uvx&lt;/code&gt; incantation:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;uvx --with git+https://github.com/simonw/files-to-prompt@nested-gitignore files-to-prompt
&lt;/code&gt;&lt;/pre&gt;


    &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/cli"&gt;cli&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/projects"&gt;projects&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/annotated-release-notes"&gt;annotated-release-notes&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/llms"&gt;llms&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/uv"&gt;uv&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/files-to-prompt"&gt;files-to-prompt&lt;/a&gt;&lt;/p&gt;



</summary><category term="cli"/><category term="projects"/><category term="annotated-release-notes"/><category term="llms"/><category term="uv"/><category term="files-to-prompt"/></entry><entry><title>LLM 0.22, the annotated release notes</title><link href="https://simonwillison.net/2025/Feb/17/llm/#atom-tag" rel="alternate"/><published>2025-02-17T06:19:00+00:00</published><updated>2025-02-17T06:19:00+00:00</updated><id>https://simonwillison.net/2025/Feb/17/llm/#atom-tag</id><summary type="html">
    &lt;p&gt;I released &lt;a href="https://llm.datasette.io/en/stable/changelog.html#v0-22"&gt;LLM 0.22&lt;/a&gt; this evening. Here are the &lt;a href="https://simonwillison.net/tags/annotated-release-notes/"&gt;annotated release notes&lt;/a&gt;:&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;&lt;a href="#model-prompt-key-for-api-keys"&gt;model.prompt(..., key=) for API keys&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href="#chatgpt-4o-latest"&gt;chatgpt-4o-latest&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href="#llm-logs-s-short"&gt;llm logs -s/--short&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href="#llm-models-q-gemini-q-exp"&gt;llm models -q gemini -q exp&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href="#llm-embed-multi-prepend-x"&gt;llm embed-multi --prepend X&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href="#everything-else"&gt;Everything else&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;h4 id="model-prompt-key-for-api-keys"&gt;model.prompt(..., key=) for API keys&lt;/h4&gt;
&lt;blockquote&gt;
&lt;ul&gt;
&lt;li&gt;Plugins that provide models that use API keys can now subclass the new &lt;code&gt;llm.KeyModel&lt;/code&gt; and &lt;code&gt;llm.AsyncKeyModel&lt;/code&gt; classes. This results in the API key being passed as a new &lt;code&gt;key&lt;/code&gt; parameter to their &lt;code&gt;.execute()&lt;/code&gt; methods, and means that Python users can pass a key as the &lt;code&gt;model.prompt(..., key=)&lt;/code&gt; - see &lt;a href="https://llm.datasette.io/en/stable/python-api.html#python-api-models-api-keys"&gt;Passing an API key&lt;/a&gt;. Plugin developers should consult the new documentation on writing &lt;a href="https://llm.datasette.io/en/stable/plugins/advanced-model-plugins.html#advanced-model-plugins-api-keys"&gt;Models that accept API keys&lt;/a&gt;. &lt;a href="https://github.com/simonw/llm/issues/744"&gt;#744&lt;/a&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;/blockquote&gt;
&lt;p&gt;This is the big change. It's only relevant to you if you use LLM as a Python library &lt;em&gt;and&lt;/em&gt; you need the ability to pass API keys for OpenAI, Anthropic, Gemini etc in yourself in Python code rather than setting them as an environment variable.&lt;/p&gt;
&lt;p&gt;It turns out I need to do that for Datasette Cloud, where API keys are retrieved from individual customer's secret stores!&lt;/p&gt;
&lt;p&gt;Thanks to this change, it's now possible to do things like this - the &lt;code&gt;key=&lt;/code&gt; parameter to &lt;code&gt;model.prompt()&lt;/code&gt; is new:&lt;/p&gt;
&lt;pre&gt;&lt;span class="pl-k"&gt;import&lt;/span&gt; &lt;span class="pl-s1"&gt;llm&lt;/span&gt;
&lt;span class="pl-s1"&gt;model&lt;/span&gt; &lt;span class="pl-c1"&gt;=&lt;/span&gt; &lt;span class="pl-s1"&gt;llm&lt;/span&gt;.&lt;span class="pl-c1"&gt;get_model&lt;/span&gt;(&lt;span class="pl-s"&gt;"gpt-4o-mini"&lt;/span&gt;)
&lt;span class="pl-s1"&gt;response&lt;/span&gt; &lt;span class="pl-c1"&gt;=&lt;/span&gt; &lt;span class="pl-s1"&gt;model&lt;/span&gt;.&lt;span class="pl-c1"&gt;prompt&lt;/span&gt;(&lt;span class="pl-s"&gt;"Surprise me!"&lt;/span&gt;, &lt;span class="pl-s1"&gt;key&lt;/span&gt;&lt;span class="pl-c1"&gt;=&lt;/span&gt;&lt;span class="pl-s"&gt;"my-api-key"&lt;/span&gt;)
&lt;span class="pl-en"&gt;print&lt;/span&gt;(&lt;span class="pl-s1"&gt;response&lt;/span&gt;.&lt;span class="pl-c1"&gt;text&lt;/span&gt;())&lt;/pre&gt;
&lt;p&gt;Other plugins need to be updated to take advantage of this new feature. Here's &lt;a href="https://llm.datasette.io/en/stable/plugins/advanced-model-plugins.html#models-that-accept-api-keys"&gt;the documentation for plugin developers&lt;/a&gt; - I've released &lt;a href="https://github.com/simonw/llm-anthropic/releases/tag/0.13"&gt;llm-anthropic 0.13&lt;/a&gt; and &lt;a href="https://github.com/simonw/llm-gemini/releases/tag/0.11"&gt;llm-gemini 0.11&lt;/a&gt; implementing the new pattern.&lt;/p&gt;
&lt;h4 id="chatgpt-4o-latest"&gt;chatgpt-4o-latest&lt;/h4&gt;
&lt;blockquote&gt;
&lt;ul&gt;
&lt;li&gt;New OpenAI model: &lt;code&gt;chatgpt-4o-latest&lt;/code&gt;. This model ID accesses the current model being used to power ChatGPT, which can change without warning. &lt;a href="https://github.com/simonw/llm/issues/752"&gt;#752&lt;/a&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;/blockquote&gt;
&lt;p&gt;This model has actually been around since &lt;a href="https://twitter.com/openaidevs/status/1823510395619000525"&gt;August 2024&lt;/a&gt; but I had somehow missed it. &lt;code&gt;chatgpt-4o-latest&lt;/code&gt; is a model alias that provides access to the current model that is being used for GPT-4o running on ChatGPT, which is &lt;em&gt;not&lt;/em&gt; the same as the GPT-4o models usually available via the API. It got &lt;a href="https://twitter.com/edwinarbus/status/1890841371675619728"&gt;an upgrade&lt;/a&gt; last week so it's currently the alias that provides access to the most recently released OpenAI model.&lt;/p&gt;
&lt;p&gt;Most OpenAI models such as &lt;code&gt;gpt-4o&lt;/code&gt; provide stable date-based aliases like &lt;code&gt;gpt-4o-2024-08-06&lt;/code&gt; which effectively let you "pin" to that exact model version. OpenAI technical staff &lt;a href="https://twitter.com/zedlander/status/1890937885848715443"&gt;have confirmed&lt;/a&gt; that they don't change the model without updating that name.&lt;/p&gt;
&lt;p&gt;The one exception is &lt;code&gt;chatgpt-4o-latest&lt;/code&gt; - that one can change without warning and doesn't appear to have release notes at all.&lt;/p&gt;
&lt;p&gt;It's also a little more expensive that &lt;code&gt;gpt-4o&lt;/code&gt; - currently priced at $5/million tokens for input and $15/million for output, compared to GPT 4o's $2.50/$10.&lt;/p&gt;
&lt;p&gt;It's a fun model to play with though! As of last week it appears to be very chatty and keen on &lt;a href="https://github.com/simonw/llm/issues/752#issuecomment-2661184024"&gt;using emoji&lt;/a&gt;. It also claims that it has a July 2024 training cut-off.&lt;/p&gt;
&lt;h4 id="llm-logs-s-short"&gt;llm logs -s/--short&lt;/h4&gt;
&lt;blockquote&gt;
&lt;ul&gt;
&lt;li&gt;New &lt;code&gt;llm logs -s/--short&lt;/code&gt; flag, which returns a greatly shortened version of the matching log entries in YAML format with a truncated prompt and without including the response. &lt;a href="https://github.com/simonw/llm/issues/737"&gt;#737&lt;/a&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;/blockquote&gt;
&lt;p&gt;The &lt;code&gt;llm logs&lt;/code&gt; command lets you search through logged prompt-response pairs - I have 4,419 of them in my database, according to this command:&lt;/p&gt;
&lt;div class="highlight highlight-source-shell"&gt;&lt;pre&gt;sqlite-utils tables &lt;span class="pl-s"&gt;&lt;span class="pl-pds"&gt;"&lt;/span&gt;&lt;span class="pl-s"&gt;&lt;span class="pl-pds"&gt;$(&lt;/span&gt;llm logs path&lt;span class="pl-pds"&gt;)&lt;/span&gt;&lt;/span&gt;&lt;span class="pl-pds"&gt;"&lt;/span&gt;&lt;/span&gt; --counts  &lt;span class="pl-k"&gt;|&lt;/span&gt; grep responses&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;By default it outputs the full prompts and responses as Markdown - and since I've started leaning more into long context models (&lt;a href="https://simonwillison.net/2025/Feb/14/files-to-prompt/"&gt;some recent examples&lt;/a&gt;) my logs have been getting pretty hard to navigate.&lt;/p&gt;
&lt;p&gt;The new &lt;code&gt;-s/--short&lt;/code&gt; flag provides a much more concise YAML format. Here are some of my recent prompts that I've run using Google's Gemini 2.0 Pro experimental model - the &lt;code&gt;-u&lt;/code&gt; flag includes usage statistics, and &lt;code&gt;-n 4&lt;/code&gt; limits the output to the most recent 4 entries:&lt;/p&gt;
&lt;div class="highlight highlight-source-shell"&gt;&lt;pre&gt;llm logs --short -m gemini-2.0-pro-exp-02-05 -u -n 4&lt;/pre&gt;&lt;/div&gt;
&lt;div class="highlight highlight-source-yaml"&gt;&lt;pre&gt;- &lt;span class="pl-ent"&gt;model&lt;/span&gt;: &lt;span class="pl-s"&gt;gemini-2.0-pro-exp-02-05&lt;/span&gt;
  &lt;span class="pl-ent"&gt;datetime&lt;/span&gt;: &lt;span class="pl-s"&gt;&lt;span class="pl-pds"&gt;'&lt;/span&gt;2025-02-13T22:30:48&lt;span class="pl-pds"&gt;'&lt;/span&gt;&lt;/span&gt;
  &lt;span class="pl-ent"&gt;conversation&lt;/span&gt;: &lt;span class="pl-s"&gt;01jm0q045fqp5xy5pn4j1bfbxs&lt;/span&gt;
  &lt;span class="pl-ent"&gt;prompt&lt;/span&gt;: &lt;span class="pl-s"&gt;&lt;span class="pl-pds"&gt;'&lt;/span&gt;&amp;lt;documents&amp;gt; &amp;lt;document index="1"&amp;gt; &amp;lt;source&amp;gt;./index.md&amp;lt;/source&amp;gt; &amp;lt;document_content&amp;gt;&lt;/span&gt;
&lt;span class="pl-s"&gt;    # uv An extremely fast Python package...&lt;span class="pl-pds"&gt;'&lt;/span&gt;&lt;/span&gt;
  &lt;span class="pl-ent"&gt;usage&lt;/span&gt;:
    &lt;span class="pl-ent"&gt;input&lt;/span&gt;: &lt;span class="pl-c1"&gt;281812&lt;/span&gt;
    &lt;span class="pl-ent"&gt;output&lt;/span&gt;: &lt;span class="pl-c1"&gt;1521&lt;/span&gt;
- &lt;span class="pl-ent"&gt;model&lt;/span&gt;: &lt;span class="pl-s"&gt;gemini-2.0-pro-exp-02-05&lt;/span&gt;
  &lt;span class="pl-ent"&gt;datetime&lt;/span&gt;: &lt;span class="pl-s"&gt;&lt;span class="pl-pds"&gt;'&lt;/span&gt;2025-02-13T22:32:29&lt;span class="pl-pds"&gt;'&lt;/span&gt;&lt;/span&gt;
  &lt;span class="pl-ent"&gt;conversation&lt;/span&gt;: &lt;span class="pl-s"&gt;01jm0q045fqp5xy5pn4j1bfbxs&lt;/span&gt;
  &lt;span class="pl-ent"&gt;prompt&lt;/span&gt;: &lt;span class="pl-s"&gt;I want to set it globally so if I run uv run python anywhere on my computer&lt;/span&gt;
    &lt;span class="pl-s"&gt;I always get 3.13&lt;/span&gt;
  &lt;span class="pl-ent"&gt;usage&lt;/span&gt;:
    &lt;span class="pl-ent"&gt;input&lt;/span&gt;: &lt;span class="pl-c1"&gt;283369&lt;/span&gt;
    &lt;span class="pl-ent"&gt;output&lt;/span&gt;: &lt;span class="pl-c1"&gt;1540&lt;/span&gt;
- &lt;span class="pl-ent"&gt;model&lt;/span&gt;: &lt;span class="pl-s"&gt;gemini-2.0-pro-exp-02-05&lt;/span&gt;
  &lt;span class="pl-ent"&gt;datetime&lt;/span&gt;: &lt;span class="pl-s"&gt;&lt;span class="pl-pds"&gt;'&lt;/span&gt;2025-02-14T23:23:57&lt;span class="pl-pds"&gt;'&lt;/span&gt;&lt;/span&gt;
  &lt;span class="pl-ent"&gt;conversation&lt;/span&gt;: &lt;span class="pl-s"&gt;01jm3cek8eb4z8tkqhf4trk98b&lt;/span&gt;
  &lt;span class="pl-ent"&gt;prompt&lt;/span&gt;: &lt;span class="pl-s"&gt;&lt;span class="pl-pds"&gt;'&lt;/span&gt;&amp;lt;documents&amp;gt; &amp;lt;document index="1"&amp;gt; &amp;lt;source&amp;gt;./LORA.md&amp;lt;/source&amp;gt; &amp;lt;document_content&amp;gt;&lt;/span&gt;
&lt;span class="pl-s"&gt;    # Fine-Tuning with LoRA or QLoRA You c...&lt;span class="pl-pds"&gt;'&lt;/span&gt;&lt;/span&gt;
  &lt;span class="pl-ent"&gt;usage&lt;/span&gt;:
    &lt;span class="pl-ent"&gt;input&lt;/span&gt;: &lt;span class="pl-c1"&gt;162885&lt;/span&gt;
    &lt;span class="pl-ent"&gt;output&lt;/span&gt;: &lt;span class="pl-c1"&gt;2558&lt;/span&gt;
- &lt;span class="pl-ent"&gt;model&lt;/span&gt;: &lt;span class="pl-s"&gt;gemini-2.0-pro-exp-02-05&lt;/span&gt;
  &lt;span class="pl-ent"&gt;datetime&lt;/span&gt;: &lt;span class="pl-s"&gt;&lt;span class="pl-pds"&gt;'&lt;/span&gt;2025-02-14T23:30:13&lt;span class="pl-pds"&gt;'&lt;/span&gt;&lt;/span&gt;
  &lt;span class="pl-ent"&gt;conversation&lt;/span&gt;: &lt;span class="pl-s"&gt;01jm3csstrfygp35rk0y1w3rfc&lt;/span&gt;
  &lt;span class="pl-ent"&gt;prompt&lt;/span&gt;: &lt;span class="pl-s"&gt;&lt;span class="pl-pds"&gt;'&lt;/span&gt;&amp;lt;documents&amp;gt; &amp;lt;document index="1"&amp;gt; &amp;lt;source&amp;gt;huggingface_hub/__init__.py&amp;lt;/source&amp;gt;&lt;/span&gt;
&lt;span class="pl-s"&gt;    &amp;lt;document_content&amp;gt; # Copyright 2020 The...&lt;span class="pl-pds"&gt;'&lt;/span&gt;&lt;/span&gt;
  &lt;span class="pl-ent"&gt;usage&lt;/span&gt;:
    &lt;span class="pl-ent"&gt;input&lt;/span&gt;: &lt;span class="pl-c1"&gt;480216&lt;/span&gt;
    &lt;span class="pl-ent"&gt;output&lt;/span&gt;: &lt;span class="pl-c1"&gt;1791&lt;/span&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;h4 id="llm-models-q-gemini-q-exp"&gt;llm models -q gemini -q exp&lt;/h4&gt;
&lt;blockquote&gt;
&lt;ul&gt;
&lt;li&gt;Both &lt;code&gt;llm models&lt;/code&gt; and &lt;code&gt;llm embed-models&lt;/code&gt; now take multiple &lt;code&gt;-q&lt;/code&gt; search fragments. You can now search for all models matching "gemini" and "exp" using &lt;code&gt;llm models -q gemini -q exp&lt;/code&gt;. &lt;a href="https://github.com/simonw/llm/issues/748"&gt;#748&lt;/a&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;/blockquote&gt;
&lt;p&gt;I have over 100 models installed in LLM now across a bunch of different plugins. I added the &lt;code&gt;-q&lt;/code&gt; option to help search through them a few months ago, and now I've upgraded it so you can pass it multiple times.&lt;/p&gt;
&lt;p&gt;Want to see all the Gemini experimental models?&lt;/p&gt;
&lt;div class="highlight highlight-source-shell"&gt;&lt;pre&gt;llm models -q gemini -q exp&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;Outputs:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;GeminiPro: gemini-exp-1114
GeminiPro: gemini-exp-1121
GeminiPro: gemini-exp-1206
GeminiPro: gemini-2.0-flash-exp
GeminiPro: learnlm-1.5-pro-experimental
GeminiPro: gemini-2.0-flash-thinking-exp-1219
GeminiPro: gemini-2.0-flash-thinking-exp-01-21
GeminiPro: gemini-2.0-pro-exp-02-05 (aliases: g2)
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;For consistency I added the same options to the &lt;code&gt;llm embed-models&lt;/code&gt; command, which lists available &lt;a href="https://llm.datasette.io/en/stable/embeddings/cli.html"&gt;embedding models&lt;/a&gt;.&lt;/p&gt;
&lt;h4 id="llm-embed-multi-prepend-x"&gt;llm embed-multi --prepend X&lt;/h4&gt;
&lt;blockquote&gt;
&lt;ul&gt;
&lt;li&gt;New &lt;code&gt;llm embed-multi --prepend X&lt;/code&gt; option for prepending a string to each value before it is embedded - useful for models such as &lt;a href="https://huggingface.co/nomic-ai/nomic-embed-text-v2-moe"&gt;nomic-embed-text-v2-moe&lt;/a&gt; that require passages to start with a string like &lt;code&gt;"search_document: "&lt;/code&gt;. &lt;a href="https://github.com/simonw/llm/issues/745"&gt;#745&lt;/a&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;/blockquote&gt;
&lt;p&gt;This was inspired by my initial experiments with &lt;a href="https://simonwillison.net/2025/Feb/12/nomic-embed-text-v2/"&gt;Nomic Embed Text V2 last week&lt;/a&gt;.&lt;/p&gt;
&lt;h4 id="everything-else"&gt;Everything else&lt;/h4&gt;
&lt;blockquote&gt;
&lt;ul&gt;
&lt;li&gt;The &lt;code&gt;response.json()&lt;/code&gt; and &lt;code&gt;response.usage()&lt;/code&gt; methods are &lt;a href="https://llm.datasette.io/en/stable/python-api.html#python-api-underlying-json"&gt;now documented&lt;/a&gt;.&lt;/li&gt;
&lt;/ul&gt;
&lt;/blockquote&gt;
&lt;p&gt;Someone asked a question about these methods online, which made me realize they weren't documented. I enjoy promptly turning questions like this into documentation!&lt;/p&gt;
&lt;blockquote&gt;
&lt;ul&gt;
&lt;li&gt;Fixed a bug where conversations that were loaded from the database could not be continued using &lt;code&gt;asyncio&lt;/code&gt; prompts. &lt;a href="https://github.com/simonw/llm/issues/742"&gt;#742&lt;/a&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;/blockquote&gt;
&lt;p&gt;This bug was reported by Romain Gehrig. It turned out not to be possible to execute a follow-up prompt in async mode if the previous conversation had been loaded from the database.&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;% llm 'hi' --async
Hello! How can I assist you today?
% llm 'now in french' --async -c
Error: 'async for' requires an object with __aiter__ method, got Response
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;I fixed the bug for the moment, but I'd like to make the whole mechanism of persisting and loading conversations from SQLite part of the documented and supported Python API - it's currently tucked away in CLI-specific internals which aren't safe for people to use in their own code.&lt;/p&gt;
&lt;blockquote&gt;
&lt;ul&gt;
&lt;li&gt;New plugin for macOS users: &lt;a href="https://github.com/simonw/llm-mlx"&gt;llm-mlx&lt;/a&gt;, which provides &lt;a href="https://simonwillison.net/2025/Feb/15/llm-mlx/"&gt;extremely high performance access&lt;/a&gt; to a wide range of local models using Apple's MLX framework.&lt;/li&gt;
&lt;/ul&gt;
&lt;/blockquote&gt;
&lt;p&gt;Technically not a part of the LLM 0.22 release, but I like using the release notes to help highlight significant new plugins and &lt;strong&gt;llm-mlx&lt;/strong&gt; is fast coming my new favorite way to run models on my own machine.&lt;/p&gt;


&lt;blockquote&gt;
&lt;ul&gt;
&lt;li&gt;The &lt;code&gt;llm-claude-3&lt;/code&gt; plugin has been renamed to &lt;a href="https://github.com/simonw/llm-anthropic"&gt;llm-anthropic&lt;/a&gt;.&lt;/li&gt;
&lt;/ul&gt;
&lt;/blockquote&gt;
&lt;p&gt;I wrote about this previously when I &lt;a href="https://simonwillison.net/2025/Feb/2/llm-anthropic/"&gt;announced llm-anthropic&lt;/a&gt;. The new name prepares me for a world in which Anthropic release models that aren't called Claude 3 or Claude 3.5!&lt;/p&gt;
    
        &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/cli"&gt;cli&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/projects"&gt;projects&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/ai"&gt;ai&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/annotated-release-notes"&gt;annotated-release-notes&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/openai"&gt;openai&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/generative-ai"&gt;generative-ai&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/chatgpt"&gt;chatgpt&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/llms"&gt;llms&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/llm"&gt;llm&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/anthropic"&gt;anthropic&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/gemini"&gt;gemini&lt;/a&gt;&lt;/p&gt;
    

</summary><category term="cli"/><category term="projects"/><category term="ai"/><category term="annotated-release-notes"/><category term="openai"/><category term="generative-ai"/><category term="chatgpt"/><category term="llms"/><category term="llm"/><category term="anthropic"/><category term="gemini"/></entry><entry><title>shot-scraper 1.6 with support for HTTP Archives</title><link href="https://simonwillison.net/2025/Feb/13/shot-scraper/#atom-tag" rel="alternate"/><published>2025-02-13T21:02:37+00:00</published><updated>2025-02-13T21:02:37+00:00</updated><id>https://simonwillison.net/2025/Feb/13/shot-scraper/#atom-tag</id><summary type="html">
    
&lt;p&gt;&lt;strong&gt;&lt;a href="https://github.com/simonw/shot-scraper/releases/tag/1.6"&gt;shot-scraper 1.6 with support for HTTP Archives&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;
New release of my &lt;a href="https://shot-scraper.datasette.io/"&gt;shot-scraper&lt;/a&gt; CLI tool for taking screenshots and scraping web pages.&lt;/p&gt;
&lt;p&gt;The big new feature is &lt;a href="https://en.wikipedia.org/wiki/HAR_(file_format)"&gt;HTTP Archive (HAR)&lt;/a&gt; support. The new &lt;a href="https://shot-scraper.datasette.io/en/stable/har.html"&gt;shot-scraper har command&lt;/a&gt; can now create an archive of a page and all of its dependents like this:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;shot-scraper har https://datasette.io/
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;This produces a &lt;code&gt;datasette-io.har&lt;/code&gt; file (currently 163KB) which is JSON representing the full set of requests used to render that page. Here's &lt;a href="https://gist.github.com/simonw/b1fdf434e460814efdb89c95c354f794"&gt;a copy of that file&lt;/a&gt;. You can visualize that &lt;a href="https://ericduran.github.io/chromeHAR/?url=https://gist.githubusercontent.com/simonw/b1fdf434e460814efdb89c95c354f794/raw/924c1eb12b940ff02cefa2cc068f23c9d3cc5895/datasette.har.json"&gt;here using ericduran.github.io/chromeHAR&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;&lt;img alt="The HAR viewer shows a line for each of the loaded resources, with options to view timing information" src="https://static.simonwillison.net/static/2025/har-viewer.jpg" /&gt;&lt;/p&gt;
&lt;p&gt;That JSON includes full copies of all of the responses, base64 encoded if they are binary files such as images.&lt;/p&gt;
&lt;p&gt;You can add the &lt;code&gt;--zip&lt;/code&gt; flag to instead get a &lt;code&gt;datasette-io.har.zip&lt;/code&gt; file, containing JSON data in &lt;code&gt;har.har&lt;/code&gt; but with the response bodies saved as separate files in that archive.&lt;/p&gt;
&lt;p&gt;The &lt;code&gt;shot-scraper multi&lt;/code&gt; command lets you run &lt;code&gt;shot-scraper&lt;/code&gt; against multiple URLs in sequence, specified using a YAML file. That command now takes a &lt;code&gt;--har&lt;/code&gt; option (or &lt;code&gt;--har-zip&lt;/code&gt; or &lt;code&gt;--har-file name-of-file)&lt;/code&gt;, &lt;a href="https://shot-scraper.datasette.io/en/stable/multi.html#recording-to-an-http-archive"&gt;described in the documentation&lt;/a&gt;, which will produce a HAR at the same time as taking the screenshots.&lt;/p&gt;
&lt;p&gt;Shots are usually defined in YAML that looks like this:&lt;/p&gt;
&lt;div class="highlight highlight-source-yaml"&gt;&lt;pre&gt;- &lt;span class="pl-ent"&gt;output&lt;/span&gt;: &lt;span class="pl-s"&gt;example.com.png&lt;/span&gt;
  &lt;span class="pl-ent"&gt;url&lt;/span&gt;: &lt;span class="pl-s"&gt;http://www.example.com/&lt;/span&gt;
- &lt;span class="pl-ent"&gt;output&lt;/span&gt;: &lt;span class="pl-s"&gt;w3c.org.png&lt;/span&gt;
  &lt;span class="pl-ent"&gt;url&lt;/span&gt;: &lt;span class="pl-s"&gt;https://www.w3.org/&lt;/span&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;You can now omit the &lt;code&gt;output:&lt;/code&gt; keys and generate a HAR file without taking any screenshots at all:&lt;/p&gt;
&lt;div class="highlight highlight-source-yaml"&gt;&lt;pre&gt;- &lt;span class="pl-ent"&gt;url&lt;/span&gt;: &lt;span class="pl-s"&gt;http://www.example.com/&lt;/span&gt;
- &lt;span class="pl-ent"&gt;url&lt;/span&gt;: &lt;span class="pl-s"&gt;https://www.w3.org/&lt;/span&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;Run like this:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;shot-scraper multi shots.yml --har
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Which outputs:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;Skipping screenshot of 'https://www.example.com/'
Skipping screenshot of 'https://www.w3.org/'
Wrote to HAR file: trace.har
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;&lt;code&gt;shot-scraper&lt;/code&gt; is built on top of Playwright, and the new features use the &lt;a href="https://playwright.dev/python/docs/next/api/class-browser#browser-new-context-option-record-har-path"&gt;browser.new_context(record_har_path=...)&lt;/a&gt; parameter.


    &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/cli"&gt;cli&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/projects"&gt;projects&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/python"&gt;python&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/scraping"&gt;scraping&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/playwright"&gt;playwright&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/shot-scraper"&gt;shot-scraper&lt;/a&gt;&lt;/p&gt;



</summary><category term="cli"/><category term="projects"/><category term="python"/><category term="scraping"/><category term="playwright"/><category term="shot-scraper"/></entry><entry><title>LLM 0.20</title><link href="https://simonwillison.net/2025/Jan/23/llm-020/#atom-tag" rel="alternate"/><published>2025-01-23T04:55:16+00:00</published><updated>2025-01-23T04:55:16+00:00</updated><id>https://simonwillison.net/2025/Jan/23/llm-020/#atom-tag</id><summary type="html">
    
&lt;p&gt;&lt;strong&gt;&lt;a href="https://github.com/simonw/llm/releases/tag/0.20"&gt;LLM 0.20&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;
New release of my &lt;a href="https://llm.datasette.io/"&gt;LLM&lt;/a&gt; CLI tool and Python library. A bunch of accumulated fixes and features since the start of December, most notably:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Support for OpenAI's &lt;a href="https://platform.openai.com/docs/models#o1"&gt;o1 model&lt;/a&gt; - a significant upgrade from &lt;code&gt;o1-preview&lt;/code&gt; given its 200,000 input and 100,000 output tokens (&lt;code&gt;o1-preview&lt;/code&gt; was 128,000/32,768). &lt;a href="https://github.com/simonw/llm/issues/676"&gt;#676&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;Support for the &lt;code&gt;gpt-4o-audio-preview&lt;/code&gt; and &lt;code&gt;gpt-4o-mini-audio-preview&lt;/code&gt; models, which can accept audio input: &lt;code&gt;llm -m gpt-4o-audio-preview -a https://static.simonwillison.net/static/2024/pelican-joke-request.mp3&lt;/code&gt; &lt;a href="https://github.com/simonw/llm/issues/677"&gt;#677&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;A new &lt;code&gt;llm -x/--extract&lt;/code&gt; option which extracts and returns the contents of the first fenced code block in the response. This is useful for prompts that generate code. &lt;a href="https://github.com/simonw/llm/issues/681"&gt;#681&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;A new &lt;code&gt;llm models -q 'search'&lt;/code&gt; option for searching available models - useful if you've installed a lot of plugins. Searches are case insensitive. &lt;a href="https://github.com/simonw/llm/issues/700"&gt;#700&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;


    &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/cli"&gt;cli&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/projects"&gt;projects&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/ai"&gt;ai&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/annotated-release-notes"&gt;annotated-release-notes&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/openai"&gt;openai&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/generative-ai"&gt;generative-ai&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/llms"&gt;llms&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/llm"&gt;llm&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/o1"&gt;o1&lt;/a&gt;&lt;/p&gt;



</summary><category term="cli"/><category term="projects"/><category term="ai"/><category term="annotated-release-notes"/><category term="openai"/><category term="generative-ai"/><category term="llms"/><category term="llm"/><category term="o1"/></entry><entry><title>Building Python tools with a one-shot prompt using uv run and Claude Projects</title><link href="https://simonwillison.net/2024/Dec/19/one-shot-python-tools/#atom-tag" rel="alternate"/><published>2024-12-19T07:00:37+00:00</published><updated>2024-12-19T07:00:37+00:00</updated><id>https://simonwillison.net/2024/Dec/19/one-shot-python-tools/#atom-tag</id><summary type="html">
    &lt;p&gt;I've written a lot about how I've been using Claude to build one-shot HTML+JavaScript applications &lt;a href="https://simonwillison.net/tags/claude-artifacts/"&gt;via Claude Artifacts&lt;/a&gt;. I recently started using a similar pattern to create one-shot Python utilities, using a custom Claude Project combined with the dependency management capabilities of &lt;a href="https://github.com/astral-sh/uv"&gt;uv&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;(In LLM jargon a "one-shot" prompt is a prompt that produces the complete desired result on the first attempt. Confusingly it also sometimes means a prompt that includes a single example of the desired output format. Here I'm using the first of those two definitions.)&lt;/p&gt;
&lt;p&gt;I'll start with an example of a tool I built that way.&lt;/p&gt;
&lt;p&gt;I had another round of battle with Amazon S3 today trying to figure out why a file in one of my buckets couldn't be accessed via a public URL.&lt;/p&gt;
&lt;p&gt;Out of frustration I prompted Claude with a variant of the following (&lt;a href="https://gist.github.com/simonw/9f69cf35889b0445b80eeed691d44504"&gt;full transcript here&lt;/a&gt;):&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;code&gt;I can't access the file at EXAMPLE_S3_URL. Write me a Python CLI tool using Click and boto3 which takes a URL of that form and then uses EVERY single boto3 trick in the book to try and debug why the file is returning a 404&lt;/code&gt;&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;It wrote me &lt;a href="https://github.com/simonw/tools/blob/main/python/debug_s3_access.py"&gt;this script&lt;/a&gt;, which gave me exactly what I needed. I ran it like this:&lt;/p&gt;
&lt;div class="highlight highlight-source-shell"&gt;&lt;pre&gt;uv run debug_s3_access.py \
  https://test-public-bucket-simonw.s3.us-east-1.amazonaws.com/0f550b7b28264d7ea2b3d360e3381a95.jpg&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;&lt;img src="https://static.simonwillison.net/static/2024/debug-s3.jpg" alt="Terminal screenshot showing S3 access analysis results. Command: '$ uv run http://tools.simonwillison.net/python/debug_s3_access.py url-to-image' followed by detailed output showing bucket exists (Yes), region (default), key exists (Yes), bucket policy (AllowAllGetObject), bucket owner (swillison), versioning (Not enabled), content type (image/jpeg), size (71683 bytes), last modified (2024-12-19 03:43:30+00:00) and public access settings (all False)" style="max-width: 100%;" /&gt;&lt;/p&gt;
&lt;p&gt;You can &lt;a href="https://github.com/simonw/tools/tree/main/python#debug_s3_accesspy"&gt;see the text output here&lt;/a&gt;.&lt;/p&gt;
&lt;h4 id="inline-dependencies-and-uv-run"&gt;Inline dependencies and uv run&lt;/h4&gt;
&lt;p&gt;Crucially, I didn't have to take any extra steps to install any of the dependencies that the script needed. That's because the script starts with this magic comment:&lt;/p&gt;
&lt;pre&gt;&lt;span class="pl-c"&gt;# /// script&lt;/span&gt;
&lt;span class="pl-c"&gt;# requires-python = "&amp;gt;=3.12"&lt;/span&gt;
&lt;span class="pl-c"&gt;# dependencies = [&lt;/span&gt;
&lt;span class="pl-c"&gt;#     "click",&lt;/span&gt;
&lt;span class="pl-c"&gt;#     "boto3",&lt;/span&gt;
&lt;span class="pl-c"&gt;#     "urllib3",&lt;/span&gt;
&lt;span class="pl-c"&gt;#     "rich",&lt;/span&gt;
&lt;span class="pl-c"&gt;# ]&lt;/span&gt;
&lt;span class="pl-c"&gt;# ///&lt;/span&gt;&lt;/pre&gt;
&lt;p&gt;This is an example of &lt;a href="https://docs.astral.sh/uv/guides/scripts/#declaring-script-dependencies"&gt;inline script dependencies&lt;/a&gt;, a feature described in &lt;a href="https://peps.python.org/pep-0723/"&gt;PEP 723&lt;/a&gt; and implemented by &lt;code&gt;uv run&lt;/code&gt;. Running the script causes &lt;code&gt;uv&lt;/code&gt; to create a temporary virtual environment with those dependencies installed, a process that takes just a few milliseconds once the &lt;code&gt;uv&lt;/code&gt; cache has been populated.&lt;/p&gt;
&lt;p&gt;This even works if the script is specified by a URL! Anyone with &lt;code&gt;uv&lt;/code&gt; installed can run the following command (provided you trust me not to have replaced the script with something malicious) to debug one of their own S3 buckets:&lt;/p&gt;
&lt;div class="highlight highlight-source-shell"&gt;&lt;pre&gt;uv run http://tools.simonwillison.net/python/debug_s3_access.py \
  https://test-public-bucket-simonw.s3.us-east-1.amazonaws.com/0f550b7b28264d7ea2b3d360e3381a95.jpg&lt;/pre&gt;&lt;/div&gt;
&lt;h4 id="writing-these-with-the-help-of-a-claude-project"&gt;Writing these with the help of a Claude Project&lt;/h4&gt;
&lt;p&gt;The reason I can one-shot scripts like this now is that I've set up a &lt;a href="https://www.anthropic.com/news/projects"&gt;Claude Project&lt;/a&gt; called "Python app". Projects can have custom instructions, and I used those to "teach" Claude how to take advantage of inline script dependencies:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;You write Python tools as single files. They always start with this comment:&lt;/p&gt;
&lt;pre&gt;&lt;span&gt;# /// script&lt;/span&gt;
&lt;span&gt;# requires-python = "&amp;gt;=3.12"&lt;/span&gt;
&lt;span&gt;# ///&lt;/span&gt;&lt;/pre&gt;
&lt;p&gt;These files can include dependencies on libraries such as Click. If they do, those dependencies are included in a list like this one in that same comment (here showing two dependencies):&lt;/p&gt;
&lt;pre&gt;&lt;span&gt;# /// script&lt;/span&gt;
&lt;span&gt;# requires-python = "&amp;gt;=3.12"&lt;/span&gt;
&lt;span&gt;# dependencies = [&lt;/span&gt;
&lt;span&gt;#     "click",&lt;/span&gt;
&lt;span&gt;#     "sqlite-utils",&lt;/span&gt;
&lt;span&gt;# ]&lt;/span&gt;
&lt;span&gt;# ///&lt;/span&gt;&lt;/pre&gt;
&lt;/blockquote&gt;
&lt;p&gt;That's everything Claude needs to reliably knock out full-featured Python tools as single scripts which can be run directly using whatever dependencies Claude chose to include.&lt;/p&gt;
&lt;p&gt;I didn't suggest that Claude use &lt;a href="https://github.com/Textualize/rich"&gt;rich&lt;/a&gt; for the &lt;code&gt;debug_s3_access.py&lt;/code&gt; script earlier but it decided to use it anyway!&lt;/p&gt;
&lt;p&gt;I've only recently started experimenting with this pattern but it seems to work &lt;em&gt;really&lt;/em&gt; well. Here's another example - my prompt was:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;code&gt;Starlette web app that provides an API where you pass in ?url= and it strips all HTML tags and returns just the text, using beautifulsoup&lt;/code&gt;&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;Here's &lt;a href="https://gist.github.com/simonw/08957a1490ebde1ea38b4a8374989cf8"&gt;the chat transcript&lt;/a&gt; and &lt;a href="https://gist.githubusercontent.com/simonw/08957a1490ebde1ea38b4a8374989cf8/raw/143ee24dc65ca109b094b72e8b8c494369e763d6/strip_html.py"&gt;the raw code it produced&lt;/a&gt;. You can run that server directly on your machine (it uses port 8000) like this:&lt;/p&gt;
&lt;div class="highlight highlight-source-shell"&gt;&lt;pre&gt;uv run https://gist.githubusercontent.com/simonw/08957a1490ebde1ea38b4a8374989cf8/raw/143ee24dc65ca109b094b72e8b8c494369e763d6/strip_html.py&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;Then visit &lt;code&gt;http://127.0.0.1:8000/?url=https://simonwillison.net/&lt;/code&gt; to see it in action.&lt;/p&gt;
&lt;h4 id="custom-instructions"&gt;Custom instructions&lt;/h4&gt;
&lt;p&gt;The pattern here that's most interesting to me is using custom instructions or system prompts to show LLMs how to implement new patterns that may not exist in their training data. &lt;code&gt;uv run&lt;/code&gt; is less than a year old, but providing just a short example is enough to get the models to write code that takes advantage of its capabilities.&lt;/p&gt;
&lt;p&gt;I have a similar set of custom instructions I use for creating single page HTML and JavaScript tools, again running in a Claude Project:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;Never use React in artifacts - always plain HTML and vanilla JavaScript and CSS with minimal dependencies.&lt;/p&gt;
&lt;p&gt;CSS should be indented with two spaces and should start like this:&lt;/p&gt;
&lt;div class="highlight highlight-text-html-basic"&gt;&lt;pre&gt;&lt;span class="pl-kos"&gt;&amp;lt;&lt;/span&gt;&lt;span class="pl-ent"&gt;style&lt;/span&gt;&lt;span class="pl-kos"&gt;&amp;gt;&lt;/span&gt;
* {
  box-sizing: border-box;
}&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;Inputs and textareas should be font size 16px. Font should always prefer Helvetica.&lt;/p&gt;
&lt;p&gt;JavaScript should be two space indents and start like this:&lt;/p&gt;
&lt;div class="highlight highlight-text-html-basic"&gt;&lt;pre&gt;&lt;span class="pl-kos"&gt;&amp;lt;&lt;/span&gt;&lt;span class="pl-ent"&gt;script&lt;/span&gt; &lt;span class="pl-c1"&gt;type&lt;/span&gt;="&lt;span class="pl-s"&gt;module&lt;/span&gt;"&lt;span class="pl-kos"&gt;&amp;gt;&lt;/span&gt;
// code in here should not be indented at the first level&lt;/pre&gt;&lt;/div&gt;
&lt;/blockquote&gt;
&lt;p&gt;Most of the tools on my &lt;a href="https://tools.simonwillison.net/"&gt;tools.simonwillison.net&lt;/a&gt; site were created using versions of this custom instructions prompt.&lt;/p&gt;
    
        &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/aws"&gt;aws&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/cli"&gt;cli&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/python"&gt;python&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/s3"&gt;s3&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/ai"&gt;ai&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/prompt-engineering"&gt;prompt-engineering&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/generative-ai"&gt;generative-ai&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/llms"&gt;llms&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/ai-assisted-programming"&gt;ai-assisted-programming&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/claude"&gt;claude&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/claude-artifacts"&gt;claude-artifacts&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/uv"&gt;uv&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/rich"&gt;rich&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/prompt-to-app"&gt;prompt-to-app&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/starlette"&gt;starlette&lt;/a&gt;&lt;/p&gt;
    

</summary><category term="aws"/><category term="cli"/><category term="python"/><category term="s3"/><category term="ai"/><category term="prompt-engineering"/><category term="generative-ai"/><category term="llms"/><category term="ai-assisted-programming"/><category term="claude"/><category term="claude-artifacts"/><category term="uv"/><category term="rich"/><category term="prompt-to-app"/><category term="starlette"/></entry><entry><title>"Rules" that terminal programs follow</title><link href="https://simonwillison.net/2024/Dec/12/rules-that-terminal-programs-follow/#atom-tag" rel="alternate"/><published>2024-12-12T20:37:07+00:00</published><updated>2024-12-12T20:37:07+00:00</updated><id>https://simonwillison.net/2024/Dec/12/rules-that-terminal-programs-follow/#atom-tag</id><summary type="html">
    
&lt;p&gt;&lt;strong&gt;&lt;a href="https://jvns.ca/blog/2024/11/26/terminal-rules/"&gt;&amp;quot;Rules&amp;quot; that terminal programs follow&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;
Julia Evans wrote down the unwritten rules of terminal programs. Lots of details in here I hadn’t fully understood before, like REPL programs that exit only if you hit Ctrl+D on an empty line.


    &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/cli"&gt;cli&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/julia-evans"&gt;julia-evans&lt;/a&gt;&lt;/p&gt;



</summary><category term="cli"/><category term="julia-evans"/></entry><entry><title>LLM 0.19</title><link href="https://simonwillison.net/2024/Dec/1/llm-019/#atom-tag" rel="alternate"/><published>2024-12-01T23:59:45+00:00</published><updated>2024-12-01T23:59:45+00:00</updated><id>https://simonwillison.net/2024/Dec/1/llm-019/#atom-tag</id><summary type="html">
    
&lt;p&gt;&lt;strong&gt;&lt;a href="https://llm.datasette.io/en/stable/changelog.html#v0-19"&gt;LLM 0.19&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;
I just released version 0.19 of &lt;a href="https://llm.datasette.io/"&gt;LLM&lt;/a&gt;, my Python library and CLI utility for working with Large Language Models.&lt;/p&gt;
&lt;p&gt;I released 0.18 &lt;a href="https://simonwillison.net/2024/Nov/17/llm-018/"&gt;a couple of weeks ago&lt;/a&gt; adding support for calling models from Python &lt;code&gt;asyncio&lt;/code&gt; code. 0.19 improves on that, and also adds a new mechanism for models to report their token usage.&lt;/p&gt;
&lt;p&gt;LLM can log those usage numbers to a SQLite database, or make then available to custom Python code.&lt;/p&gt;
&lt;p&gt;My eventual goal with these features is to implement token accounting as a Datasette plugin so I can offer AI features in my SaaS platform without worrying about customers spending unlimited LLM tokens.&lt;/p&gt;
&lt;p&gt;Those 0.19 release notes in full:&lt;/p&gt;
&lt;blockquote&gt;
&lt;ul&gt;
&lt;li&gt;Tokens used by a response are now logged to new &lt;code&gt;input_tokens&lt;/code&gt; and &lt;code&gt;output_tokens&lt;/code&gt; integer columns and a &lt;code&gt;token_details&lt;/code&gt; JSON string column, for the default OpenAI models and models from other plugins that &lt;a href="https://llm.datasette.io/en/stable/plugins/advanced-model-plugins.html#advanced-model-plugins-usage"&gt;implement this feature&lt;/a&gt;. &lt;a href="https://github.com/simonw/llm/issues/610"&gt;#610&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;code&gt;llm prompt&lt;/code&gt; now takes a &lt;code&gt;-u/--usage&lt;/code&gt; flag to display token usage at the end of the response.&lt;/li&gt;
&lt;li&gt;&lt;code&gt;llm logs -u/--usage&lt;/code&gt; shows token usage information for logged responses.&lt;/li&gt;
&lt;li&gt;&lt;code&gt;llm prompt ... --async&lt;/code&gt; responses are now logged to the database. &lt;a href="https://github.com/simonw/llm/issues/641"&gt;#641&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;code&gt;llm.get_models()&lt;/code&gt; and &lt;code&gt;llm.get_async_models()&lt;/code&gt; functions, &lt;a href="https://llm.datasette.io/en/stable/python-api.html#python-api-listing-models"&gt;documented here&lt;/a&gt;. &lt;a href="https://github.com/simonw/llm/issues/640"&gt;#640&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;code&gt;response.usage()&lt;/code&gt; and async response &lt;code&gt;await response.usage()&lt;/code&gt; methods, returning a &lt;code&gt;Usage(input=2, output=1, details=None)&lt;/code&gt; dataclass. &lt;a href="https://github.com/simonw/llm/issues/644"&gt;#644&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;code&gt;response.on_done(callback)&lt;/code&gt; and &lt;code&gt;await response.on_done(callback)&lt;/code&gt; methods for specifying a callback to be executed when a response has completed, &lt;a href="https://llm.datasette.io/en/stable/python-api.html#python-api-response-on-done"&gt;documented here&lt;/a&gt;. &lt;a href="https://github.com/simonw/llm/issues/653"&gt;#653&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;Fix for bug running &lt;code&gt;llm chat&lt;/code&gt; on Windows 11. Thanks, &lt;a href="https://github.com/sukhbinder"&gt;Sukhbinder Singh&lt;/a&gt;. &lt;a href="https://github.com/simonw/llm/issues/495"&gt;#495&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/blockquote&gt;
&lt;p&gt;I also released three new plugin versions that add support for the new usage tracking feature: &lt;a href="https://github.com/simonw/llm-gemini/releases/tag/0.5"&gt;llm-gemini 0.5&lt;/a&gt;, &lt;a href="https://github.com/simonw/llm-claude-3/releases/tag/0.10"&gt;llm-claude-3 0.10&lt;/a&gt; and &lt;a href="https://github.com/simonw/llm-mistral/releases/tag/0.9"&gt;llm-mistral 0.9&lt;/a&gt;.


    &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/cli"&gt;cli&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/projects"&gt;projects&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/releasenotes"&gt;releasenotes&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/releases"&gt;releases&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/ai"&gt;ai&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/generative-ai"&gt;generative-ai&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/llms"&gt;llms&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/llm"&gt;llm&lt;/a&gt;&lt;/p&gt;



</summary><category term="cli"/><category term="projects"/><category term="releasenotes"/><category term="releases"/><category term="ai"/><category term="generative-ai"/><category term="llms"/><category term="llm"/></entry><entry><title>Ask questions of SQLite databases and CSV/JSON files in your terminal</title><link href="https://simonwillison.net/2024/Nov/25/ask-questions-of-sqlite/#atom-tag" rel="alternate"/><published>2024-11-25T01:33:03+00:00</published><updated>2024-11-25T01:33:03+00:00</updated><id>https://simonwillison.net/2024/Nov/25/ask-questions-of-sqlite/#atom-tag</id><summary type="html">
    &lt;p&gt;I built a new plugin for my &lt;a href="https://sqlite-utils.datasette.io/en/stable/cli.html"&gt;sqlite-utils CLI tool&lt;/a&gt; that lets you ask human-language questions directly of SQLite databases and CSV/JSON files on your computer.&lt;/p&gt;
&lt;p&gt;It's called &lt;a href="https://github.com/simonw/sqlite-utils-ask"&gt;sqlite-utils-ask&lt;/a&gt;. Here's how you install it:&lt;/p&gt;
&lt;div class="highlight highlight-source-shell"&gt;&lt;pre&gt;sqlite-utils install sqlite-utils-ask&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;It picks up API keys from an &lt;code&gt;OPENAI_API_KEY&lt;/code&gt; environment variable, or you can &lt;a href="https://llm.datasette.io/"&gt;install LLM&lt;/a&gt; and use &lt;a href="https://llm.datasette.io/en/stable/setup.html#saving-and-using-stored-keys"&gt;llm keys set openai&lt;/a&gt; to store a key in a configuration file.&lt;/p&gt;
&lt;p&gt;Then you can use it like this:&lt;/p&gt;
&lt;div class="highlight highlight-source-shell"&gt;&lt;pre&gt;curl -O https://datasette.io/content.db
sqlite-utils ask content.db &lt;span class="pl-s"&gt;&lt;span class="pl-pds"&gt;"&lt;/span&gt;how many sqlite-utils pypi downloads in 2024?&lt;span class="pl-pds"&gt;"&lt;/span&gt;&lt;/span&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;This command will extract the SQL schema for the provided database file, send that through an LLM along with your question, get back a SQL query and attempt to run it to derive a result.&lt;/p&gt;
&lt;p&gt;If all goes well it spits out an answer something like this:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;SELECT SUM(downloads)
FROM stats
WHERE package = 'sqlite-utils' AND date &amp;gt;= '2024-01-01' AND date &amp;lt; '2025-01-01';

[
    {
        "SUM(downloads)": 4300221
    }
]
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;If the SQL query fails to execute (due to a syntax error of some kind) it passes that error back to the model for corrections and retries up to three times before giving up.&lt;/p&gt;
&lt;p&gt;Add &lt;code&gt;-v/--verbose&lt;/code&gt; to see the exact prompt it's using:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;System prompt:
You will be given a SQLite schema followed by a question. Generate a single SQL
query to answer that question. Return that query in a ```sql ... ```
fenced code block.

Example: How many repos are there?
Answer:
```sql
select count(*) from repos
```

Prompt:
...
CREATE TABLE [stats] (
   [package] TEXT,
   [date] TEXT,
   [downloads] INTEGER,
   PRIMARY KEY ([package], [date])
);
...
how many sqlite-utils pypi downloads in 2024?
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;I've truncated the above to just the relevant table - it actually includes the full schema of every table in that database.&lt;/p&gt;
&lt;p&gt;By default, the tool sends just that database schema and your question to the LLM. If you add the &lt;code&gt;-e/--examples&lt;/code&gt; option it will also include five common values for each of the text columns in that schema with an average length less than 32 characters. This can sometimes help get a better result, for example sending values "CA" and "FL" and "TX" for a &lt;code&gt;state&lt;/code&gt; column can tip the model of that it should use state abbreviations rather than full names in its queries.&lt;/p&gt;
&lt;h4 id="ask-files"&gt;Asking questions of CSV and JSON data&lt;/h4&gt;
&lt;p&gt;The core &lt;code&gt;sqlite-utils&lt;/code&gt; CLI usually works against SQLite files directly, but three years ago I added the ability to run SQL queries against CSV and JSON files directly with the &lt;a href="https://simonwillison.net/2021/Jun/19/sqlite-utils-memory/"&gt;sqlite-utils memory&lt;/a&gt; command. This works by loading that data into an in-memory SQLite database before executing a SQL query.&lt;/p&gt;
&lt;p&gt;I decided to reuse that mechanism to enable LLM prompts against CSV and JSON data directly as well.&lt;/p&gt;
&lt;p&gt;The &lt;code&gt;sqlite-utils ask-files&lt;/code&gt; command looks like this:&lt;/p&gt;
&lt;div class="highlight highlight-source-shell"&gt;&lt;pre&gt;sqlite-utils ask-files transactions.csv &lt;span class="pl-s"&gt;&lt;span class="pl-pds"&gt;"&lt;/span&gt;total sales by year&lt;span class="pl-pds"&gt;"&lt;/span&gt;&lt;/span&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;This command accepts one or more files, and you can provide a mix of CSV, TSV and JSON. Each provided file will be imported into a different table, allowing the model to construct join queries where necessary.&lt;/p&gt;
&lt;h4 id="implementation-notes"&gt;Implementation notes&lt;/h4&gt;
&lt;p&gt;The core of the plugin is implemented as around &lt;a href="https://github.com/simonw/sqlite-utils-ask/blob/0.2/sqlite_utils_ask.py"&gt;250 lines of Python&lt;/a&gt;, using the &lt;code&gt;sqlite-utils&lt;/code&gt; &lt;a href="https://sqlite-utils.datasette.io/en/stable/plugins.html#register-commands-cli"&gt;register_commands()&lt;/a&gt; plugin hook to add the &lt;code&gt;ask&lt;/code&gt; and &lt;code&gt;ask-files&lt;/code&gt; commands.&lt;/p&gt;
&lt;p&gt;It adds &lt;a href="https://llm.datasette.io/"&gt;LLM&lt;/a&gt; as a dependency, and takes advantage of LLM's &lt;a href="https://llm.datasette.io/en/stable/python-api.html"&gt;Python API&lt;/a&gt; to abstract over the details of talking to the models. This means &lt;code&gt;sqlite-utils-ask&lt;/code&gt; can use any of the models supported by LLM or its plugins - if you want to run your prompt through Claude 3.5 Sonnet you can do this:&lt;/p&gt;
&lt;div class="highlight highlight-source-shell"&gt;&lt;pre&gt;sqlite-utils install llm-claude-3
sqlite-utils ask content.db &lt;span class="pl-s"&gt;&lt;span class="pl-pds"&gt;"&lt;/span&gt;count rows in news table&lt;span class="pl-pds"&gt;"&lt;/span&gt;&lt;/span&gt; -m claude-3.5-sonnet&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;The plugin defaults to &lt;a href="https://openai.com/index/gpt-4o-mini-advancing-cost-efficient-intelligence/"&gt;gpt-4o-mini&lt;/a&gt; initially to take advantage of that model's automatic prompt caching: if you run multiple questions against the same schema you'll end up sending the same lengthy prompt prefix multiple times, and OpenAI's prompt caching should automatically kick in and provide a 50% discount on those input tokens.&lt;/p&gt;
&lt;p&gt;Then I ran the actual numbers and found that &lt;code&gt;gpt-4o-mini&lt;/code&gt; is cheap enough that even without caching a 4,000 token prompt (that's a pretty large SQL schema) should cost less than a tenth of a cent. So those caching savings aren't worth anything at all!&lt;/p&gt;
    
        &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/cli"&gt;cli&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/plugins"&gt;plugins&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/projects"&gt;projects&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/sqlite"&gt;sqlite&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/ai"&gt;ai&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/sqlite-utils"&gt;sqlite-utils&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/generative-ai"&gt;generative-ai&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/llms"&gt;llms&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/ai-assisted-programming"&gt;ai-assisted-programming&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/llm"&gt;llm&lt;/a&gt;&lt;/p&gt;
    

</summary><category term="cli"/><category term="plugins"/><category term="projects"/><category term="sqlite"/><category term="ai"/><category term="sqlite-utils"/><category term="generative-ai"/><category term="llms"/><category term="ai-assisted-programming"/><category term="llm"/></entry><entry><title>Docling</title><link href="https://simonwillison.net/2024/Nov/3/docling/#atom-tag" rel="alternate"/><published>2024-11-03T04:57:56+00:00</published><updated>2024-11-03T04:57:56+00:00</updated><id>https://simonwillison.net/2024/Nov/3/docling/#atom-tag</id><summary type="html">
    
&lt;p&gt;&lt;strong&gt;&lt;a href="https://ds4sd.github.io/docling/"&gt;Docling&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;
MIT licensed document extraction Python library from the Deep Search team at IBM, who released &lt;a href="https://ds4sd.github.io/docling/v2/#changes-in-docling-v2"&gt;Docling v2&lt;/a&gt; on October 16th.&lt;/p&gt;
&lt;p&gt;Here's the &lt;a href="https://arxiv.org/abs/2408.09869"&gt;Docling Technical Report&lt;/a&gt; paper from August, which provides details of two custom models: a layout analysis model for figuring out the structure of the document (sections, figures, text, tables etc) and a TableFormer model specifically for extracting structured data from tables.&lt;/p&gt;
&lt;p&gt;Those models are &lt;a href="https://huggingface.co/ds4sd/docling-models"&gt;available on Hugging Face&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;Here's how to try out the Docling CLI interface using &lt;code&gt;uvx&lt;/code&gt; (avoiding the need to install it first - though since it downloads models it will take a while to run the first time):&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;uvx docling mydoc.pdf --to json --to md
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;This will output a &lt;code&gt;mydoc.json&lt;/code&gt; file with complex layout information and a &lt;code&gt;mydoc.md&lt;/code&gt; Markdown file which includes Markdown tables where appropriate.&lt;/p&gt;
&lt;p&gt;The &lt;a href="https://ds4sd.github.io/docling/usage/"&gt;Python API&lt;/a&gt; is a lot more comprehensive. It can even extract tables &lt;a href="https://ds4sd.github.io/docling/examples/export_tables/"&gt;as Pandas DataFrames&lt;/a&gt;:&lt;/p&gt;
&lt;pre&gt;&lt;span class="pl-k"&gt;from&lt;/span&gt; &lt;span class="pl-s1"&gt;docling&lt;/span&gt;.&lt;span class="pl-s1"&gt;document_converter&lt;/span&gt; &lt;span class="pl-k"&gt;import&lt;/span&gt; &lt;span class="pl-v"&gt;DocumentConverter&lt;/span&gt;
&lt;span class="pl-s1"&gt;converter&lt;/span&gt; &lt;span class="pl-c1"&gt;=&lt;/span&gt; &lt;span class="pl-v"&gt;DocumentConverter&lt;/span&gt;()
&lt;span class="pl-s1"&gt;result&lt;/span&gt; &lt;span class="pl-c1"&gt;=&lt;/span&gt; &lt;span class="pl-s1"&gt;converter&lt;/span&gt;.&lt;span class="pl-en"&gt;convert&lt;/span&gt;(&lt;span class="pl-s"&gt;"document.pdf"&lt;/span&gt;)
&lt;span class="pl-k"&gt;for&lt;/span&gt; &lt;span class="pl-s1"&gt;table&lt;/span&gt; &lt;span class="pl-c1"&gt;in&lt;/span&gt; &lt;span class="pl-s1"&gt;result&lt;/span&gt;.&lt;span class="pl-s1"&gt;document&lt;/span&gt;.&lt;span class="pl-s1"&gt;tables&lt;/span&gt;:
    &lt;span class="pl-s1"&gt;df&lt;/span&gt; &lt;span class="pl-c1"&gt;=&lt;/span&gt; &lt;span class="pl-s1"&gt;table&lt;/span&gt;.&lt;span class="pl-en"&gt;export_to_dataframe&lt;/span&gt;()
    &lt;span class="pl-en"&gt;print&lt;/span&gt;(&lt;span class="pl-s1"&gt;df&lt;/span&gt;)&lt;/pre&gt;

&lt;p&gt;I ran that inside &lt;code&gt;uv run --with docling python&lt;/code&gt;. It took a little while to run, but it demonstrated that the library works.


    &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/cli"&gt;cli&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/ibm"&gt;ibm&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/ocr"&gt;ocr&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/pdf"&gt;pdf&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/python"&gt;python&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/ai"&gt;ai&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/hugging-face"&gt;hugging-face&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/uv"&gt;uv&lt;/a&gt;&lt;/p&gt;



</summary><category term="cli"/><category term="ibm"/><category term="ocr"/><category term="pdf"/><category term="python"/><category term="ai"/><category term="hugging-face"/><category term="uv"/></entry><entry><title>You can now run prompts against images, audio and video in your terminal using LLM</title><link href="https://simonwillison.net/2024/Oct/29/llm-multi-modal/#atom-tag" rel="alternate"/><published>2024-10-29T15:09:38+00:00</published><updated>2024-10-29T15:09:38+00:00</updated><id>https://simonwillison.net/2024/Oct/29/llm-multi-modal/#atom-tag</id><summary type="html">
    &lt;p&gt;I released &lt;a href="https://llm.datasette.io/en/stable/changelog.html#v0-17"&gt;LLM 0.17&lt;/a&gt; last night, the latest version of my combined CLI tool and Python library for interacting with hundreds of different Large Language Models such as GPT-4o, Llama, Claude and Gemini.&lt;/p&gt;
&lt;p&gt;The signature feature of 0.17 is that LLM can now be used to prompt &lt;strong&gt;multi-modal models&lt;/strong&gt; - which means you can now use it to send images, audio and video files to LLMs that can handle them.&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;&lt;a href="https://simonwillison.net/2024/Oct/29/llm-multi-modal/#processing-an-image-with-gpt-4o-mini"&gt;Processing an image with gpt-4o-mini&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href="https://simonwillison.net/2024/Oct/29/llm-multi-modal/#using-a-plugin-to-run-audio-and-video-against-gemini"&gt;Using a plugin to run audio and video against Gemini&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href="https://simonwillison.net/2024/Oct/29/llm-multi-modal/#there-s-a-python-api-too"&gt;There's a Python API too&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href="https://simonwillison.net/2024/Oct/29/llm-multi-modal/#what-can-we-do-with-this-"&gt;What can we do with this?&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;h4 id="processing-an-image-with-gpt-4o-mini"&gt;Processing an image with gpt-4o-mini&lt;/h4&gt;
&lt;p&gt;Here's an example. First, &lt;a href="https://llm.datasette.io/en/stable/setup.html"&gt;install LLM&lt;/a&gt; - using &lt;code&gt;brew install llm&lt;/code&gt; or &lt;code&gt;pipx install llm&lt;/code&gt; or &lt;code&gt;uv tool install llm&lt;/code&gt;, pick your favourite. If you have it installed already you made need to upgrade to 0.17, e.g. with &lt;code&gt;brew upgrade llm&lt;/code&gt;.&lt;/p&gt;
&lt;p&gt;Obtain &lt;a href="https://platform.openai.com/api-keys"&gt;an OpenAI key&lt;/a&gt; (or an alternative, see below) and provide it to the tool:&lt;/p&gt;
&lt;div class="highlight highlight-source-shell"&gt;&lt;pre&gt;llm keys &lt;span class="pl-c1"&gt;set&lt;/span&gt; openai
&lt;span class="pl-c"&gt;&lt;span class="pl-c"&gt;#&lt;/span&gt; paste key here&lt;/span&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;And now you can start running prompts against images.&lt;/p&gt;
&lt;div class="highlight highlight-source-shell"&gt;&lt;pre&gt;llm &lt;span class="pl-s"&gt;&lt;span class="pl-pds"&gt;'&lt;/span&gt;describe this image&lt;span class="pl-pds"&gt;'&lt;/span&gt;&lt;/span&gt; \
  -a https://static.simonwillison.net/static/2024/pelican.jpg&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;The &lt;code&gt;-a&lt;/code&gt; option stands for &lt;code&gt;--attachment&lt;/code&gt;. Attachments can be specified as URLs, as paths to files on disk or as &lt;code&gt;-&lt;/code&gt; to read from data piped into the tool.&lt;/p&gt;
&lt;p&gt;The above example uses the default model, &lt;code&gt;gpt-4o-mini&lt;/code&gt;. I got back this:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;The image features a brown pelican standing on rocky terrain near a body of water. The pelican has a distinct coloration, with dark feathers on its body and a lighter-colored head. Its long bill is characteristic of the species, and it appears to be looking out towards the water. In the background, there are boats, suggesting a marina or coastal area. The lighting indicates it may be a sunny day, enhancing the scene's natural beauty.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;Here's that image:&lt;/p&gt;
&lt;p&gt;&lt;img src="https://static.simonwillison.net/static/2024/pelican.jpg" alt="A photograph of a fine looking pelican in the marina" style="max-width: 100%;" /&gt;&lt;/p&gt;
&lt;p&gt;You can run &lt;code&gt;llm logs --json -c&lt;/code&gt; for a hint of how much that cost:&lt;/p&gt;
&lt;div class="highlight highlight-source-json"&gt;&lt;pre&gt;      &lt;span class="pl-ent"&gt;"usage"&lt;/span&gt;: {
        &lt;span class="pl-ent"&gt;"completion_tokens"&lt;/span&gt;: &lt;span class="pl-c1"&gt;89&lt;/span&gt;,
        &lt;span class="pl-ent"&gt;"prompt_tokens"&lt;/span&gt;: &lt;span class="pl-c1"&gt;14177&lt;/span&gt;,
        &lt;span class="pl-ent"&gt;"total_tokens"&lt;/span&gt;: &lt;span class="pl-c1"&gt;14266&lt;/span&gt;,&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;Using &lt;a href="https://tools.simonwillison.net/llm-prices"&gt;my LLM pricing calculator&lt;/a&gt; that came to 0.218 cents - less than a quarter of a cent.&lt;/p&gt;
&lt;p&gt;Let's run that again with &lt;code&gt;gpt-4o&lt;/code&gt;. Add &lt;code&gt;-m gpt-4o&lt;/code&gt; to specify the model:&lt;/p&gt;
&lt;div class="highlight highlight-source-shell"&gt;&lt;pre&gt;llm &lt;span class="pl-s"&gt;&lt;span class="pl-pds"&gt;'&lt;/span&gt;describe this image&lt;span class="pl-pds"&gt;'&lt;/span&gt;&lt;/span&gt; \
  -a https://static.simonwillison.net/static/2024/pelican.jpg \
  -m gpt-4o&lt;/pre&gt;&lt;/div&gt;
&lt;blockquote&gt;
&lt;p&gt;The image shows a pelican standing on rocks near a body of water. The bird has a large, long bill and predominantly gray feathers with a lighter head and neck. In the background, there is a docked boat, giving the impression of a marina or harbor setting. The lighting suggests it might be sunny, highlighting the pelican's features.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;That time it cost 435 prompt tokens (GPT-4o mini charges higher tokens per image than GPT-4o) and the total was 0.1787 cents.&lt;/p&gt;
&lt;h4 id="using-a-plugin-to-run-audio-and-video-against-gemini"&gt;Using a plugin to run audio and video against Gemini&lt;/h4&gt;
&lt;p&gt;Models in LLM are defined by &lt;a href="https://llm.datasette.io/en/stable/plugins/index.html"&gt;plugins&lt;/a&gt;. The application ships with a &lt;a href="https://github.com/simonw/llm/blob/0.17/llm/default_plugins/openai_models.py"&gt;default OpenAI plugin&lt;/a&gt; to get people started, but there are dozens of &lt;a href="https://llm.datasette.io/en/stable/plugins/directory.html"&gt;other plugins&lt;/a&gt; providing access to different models, including models that can run directly on your own device.&lt;/p&gt;
&lt;p&gt;Plugins need to be upgraded to add support for multi-modal input - here's &lt;a href="https://llm.datasette.io/en/stable/plugins/advanced-model-plugins.html"&gt;documentation on how to do that&lt;/a&gt;. I've shipped three plugins with support for multi-modal attachments so far: &lt;a href="https://github.com/simonw/llm-gemini"&gt;llm-gemini&lt;/a&gt;, &lt;a href="https://github.com/simonw/llm-claude-3"&gt;llm-claude-3&lt;/a&gt; and &lt;a href="https://github.com/simonw/llm-mistral"&gt;llm-mistral&lt;/a&gt; (for Pixtral).&lt;/p&gt;
&lt;p&gt;So far these are all remote API plugins. It's definitely possible to build a plugin that runs attachments through local models but I haven't got one of those into good enough condition to release just yet.&lt;/p&gt;
&lt;p&gt;The Google Gemini series are my favourite multi-modal models right now due to the size and breadth of content they support. Gemini models can handle images, audio &lt;em&gt;and&lt;/em&gt; video!&lt;/p&gt;
&lt;p&gt;Let's try that out. Start by installing &lt;code&gt;llm-gemini&lt;/code&gt;:&lt;/p&gt;
&lt;div class="highlight highlight-source-shell"&gt;&lt;pre&gt;llm install llm-gemini&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;Obtain a &lt;a href="https://aistudio.google.com/app/apikey"&gt;Gemini API key&lt;/a&gt;. These include a &lt;em&gt;free tier&lt;/em&gt;, so you can get started without needing to spend any money. Paste that in here:&lt;/p&gt;
&lt;div class="highlight highlight-source-shell"&gt;&lt;pre&gt;llm keys &lt;span class="pl-c1"&gt;set&lt;/span&gt; gemini
&lt;span class="pl-c"&gt;&lt;span class="pl-c"&gt;#&lt;/span&gt; paste key here&lt;/span&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;The three Gemini 1.5 models are called Pro, Flash and Flash-8B. Let's try it with Pro:&lt;/p&gt;
&lt;div class="highlight highlight-source-shell"&gt;&lt;pre&gt;llm &lt;span class="pl-s"&gt;&lt;span class="pl-pds"&gt;'&lt;/span&gt;describe this image&lt;span class="pl-pds"&gt;'&lt;/span&gt;&lt;/span&gt; \
  -a https://static.simonwillison.net/static/2024/pelican.jpg \
  -m gemini-1.5-pro-latest&lt;/pre&gt;&lt;/div&gt;
&lt;blockquote&gt;
&lt;p&gt;A brown pelican stands on a rocky surface, likely a jetty or breakwater, with blurred boats in the background. The pelican is facing right, and its long beak curves downwards. Its plumage is primarily grayish-brown, with lighter feathers on its neck and breast. [...]&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;&lt;a href="https://gist.github.com/simonw/2f7ae62f37d99cf9588a6c36ba318be6"&gt;Very detailed&lt;/a&gt;!&lt;/p&gt;
&lt;p&gt;But let's do something a bit more interesting. I shared a 7m40s MP3 of a &lt;a href="https://simonwillison.net/2024/Oct/17/notebooklm-pelicans/"&gt;NotebookLM podcast&lt;/a&gt; a few weeks ago. Let's use Flash-8B - the cheapest Gemini model - to try and obtain a transcript.&lt;/p&gt;
&lt;div class="highlight highlight-source-shell"&gt;&lt;pre&gt;llm &lt;span class="pl-s"&gt;&lt;span class="pl-pds"&gt;'&lt;/span&gt;transcript&lt;span class="pl-pds"&gt;'&lt;/span&gt;&lt;/span&gt; \
  -a https://static.simonwillison.net/static/2024/video-scraping-pelicans.mp3 \
  -m gemini-1.5-flash-8b-latest&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;It worked!&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;Hey everyone, welcome back. You ever find yourself wading through mountains of data, trying to pluck out the juicy bits? It's like hunting for a single shrimp in a whole kelp forest, am I right? Oh, tell me about it. I swear, sometimes I feel like I'm gonna go cross-eyed from staring at spreadsheets all day. [...]&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;&lt;a href="https://gist.github.com/simonw/ab05cf3464534a3442e771148defa8e1"&gt;Full output here&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;Once again, &lt;code&gt;llm logs -c --json&lt;/code&gt; will show us the tokens used. Here it's 14754 prompt tokens and 1865 completion tokens. The pricing calculator says that adds up to... 0.0833 cents. Less than a tenth of a cent to transcribe a 7m40s audio clip.&lt;/p&gt;
&lt;h4 id="there-s-a-python-api-too"&gt;There's a Python API too&lt;/h4&gt;
&lt;p&gt;Here's what it looks like to execute multi-modal prompts with attachments using the &lt;a href="https://llm.datasette.io/en/stable/python-api.html"&gt;LLM Python library&lt;/a&gt;:&lt;/p&gt;
&lt;pre&gt;&lt;span class="pl-k"&gt;import&lt;/span&gt; &lt;span class="pl-s1"&gt;llm&lt;/span&gt;

&lt;span class="pl-s1"&gt;model&lt;/span&gt; &lt;span class="pl-c1"&gt;=&lt;/span&gt; &lt;span class="pl-s1"&gt;llm&lt;/span&gt;.&lt;span class="pl-en"&gt;get_model&lt;/span&gt;(&lt;span class="pl-s"&gt;"gpt-4o-mini"&lt;/span&gt;)
&lt;span class="pl-s1"&gt;response&lt;/span&gt; &lt;span class="pl-c1"&gt;=&lt;/span&gt; &lt;span class="pl-s1"&gt;model&lt;/span&gt;.&lt;span class="pl-en"&gt;prompt&lt;/span&gt;(
    &lt;span class="pl-s"&gt;"Describe these images"&lt;/span&gt;,
    &lt;span class="pl-s1"&gt;attachments&lt;/span&gt;&lt;span class="pl-c1"&gt;=&lt;/span&gt;[
        &lt;span class="pl-s1"&gt;llm&lt;/span&gt;.&lt;span class="pl-v"&gt;Attachment&lt;/span&gt;(&lt;span class="pl-s1"&gt;path&lt;/span&gt;&lt;span class="pl-c1"&gt;=&lt;/span&gt;&lt;span class="pl-s"&gt;"pelican.jpg"&lt;/span&gt;),
        &lt;span class="pl-s1"&gt;llm&lt;/span&gt;.&lt;span class="pl-v"&gt;Attachment&lt;/span&gt;(
            &lt;span class="pl-s1"&gt;url&lt;/span&gt;&lt;span class="pl-c1"&gt;=&lt;/span&gt;&lt;span class="pl-s"&gt;"https://static.simonwillison.net/static/2024/pelicans.jpg"&lt;/span&gt;
        ),
    ]
)&lt;/pre&gt;
&lt;p&gt;You can send multiple attachments with a single prompt, and both file paths and URLs are supported - or even binary content, using &lt;code&gt;llm.Attachment(content=b'binary goes here')&lt;/code&gt;.&lt;/p&gt;
&lt;p&gt;Any model plugin becomes available to Python with the same interface, making this LLM library a useful abstraction layer to try out the same prompts against many different models, both local and remote.&lt;/p&gt;
&lt;h4 id="what-can-we-do-with-this-"&gt;What can we do with this?&lt;/h4&gt;
&lt;p&gt;I've only had this working for a couple of days and the potential applications are somewhat dizzying. It's trivial to spin up a Bash script that can do things like generate &lt;code&gt;alt=&lt;/code&gt; text for every image in a directory, for example. Here's one &lt;a href="https://gist.github.com/simonw/a26046b0f9d74c46ee5af6eb47b73db9"&gt;Claude wrote just now&lt;/a&gt;:&lt;/p&gt;
&lt;div class="highlight highlight-source-shell"&gt;&lt;pre&gt;&lt;span class="pl-c"&gt;&lt;span class="pl-c"&gt;#!&lt;/span&gt;/bin/bash&lt;/span&gt;
&lt;span class="pl-k"&gt;for&lt;/span&gt; &lt;span class="pl-smi"&gt;img&lt;/span&gt; &lt;span class="pl-k"&gt;in&lt;/span&gt; &lt;span class="pl-k"&gt;*&lt;/span&gt;.{jpg,jpeg}&lt;span class="pl-k"&gt;;&lt;/span&gt; &lt;span class="pl-k"&gt;do&lt;/span&gt;
    &lt;span class="pl-k"&gt;if&lt;/span&gt; [ &lt;span class="pl-k"&gt;-f&lt;/span&gt; &lt;span class="pl-s"&gt;&lt;span class="pl-pds"&gt;"&lt;/span&gt;&lt;span class="pl-smi"&gt;$img&lt;/span&gt;&lt;span class="pl-pds"&gt;"&lt;/span&gt;&lt;/span&gt; ]&lt;span class="pl-k"&gt;;&lt;/span&gt; &lt;span class="pl-k"&gt;then&lt;/span&gt;
        output=&lt;span class="pl-s"&gt;&lt;span class="pl-pds"&gt;"&lt;/span&gt;&lt;span class="pl-smi"&gt;${img&lt;span class="pl-k"&gt;%&lt;/span&gt;.&lt;span class="pl-k"&gt;*&lt;/span&gt;}&lt;/span&gt;.txt&lt;span class="pl-pds"&gt;"&lt;/span&gt;&lt;/span&gt;
        llm -m gpt-4o-mini &lt;span class="pl-s"&gt;&lt;span class="pl-pds"&gt;'&lt;/span&gt;return just the alt text for this image&lt;span class="pl-pds"&gt;'&lt;/span&gt;&lt;/span&gt; &lt;span class="pl-s"&gt;&lt;span class="pl-pds"&gt;"&lt;/span&gt;&lt;span class="pl-smi"&gt;$img&lt;/span&gt;&lt;span class="pl-pds"&gt;"&lt;/span&gt;&lt;/span&gt; &lt;span class="pl-k"&gt;&amp;gt;&lt;/span&gt; &lt;span class="pl-s"&gt;&lt;span class="pl-pds"&gt;"&lt;/span&gt;&lt;span class="pl-smi"&gt;$output&lt;/span&gt;&lt;span class="pl-pds"&gt;"&lt;/span&gt;&lt;/span&gt;
    &lt;span class="pl-k"&gt;fi&lt;/span&gt;
&lt;span class="pl-k"&gt;done&lt;/span&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;On the &lt;a href="https://datasette.io/discord-llm"&gt;#llm Discord channel&lt;/a&gt; Drew Breunig suggested this one-liner:&lt;/p&gt;
&lt;div class="highlight highlight-source-shell"&gt;&lt;pre&gt;llm prompt -m gpt-4o &lt;span class="pl-s"&gt;&lt;span class="pl-pds"&gt;"&lt;/span&gt;&lt;/span&gt;
&lt;span class="pl-s"&gt;tell me if it's foggy in this image, reply on a scale from&lt;/span&gt;
&lt;span class="pl-s"&gt;1-10 with 10 being so foggy you can't see anything and 1&lt;/span&gt;
&lt;span class="pl-s"&gt;being clear enough to see the hills in the distance.&lt;/span&gt;
&lt;span class="pl-s"&gt;Only respond with a single number.&lt;span class="pl-pds"&gt;"&lt;/span&gt;&lt;/span&gt; \
  -a https://cameras.alertcalifornia.org/public-camera-data/Axis-Purisma1/latest-frame.jpg&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;That URL is to &lt;a href="https://cameras.alertcalifornia.org/public-camera-data/Axis-Purisma1/latest-frame.jpg"&gt;a live webcam feed&lt;/a&gt;, so here's an instant GPT-4o vision powered weather report!&lt;/p&gt;
&lt;p&gt;We can have &lt;em&gt;so much fun&lt;/em&gt; with this stuff.&lt;/p&gt;
&lt;p&gt;All of the usual AI caveats apply: it can make mistakes, it can hallucinate, safety filters may kick in and refuse to transcribe audio based on the content. A &lt;em&gt;lot&lt;/em&gt; of work is needed to evaluate how well the models perform at different tasks. There's a lot still to explore here.&lt;/p&gt;
&lt;p&gt;But at 1/10th of a cent for 7 minutes of audio at least those explorations can be plentiful and inexpensive!&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Update 12th November 2024&lt;/strong&gt;: If you want to try running prompts against images using a local model that runs on your own machine you can now do so using &lt;a href="https://simonwillison.net/2024/Nov/13/ollama-llama-vision/"&gt;Ollama, llm-ollama and Llama 3.2 Vision&lt;/a&gt;.&lt;/p&gt;
    
        &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/cli"&gt;cli&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/projects"&gt;projects&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/ai"&gt;ai&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/openai"&gt;openai&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/generative-ai"&gt;generative-ai&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/local-llms"&gt;local-llms&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/llms"&gt;llms&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/llm"&gt;llm&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/anthropic"&gt;anthropic&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/claude"&gt;claude&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/mistral"&gt;mistral&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/gemini"&gt;gemini&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/vision-llms"&gt;vision-llms&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/llm-pricing"&gt;llm-pricing&lt;/a&gt;&lt;/p&gt;
    

</summary><category term="cli"/><category term="projects"/><category term="ai"/><category term="openai"/><category term="generative-ai"/><category term="local-llms"/><category term="llms"/><category term="llm"/><category term="anthropic"/><category term="claude"/><category term="mistral"/><category term="gemini"/><category term="vision-llms"/><category term="llm-pricing"/></entry><entry><title>python-imgcat</title><link href="https://simonwillison.net/2024/Oct/28/python-imgcat/#atom-tag" rel="alternate"/><published>2024-10-28T05:13:30+00:00</published><updated>2024-10-28T05:13:30+00:00</updated><id>https://simonwillison.net/2024/Oct/28/python-imgcat/#atom-tag</id><summary type="html">
    
&lt;p&gt;&lt;strong&gt;&lt;a href="https://github.com/wookayin/python-imgcat"&gt;python-imgcat&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;
I was &lt;a href="https://github.com/simonw/llm/issues/587#issuecomment-2440549543"&gt;investigating options&lt;/a&gt; for displaying images in a terminal window (for multi-modal logging output of &lt;a href="https://llm.datasette.io/"&gt;LLM&lt;/a&gt;) and I found this neat Python library for displaying images using iTerm 2.&lt;/p&gt;
&lt;p&gt;It includes a CLI tool, which means you can run it without installation using &lt;code&gt;uvx&lt;/code&gt; like this:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;uvx imgcat filename.png
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;&lt;img alt="Screenshot of an iTerm2 terminal window. I have run uvx imgcat output_4.png and an image is shown below that in the terminal of a slide from a FEMA deck about Tropical Storm Ian." src="https://static.simonwillison.net/static/2024/imgcat.jpg" /&gt;

    &lt;p&gt;&lt;small&gt;&lt;/small&gt;Via &lt;a href="https://github.com/Textualize/rich/discussions/384#discussioncomment-9821180"&gt;rich/discussions&lt;/a&gt;&lt;/small&gt;&lt;/p&gt;


    &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/cli"&gt;cli&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/python"&gt;python&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/llm"&gt;llm&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/uv"&gt;uv&lt;/a&gt;&lt;/p&gt;



</summary><category term="cli"/><category term="python"/><category term="llm"/><category term="uv"/></entry><entry><title>Run a prompt to generate and execute jq programs using llm-jq</title><link href="https://simonwillison.net/2024/Oct/27/llm-jq/#atom-tag" rel="alternate"/><published>2024-10-27T04:26:36+00:00</published><updated>2024-10-27T04:26:36+00:00</updated><id>https://simonwillison.net/2024/Oct/27/llm-jq/#atom-tag</id><summary type="html">
    &lt;p&gt;&lt;a href="https://github.com/simonw/llm-jq"&gt;llm-jq&lt;/a&gt; is a brand new plugin for &lt;a href="https://llm.datasette.io/"&gt;LLM&lt;/a&gt; which lets you pipe JSON directly into the &lt;code&gt;llm jq&lt;/code&gt; command along with a human-language description of how you'd like to manipulate that JSON and have a &lt;a href="https://jqlang.github.io/jq/"&gt;jq&lt;/a&gt; program generated and executed for you on the fly.&lt;/p&gt;

&lt;p&gt;Thomas Ptacek &lt;a href="https://twitter.com/tqbf/status/1850350668965359801"&gt;on Twitter&lt;/a&gt;:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;The JQ CLI should just BE a ChatGPT client, so there's no pretense of actually understanding this syntax. Cut out the middleman, just look up what I'm trying to do, for me.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;I &lt;a href="https://xkcd.com/356/"&gt;couldn't resist&lt;/a&gt; writing a plugin. Here's an example of &lt;code&gt;llm-jq&lt;/code&gt; in action:&lt;/p&gt;
&lt;div class="highlight highlight-source-shell"&gt;&lt;pre class="notranslate"&gt;llm install llm-jq
curl -s https://api.github.com/repos/simonw/datasette/issues &lt;span class="pl-k"&gt;|&lt;/span&gt; \
  llm jq &lt;span class="pl-s"&gt;&lt;span class="pl-pds"&gt;'&lt;/span&gt;count by user login, top 3&lt;span class="pl-pds"&gt;'&lt;/span&gt;&lt;/span&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;This outputs the following:&lt;/p&gt;
&lt;div class="highlight highlight-source-json"&gt;&lt;pre&gt;[
  {
    &lt;span class="pl-ent"&gt;"login"&lt;/span&gt;: &lt;span class="pl-s"&gt;&lt;span class="pl-pds"&gt;"&lt;/span&gt;simonw&lt;span class="pl-pds"&gt;"&lt;/span&gt;&lt;/span&gt;,
    &lt;span class="pl-ent"&gt;"count"&lt;/span&gt;: &lt;span class="pl-c1"&gt;11&lt;/span&gt;
  },
  {
    &lt;span class="pl-ent"&gt;"login"&lt;/span&gt;: &lt;span class="pl-s"&gt;&lt;span class="pl-pds"&gt;"&lt;/span&gt;king7532&lt;span class="pl-pds"&gt;"&lt;/span&gt;&lt;/span&gt;,
    &lt;span class="pl-ent"&gt;"count"&lt;/span&gt;: &lt;span class="pl-c1"&gt;5&lt;/span&gt;
  },
  {
    &lt;span class="pl-ent"&gt;"login"&lt;/span&gt;: &lt;span class="pl-s"&gt;&lt;span class="pl-pds"&gt;"&lt;/span&gt;dependabot[bot]&lt;span class="pl-pds"&gt;"&lt;/span&gt;&lt;/span&gt;,
    &lt;span class="pl-ent"&gt;"count"&lt;/span&gt;: &lt;span class="pl-c1"&gt;2&lt;/span&gt;
  }
]
&lt;span style="color: blue"&gt;group_by(.user.login) | map({login: .[0].user.login, count: length}) | sort_by(-.count) | .[0:3]&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;The JSON result is sent to standard output, the &lt;code&gt;jq&lt;/code&gt; program it generated and executed is sent to standard error. Add the &lt;code&gt;-s/--silent&lt;/code&gt; option to tell it not to output the program, or the &lt;code&gt;-v/--verbose&lt;/code&gt; option for verbose output that shows the prompt it sent to the LLM as well.&lt;/p&gt;
&lt;p&gt;Under the hood it passes the first 1024 bytes of the JSON piped to it plus the program description "count by user login, top 3" to the default LLM model (usually &lt;code&gt;gpt-4o-mini&lt;/code&gt; unless you set another with e.g. &lt;code&gt;llm models default claude-3.5-sonnet&lt;/code&gt;) and system prompt. It then runs &lt;code&gt;jq&lt;/code&gt; in a subprocess and pipes in the full JSON that was passed to it.&lt;/p&gt;
&lt;p&gt;Here's the system prompt it uses, adapted from my &lt;a href="https://github.com/simonw/llm-cmd"&gt;llm-cmd plugin&lt;/a&gt;:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;code class="notranslate"&gt;Based on the example JSON snippet and the desired query, write a jq program&lt;/code&gt;&lt;/p&gt;
&lt;p&gt;&lt;code class="notranslate"&gt;Return only the jq program to be executed as a raw string, no string delimiters wrapping it, no yapping, no markdown, no fenced code blocks, what you return will be passed to subprocess.check_output('jq', [...]) directly. For example, if the user asks: extract the name of the first person You return only: .people[0].name&lt;/code&gt;&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;I &lt;a href="https://gist.github.com/simonw/484d878877f53537f38e48a7a3845df2"&gt;used Claude&lt;/a&gt; to figure out how to pipe content from the parent process to the child and detect and return the correct exit code.&lt;/p&gt;

&lt;p&gt;&lt;img src="https://static.simonwillison.net/static/2024/llm-jq-card.jpg" alt="Example terminal screenshot of llm jq with the verbose option." style="max-width: 100%" /&gt;&lt;/p&gt;
    
        &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/cli"&gt;cli&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/plugins"&gt;plugins&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/projects"&gt;projects&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/thomas-ptacek"&gt;thomas-ptacek&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/ai"&gt;ai&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/jq"&gt;jq&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/prompt-engineering"&gt;prompt-engineering&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/generative-ai"&gt;generative-ai&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/llms"&gt;llms&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/ai-assisted-programming"&gt;ai-assisted-programming&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/llm"&gt;llm&lt;/a&gt;&lt;/p&gt;
    

</summary><category term="cli"/><category term="plugins"/><category term="projects"/><category term="thomas-ptacek"/><category term="ai"/><category term="jq"/><category term="prompt-engineering"/><category term="generative-ai"/><category term="llms"/><category term="ai-assisted-programming"/><category term="llm"/></entry><entry><title>TIL: Using uv to develop Python command-line applications</title><link href="https://simonwillison.net/2024/Oct/24/uv-cli/#atom-tag" rel="alternate"/><published>2024-10-24T05:56:21+00:00</published><updated>2024-10-24T05:56:21+00:00</updated><id>https://simonwillison.net/2024/Oct/24/uv-cli/#atom-tag</id><summary type="html">
    
&lt;p&gt;&lt;strong&gt;&lt;a href="https://til.simonwillison.net/python/uv-cli-apps"&gt;TIL: Using uv to develop Python command-line applications&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;
I've been increasingly using &lt;a href="https://docs.astral.sh/uv/"&gt;uv&lt;/a&gt; to try out new software (via &lt;code&gt;uvx&lt;/code&gt;) and experiment with new ideas, but I hadn't quite figured out the right way to use it for developing my own projects.&lt;/p&gt;
&lt;p&gt;It turns out I was missing a few things - in particular the fact that there's no need to use &lt;code&gt;uv pip&lt;/code&gt; at all when working with a local development environment, you can get by entirely on &lt;code&gt;uv run&lt;/code&gt; (and maybe &lt;code&gt;uv sync --extra test&lt;/code&gt; to install test dependencies) with no direct invocations of &lt;code&gt;uv pip&lt;/code&gt; at all.&lt;/p&gt;
&lt;p&gt;I bounced &lt;a href="https://gist.github.com/simonw/975dfa41e9b03bca2513a986d9aa3dcf"&gt;a few questions&lt;/a&gt; off Charlie Marsh and filled in the missing gaps - this TIL shows my new uv-powered process for hacking on Python CLI apps built using Click and my &lt;a href="https://github.com/simonw/click-app"&gt;simonw/click-app&lt;/a&gt; cookecutter template.


    &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/cli"&gt;cli&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/packaging"&gt;packaging&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/pip"&gt;pip&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/python"&gt;python&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/til"&gt;til&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/cookiecutter"&gt;cookiecutter&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/uv"&gt;uv&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/astral"&gt;astral&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/charlie-marsh"&gt;charlie-marsh&lt;/a&gt;&lt;/p&gt;



</summary><category term="cli"/><category term="packaging"/><category term="pip"/><category term="python"/><category term="til"/><category term="cookiecutter"/><category term="uv"/><category term="astral"/><category term="charlie-marsh"/></entry><entry><title>files-to-prompt 0.3</title><link href="https://simonwillison.net/2024/Sep/9/files-to-prompt-03/#atom-tag" rel="alternate"/><published>2024-09-09T05:57:35+00:00</published><updated>2024-09-09T05:57:35+00:00</updated><id>https://simonwillison.net/2024/Sep/9/files-to-prompt-03/#atom-tag</id><summary type="html">
    
&lt;p&gt;&lt;strong&gt;&lt;a href="https://github.com/simonw/files-to-prompt/releases/tag/0.3"&gt;files-to-prompt 0.3&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;
New version of my &lt;code&gt;files-to-prompt&lt;/code&gt; CLI tool for turning a bunch of files into a prompt suitable for piping to an LLM, &lt;a href="https://simonwillison.net/2024/Apr/8/files-to-prompt/"&gt;described here previously&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;It now has a &lt;code&gt;-c/--cxml&lt;/code&gt; flag for outputting the files in Claude XML-ish notation (XML-ish because it's not actually valid XML) using the format Anthropic describe as &lt;a href="https://docs.anthropic.com/en/docs/build-with-claude/prompt-engineering/long-context-tips#essential-tips-for-long-context-prompts"&gt;recommended for long context&lt;/a&gt;:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;files-to-prompt llm-*/README.md --cxml | llm -m claude-3.5-sonnet \
  --system 'return an HTML page about these plugins with usage examples' \
  &amp;gt; /tmp/fancy.html
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;&lt;a href="https://static.simonwillison.net/static/2024/llm-cxml-demo.html"&gt;Here's what that gave me&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;The format itself looks something like this:&lt;/p&gt;
&lt;div class="highlight highlight-text-xml"&gt;&lt;pre&gt;&amp;lt;&lt;span class="pl-ent"&gt;documents&lt;/span&gt;&amp;gt;
&amp;lt;&lt;span class="pl-ent"&gt;document&lt;/span&gt; &lt;span class="pl-e"&gt;index&lt;/span&gt;=&lt;span class="pl-s"&gt;&lt;span class="pl-pds"&gt;"&lt;/span&gt;1&lt;span class="pl-pds"&gt;"&lt;/span&gt;&lt;/span&gt;&amp;gt;
&amp;lt;&lt;span class="pl-ent"&gt;source&lt;/span&gt;&amp;gt;llm-anyscale-endpoints/README.md&amp;lt;/&lt;span class="pl-ent"&gt;source&lt;/span&gt;&amp;gt;
&amp;lt;&lt;span class="pl-ent"&gt;document_content&lt;/span&gt;&amp;gt;
# llm-anyscale-endpoints
...
&amp;lt;/&lt;span class="pl-ent"&gt;document_content&lt;/span&gt;&amp;gt;
&amp;lt;/&lt;span class="pl-ent"&gt;document&lt;/span&gt;&amp;gt;
&amp;lt;/&lt;span class="pl-ent"&gt;documents&lt;/span&gt;&amp;gt;&lt;/pre&gt;&lt;/div&gt;


    &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/cli"&gt;cli&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/projects"&gt;projects&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/tools"&gt;tools&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/ai"&gt;ai&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/prompt-engineering"&gt;prompt-engineering&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/generative-ai"&gt;generative-ai&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/llms"&gt;llms&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/anthropic"&gt;anthropic&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/claude"&gt;claude&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/files-to-prompt"&gt;files-to-prompt&lt;/a&gt;&lt;/p&gt;



</summary><category term="cli"/><category term="projects"/><category term="tools"/><category term="ai"/><category term="prompt-engineering"/><category term="generative-ai"/><category term="llms"/><category term="anthropic"/><category term="claude"/><category term="files-to-prompt"/></entry><entry><title>LLM 0.15</title><link href="https://simonwillison.net/2024/Jul/18/llm-015/#atom-tag" rel="alternate"/><published>2024-07-18T19:44:24+00:00</published><updated>2024-07-18T19:44:24+00:00</updated><id>https://simonwillison.net/2024/Jul/18/llm-015/#atom-tag</id><summary type="html">
    
&lt;p&gt;&lt;strong&gt;&lt;a href="https://llm.datasette.io/en/stable/changelog.html#v0-15"&gt;LLM 0.15&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;
A new release of my &lt;a href="https://llm.datasette.io/"&gt;LLM CLI tool&lt;/a&gt; for interacting with Large Language Models from the terminal (see &lt;a href="https://simonwillison.net/2024/Jun/17/cli-language-models/"&gt;this recent talk&lt;/a&gt; for plenty of demos).&lt;/p&gt;
&lt;p&gt;This release adds support for the brand new &lt;a href="https://simonwillison.net/2024/Jul/18/gpt-4o-mini/"&gt;GPT-4o mini&lt;/a&gt;:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;llm -m gpt-4o-mini "rave about pelicans in Spanish"
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;It also sets that model as the default used by the tool if no other model is specified. This replaces GPT-3.5 Turbo, the default since the first release of LLM. 4o-mini is both cheaper and &lt;em&gt;way&lt;/em&gt; more capable than 3.5 Turbo.


    &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/cli"&gt;cli&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/projects"&gt;projects&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/ai"&gt;ai&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/openai"&gt;openai&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/generative-ai"&gt;generative-ai&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/llms"&gt;llms&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/llm"&gt;llm&lt;/a&gt;&lt;/p&gt;



</summary><category term="cli"/><category term="projects"/><category term="ai"/><category term="openai"/><category term="generative-ai"/><category term="llms"/><category term="llm"/></entry></feed>