<?xml version="1.0" encoding="utf-8"?>
<feed xml:lang="en-us" xmlns="http://www.w3.org/2005/Atom"><title>Simon Willison's Weblog: plugins</title><link href="http://simonwillison.net/" rel="alternate"/><link href="http://simonwillison.net/tags/plugins.atom" rel="self"/><id>http://simonwillison.net/</id><updated>2026-03-25T21:24:31+00:00</updated><author><name>Simon Willison</name></author><entry><title>datasette-llm 0.1a1</title><link href="https://simonwillison.net/2026/Mar/25/datasette-llm/#atom-tag" rel="alternate"/><published>2026-03-25T21:24:31+00:00</published><updated>2026-03-25T21:24:31+00:00</updated><id>https://simonwillison.net/2026/Mar/25/datasette-llm/#atom-tag</id><summary type="html">
    &lt;p&gt;&lt;strong&gt;Release:&lt;/strong&gt; &lt;a href="https://github.com/datasette/datasette-llm/releases/tag/0.1a1"&gt;datasette-llm 0.1a1&lt;/a&gt;&lt;/p&gt;
    &lt;p&gt;New release of the base plugin that makes models from &lt;a href="https://llm.datasette.io/"&gt;LLM&lt;/a&gt; available for use by other Datasette plugins such as &lt;a href="https://github.com/datasette/datasette-enrichments-llm"&gt;datasette-enrichments-llm&lt;/a&gt;.&lt;/p&gt;
&lt;blockquote&gt;
&lt;ul&gt;
&lt;li&gt;New &lt;a href="https://github.com/datasette/datasette-llm/blob/main/README.md#register_llm_purposes"&gt;&lt;code&gt;register_llm_purposes()&lt;/code&gt; plugin hook&lt;/a&gt; and &lt;code&gt;get_purposes()&lt;/code&gt; function for retrieving registered purpose strings. &lt;a href="https://github.com/datasette/datasette-llm/issues/1"&gt;#1&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/blockquote&gt;
&lt;p&gt;One of the responsibilities of this plugin is to configure which models are used for which purposes, so you can say in one place "data enrichment uses GPT-5.4-nano but SQL query assistance happens using Sonnet 4.6", for example.&lt;/p&gt;
&lt;p&gt;Plugins that depend on this can use &lt;code&gt;model = await llm.model(purpose="enrichment")&lt;/code&gt; to indicate the purpose of the prompts they wish to execute against the model. Those plugins can now also use the new &lt;code&gt;register_llm_purposes()&lt;/code&gt; hook to register those purpose strings, which means future plugins can list those purposes in one place to power things like an admin UI for assigning models to purposes.&lt;/p&gt;
    
        &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/plugins"&gt;plugins&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/datasette"&gt;datasette&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/annotated-release-notes"&gt;annotated-release-notes&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/llm"&gt;llm&lt;/a&gt;&lt;/p&gt;
    

</summary><category term="plugins"/><category term="datasette"/><category term="annotated-release-notes"/><category term="llm"/></entry><entry><title>Video + notes on upgrading a Datasette plugin for the latest 1.0 alpha, with help from uv and OpenAI Codex CLI</title><link href="https://simonwillison.net/2025/Nov/6/upgrading-datasette-plugins/#atom-tag" rel="alternate"/><published>2025-11-06T18:26:05+00:00</published><updated>2025-11-06T18:26:05+00:00</updated><id>https://simonwillison.net/2025/Nov/6/upgrading-datasette-plugins/#atom-tag</id><summary type="html">
    &lt;p&gt;I'm upgrading various plugins for compatibility with the new &lt;a href="https://simonwillison.net/2025/Nov/4/datasette-10a20/"&gt;Datasette 1.0a20 alpha release&lt;/a&gt; and I decided to record &lt;a href="https://www.youtube.com/watch?v=qy4ci7AoF9Y"&gt;a video&lt;/a&gt; of the process. This post accompanies that video with detailed additional notes.&lt;/p&gt;

&lt;p&gt;&lt;lite-youtube videoid="qy4ci7AoF9Y" js-api="js-api" title="My process for upgrading Datasette plugins with uv and OpenAI Codex CLI" playlabel="Play: My process for upgrading Datasette plugins with uv and OpenAI Codex CLI"&gt; &lt;/lite-youtube&gt;&lt;/p&gt;

&lt;h4 id="the-datasette-checkbox-plugin"&gt;The datasette-checkbox plugin&lt;/h4&gt;
&lt;p&gt;I picked a very simple plugin to illustrate the upgrade process (possibly too simple). &lt;a href="https://github.com/datasette/datasette-checkbox"&gt;datasette-checkbox&lt;/a&gt; adds just one feature to Datasette: if you are viewing a table with boolean columns (detected as integer columns with names like &lt;code&gt;is_active&lt;/code&gt; or &lt;code&gt;has_attachments&lt;/code&gt; or &lt;code&gt;should_notify&lt;/code&gt;) &lt;em&gt;and&lt;/em&gt; your current user has permission to update rows in that table it adds an inline checkbox UI that looks like this:&lt;/p&gt;
&lt;p&gt;&lt;img src="https://static.simonwillison.net/static/2025/datasette-checkbox.gif" alt="Animated demo of a table with name, is_done, should_be_deleted and is_happy columns. Each column has checkboxes, and clicking a checkboxflashes a little &amp;quot;updated&amp;quot; message." style="max-width: 100%;" /&gt;&lt;/p&gt;
&lt;p&gt;I built the first version with the help of Claude back in August 2024 - details &lt;a href="https://github.com/datasette/datasette-checkbox/issues/1#issuecomment-2294168693"&gt;in this issue comment&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;Most of the implementation is JavaScript that makes calls to Datasette 1.0's &lt;a href="https://simonwillison.net/2022/Dec/2/datasette-write-api/"&gt;JSON write API&lt;/a&gt;. The Python code just checks that the user has the necessary permissions before including the extra JavaScript.&lt;/p&gt;
&lt;h4 id="running-the-plugin-s-tests"&gt;Running the plugin's tests&lt;/h4&gt;
&lt;p&gt;The first step in upgrading any plugin is to run its tests against the latest Datasette version.&lt;/p&gt;
&lt;p&gt;Thankfully &lt;code&gt;uv&lt;/code&gt; makes it easy to run code in scratch virtual environments that include the different code versions you want to test against.&lt;/p&gt;
&lt;p&gt;I have a test utility called &lt;code&gt;tadd&lt;/code&gt; (for "test against development Datasette") which I use for that purpose. I can run it in any plugin directory like this:&lt;/p&gt;
&lt;div class="highlight highlight-source-shell"&gt;&lt;pre&gt;tadd&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;And it will run the existing plugin tests against whatever version of Datasette I have checked out in my &lt;code&gt;~/dev/datasette&lt;/code&gt; directory.&lt;/p&gt;
&lt;p&gt;You can see the full implementation of &lt;code&gt;tadd&lt;/code&gt; (and its friend &lt;code&gt;radd&lt;/code&gt; described below) &lt;a href="https://til.simonwillison.net/python/uv-tests#variants-tadd-and-radd"&gt;in this TIL&lt;/a&gt; - the basic version looks like this:&lt;/p&gt;
&lt;div class="highlight highlight-source-shell"&gt;&lt;pre&gt;&lt;span class="pl-c"&gt;&lt;span class="pl-c"&gt;#!&lt;/span&gt;/bin/sh&lt;/span&gt;
uv run --no-project --isolated \
  --with-editable &lt;span class="pl-s"&gt;&lt;span class="pl-pds"&gt;'&lt;/span&gt;.[test]&lt;span class="pl-pds"&gt;'&lt;/span&gt;&lt;/span&gt; --with-editable &lt;span class="pl-k"&gt;~&lt;/span&gt;/dev/datasette \
  python -m pytest &lt;span class="pl-s"&gt;&lt;span class="pl-pds"&gt;"&lt;/span&gt;&lt;span class="pl-smi"&gt;$@&lt;/span&gt;&lt;span class="pl-pds"&gt;"&lt;/span&gt;&lt;/span&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;I started by running &lt;code&gt;tadd&lt;/code&gt; in the &lt;code&gt;datasette-checkbox&lt;/code&gt; directory, and got my first failure... but it wasn't due to permissions, it was because the &lt;code&gt;pyproject.toml&lt;/code&gt; for the plugin was &lt;a href="https://github.com/datasette/datasette-checkbox/blob/0.1a3/pyproject.toml#L13C1-L15C2"&gt;pinned&lt;/a&gt; to a specific mismatched version of Datasette:&lt;/p&gt;
&lt;div class="highlight highlight-source-toml"&gt;&lt;pre&gt;&lt;span class="pl-smi"&gt;dependencies&lt;/span&gt; = [
    &lt;span class="pl-s"&gt;&lt;span class="pl-pds"&gt;"&lt;/span&gt;datasette==1.0a19&lt;span class="pl-pds"&gt;"&lt;/span&gt;&lt;/span&gt;
]&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;I fixed this problem by swapping &lt;code&gt;==&lt;/code&gt; to &lt;code&gt;&amp;gt;=&lt;/code&gt; and ran the tests again... and they passed! Which was a problem because I was expecting permission-related failures.&lt;/p&gt;
&lt;p&gt;It turns out when I first wrote the plugin I was &lt;a href="https://github.com/datasette/datasette-checkbox/blob/0.1a3/tests/test_checkbox.py"&gt;lazy with the tests&lt;/a&gt; - they weren't actually confirming that the table page loaded without errors.&lt;/p&gt;
&lt;p&gt;I needed to actually run the code myself to see the expected bug.&lt;/p&gt;
&lt;p&gt;First I created myself a demo database using &lt;a href="https://sqlite-utils.datasette.io/en/stable/cli.html#creating-tables"&gt;sqlite-utils create-table&lt;/a&gt;:&lt;/p&gt;
&lt;div class="highlight highlight-source-shell"&gt;&lt;pre&gt;sqlite-utils create-table demo.db \
  demo id integer is_checked integer --pk id&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;Then I ran it with Datasette against the plugin's code like so:&lt;/p&gt;
&lt;div class="highlight highlight-source-shell"&gt;&lt;pre&gt;radd demo.db&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;Sure enough, visiting &lt;code&gt;/demo/demo&lt;/code&gt; produced a 500 error about the missing &lt;code&gt;Datasette.permission_allowed()&lt;/code&gt; method.&lt;/p&gt;
&lt;p&gt;The next step was to update the test to also trigger this error:&lt;/p&gt;
&lt;pre&gt;&lt;span class="pl-en"&gt;@&lt;span class="pl-s1"&gt;pytest&lt;/span&gt;.&lt;span class="pl-c1"&gt;mark&lt;/span&gt;.&lt;span class="pl-c1"&gt;asyncio&lt;/span&gt;&lt;/span&gt;
&lt;span class="pl-k"&gt;async&lt;/span&gt; &lt;span class="pl-k"&gt;def&lt;/span&gt; &lt;span class="pl-en"&gt;test_plugin_adds_javascript&lt;/span&gt;():
    &lt;span class="pl-s1"&gt;datasette&lt;/span&gt; &lt;span class="pl-c1"&gt;=&lt;/span&gt; &lt;span class="pl-en"&gt;Datasette&lt;/span&gt;()
    &lt;span class="pl-s1"&gt;db&lt;/span&gt; &lt;span class="pl-c1"&gt;=&lt;/span&gt; &lt;span class="pl-s1"&gt;datasette&lt;/span&gt;.&lt;span class="pl-c1"&gt;add_memory_database&lt;/span&gt;(&lt;span class="pl-s"&gt;"demo"&lt;/span&gt;)
    &lt;span class="pl-k"&gt;await&lt;/span&gt; &lt;span class="pl-s1"&gt;db&lt;/span&gt;.&lt;span class="pl-c1"&gt;execute_write&lt;/span&gt;(
        &lt;span class="pl-s"&gt;"CREATE TABLE IF NOT EXISTS test (id INTEGER PRIMARY KEY, is_active INTEGER)"&lt;/span&gt;
    )
    &lt;span class="pl-k"&gt;await&lt;/span&gt; &lt;span class="pl-s1"&gt;datasette&lt;/span&gt;.&lt;span class="pl-c1"&gt;invoke_startup&lt;/span&gt;()
    &lt;span class="pl-s1"&gt;response&lt;/span&gt; &lt;span class="pl-c1"&gt;=&lt;/span&gt; &lt;span class="pl-k"&gt;await&lt;/span&gt; &lt;span class="pl-s1"&gt;datasette&lt;/span&gt;.&lt;span class="pl-c1"&gt;client&lt;/span&gt;.&lt;span class="pl-c1"&gt;get&lt;/span&gt;(&lt;span class="pl-s"&gt;"/demo/test"&lt;/span&gt;)
    &lt;span class="pl-k"&gt;assert&lt;/span&gt; &lt;span class="pl-s1"&gt;response&lt;/span&gt;.&lt;span class="pl-c1"&gt;status_code&lt;/span&gt; &lt;span class="pl-c1"&gt;==&lt;/span&gt; &lt;span class="pl-c1"&gt;200&lt;/span&gt;&lt;/pre&gt;
&lt;p&gt;And now &lt;code&gt;tadd&lt;/code&gt; fails as expected.&lt;/p&gt;
&lt;h4 id="upgrading-the-plugin-with-codex"&gt;Upgrading the plugin with Codex&lt;/h4&gt;
&lt;p&gt;It this point I could have manually fixed the plugin itself - which would likely have been faster given the small size of the fix - but instead I demonstrated a bash one-liner I've been using to apply these kinds of changes automatically:&lt;/p&gt;
&lt;div class="highlight highlight-source-shell"&gt;&lt;pre&gt;codex &lt;span class="pl-c1"&gt;exec&lt;/span&gt; --dangerously-bypass-approvals-and-sandbox \
&lt;span class="pl-s"&gt;&lt;span class="pl-pds"&gt;"&lt;/span&gt;Run the command tadd and look at the errors and then&lt;/span&gt;
&lt;span class="pl-s"&gt;read ~/dev/datasette/docs/upgrade-1.0a20.md and apply&lt;/span&gt;
&lt;span class="pl-s"&gt;fixes and run the tests again and get them to pass&lt;span class="pl-pds"&gt;"&lt;/span&gt;&lt;/span&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;&lt;code&gt;codex exec&lt;/code&gt; runs OpenAI Codex in non-interactive mode - it will loop until it has finished the prompt you give it.&lt;/p&gt;
&lt;p&gt;I tell it to consult the subset of the &lt;a href="https://docs.datasette.io/en/latest/upgrade_guide.html#datasette-1-0a20-plugin-upgrade-guide"&gt;Datasette upgrade documentation&lt;/a&gt; that talks about Datasette permissions and then get the &lt;code&gt;tadd&lt;/code&gt; command to pass its tests.&lt;/p&gt;
&lt;p&gt;This is an example of what I call &lt;a href="https://simonwillison.net/2025/Sep/30/designing-agentic-loops/"&gt;designing agentic loops&lt;/a&gt; - I gave Codex the tools it needed (&lt;code&gt;tadd&lt;/code&gt;) and a clear goal and let it get to work on my behalf.&lt;/p&gt;
&lt;p&gt;The remainder of the video covers finishing up the work - testing the fix manually, commiting my work using:&lt;/p&gt;
&lt;div class="highlight highlight-source-shell"&gt;&lt;pre&gt;git commit -a -m &lt;span class="pl-s"&gt;&lt;span class="pl-pds"&gt;"&lt;/span&gt;&lt;span class="pl-s"&gt;&lt;span class="pl-pds"&gt;$(&lt;/span&gt;basename &lt;span class="pl-s"&gt;&lt;span class="pl-pds"&gt;"&lt;/span&gt;&lt;span class="pl-smi"&gt;$PWD&lt;/span&gt;&lt;span class="pl-pds"&gt;"&lt;/span&gt;&lt;/span&gt;&lt;span class="pl-pds"&gt;)&lt;/span&gt;&lt;/span&gt; for datasette&amp;gt;=1.0a20&lt;span class="pl-pds"&gt;"&lt;/span&gt;&lt;/span&gt; \
  -m &lt;span class="pl-s"&gt;&lt;span class="pl-pds"&gt;"&lt;/span&gt;Refs https://github.com/simonw/datasette/issues/2577&lt;span class="pl-pds"&gt;"&lt;/span&gt;&lt;/span&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;Then shipping a &lt;a href="https://pypi.org/project/datasette-checkbox/0.1a4/"&gt;0.1a4 release&lt;/a&gt; to PyPI using the pattern &lt;a href="https://til.simonwillison.net/pypi/pypi-releases-from-github"&gt;described in this TIL&lt;/a&gt;.
Finally, I demonstrated that the shipped plugin worked in a fresh environment using &lt;code&gt;uvx&lt;/code&gt; like this:&lt;/p&gt;
&lt;div class="highlight highlight-source-shell"&gt;&lt;pre&gt;uvx --prerelease=allow --with datasette-checkbox \
  datasette --root &lt;span class="pl-k"&gt;~&lt;/span&gt;/dev/ecosystem/datasette-checkbox/demo.db&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;Executing this command installs and runs a fresh Datasette instance with a fresh copy of the new alpha plugin (&lt;code&gt;--prerelease=allow&lt;/code&gt;). It's a neat way of confirming that freshly released software works as expected.&lt;/p&gt;
&lt;h4 id="a-colophon-for-the-video"&gt;A colophon for the video&lt;/h4&gt;
&lt;p&gt;This video was shot in a single take using &lt;a href="https://www.descript.com/"&gt;Descript&lt;/a&gt;, with no rehearsal and perilously little preparation in advance. I recorded through my AirPods and applied the "Studio Sound" filter to clean up the audio. I pasted in a &lt;code&gt;simonwillison.net&lt;/code&gt; closing slide from &lt;a href="https://simonwillison.net/2025/Oct/23/claude-code-for-web-video/"&gt;my previous video&lt;/a&gt; and exported it locally at 1080p, then uploaded it to YouTube.&lt;/p&gt;
&lt;p&gt;Something I learned from the Software Carpentry &lt;a href="https://simonwillison.net/2020/Sep/26/weeknotes-software-carpentry-sqlite/"&gt;instructor training course&lt;/a&gt; is that making mistakes in front of an audience is actively helpful - it helps them see a realistic version of how software development works and they can learn from watching you recover. I see this as a great excuse for not editing out all of my mistakes!&lt;/p&gt;
&lt;p&gt;I'm trying to build new habits around video content that let me produce useful videos while minimizing the amount of time I spend on production.&lt;/p&gt;
&lt;p&gt;I plan to iterate more on the format as I get more comfortable with the process. I'm hoping I can find the right balance between production time and value to viewers.&lt;/p&gt;
    
        &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/plugins"&gt;plugins&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/python"&gt;python&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/youtube"&gt;youtube&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/ai"&gt;ai&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/datasette"&gt;datasette&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/generative-ai"&gt;generative-ai&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/llms"&gt;llms&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/ai-assisted-programming"&gt;ai-assisted-programming&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/uv"&gt;uv&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/coding-agents"&gt;coding-agents&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/codex-cli"&gt;codex-cli&lt;/a&gt;&lt;/p&gt;
    

</summary><category term="plugins"/><category term="python"/><category term="youtube"/><category term="ai"/><category term="datasette"/><category term="generative-ai"/><category term="llms"/><category term="ai-assisted-programming"/><category term="uv"/><category term="coding-agents"/><category term="codex-cli"/></entry><entry><title>A new SQL-powered permissions system in Datasette 1.0a20</title><link href="https://simonwillison.net/2025/Nov/4/datasette-10a20/#atom-tag" rel="alternate"/><published>2025-11-04T21:34:42+00:00</published><updated>2025-11-04T21:34:42+00:00</updated><id>https://simonwillison.net/2025/Nov/4/datasette-10a20/#atom-tag</id><summary type="html">
    &lt;p&gt;&lt;a href="https://docs.datasette.io/en/latest/changelog.html#a20-2025-11-03"&gt;Datasette 1.0a20 is out&lt;/a&gt; with the biggest breaking API change on the road to 1.0, improving how Datasette's permissions system works by migrating permission logic to SQL running in SQLite. This release involved &lt;a href="https://github.com/simonw/datasette/compare/1.0a19...1.0a20"&gt;163 commits&lt;/a&gt;, with 10,660 additions and 1,825 deletions, most of which was written with the help of Claude Code.&lt;/p&gt;


&lt;ul&gt;
  &lt;li&gt;&lt;a href="https://simonwillison.net/2025/Nov/4/datasette-10a20/#understanding-the-permissions-system"&gt;Understanding the permissions system&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href="https://simonwillison.net/2025/Nov/4/datasette-10a20/#permissions-systems-need-to-be-able-to-efficiently-list-things"&gt;Permissions systems need to be able to efficiently list things&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href="https://simonwillison.net/2025/Nov/4/datasette-10a20/#the-new-permission-resources-sql-plugin-hook"&gt;The new permission_resources_sql() plugin hook&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href="https://simonwillison.net/2025/Nov/4/datasette-10a20/#hierarchies-plugins-vetoes-and-restrictions"&gt;Hierarchies, plugins, vetoes, and restrictions&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href="https://simonwillison.net/2025/Nov/4/datasette-10a20/#new-debugging-tools"&gt;New debugging tools&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href="https://simonwillison.net/2025/Nov/4/datasette-10a20/#the-missing-feature-list-actors-who-can-act-on-this-resource"&gt;The missing feature: list actors who can act on this resource&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href="https://simonwillison.net/2025/Nov/4/datasette-10a20/#upgrading-plugins-for-datasette-1-0a20"&gt;Upgrading plugins for Datasette 1.0a20&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href="https://simonwillison.net/2025/Nov/4/datasette-10a20/#using-claude-code-to-implement-this-change"&gt;Using Claude Code to implement this change&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href="https://simonwillison.net/2025/Nov/4/datasette-10a20/#starting-with-a-proof-of-concept"&gt;Starting with a proof-of-concept&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href="https://simonwillison.net/2025/Nov/4/datasette-10a20/#miscellaneous-tips-i-picked-up-along-the-way"&gt;Miscellaneous tips I picked up along the way&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href="https://simonwillison.net/2025/Nov/4/datasette-10a20/#what-s-next-"&gt;What's next?&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;h4 id="understanding-the-permissions-system"&gt;Understanding the permissions system&lt;/h4&gt;
&lt;p&gt;Datasette's &lt;a href="https://docs.datasette.io/en/latest/authentication.html"&gt;permissions system&lt;/a&gt; exists to answer the following question:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;Is this &lt;strong&gt;actor&lt;/strong&gt; allowed to perform this &lt;strong&gt;action&lt;/strong&gt;, optionally against this particular &lt;strong&gt;resource&lt;/strong&gt;?&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;An &lt;strong&gt;actor&lt;/strong&gt; is usually a user, but might also be an automation operating via the Datasette API.&lt;/p&gt;
&lt;p&gt;An &lt;strong&gt;action&lt;/strong&gt; is a thing they need to do - things like view-table, execute-sql, insert-row.&lt;/p&gt;
&lt;p&gt;A &lt;strong&gt;resource&lt;/strong&gt; is the subject of the action - the database you are executing SQL against, the table you want to insert a row into.&lt;/p&gt;
&lt;p&gt;Datasette's default configuration is public but read-only: anyone can view databases and tables or execute read-only SQL queries but no-one can modify data.&lt;/p&gt;
&lt;p&gt;Datasette plugins can enable all sorts of additional ways to interact with databases, many of which need to be protected by a form of authentication Datasette also 1.0 includes &lt;a href="https://simonwillison.net/2022/Dec/2/datasette-write-api/"&gt;a write API&lt;/a&gt; with a need to configure who can insert, update, and delete rows or create new tables.&lt;/p&gt;
&lt;p&gt;Actors can be authenticated in a number of different ways provided by plugins using the &lt;a href="https://docs.datasette.io/en/latest/plugin_hooks.html#actor-from-request-datasette-request"&gt;actor_from_request()&lt;/a&gt; plugin hook. &lt;a href="https://datasette.io/plugins/datasette-auth-passwords"&gt;datasette-auth-passwords&lt;/a&gt; and &lt;a href="https://datasette.io/plugins/datasette-auth-github"&gt;datasette-auth-github&lt;/a&gt; and &lt;a href="https://datasette.io/plugins/datasette-auth-existing-cookies"&gt;datasette-auth-existing-cookies&lt;/a&gt; are examples of authentication plugins.&lt;/p&gt;
&lt;h4 id="permissions-systems-need-to-be-able-to-efficiently-list-things"&gt;Permissions systems need to be able to efficiently list things&lt;/h4&gt;
&lt;p&gt;The previous implementation included a design flaw common to permissions systems of this nature: each permission check involved a function call which would delegate to one or more plugins and return a True/False result.&lt;/p&gt;
&lt;p&gt;This works well for single checks, but has a significant problem: what if you need to show the user a list of things they can access, for example the tables they can view?&lt;/p&gt;
&lt;p&gt;I want Datasette to be able to handle potentially thousands of tables - tables in SQLite are cheap! I don't want to have to run 1,000+ permission checks just to show the user a list of tables.&lt;/p&gt;
&lt;p&gt;Since Datasette is built on top of SQLite we already have a powerful mechanism to help solve this problem. SQLite is &lt;em&gt;really&lt;/em&gt; good at filtering large numbers of records.&lt;/p&gt;
&lt;h4 id="the-new-permission-resources-sql-plugin-hook"&gt;The new permission_resources_sql() plugin hook&lt;/h4&gt;
&lt;p&gt;The biggest change in the new release is that I've replaced the previous  &lt;code&gt;permission_allowed(actor, action, resource)&lt;/code&gt; plugin hook - which let a plugin determine if an actor could perform an action against a resource - with a new &lt;a href="https://docs.datasette.io/en/latest/plugin_hooks.html#plugin-hook-permission-resources-sql"&gt;permission_resources_sql(actor, action)&lt;/a&gt; plugin hook.&lt;/p&gt;
&lt;p&gt;Instead of returning a True/False result, this new hook returns a SQL query that returns rules helping determine the resources the current actor can execute the specified action against.&lt;/p&gt;
&lt;p&gt;Here's an example, lifted from the documentation:&lt;/p&gt;
&lt;pre&gt;&lt;span class="pl-k"&gt;from&lt;/span&gt; &lt;span class="pl-s1"&gt;datasette&lt;/span&gt; &lt;span class="pl-k"&gt;import&lt;/span&gt; &lt;span class="pl-s1"&gt;hookimpl&lt;/span&gt;
&lt;span class="pl-k"&gt;from&lt;/span&gt; &lt;span class="pl-s1"&gt;datasette&lt;/span&gt;.&lt;span class="pl-s1"&gt;permissions&lt;/span&gt; &lt;span class="pl-k"&gt;import&lt;/span&gt; &lt;span class="pl-v"&gt;PermissionSQL&lt;/span&gt;


&lt;span class="pl-en"&gt;@&lt;span class="pl-s1"&gt;hookimpl&lt;/span&gt;&lt;/span&gt;
&lt;span class="pl-k"&gt;def&lt;/span&gt; &lt;span class="pl-en"&gt;permission_resources_sql&lt;/span&gt;(&lt;span class="pl-s1"&gt;datasette&lt;/span&gt;, &lt;span class="pl-s1"&gt;actor&lt;/span&gt;, &lt;span class="pl-s1"&gt;action&lt;/span&gt;):
    &lt;span class="pl-k"&gt;if&lt;/span&gt; &lt;span class="pl-s1"&gt;action&lt;/span&gt; &lt;span class="pl-c1"&gt;!=&lt;/span&gt; &lt;span class="pl-s"&gt;"view-table"&lt;/span&gt;:
        &lt;span class="pl-k"&gt;return&lt;/span&gt; &lt;span class="pl-c1"&gt;None&lt;/span&gt;
    &lt;span class="pl-k"&gt;if&lt;/span&gt; &lt;span class="pl-c1"&gt;not&lt;/span&gt; &lt;span class="pl-s1"&gt;actor&lt;/span&gt; &lt;span class="pl-c1"&gt;or&lt;/span&gt; &lt;span class="pl-s1"&gt;actor&lt;/span&gt;.&lt;span class="pl-c1"&gt;get&lt;/span&gt;(&lt;span class="pl-s"&gt;"id"&lt;/span&gt;) &lt;span class="pl-c1"&gt;!=&lt;/span&gt; &lt;span class="pl-s"&gt;"alice"&lt;/span&gt;:
        &lt;span class="pl-k"&gt;return&lt;/span&gt; &lt;span class="pl-c1"&gt;None&lt;/span&gt;

    &lt;span class="pl-k"&gt;return&lt;/span&gt; &lt;span class="pl-en"&gt;PermissionSQL&lt;/span&gt;(
        &lt;span class="pl-s1"&gt;sql&lt;/span&gt;&lt;span class="pl-c1"&gt;=&lt;/span&gt;&lt;span class="pl-s"&gt;"""&lt;/span&gt;
&lt;span class="pl-s"&gt;            SELECT&lt;/span&gt;
&lt;span class="pl-s"&gt;                'accounting' AS parent,&lt;/span&gt;
&lt;span class="pl-s"&gt;                'sales' AS child,&lt;/span&gt;
&lt;span class="pl-s"&gt;                1 AS allow,&lt;/span&gt;
&lt;span class="pl-s"&gt;                'alice can view accounting/sales' AS reason&lt;/span&gt;
&lt;span class="pl-s"&gt;        """&lt;/span&gt;,
    )&lt;/pre&gt;
&lt;p&gt;This hook grants the actor with ID "alice" permission to view the "sales" table in the "accounting" database.&lt;/p&gt;
&lt;p&gt;The &lt;code&gt;PermissionSQL&lt;/code&gt; object should always return four columns: a parent, child, allow (1 or 0), and a reason string for debugging.&lt;/p&gt;
&lt;p&gt;When you ask Datasette to list the resources an actor can access for a specific action, it will combine the SQL returned by all installed plugins into a single query that joins against &lt;a href="https://docs.datasette.io/en/latest/internals.html#internal-database-schema"&gt;the internal catalog tables&lt;/a&gt; and efficiently lists all the resources the actor can access.&lt;/p&gt;
&lt;p&gt;This query can then be limited or paginated to avoid loading too many results at once.&lt;/p&gt;
&lt;h4 id="hierarchies-plugins-vetoes-and-restrictions"&gt;Hierarchies, plugins, vetoes, and restrictions&lt;/h4&gt;
&lt;p&gt;Datasette has several additional requirements that make the permissions system more complicated.&lt;/p&gt;
&lt;p&gt;Datasette permissions can optionally act against a two-level &lt;strong&gt;hierarchy&lt;/strong&gt;. You can grant a user the ability to insert-row against a specific table, or every table in a specific database, or every table in &lt;em&gt;every&lt;/em&gt; database in that Datasette instance.&lt;/p&gt;
&lt;p&gt;Some actions can apply at the table level, others the database level and others only make sense globally - enabling a new feature that isn't tied to tables or databases, for example.&lt;/p&gt;
&lt;p&gt;Datasette currently has &lt;a href="https://docs.datasette.io/en/latest/authentication.html#built-in-actions"&gt;ten default actions&lt;/a&gt; but &lt;strong&gt;plugins&lt;/strong&gt; that add additional features can &lt;a href="https://docs.datasette.io/en/latest/plugin_hooks.html#register-actions-datasette"&gt;register new actions&lt;/a&gt; to better participate in the permission systems.&lt;/p&gt;
&lt;p&gt;Datasette's permission system has a mechanism to &lt;strong&gt;veto&lt;/strong&gt; permission checks - a plugin can return a deny for a specific permission check which will override any allows. This needs to be hierarchy-aware - a deny at the database level can be outvoted by an allow at the table level.&lt;/p&gt;
&lt;p&gt;Finally, Datasette includes a mechanism for applying additional &lt;strong&gt;restrictions&lt;/strong&gt; to a request. This was introduced for Datasette's API - it allows a user to create an API token that can act on their behalf but is only allowed to perform a subset of their capabilities - just reading from two specific tables, for example. Restrictions are &lt;a href="https://docs.datasette.io/en/latest/authentication.html#restricting-the-actions-that-a-token-can-perform"&gt;described in more detail&lt;/a&gt; in the documentation.&lt;/p&gt;
&lt;p&gt;That's a lot of different moving parts for the new implementation to cover.&lt;/p&gt;
&lt;h4 id="new-debugging-tools"&gt;New debugging tools&lt;/h4&gt;
&lt;p&gt;Since permissions are critical to the security of a Datasette deployment it's vital that they are as easy to understand and debug as possible.&lt;/p&gt;
&lt;p&gt;The new alpha adds several new debugging tools, including this page that shows the full list of resources matching a specific action for the current user:&lt;/p&gt;
&lt;p&gt;&lt;img src="https://static.simonwillison.net/static/2025/datasette-allowed-resources.jpg" alt="Allowed resources. Tabs are Playground, Check, Allowed, Rules, Actions, Allow debug. There is a form where you can select an action (here view-table) and optionally filter by parent and child. Below is a table of results listing resource paths - e.g. /fixtures/name-of-table - plus parent, child and reason columns. The reason is a JSON list for example &amp;quot;datasette.default_permissions: root user&amp;quot;,&amp;quot;datasette.default_permissions: default allow for view-table&amp;quot;." style="max-width: 100%;" /&gt;&lt;/p&gt;
&lt;p&gt;And this page listing the &lt;em&gt;rules&lt;/em&gt; that apply to that question - since different plugins may return different rules which get combined together:&lt;/p&gt;
&lt;p&gt;&lt;img src="https://static.simonwillison.net/static/2025/datasette-rules.jpg" alt="The rules tab for the same view-table question. Here there are two allow rules - one from datasette.default_permissions for the root user and another from default_permissions labelled default allow for view-table." style="max-width: 100%;" /&gt;&lt;/p&gt;
&lt;p&gt;This screenshot illustrates two of Datasette's built-in rules: there is a default allow for read-only operations such as view-table (which can be over-ridden by plugins) and another rule that says the root user can do anything (provided Datasette was started with the &lt;code&gt;--root&lt;/code&gt; option.)&lt;/p&gt;
&lt;p&gt;Those rules are defined in the &lt;a href="https://github.com/simonw/datasette/blob/1.0a20/datasette/default_permissions.py"&gt;datasette/default_permissions.py&lt;/a&gt; Python module.&lt;/p&gt;
&lt;h4 id="the-missing-feature-list-actors-who-can-act-on-this-resource"&gt;The missing feature: list actors who can act on this resource&lt;/h4&gt;
&lt;p&gt;There's one question that the new system cannot answer: provide a full list of actors who can perform this action against this resource.&lt;/p&gt;
&lt;p&gt;It's not possibly to provide this globally for Datasette because Datasette doesn't have a way to track what "actors" exist in the system. SSO plugins such as &lt;code&gt;datasette-auth-github&lt;/code&gt; mean a new authenticated GitHub user might show up at any time, with the ability to perform actions despite the Datasette system never having encountered that particular username before.&lt;/p&gt;
&lt;p&gt;API tokens and actor restrictions come into play here as well. A user might create a signed API token that can perform a subset of actions on their behalf - the existence of that token can't be predicted by the permissions system.&lt;/p&gt;
&lt;p&gt;This is a notable omission, but it's also quite common in other systems. AWS cannot provide a list of all actors who have permission to access a specific S3 bucket, for example - presumably for similar reasons.&lt;/p&gt;
&lt;h4 id="upgrading-plugins-for-datasette-1-0a20"&gt;Upgrading plugins for Datasette 1.0a20&lt;/h4&gt;
&lt;p&gt;Datasette's plugin ecosystem is the reason I'm paying so much attention to ensuring Datasette 1.0 has a stable API. I don't want plugin authors to need to chase breaking changes once that 1.0 release is out.&lt;/p&gt;
&lt;p&gt;The &lt;a href="https://docs.datasette.io/en/latest/upgrade_guide.html"&gt;Datasette upgrade guide&lt;/a&gt; includes detailed notes on upgrades that are needed between the 0.x and 1.0 alpha releases. I've added an extensive section about the permissions changes to that document.&lt;/p&gt;
&lt;p&gt;I've also been experimenting with dumping those instructions directly into coding agent tools - Claude Code and Codex CLI - to have them upgrade existing plugins for me. This has been working &lt;em&gt;extremely well&lt;/em&gt;. I've even had Claude Code &lt;a href="https://github.com/simonw/datasette/commit/fa978ec1006297416e2cd87a2f0d3cac99283cf8"&gt;update those notes itself&lt;/a&gt; with things it learned during an upgrade process!&lt;/p&gt;
&lt;p&gt;This is greatly helped by the fact that every single Datasette plugin has an automated test suite that demonstrates the core functionality works as expected. Coding agents can use those tests to verify that their changes have had the desired effect.&lt;/p&gt;
&lt;p&gt;I've also been leaning heavily on &lt;code&gt;uv&lt;/code&gt; to help with the upgrade process. I wrote myself two new helper scripts - &lt;code&gt;tadd&lt;/code&gt; and &lt;code&gt;radd&lt;/code&gt; - to help test the new plugins.&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;tadd&lt;/code&gt; = "test against datasette dev" - it runs a plugin's existing test suite against the current development version of Datasette checked out on my machine. It passes extra options through to &lt;code&gt;pytest&lt;/code&gt; so I can run &lt;code&gt;tadd -k test_name&lt;/code&gt; or &lt;code&gt;tadd -x --pdb&lt;/code&gt; as needed.&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;radd&lt;/code&gt; = "run against datasette dev" - it runs the latest dev &lt;code&gt;datasette&lt;/code&gt; command with the plugin installed.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;The &lt;code&gt;tadd&lt;/code&gt; and &lt;code&gt;radd&lt;/code&gt; implementations &lt;a href="https://til.simonwillison.net/python/uv-tests#variants-tadd-and-radd"&gt;can be found in this TIL&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;Some of my plugin upgrades have become a one-liner to the &lt;code&gt;codex exec&lt;/code&gt; command, which runs OpenAI Codex CLI with a prompt without entering interactive mode:&lt;/p&gt;
&lt;div class="highlight highlight-source-shell"&gt;&lt;pre&gt;codex &lt;span class="pl-c1"&gt;exec&lt;/span&gt; --dangerously-bypass-approvals-and-sandbox \
&lt;span class="pl-s"&gt;&lt;span class="pl-pds"&gt;"&lt;/span&gt;Run the command tadd and look at the errors and then&lt;/span&gt;
&lt;span class="pl-s"&gt;read ~/dev/datasette/docs/upgrade-1.0a20.md and apply&lt;/span&gt;
&lt;span class="pl-s"&gt;fixes and run the tests again and get them to pass&lt;span class="pl-pds"&gt;"&lt;/span&gt;&lt;/span&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;There are still a bunch more to go - there's &lt;a href="https://github.com/simonw/datasette/issues/2577"&gt;a list in this tracking issue&lt;/a&gt; - but I expect to have the plugins I maintain all upgraded pretty quickly now that I have a solid process in place.&lt;/p&gt;
&lt;h4 id="using-claude-code-to-implement-this-change"&gt;Using Claude Code to implement this change&lt;/h4&gt;
&lt;p&gt;This change to Datasette core &lt;em&gt;by far&lt;/em&gt; the most ambitious piece of work I've ever attempted using a coding agent.&lt;/p&gt;
&lt;p&gt;Last year I agreed with the prevailing opinion that LLM assistance was much more useful for greenfield coding tasks than working on existing codebases. The amount you could usefully get done was greatly limited by the need to fit the entire codebase into the model's context window.&lt;/p&gt;
&lt;p&gt;Coding agents have entirely changed that calculation. Claude Code and Codex CLI still have relatively limited token windows - albeit larger than last year - but their ability to search through the codebase, read extra files on demand and "reason" about the code they are working with has made them vastly more capable.&lt;/p&gt;
&lt;p&gt;I no longer see codebase size as a limiting factor for how useful they can be.&lt;/p&gt;
&lt;p&gt;I've also spent enough time with Claude Sonnet 4.5 to build a weird level of trust in it. I can usually predict exactly what changes it will make for a prompt. If I tell it "extract this code into a separate function" or "update every instance of this pattern" I know it's likely to get it right.&lt;/p&gt;
&lt;p&gt;For something like permission code I still review everything it does, often by watching it as it works since it displays diffs in the UI.&lt;/p&gt;
&lt;p&gt;I also pay extremely close attention to the tests it's writing. Datasette 1.0a19 already had 1,439 tests, many of which exercised the existing permission system. 1.0a20 increases that to 1,583 tests. I feel very good about that, especially since most of the existing tests continued to pass without modification.&lt;/p&gt;
&lt;h4 id="starting-with-a-proof-of-concept"&gt;Starting with a proof-of-concept&lt;/h4&gt;
&lt;p&gt;I built several different proof-of-concept implementations of SQL permissions before settling on the final design. My &lt;a href="https://github.com/simonw/research/tree/main/sqlite-permissions-poc"&gt;research/sqlite-permissions-poc&lt;/a&gt; project was the one that finally convinced me of a viable approach,&lt;/p&gt;
&lt;p&gt;That one started as a &lt;a href="https://claude.ai/share/8fd432bc-a718-4883-9978-80ab82a75c87"&gt;free ranging conversation with Claude&lt;/a&gt;, at the end of which I told it to generate a specification which I then &lt;a href="https://chatgpt.com/share/68f6532f-9920-8006-928a-364e15b6e9ef"&gt;fed into GPT-5&lt;/a&gt; to implement. You can see that specification &lt;a href="https://github.com/simonw/research/tree/main/sqlite-permissions-poc#original-prompt"&gt;at the end of the README&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;I later fed the POC itself into Claude Code and had it implement the first version of the new Datasette system based on that previous experiment.&lt;/p&gt;
&lt;p&gt;This is admittedly a very weird way of working, but it helped me finally break through on a problem that I'd been struggling with for months.&lt;/p&gt;
&lt;h4 id="miscellaneous-tips-i-picked-up-along-the-way"&gt;Miscellaneous tips I picked up along the way&lt;/h4&gt;
&lt;ul&gt;
&lt;li&gt;When working on anything relating to plugins it's vital to have at least a few real plugins that you upgrade in lock-step with the core changes. The &lt;code&gt;tadd&lt;/code&gt; and &lt;code&gt;radd&lt;/code&gt; shortcuts were invaluable for productively working on those plugins while I made changes to core.&lt;/li&gt;
&lt;li&gt;Coding agents make experiments &lt;em&gt;much&lt;/em&gt; cheaper. I threw away so much code on the way to the final implementation, which was psychologically easier because the cost to create that code in the first place was so low.&lt;/li&gt;
&lt;li&gt;Tests, tests, tests. This project would have been impossible without that existing test suite. The additional tests we built along the way give me confidence that the new system is as robust as I need it to be.&lt;/li&gt;
&lt;li&gt;Claude writes good commit messages now! I finally gave in and let it write these - previously I've been determined to write them myself. It's a big time saver to be able to say "write a tasteful commit message for these changes".&lt;/li&gt;
&lt;li&gt;Claude is also great at breaking up changes into smaller commits. It can also productively rewrite history to make it easier to follow, especially useful if you're still working in a branch.&lt;/li&gt;
&lt;li&gt;A really great way to review Claude's changes is with the GitHub PR interface. You can attach comments to individual lines of code and then later prompt Claude like this: &lt;code&gt;Use gh CLI to fetch comments on URL-to-PR and make the requested changes&lt;/code&gt;. This is a very quick way to apply little nitpick changes - rename this function, refactor this repeated code, add types here etc.&lt;/li&gt;
&lt;li&gt;The code I write with LLMs is &lt;em&gt;higher quality code&lt;/em&gt;. I usually find myself making constant trade-offs while coding: this function would be neater if I extracted this helper, it would be nice to have inline documentation here, this changing this would be good but would break a dozen tests... for each of those I have to determine if the additional time is worth the benefit. Claude can apply changes so much faster than me that these calculations have changed - almost any improvement is worth applying, no matter how trivial, because the time cost is so low.&lt;/li&gt;
&lt;li&gt;Internal tools are cheap now. The new debugging interfaces were mostly written by Claude and are significantly nicer to use and look at than the hacky versions I would have knocked out myself, if I had even taken the extra time to build them.&lt;/li&gt;
&lt;li&gt;That trick with a Markdown file full of upgrade instructions works astonishingly well - it's the same basic idea as &lt;a href="https://simonwillison.net/2025/Oct/16/claude-skills/"&gt;Claude Skills&lt;/a&gt;. I maintain over 100 Datasette plugins now and I expect I'll be automating all sorts of minor upgrades in the future using this technique.&lt;/li&gt;
&lt;/ul&gt;
&lt;h4 id="what-s-next-"&gt;What's next?&lt;/h4&gt;
&lt;p&gt;Now that the new alpha is out my focus is upgrading the existing plugin ecosystem to use it, and supporting other plugin authors who are doing the same.&lt;/p&gt;
&lt;p&gt;The new permissions system unlocks some key improvements to Datasette Cloud concerning finely-grained permissions for larger teams, so I'll be integrating the new alpha there this week.&lt;/p&gt;
&lt;p&gt;This is the single biggest backwards-incompatible change required before Datasette 1.0. I plan to apply the lessons I learned from this project to the other, less intimidating changes. I'm hoping this can result in a final 1.0 release before the end of the year!&lt;/p&gt;
    
        &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/plugins"&gt;plugins&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/projects"&gt;projects&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/python"&gt;python&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/sql"&gt;sql&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/sqlite"&gt;sqlite&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/datasette"&gt;datasette&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/annotated-release-notes"&gt;annotated-release-notes&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/uv"&gt;uv&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/coding-agents"&gt;coding-agents&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/claude-code"&gt;claude-code&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/codex-cli"&gt;codex-cli&lt;/a&gt;&lt;/p&gt;
    

</summary><category term="plugins"/><category term="projects"/><category term="python"/><category term="sql"/><category term="sqlite"/><category term="datasette"/><category term="annotated-release-notes"/><category term="uv"/><category term="coding-agents"/><category term="claude-code"/><category term="codex-cli"/></entry><entry><title>Superpowers: How I'm using coding agents in October 2025</title><link href="https://simonwillison.net/2025/Oct/10/superpowers/#atom-tag" rel="alternate"/><published>2025-10-10T23:30:14+00:00</published><updated>2025-10-10T23:30:14+00:00</updated><id>https://simonwillison.net/2025/Oct/10/superpowers/#atom-tag</id><summary type="html">
    
&lt;p&gt;&lt;strong&gt;&lt;a href="https://blog.fsck.com/2025/10/09/superpowers/"&gt;Superpowers: How I&amp;#x27;m using coding agents in October 2025&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;
A follow-up to Jesse Vincent's post &lt;a href="https://blog.fsck.com/2025/10/05/how-im-using-coding-agents-in-september-2025/"&gt;about September&lt;/a&gt;, but this is a really significant piece in its own right.&lt;/p&gt;
&lt;p&gt;Jesse is one of the most creative users of coding agents (Claude Code in particular) that I know. He's put a great amount of work into evolving an effective process for working with them, encourage red/green TDD (watch the test fail first), planning steps, self-updating memory notes and even implementing a &lt;a href="https://blog.fsck.com/2025/05/28/dear-diary-the-user-asked-me-if-im-alive/"&gt;feelings journal&lt;/a&gt; ("I feel engaged and curious about this project" - Claude).&lt;/p&gt;
&lt;p&gt;Claude Code &lt;a href="https://www.anthropic.com/news/claude-code-plugins"&gt;just launched plugins&lt;/a&gt;, and Jesse is celebrating by wrapping up a whole host of his accumulated tricks as a new plugin called &lt;a href="https://github.com/obra/superpowers"&gt;Superpowers&lt;/a&gt;. You can add it to your Claude Code like this:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;/plugin marketplace add obra/superpowers-marketplace
/plugin install superpowers@superpowers-marketplace
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;There's a lot in here! It's worth spending some time &lt;a href="https://github.com/obra/superpowers"&gt;browsing the repository&lt;/a&gt; - here's just one fun example, in &lt;a href="https://github.com/obra/superpowers/blob/main/skills/debugging/root-cause-tracing/SKILL.md"&gt;skills/debugging/root-cause-tracing/SKILL.md&lt;/a&gt;:&lt;/p&gt;
&lt;blockquote&gt;
&lt;pre&gt;&lt;code&gt;---
name: Root Cause Tracing
description: Systematically trace bugs backward through call stack to find original trigger
when_to_use: Bug appears deep in call stack but you need to find where it originates
version: 1.0.0
languages: all
---
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;&lt;strong&gt;Overview&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;Bugs often manifest deep in the call stack (git init in wrong directory, file created in wrong location, database opened with wrong path). Your instinct is to fix where the error appears, but that's treating a symptom.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Core principle:&lt;/strong&gt; Trace backward through the call chain until you find the original trigger, then fix at the source.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;When to Use&lt;/strong&gt;&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;digraph when_to_use {
    "Bug appears deep in stack?" [shape=diamond];
    "Can trace backwards?" [shape=diamond];
    "Fix at symptom point" [shape=box];
    "Trace to original trigger" [shape=box];
    "BETTER: Also add defense-in-depth" [shape=box];

    "Bug appears deep in stack?" -&amp;gt; "Can trace backwards?" [label="yes"];
    "Can trace backwards?" -&amp;gt; "Trace to original trigger" [label="yes"];
    "Can trace backwards?" -&amp;gt; "Fix at symptom point" [label="no - dead end"];
    "Trace to original trigger" -&amp;gt; "BETTER: Also add defense-in-depth";
}
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;[...]&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;This one is particularly fun because it then includes a &lt;a href="https://en.wikipedia.org/wiki/DOT_(graph_description_language)"&gt;Graphviz DOT graph&lt;/a&gt; illustrating the process - it turns out Claude can interpret those as workflow instructions just fine, and Jesse has been &lt;a href="https://blog.fsck.com/2025/09/29/using-graphviz-for-claudemd/"&gt;wildly experimenting with them&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;I &lt;a href="https://claude.ai/share/2b78a93e-cdc3-4b1d-9b02-457eb62140a5"&gt;vibe-coded up&lt;/a&gt; a quick URL-based DOT visualizer, &lt;a href="https://tools.simonwillison.net/dot#digraph%20when_to_use%20%7B%0A%20%20%20%20%22Bug%20appears%20deep%20in%20stack%3F%22%20%5Bshape%3Ddiamond%5D%3B%0A%20%20%20%20%22Can%20trace%20backwards%3F%22%20%5Bshape%3Ddiamond%5D%3B%0A%20%20%20%20%22Fix%20at%20symptom%20point%22%20%5Bshape%3Dbox%5D%3B%0A%20%20%20%20%22Trace%20to%20original%20trigger%22%20%5Bshape%3Dbox%5D%3B%0A%20%20%20%20%22BETTER%3A%20Also%20add%20defense-in-depth%22%20%5Bshape%3Dbox%5D%3B%0A%0A%20%20%20%20%22Bug%20appears%20deep%20in%20stack%3F%22%20-%3E%20%22Can%20trace%20backwards%3F%22%20%5Blabel%3D%22yes%22%5D%3B%0A%20%20%20%20%22Can%20trace%20backwards%3F%22%20-%3E%20%22Trace%20to%20original%20trigger%22%20%5Blabel%3D%22yes%22%5D%3B%0A%20%20%20%20%22Can%20trace%20backwards%3F%22%20-%3E%20%22Fix%20at%20symptom%20point%22%20%5Blabel%3D%22no%20-%20dead%20end%22%5D%3B%0A%20%20%20%20%22Trace%20to%20original%20trigger%22%20-%3E%20%22BETTER%3A%20Also%20add%20defense-in-depth%22%3B%0A%7D"&gt;here's that one rendered&lt;/a&gt;:&lt;/p&gt;
&lt;p&gt;&lt;img alt="The above DOT rendered as an image" src="https://static.simonwillison.net/static/2025/jesse-dot.jpg" /&gt;&lt;/p&gt;
&lt;p&gt;There is &lt;em&gt;so much&lt;/em&gt; to learn about putting these tools to work in the most effective way possible. Jesse is way ahead of the curve, so it's absolutely worth spending some time exploring what he's shared so far.&lt;/p&gt;
&lt;p&gt;And if you're worried about filling up your context with a bunch of extra stuff, here's &lt;a href="https://bsky.app/profile/s.ly/post/3m2srmkergc2p"&gt;a reassuring note from Jesse&lt;/a&gt;:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;The core of it is VERY token light. It pulls in one doc of fewer than 2k tokens. As it needs bits of the process, it runs a shell script to search for them.  The long end to end chat for the planning and implementation process for that todo list app was 100k tokens.&lt;/p&gt;
&lt;p&gt;It uses subagents to manage token-heavy stuff, including all the actual implementation.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;(Jesse's post also tipped me off about Claude's &lt;code&gt;/mnt/skills/public&lt;/code&gt; folder, see &lt;a href="https://simonwillison.net/2025/Oct/10/claude-skills/"&gt;my notes here&lt;/a&gt;.)


    &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/plugins"&gt;plugins&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/ai"&gt;ai&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/prompt-engineering"&gt;prompt-engineering&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/generative-ai"&gt;generative-ai&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/llms"&gt;llms&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/ai-assisted-programming"&gt;ai-assisted-programming&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/anthropic"&gt;anthropic&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/claude"&gt;claude&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/vibe-coding"&gt;vibe-coding&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/coding-agents"&gt;coding-agents&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/claude-code"&gt;claude-code&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/sub-agents"&gt;sub-agents&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/jesse-vincent"&gt;jesse-vincent&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/skills"&gt;skills&lt;/a&gt;&lt;/p&gt;



</summary><category term="plugins"/><category term="ai"/><category term="prompt-engineering"/><category term="generative-ai"/><category term="llms"/><category term="ai-assisted-programming"/><category term="anthropic"/><category term="claude"/><category term="vibe-coding"/><category term="coding-agents"/><category term="claude-code"/><category term="sub-agents"/><category term="jesse-vincent"/><category term="skills"/></entry><entry><title>Claude can write complete Datasette plugins now</title><link href="https://simonwillison.net/2025/Oct/8/claude-datasette-plugins/#atom-tag" rel="alternate"/><published>2025-10-08T23:43:43+00:00</published><updated>2025-10-08T23:43:43+00:00</updated><id>https://simonwillison.net/2025/Oct/8/claude-datasette-plugins/#atom-tag</id><summary type="html">
    &lt;p&gt;This isn't necessarily surprising, but it's worth noting anyway. Claude Sonnet 4.5 is capable of building a full Datasette plugin now.&lt;/p&gt;
&lt;p&gt;I've seen models complete aspects of this in the past, but today is the first time I've shipped a new plugin where every line of code and test was written by Claude, with minimal prompting from myself.&lt;/p&gt;
&lt;p&gt;The plugin is called &lt;strong&gt;&lt;a href="https://github.com/datasette/datasette-os-info"&gt;datasette-os-info&lt;/a&gt;&lt;/strong&gt;. It's a simple debugging tool - all it does is add a &lt;code&gt;/-/os&lt;/code&gt; JSON page which dumps out as much information as it can about the OS it's running on. Here's a &lt;a href="https://til.simonwillison.net/-/os"&gt;live demo&lt;/a&gt; on my TIL website.&lt;/p&gt;
&lt;p&gt;I built it to help experiment with changing the Docker base container that Datasette uses to &lt;a href="https://docs.datasette.io/en/stable/publish.html"&gt;publish images&lt;/a&gt; to one that uses Python 3.14.&lt;/p&gt;
&lt;p&gt;Here's the full set of commands I used to create the plugin. I started with my &lt;a href="https://github.com/simonw/datasette-plugin"&gt;datasette-plugin&lt;/a&gt; cookiecutter template:&lt;/p&gt;
&lt;div class="highlight highlight-source-shell"&gt;&lt;pre&gt;uvx cookiecutter gh:simonw/datasette-plugin

  [1/8] &lt;span class="pl-en"&gt;plugin_name&lt;/span&gt; (): os-info
  [2/8] &lt;span class="pl-en"&gt;description&lt;/span&gt; (): Information about the current OS
  [3/8] hyphenated (os-info): 
  [4/8] underscored (os_info): 
  [5/8] &lt;span class="pl-en"&gt;github_username&lt;/span&gt; (): datasette
  [6/8] &lt;span class="pl-en"&gt;author_name&lt;/span&gt; (): Simon Willison
  [7/8] &lt;span class="pl-en"&gt;include_static_directory&lt;/span&gt; (): 
  [8/8] &lt;span class="pl-en"&gt;include_templates_directory&lt;/span&gt; (): &lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;This created a &lt;code&gt;datasette-os-info&lt;/code&gt; directory with the initial &lt;code&gt;pyproject.toml&lt;/code&gt; and &lt;code&gt;tests/&lt;/code&gt; and &lt;code&gt;datasette_os_info/__init__.py&lt;/code&gt; files. Here's an example of &lt;a href="https://github.com/simonw/datasette-plugin-template-demo"&gt;that starter template&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;I created a &lt;code&gt;uv&lt;/code&gt; virtual environment for it, installed the initial test dependencies and ran &lt;code&gt;pytest&lt;/code&gt; to check that worked:&lt;/p&gt;
&lt;div class="highlight highlight-source-shell"&gt;&lt;pre&gt;&lt;span class="pl-c1"&gt;cd&lt;/span&gt; datasette-os-info
uv venv
uv sync --extra &lt;span class="pl-c1"&gt;test&lt;/span&gt;
uv run pytest&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;Then I fired up &lt;a href="https://www.claude.com/product/claude-code"&gt;Claude Code&lt;/a&gt; in that directory in YOLO mode:&lt;/p&gt;
&lt;div class="highlight highlight-source-shell"&gt;&lt;pre&gt;claude --dangerously-skip-permissions&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;(I actually used my &lt;code&gt;claude-yolo&lt;/code&gt; shortcut which runs the above.)&lt;/p&gt;
&lt;p&gt;Then, in Claude, I told it how to run the tests:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;code&gt;Run uv run pytest&lt;/code&gt;&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;When that worked, I told it to build the plugin:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;code&gt;This is a Datasette plugin which should add a new page /-/os which returns pretty-printed JSON about the current operating system - implement it. I want to pick up as many details as possible across as many OS as possible, including if possible figuring out the base image if it is in a docker container - otherwise the Debian OS release name and suchlike would be good&lt;/code&gt;&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;... and that was it! Claude &lt;a href="https://github.com/datasette/datasette-os-info/blob/0.1/datasette_os_info/__init__.py"&gt;implemented the plugin&lt;/a&gt; using Datasette's &lt;a href="https://docs.datasette.io/en/stable/plugin_hooks.html#register-routes-datasette"&gt;register_routes() plugin hook&lt;/a&gt; to add the &lt;code&gt;/-/os&lt;/code&gt; page,and then without me prompting it to do so &lt;a href="https://github.com/datasette/datasette-os-info/blob/0.1/tests/test_os_info.py"&gt;built this basic test as well&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;It ran the new test, spotted a bug (it had guessed a non-existent &lt;code&gt;Response(..., default_repr=)&lt;/code&gt; parameter), fixed the bug and declared itself done.&lt;/p&gt;
&lt;p&gt;I built myself a wheel:&lt;/p&gt;
&lt;div class="highlight highlight-source-shell"&gt;&lt;pre&gt;uv pip install build
uv run python -m build&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;Then uploaded that to an S3 bucket and deployed it to test it out using &lt;code&gt;datasette publish ... --install URL-to-wheel&lt;/code&gt;.  It did exactly what I had hoped - here's what that &lt;code&gt;/-/os&lt;/code&gt; page looked like:&lt;/p&gt;
&lt;div class="highlight highlight-source-json"&gt;&lt;pre&gt;{
  &lt;span class="pl-ent"&gt;"platform"&lt;/span&gt;: {
    &lt;span class="pl-ent"&gt;"system"&lt;/span&gt;: &lt;span class="pl-s"&gt;&lt;span class="pl-pds"&gt;"&lt;/span&gt;Linux&lt;span class="pl-pds"&gt;"&lt;/span&gt;&lt;/span&gt;,
    &lt;span class="pl-ent"&gt;"release"&lt;/span&gt;: &lt;span class="pl-s"&gt;&lt;span class="pl-pds"&gt;"&lt;/span&gt;4.4.0&lt;span class="pl-pds"&gt;"&lt;/span&gt;&lt;/span&gt;,
    &lt;span class="pl-ent"&gt;"version"&lt;/span&gt;: &lt;span class="pl-s"&gt;&lt;span class="pl-pds"&gt;"&lt;/span&gt;#1 SMP Sun Jan 10 15:06:54 PST 2016&lt;span class="pl-pds"&gt;"&lt;/span&gt;&lt;/span&gt;,
    &lt;span class="pl-ent"&gt;"machine"&lt;/span&gt;: &lt;span class="pl-s"&gt;&lt;span class="pl-pds"&gt;"&lt;/span&gt;x86_64&lt;span class="pl-pds"&gt;"&lt;/span&gt;&lt;/span&gt;,
    &lt;span class="pl-ent"&gt;"processor"&lt;/span&gt;: &lt;span class="pl-s"&gt;&lt;span class="pl-pds"&gt;"&lt;/span&gt;&lt;span class="pl-pds"&gt;"&lt;/span&gt;&lt;/span&gt;,
    &lt;span class="pl-ent"&gt;"architecture"&lt;/span&gt;: [
      &lt;span class="pl-s"&gt;&lt;span class="pl-pds"&gt;"&lt;/span&gt;64bit&lt;span class="pl-pds"&gt;"&lt;/span&gt;&lt;/span&gt;,
      &lt;span class="pl-s"&gt;&lt;span class="pl-pds"&gt;"&lt;/span&gt;&lt;span class="pl-pds"&gt;"&lt;/span&gt;&lt;/span&gt;
    ],
    &lt;span class="pl-ent"&gt;"platform"&lt;/span&gt;: &lt;span class="pl-s"&gt;&lt;span class="pl-pds"&gt;"&lt;/span&gt;Linux-4.4.0-x86_64-with-glibc2.41&lt;span class="pl-pds"&gt;"&lt;/span&gt;&lt;/span&gt;,
    &lt;span class="pl-ent"&gt;"python_version"&lt;/span&gt;: &lt;span class="pl-s"&gt;&lt;span class="pl-pds"&gt;"&lt;/span&gt;3.14.0&lt;span class="pl-pds"&gt;"&lt;/span&gt;&lt;/span&gt;,
    &lt;span class="pl-ent"&gt;"python_implementation"&lt;/span&gt;: &lt;span class="pl-s"&gt;&lt;span class="pl-pds"&gt;"&lt;/span&gt;CPython&lt;span class="pl-pds"&gt;"&lt;/span&gt;&lt;/span&gt;
  },
  &lt;span class="pl-ent"&gt;"hostname"&lt;/span&gt;: &lt;span class="pl-s"&gt;&lt;span class="pl-pds"&gt;"&lt;/span&gt;localhost&lt;span class="pl-pds"&gt;"&lt;/span&gt;&lt;/span&gt;,
  &lt;span class="pl-ent"&gt;"cpu_count"&lt;/span&gt;: &lt;span class="pl-c1"&gt;2&lt;/span&gt;,
  &lt;span class="pl-ent"&gt;"linux"&lt;/span&gt;: {
    &lt;span class="pl-ent"&gt;"os_release"&lt;/span&gt;: {
      &lt;span class="pl-ent"&gt;"PRETTY_NAME"&lt;/span&gt;: &lt;span class="pl-s"&gt;&lt;span class="pl-pds"&gt;"&lt;/span&gt;Debian GNU/Linux 13 (trixie)&lt;span class="pl-pds"&gt;"&lt;/span&gt;&lt;/span&gt;,
      &lt;span class="pl-ent"&gt;"NAME"&lt;/span&gt;: &lt;span class="pl-s"&gt;&lt;span class="pl-pds"&gt;"&lt;/span&gt;Debian GNU/Linux&lt;span class="pl-pds"&gt;"&lt;/span&gt;&lt;/span&gt;,
      &lt;span class="pl-ent"&gt;"VERSION_ID"&lt;/span&gt;: &lt;span class="pl-s"&gt;&lt;span class="pl-pds"&gt;"&lt;/span&gt;13&lt;span class="pl-pds"&gt;"&lt;/span&gt;&lt;/span&gt;,
      &lt;span class="pl-ent"&gt;"VERSION"&lt;/span&gt;: &lt;span class="pl-s"&gt;&lt;span class="pl-pds"&gt;"&lt;/span&gt;13 (trixie)&lt;span class="pl-pds"&gt;"&lt;/span&gt;&lt;/span&gt;,
      &lt;span class="pl-ent"&gt;"VERSION_CODENAME"&lt;/span&gt;: &lt;span class="pl-s"&gt;&lt;span class="pl-pds"&gt;"&lt;/span&gt;trixie&lt;span class="pl-pds"&gt;"&lt;/span&gt;&lt;/span&gt;,
      &lt;span class="pl-ent"&gt;"DEBIAN_VERSION_FULL"&lt;/span&gt;: &lt;span class="pl-s"&gt;&lt;span class="pl-pds"&gt;"&lt;/span&gt;13.1&lt;span class="pl-pds"&gt;"&lt;/span&gt;&lt;/span&gt;,
      &lt;span class="pl-ent"&gt;"ID"&lt;/span&gt;: &lt;span class="pl-s"&gt;&lt;span class="pl-pds"&gt;"&lt;/span&gt;debian&lt;span class="pl-pds"&gt;"&lt;/span&gt;&lt;/span&gt;,
      &lt;span class="pl-ent"&gt;"HOME_URL"&lt;/span&gt;: &lt;span class="pl-s"&gt;&lt;span class="pl-pds"&gt;"&lt;/span&gt;https://www.debian.org/&lt;span class="pl-pds"&gt;"&lt;/span&gt;&lt;/span&gt;,
      &lt;span class="pl-ent"&gt;"SUPPORT_URL"&lt;/span&gt;: &lt;span class="pl-s"&gt;&lt;span class="pl-pds"&gt;"&lt;/span&gt;https://www.debian.org/support&lt;span class="pl-pds"&gt;"&lt;/span&gt;&lt;/span&gt;,
      &lt;span class="pl-ent"&gt;"BUG_REPORT_URL"&lt;/span&gt;: &lt;span class="pl-s"&gt;&lt;span class="pl-pds"&gt;"&lt;/span&gt;https://bugs.debian.org/&lt;span class="pl-pds"&gt;"&lt;/span&gt;&lt;/span&gt;
    },
    &lt;span class="pl-ent"&gt;"debian_version"&lt;/span&gt;: &lt;span class="pl-s"&gt;&lt;span class="pl-pds"&gt;"&lt;/span&gt;13.1&lt;span class="pl-pds"&gt;"&lt;/span&gt;&lt;/span&gt;,
    &lt;span class="pl-ent"&gt;"kernel"&lt;/span&gt;: {
      &lt;span class="pl-ent"&gt;"sysname"&lt;/span&gt;: &lt;span class="pl-s"&gt;&lt;span class="pl-pds"&gt;"&lt;/span&gt;Linux&lt;span class="pl-pds"&gt;"&lt;/span&gt;&lt;/span&gt;,
      &lt;span class="pl-ent"&gt;"nodename"&lt;/span&gt;: &lt;span class="pl-s"&gt;&lt;span class="pl-pds"&gt;"&lt;/span&gt;localhost&lt;span class="pl-pds"&gt;"&lt;/span&gt;&lt;/span&gt;,
      &lt;span class="pl-ent"&gt;"release"&lt;/span&gt;: &lt;span class="pl-s"&gt;&lt;span class="pl-pds"&gt;"&lt;/span&gt;4.4.0&lt;span class="pl-pds"&gt;"&lt;/span&gt;&lt;/span&gt;,
      &lt;span class="pl-ent"&gt;"version"&lt;/span&gt;: &lt;span class="pl-s"&gt;&lt;span class="pl-pds"&gt;"&lt;/span&gt;#1 SMP Sun Jan 10 15:06:54 PST 2016&lt;span class="pl-pds"&gt;"&lt;/span&gt;&lt;/span&gt;,
      &lt;span class="pl-ent"&gt;"machine"&lt;/span&gt;: &lt;span class="pl-s"&gt;&lt;span class="pl-pds"&gt;"&lt;/span&gt;x86_64&lt;span class="pl-pds"&gt;"&lt;/span&gt;&lt;/span&gt;
    },
    &lt;span class="pl-ent"&gt;"cpu_model"&lt;/span&gt;: &lt;span class="pl-s"&gt;&lt;span class="pl-pds"&gt;"&lt;/span&gt;unknown&lt;span class="pl-pds"&gt;"&lt;/span&gt;&lt;/span&gt;,
    &lt;span class="pl-ent"&gt;"memory"&lt;/span&gt;: {
      &lt;span class="pl-ent"&gt;"MemTotal"&lt;/span&gt;: &lt;span class="pl-s"&gt;&lt;span class="pl-pds"&gt;"&lt;/span&gt;1048576 kB&lt;span class="pl-pds"&gt;"&lt;/span&gt;&lt;/span&gt;,
      &lt;span class="pl-ent"&gt;"MemFree"&lt;/span&gt;: &lt;span class="pl-s"&gt;&lt;span class="pl-pds"&gt;"&lt;/span&gt;958244 kB&lt;span class="pl-pds"&gt;"&lt;/span&gt;&lt;/span&gt;,
      &lt;span class="pl-ent"&gt;"MemAvailable"&lt;/span&gt;: &lt;span class="pl-s"&gt;&lt;span class="pl-pds"&gt;"&lt;/span&gt;958244 kB&lt;span class="pl-pds"&gt;"&lt;/span&gt;&lt;/span&gt;,
      &lt;span class="pl-ent"&gt;"SwapTotal"&lt;/span&gt;: &lt;span class="pl-s"&gt;&lt;span class="pl-pds"&gt;"&lt;/span&gt;0 kB&lt;span class="pl-pds"&gt;"&lt;/span&gt;&lt;/span&gt;,
      &lt;span class="pl-ent"&gt;"SwapFree"&lt;/span&gt;: &lt;span class="pl-s"&gt;&lt;span class="pl-pds"&gt;"&lt;/span&gt;0 kB&lt;span class="pl-pds"&gt;"&lt;/span&gt;&lt;/span&gt;
    }
  },
  &lt;span class="pl-ent"&gt;"environment"&lt;/span&gt;: {
    &lt;span class="pl-ent"&gt;"HOME"&lt;/span&gt;: &lt;span class="pl-s"&gt;&lt;span class="pl-pds"&gt;"&lt;/span&gt;/root&lt;span class="pl-pds"&gt;"&lt;/span&gt;&lt;/span&gt;,
    &lt;span class="pl-ent"&gt;"PATH"&lt;/span&gt;: &lt;span class="pl-s"&gt;&lt;span class="pl-pds"&gt;"&lt;/span&gt;/usr/local/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin&lt;span class="pl-pds"&gt;"&lt;/span&gt;&lt;/span&gt;
  },
  &lt;span class="pl-ent"&gt;"python_executable"&lt;/span&gt;: &lt;span class="pl-s"&gt;&lt;span class="pl-pds"&gt;"&lt;/span&gt;/usr/local/bin/python3.14&lt;span class="pl-pds"&gt;"&lt;/span&gt;&lt;/span&gt;
}&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;I dread to think how long it would have taken me to figure out all of those different useful OS details!&lt;/p&gt;
&lt;p&gt;To finish up, I had Claude fill out the README via a couple of extra prompts:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;code&gt;Add usage instructions to the README (about the /-/os page)&lt;/code&gt;&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;Then had a brief conversation about how safe the output was:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;code&gt;Is there anything potentially sensitive in that output?&lt;/code&gt;&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;Claude said:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;Yes, there are several potentially sensitive items in the output:&lt;/p&gt;
&lt;p&gt;Most sensitive:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;hostname - Could reveal internal naming schemes or organization details&lt;/li&gt;
&lt;li&gt;Environment variables:
&lt;ul&gt;
&lt;li&gt;HOME - Reveals user's home directory path&lt;/li&gt;
&lt;li&gt;USER - Reveals the username&lt;/li&gt;
&lt;li&gt;PATH - Reveals filesystem structure and software installation locations&lt;/li&gt;
&lt;li&gt;VIRTUAL_ENV - Could reveal project paths and directory structure&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;python_executable - Reveals filesystem paths and user directories&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Moderately sensitive:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;CPU count and memory info - Could reveal infrastructure capacity&lt;/li&gt;
&lt;li&gt;Kernel version details - Could potentially be used to identify specific vulnerabilities&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Considerations:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;This endpoint is accessible without authentication by default (at /-/os)&lt;/li&gt;
&lt;li&gt;If you're running Datasette on a public server, this information would be visible to anyone&lt;/li&gt;
&lt;li&gt;The information could help attackers understand your infrastructure setup&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;Suggestions:&lt;/p&gt;
&lt;p&gt;You might want to:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;Add authentication requirements to this endpoint&lt;/li&gt;
&lt;li&gt;Make certain fields optional/configurable&lt;/li&gt;
&lt;li&gt;Add a warning in the README that this exposes system information&lt;/li&gt;
&lt;li&gt;Consider filtering out or redacting sensitive paths (replace usernames with ***, etc.)&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;Would you like me to add any of these protections to the plugin?&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;In this case I'm fine with the current behavior provided people understand the risks, so I told it:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;code&gt;Add the warning to the README&lt;/code&gt;&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;And then (because the warning it added was quite long):&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;code&gt;Drop the bit about configuring the permission system, just have a short warning telling people to review what it exposes&lt;/code&gt;&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;And requested an extra README note:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;code&gt;Add a note that you can also see the output by running: datasette --get /-/os&lt;/code&gt;&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;Three last prompts:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;code&gt;Add uv.lock to gitignore&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;&lt;code&gt;Drop Python 3.9 and add Python 3.14 - to the GitHub workflows, also min version in pyproject.toml&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;&lt;code&gt;Bump to setup-python@v6&lt;/code&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;... and that was the project finished. I pushed it &lt;a href="https://github.com/datasette/datasette-os-info"&gt;to GitHub&lt;/a&gt;, &lt;a href="https://til.simonwillison.net/pypi/pypi-releases-from-github"&gt;configured Trusted Publishing&lt;/a&gt; for it on PyPI and posted &lt;a href="https://github.com/datasette/datasette-os-info/releases/tag/0.1"&gt;the 0.1 release&lt;/a&gt;, which ran &lt;a href="https://github.com/datasette/datasette-os-info/blob/0.1/.github/workflows/publish.yml"&gt;this GitHub Actions publish.yml&lt;/a&gt; and deployed that release &lt;a href="https://pypi.org/project/datasette-os-info/"&gt;to datasette-os-info on PyPI&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;Now that it's live you can try it out without even installing Datasette using a &lt;code&gt;uv&lt;/code&gt; one-liner like this:&lt;/p&gt;
&lt;div class="highlight highlight-source-shell"&gt;&lt;pre&gt;uv run --isolated \
  --with datasette-os-info \
  datasette --get /-/os&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;That's using the &lt;code&gt;--get PATH&lt;/code&gt; CLI option to show what that path in the Datasette instance would return, as &lt;a href="https://docs.datasette.io/en/stable/cli-reference.html#datasette-get"&gt;described in the Datasette documentation&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;I've shared &lt;a href="https://gist.github.com/simonw/85fd7a76589dc01950e71d8e606cd5dd"&gt;my full Claude Code transcript&lt;/a&gt; in a Gist.&lt;/p&gt;
&lt;p&gt;A year ago I'd have been &lt;em&gt;very&lt;/em&gt; impressed by this. Today I wasn't even particularly surprised that this worked - the coding agent pattern implemented by Claude Code is spectacularly effective when you combine it with pre-existing templates, and Datasette has been aroung for long enough now that plenty of examples of plugins have made it into the training data for the leading models.&lt;/p&gt;
    
        &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/plugins"&gt;plugins&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/projects"&gt;projects&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/python"&gt;python&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/ai"&gt;ai&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/datasette"&gt;datasette&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/generative-ai"&gt;generative-ai&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/llms"&gt;llms&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/ai-assisted-programming"&gt;ai-assisted-programming&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/anthropic"&gt;anthropic&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/claude"&gt;claude&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/uv"&gt;uv&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/coding-agents"&gt;coding-agents&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/claude-code"&gt;claude-code&lt;/a&gt;&lt;/p&gt;
    

</summary><category term="plugins"/><category term="projects"/><category term="python"/><category term="ai"/><category term="datasette"/><category term="generative-ai"/><category term="llms"/><category term="ai-assisted-programming"/><category term="anthropic"/><category term="claude"/><category term="uv"/><category term="coding-agents"/><category term="claude-code"/></entry><entry><title>llm-fragments-youtube</title><link href="https://simonwillison.net/2025/Jun/14/llm-fragments-youtube/#atom-tag" rel="alternate"/><published>2025-06-14T18:19:37+00:00</published><updated>2025-06-14T18:19:37+00:00</updated><id>https://simonwillison.net/2025/Jun/14/llm-fragments-youtube/#atom-tag</id><summary type="html">
    
&lt;p&gt;&lt;strong&gt;&lt;a href="https://github.com/redraw/llm-fragments-youtube"&gt;llm-fragments-youtube&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;
Excellent new &lt;a href="https://llm.datasette.io/"&gt;LLM&lt;/a&gt; plugin by Agustin Bacigalup which lets you use the subtitles of any YouTube video as a &lt;a href="https://simonwillison.net/2025/Apr/7/long-context-llm/"&gt;fragment&lt;/a&gt; for running prompts against.&lt;/p&gt;
&lt;p&gt;I tried it out like this:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;llm install llm-fragments-youtube
llm -f youtube:dQw4w9WgXcQ \
  'summary of people and what they do'
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Which returned (&lt;a href="https://gist.github.com/simonw/6098f6e48981d84d1130f34566d0a116"&gt;full transcript&lt;/a&gt;):&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;The lyrics you've provided are from the song "Never Gonna Give You Up" by Rick Astley. The song features a narrator who is expressing unwavering love and commitment to another person. Here's a summary of the people involved and their roles:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;The Narrator (Singer):&lt;/strong&gt; A person deeply in love, promising loyalty, honesty, and emotional support. They emphasize that they will never abandon, hurt, or deceive their partner.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;The Partner (Implied Listener):&lt;/strong&gt; The person the narrator is addressing, who is experiencing emotional pain or hesitation ("Your heart's been aching but you're too shy to say it"). The narrator is encouraging them to understand and trust in the commitment being offered.&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;In essence, the song portrays a one-sided but heartfelt pledge of love, with the narrator assuring their partner of their steadfast dedication.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;The plugin works by including &lt;a href="https://github.com/yt-dlp/yt-dlp"&gt;yt-dlp&lt;/a&gt; as a Python dependency and then executing it via a call to &lt;code&gt;subprocess.run()&lt;/code&gt;.


    &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/plugins"&gt;plugins&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/youtube"&gt;youtube&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/ai"&gt;ai&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/generative-ai"&gt;generative-ai&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/llms"&gt;llms&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/llm"&gt;llm&lt;/a&gt;&lt;/p&gt;



</summary><category term="plugins"/><category term="youtube"/><category term="ai"/><category term="generative-ai"/><category term="llms"/><category term="llm"/></entry><entry><title>llm-mistral 0.14</title><link href="https://simonwillison.net/2025/May/29/llm-mistral-014/#atom-tag" rel="alternate"/><published>2025-05-29T03:33:17+00:00</published><updated>2025-05-29T03:33:17+00:00</updated><id>https://simonwillison.net/2025/May/29/llm-mistral-014/#atom-tag</id><summary type="html">
    
&lt;p&gt;&lt;strong&gt;&lt;a href="https://github.com/simonw/llm-mistral/releases/tag/0.14"&gt;llm-mistral 0.14&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;
I &lt;a href="https://github.com/simonw/llm-mistral/issues/31"&gt;added tool-support&lt;/a&gt; to my plugin for accessing the Mistral API from LLM today, plus support for Mistral's new &lt;a href="https://simonwillison.net/2025/May/28/codestral-embed/"&gt;Codestral Embed&lt;/a&gt; embedding model.&lt;/p&gt;
&lt;p&gt;An interesting challenge here is that I'm not using an official client library for &lt;code&gt;llm-mistral&lt;/code&gt; - I rolled my own client on top of their streaming HTTP API using Florimond Manca's &lt;a href="https://github.com/florimondmanca/httpx-sse"&gt;httpx-sse&lt;/a&gt; library. It's a very pleasant way to interact with streaming APIs - here's &lt;a href="https://github.com/simonw/llm-mistral/blob/098a4eaf624a3a723f91381915f93b4783d498bc/llm_mistral.py#L456-L502"&gt;my code that does most of the work&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;The problem I faced is that Mistral's API &lt;a href="https://docs.mistral.ai/capabilities/function_calling/"&gt;documentation for function calling&lt;/a&gt; has examples in Python and TypeScript but doesn't include &lt;code&gt;curl&lt;/code&gt; or direct documentation of their HTTP endpoints!&lt;/p&gt;
&lt;p&gt;I needed documentation at the HTTP level. Could I maybe extract that directly from Mistral's official Python library?&lt;/p&gt;
&lt;p&gt;It turns out &lt;a href="https://github.com/simonw/llm-mistral/issues/31#issuecomment-2917121330"&gt;I could&lt;/a&gt;. I started by cloning the repo:&lt;/p&gt;
&lt;div class="highlight highlight-source-shell"&gt;&lt;pre&gt;git clone https://github.com/mistralai/client-python
&lt;span class="pl-c1"&gt;cd&lt;/span&gt; client-python/src/mistralai
files-to-prompt &lt;span class="pl-c1"&gt;.&lt;/span&gt; &lt;span class="pl-k"&gt;|&lt;/span&gt; ttok&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;My &lt;a href="https://github.com/simonw/ttok"&gt;ttok&lt;/a&gt; tool gave me a token count of 212,410 (counted using OpenAI's tokenizer, but that's normally a close enough estimate) - Mistral's models tap out at 128,000 so I switched to Gemini 2.5 Flash which can easily handle that many.&lt;/p&gt;
&lt;p&gt;I ran this:&lt;/p&gt;
&lt;div class="highlight highlight-source-shell"&gt;&lt;pre&gt;files-to-prompt -c &lt;span class="pl-c1"&gt;.&lt;/span&gt; &lt;span class="pl-k"&gt;&amp;gt;&lt;/span&gt; /tmp/mistral.txt

llm -f /tmp/mistral.txt \
  -m gemini-2.5-flash-preview-05-20 \
  -s &lt;span class="pl-s"&gt;&lt;span class="pl-pds"&gt;'&lt;/span&gt;Generate comprehensive HTTP API documentation showing
how function calling works, include example curl commands for each step&lt;span class="pl-pds"&gt;'&lt;/span&gt;&lt;/span&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;The results were pretty spectacular! Gemini 2.5 Flash produced a &lt;a href="https://gist.github.com/simonw/03f2049cd9af6dc072e1ee33461f3437#response"&gt;detailed description&lt;/a&gt; of the exact set of HTTP APIs I needed to interact with, and the JSON formats I should pass to them.&lt;/p&gt;
&lt;p&gt;There are a bunch of steps needed to get tools working in a new model, as described in &lt;a href="https://llm.datasette.io/en/stable/plugins/advanced-model-plugins.html#supporting-tools"&gt;the LLM plugin authors documentation&lt;/a&gt;. I started working through them by hand... and then got lazy and decided to see if I could get a model to do the work for me.&lt;/p&gt;
&lt;p&gt;This time I tried the new Claude Opus 4. I fed it three files: my existing, incomplete &lt;code&gt;llm_mistral.py&lt;/code&gt;, a full copy of &lt;a href="https://github.com/simonw/llm-gemini/blob/6177aa2a0676bf004b374a8863914585aa93ca52/llm_gemini.py"&gt;llm_gemini.py&lt;/a&gt; with its working tools implementation and a copy of the API docs Gemini had written for me earlier. I prompted:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;code&gt;I need to update this Mistral code to add tool support. I've included examples of that code for Gemini, and a detailed README explaining the Mistral format.&lt;/code&gt;&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;Claude churned away and wrote me code that was &lt;em&gt;most&lt;/em&gt; of what I needed. I tested it in a bunch of different scenarios, pasted problems back into Claude to see what would happen, and eventually took over and finished the rest of the code myself. Here's &lt;a href="https://claude.ai/share/7c609a61-4b32-45ca-bdca-31bf4ef25d2d"&gt;the full transcript&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;I'm a little sad I didn't use Mistral to write the code to support Mistral, but I'm pleased to add yet another model family to the list that's supported for tool usage in LLM.


    &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/plugins"&gt;plugins&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/projects"&gt;projects&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/python"&gt;python&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/ai"&gt;ai&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/httpx"&gt;httpx&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/generative-ai"&gt;generative-ai&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/llms"&gt;llms&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/ai-assisted-programming"&gt;ai-assisted-programming&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/llm"&gt;llm&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/claude"&gt;claude&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/mistral"&gt;mistral&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/gemini"&gt;gemini&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/llm-tool-use"&gt;llm-tool-use&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/claude-4"&gt;claude-4&lt;/a&gt;&lt;/p&gt;



</summary><category term="plugins"/><category term="projects"/><category term="python"/><category term="ai"/><category term="httpx"/><category term="generative-ai"/><category term="llms"/><category term="ai-assisted-programming"/><category term="llm"/><category term="claude"/><category term="mistral"/><category term="gemini"/><category term="llm-tool-use"/><category term="claude-4"/></entry><entry><title>llm-llama-server 0.2</title><link href="https://simonwillison.net/2025/May/28/llama-server-tools/#atom-tag" rel="alternate"/><published>2025-05-28T06:27:14+00:00</published><updated>2025-05-28T06:27:14+00:00</updated><id>https://simonwillison.net/2025/May/28/llama-server-tools/#atom-tag</id><summary type="html">
    
&lt;p&gt;&lt;strong&gt;&lt;a href="https://github.com/simonw/llm-llama-server/releases/tag/0.2"&gt;llm-llama-server 0.2&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;
Here's a second option for using LLM's &lt;a href="https://simonwillison.net/2025/May/27/llm-tools/"&gt;new tool support&lt;/a&gt; against local models (the first was via &lt;a href="https://github.com/taketwo/llm-ollama/releases/tag/0.11a0"&gt;llm-ollama&lt;/a&gt;).&lt;/p&gt;
&lt;p&gt;It turns out the &lt;code&gt;llama.cpp&lt;/code&gt; ecosystem has pretty robust OpenAI-compatible tool support already, so my &lt;code&gt;llm-llama-server&lt;/code&gt; plugin only needed &lt;a href="https://github.com/simonw/llm-llama-server/commit/f61626fb4737f4f17dc6a9689274d14c3f3cb8ad#diff-66221cd67281bbbfbc677b6e7a3fd6d1b2e5562f0f55bde58250bf3953b1853a"&gt;a quick upgrade&lt;/a&gt; to get those working there.&lt;/p&gt;
&lt;p&gt;Unfortunately it looks like streaming support doesn't work with tools in &lt;code&gt;llama-server&lt;/code&gt; at the moment, so I added a new model ID called &lt;code&gt;llama-server-tools&lt;/code&gt; which disables streaming and enables tools.&lt;/p&gt;
&lt;p&gt;Here's how to try it out. First, ensure you have &lt;code&gt;llama-server&lt;/code&gt; - the easiest way to get that on macOS is via Homebrew:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;brew install llama.cpp
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Start the server running like this. This command will download and cache the 3.2GB &lt;a href="https://huggingface.co/unsloth/gemma-3-4b-it-GGUF"&gt;unsloth/gemma-3-4b-it-GGUF:Q4_K_XL&lt;/a&gt; if you don't yet have it:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;llama-server --jinja -hf unsloth/gemma-3-4b-it-GGUF:Q4_K_XL
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Then in another window:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;llm install llm-llama-server
llm -m llama-server-tools -T llm_time 'what time is it?' --td
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;And since you don't even need an API key for this, even if you've never used LLM before you can try it out with this uvx one-liner:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;uvx --with llm-llama-server llm -m llama-server-tools -T llm_time 'what time is it?' --td
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;For more notes on using &lt;code&gt;llama.cpp&lt;/code&gt; with LLM see &lt;a href="https://simonwillison.net/2025/May/10/llama-cpp-vision/"&gt;Trying out llama.cpp’s new vision support&lt;/a&gt; from a couple of weeks ago.


    &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/plugins"&gt;plugins&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/projects"&gt;projects&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/ai"&gt;ai&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/generative-ai"&gt;generative-ai&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/llm"&gt;llm&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/llm-tool-use"&gt;llm-tool-use&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/uv"&gt;uv&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/llama-cpp"&gt;llama-cpp&lt;/a&gt;&lt;/p&gt;



</summary><category term="plugins"/><category term="projects"/><category term="ai"/><category term="generative-ai"/><category term="llm"/><category term="llm-tool-use"/><category term="uv"/><category term="llama-cpp"/></entry><entry><title>llm-pdf-to-images</title><link href="https://simonwillison.net/2025/May/18/llm-pdf-to-images/#atom-tag" rel="alternate"/><published>2025-05-18T20:48:41+00:00</published><updated>2025-05-18T20:48:41+00:00</updated><id>https://simonwillison.net/2025/May/18/llm-pdf-to-images/#atom-tag</id><summary type="html">
    
&lt;p&gt;&lt;strong&gt;&lt;a href="https://github.com/simonw/llm-pdf-to-images"&gt;llm-pdf-to-images&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;
Inspired by my previous &lt;a href="https://github.com/simonw/llm-video-frames"&gt;llm-video-frames&lt;/a&gt; plugin, I thought it would be neat to have a plugin for LLM that can take a PDF and turn that into an image-per-page so you can feed PDFs into models that support image inputs but don't yet support PDFs.&lt;/p&gt;
&lt;p&gt;This should now do exactly that:&lt;/p&gt;
&lt;div class="highlight highlight-source-shell"&gt;&lt;pre&gt;llm install llm-pdf-to-images
llm -f pdf-to-images:path/to/document.pdf &lt;span class="pl-s"&gt;&lt;span class="pl-pds"&gt;'&lt;/span&gt;Summarize this document&lt;span class="pl-pds"&gt;'&lt;/span&gt;&lt;/span&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;Under the hood it's using the &lt;a href="https://github.com/pymupdf/PyMuPDF"&gt;PyMuPDF&lt;/a&gt; library. The key code to convert a PDF into images looks like this:&lt;/p&gt;
&lt;pre&gt;&lt;span class="pl-k"&gt;import&lt;/span&gt; &lt;span class="pl-s1"&gt;fitz&lt;/span&gt;
&lt;span class="pl-s1"&gt;doc&lt;/span&gt; &lt;span class="pl-c1"&gt;=&lt;/span&gt; &lt;span class="pl-s1"&gt;fitz&lt;/span&gt;.&lt;span class="pl-c1"&gt;open&lt;/span&gt;(&lt;span class="pl-s"&gt;"input.pdf"&lt;/span&gt;)
&lt;span class="pl-k"&gt;for&lt;/span&gt; &lt;span class="pl-s1"&gt;page&lt;/span&gt; &lt;span class="pl-c1"&gt;in&lt;/span&gt; &lt;span class="pl-s1"&gt;doc&lt;/span&gt;:
    &lt;span class="pl-s1"&gt;pix&lt;/span&gt; &lt;span class="pl-c1"&gt;=&lt;/span&gt; &lt;span class="pl-s1"&gt;page&lt;/span&gt;.&lt;span class="pl-c1"&gt;get_pixmap&lt;/span&gt;(&lt;span class="pl-s1"&gt;matrix&lt;/span&gt;&lt;span class="pl-c1"&gt;=&lt;/span&gt;&lt;span class="pl-s1"&gt;fitz&lt;/span&gt;.&lt;span class="pl-c1"&gt;Matrix&lt;/span&gt;(&lt;span class="pl-c1"&gt;300&lt;/span&gt;&lt;span class="pl-c1"&gt;/&lt;/span&gt;&lt;span class="pl-c1"&gt;72&lt;/span&gt;, &lt;span class="pl-c1"&gt;300&lt;/span&gt;&lt;span class="pl-c1"&gt;/&lt;/span&gt;&lt;span class="pl-c1"&gt;72&lt;/span&gt;))
    &lt;span class="pl-s1"&gt;jpeg_bytes&lt;/span&gt; &lt;span class="pl-c1"&gt;=&lt;/span&gt; &lt;span class="pl-s1"&gt;pix&lt;/span&gt;.&lt;span class="pl-c1"&gt;tobytes&lt;/span&gt;(&lt;span class="pl-s1"&gt;output&lt;/span&gt;&lt;span class="pl-c1"&gt;=&lt;/span&gt;&lt;span class="pl-s"&gt;"jpg"&lt;/span&gt;, &lt;span class="pl-s1"&gt;jpg_quality&lt;/span&gt;&lt;span class="pl-c1"&gt;=&lt;/span&gt;&lt;span class="pl-c1"&gt;30&lt;/span&gt;)&lt;/pre&gt;

&lt;p&gt;Once I'd figured out that code I got o4-mini to write most of the rest of the plugin, using &lt;a href="https://github.com/simonw/llm-fragments-github"&gt;llm-fragments-github&lt;/a&gt; to load in the example code from the video plugin:&lt;/p&gt;
&lt;pre&gt;llm -f github:simonw/llm-video-frames &lt;span class="pl-s"&gt;&lt;span class="pl-pds"&gt;'&lt;/span&gt;&lt;/span&gt;
&lt;span class="pl-s"&gt;import fitz&lt;/span&gt;
&lt;span class="pl-s"&gt;doc = fitz.open("input.pdf")&lt;/span&gt;
&lt;span class="pl-s"&gt;for page in doc:&lt;/span&gt;
&lt;span class="pl-s"&gt;    pix = page.get_pixmap(matrix=fitz.Matrix(300/72, 300/72))&lt;/span&gt;
&lt;span class="pl-s"&gt;    jpeg_bytes = pix.tobytes(output="jpg", jpg_quality=30)&lt;/span&gt;
&lt;span class="pl-s"&gt;&lt;span class="pl-pds"&gt;'&lt;/span&gt;&lt;/span&gt; -s &lt;span class="pl-s"&gt;&lt;span class="pl-pds"&gt;'&lt;/span&gt;output llm_pdf_to_images.py which adds a pdf-to-images: &lt;/span&gt;
&lt;span class="pl-s"&gt; fragment loader that converts a PDF to frames using fitz like in the example&lt;span class="pl-pds"&gt;'&lt;/span&gt;&lt;/span&gt; \
-m o4-mini&lt;/pre&gt;

&lt;p&gt;Here's &lt;a href="https://gist.github.com/simonw/27af84e4e533872bfd59fcba69b4166f"&gt;the transcript&lt;/a&gt; - more details in &lt;a href="https://github.com/simonw/llm-pdf-to-images/issues/1"&gt;this issue&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;I had some &lt;em&gt;weird&lt;/em&gt; results testing this with GPT 4.1 mini. I created &lt;a href="https://github.com/simonw/llm-pdf-to-images/blob/main/tests/blank-pages.pdf"&gt;a test PDF&lt;/a&gt; with two pages - one white, one black - and ran a test prompt like this:&lt;/p&gt;
&lt;pre&gt;llm -f &lt;span class="pl-s"&gt;&lt;span class="pl-pds"&gt;'&lt;/span&gt;pdf-to-images:blank-pages.pdf&lt;span class="pl-pds"&gt;'&lt;/span&gt;&lt;/span&gt; \
  &lt;span class="pl-s"&gt;&lt;span class="pl-pds"&gt;'&lt;/span&gt;describe these images&lt;span class="pl-pds"&gt;'&lt;/span&gt;&lt;/span&gt;&lt;/pre&gt;

&lt;blockquote&gt;
&lt;p&gt;The first image features a stylized red maple leaf with triangular facets, giving it a geometric appearance. The maple leaf is a well-known symbol associated with Canada.&lt;/p&gt;
&lt;p&gt;The second image is a simple black silhouette of a cat sitting and facing to the left. The cat's tail curls around its body. The design is minimalistic and iconic.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;I got even wilder hallucinations for other prompts, like "summarize this document" or "describe all figures". I have a collection of those &lt;a href="https://gist.github.com/simonw/2fbef11b1737a9ae7da1b2ff58998454"&gt;in this Gist&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;Thankfully this behavior is limited to GPT-4.1 mini. I upgraded to full GPT-4.1 and got &lt;a href="https://gist.github.com/simonw/0713dc0ce00bd6cd4d5990f44c865964#prompt-1"&gt;much more sensible results&lt;/a&gt;:&lt;/p&gt;
&lt;pre&gt;llm -f &lt;span class="pl-s"&gt;&lt;span class="pl-pds"&gt;'&lt;/span&gt;pdf-to-images:blank-pages.pdf&lt;span class="pl-pds"&gt;'&lt;/span&gt;&lt;/span&gt; \
  &lt;span class="pl-s"&gt;&lt;span class="pl-pds"&gt;'&lt;/span&gt;describe these images&lt;span class="pl-pds"&gt;'&lt;/span&gt;&lt;/span&gt; -m gpt-4.1&lt;/pre&gt;

&lt;blockquote&gt;
&lt;p&gt;Certainly! Here are the descriptions of the two images you provided:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;First image:&lt;/strong&gt; This image is completely white. It appears blank, with no discernible objects, text, or features.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Second image:&lt;/strong&gt; This image is entirely black. Like the first, it is blank and contains no visible objects, text, or distinct elements.&lt;/p&gt;
&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;If you have questions or need a specific kind of analysis or modification, please let me know!&lt;/p&gt;
&lt;/blockquote&gt;


    &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/pdf"&gt;pdf&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/plugins"&gt;plugins&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/projects"&gt;projects&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/ai"&gt;ai&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/generative-ai"&gt;generative-ai&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/llms"&gt;llms&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/ai-assisted-programming"&gt;ai-assisted-programming&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/llm"&gt;llm&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/hallucinations"&gt;hallucinations&lt;/a&gt;&lt;/p&gt;



</summary><category term="pdf"/><category term="plugins"/><category term="projects"/><category term="ai"/><category term="generative-ai"/><category term="llms"/><category term="ai-assisted-programming"/><category term="llm"/><category term="hallucinations"/></entry><entry><title>django-simple-deploy</title><link href="https://simonwillison.net/2025/May/17/django-simple-deploy/#atom-tag" rel="alternate"/><published>2025-05-17T12:49:52+00:00</published><updated>2025-05-17T12:49:52+00:00</updated><id>https://simonwillison.net/2025/May/17/django-simple-deploy/#atom-tag</id><summary type="html">
    
&lt;p&gt;&lt;strong&gt;&lt;a href="https://django-simple-deploy.readthedocs.io/"&gt;django-simple-deploy&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;
Eric Matthes presented a lightning talk about this project at PyCon US this morning. "Django has a deploy command now". You can run it like this:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;pip install django-simple-deploy[fly_io]
# Add django_simple_deploy to INSTALLED_APPS.
python manage.py deploy --automate-all
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;It's plugin-based (&lt;a href="https://github.com/django-simple-deploy/django-simple-deploy/issues/313"&gt;inspired by Datasette!&lt;/a&gt;) and the project has stable plugins for three hosting platforms: &lt;a href="https://github.com/django-simple-deploy/dsd-flyio"&gt;dsd-flyio&lt;/a&gt;, &lt;a href="https://github.com/django-simple-deploy/dsd-heroku"&gt;dsd-heroku&lt;/a&gt; and &lt;a href="https://github.com/django-simple-deploy/dsd-platformsh"&gt;dsd-platformsh&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;Currently in development: &lt;a href="https://github.com/django-simple-deploy/dsd-vps"&gt;dsd-vps&lt;/a&gt; - a plugin that should work with any VPS provider, using &lt;a href="https://www.paramiko.org/"&gt;Paramiko&lt;/a&gt; to connect to a newly created instance and &lt;a href="https://github.com/django-simple-deploy/dsd-vps/blob/a372fc7b7fd31cd2ad3cf22d68b9c9fecb65d17a/dsd_vps/utils.py"&gt;run all of the commands&lt;/a&gt; needed to start serving a Django application.


    &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/django"&gt;django&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/paramiko"&gt;paramiko&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/plugins"&gt;plugins&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/python"&gt;python&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/heroku"&gt;heroku&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/datasette"&gt;datasette&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/fly"&gt;fly&lt;/a&gt;&lt;/p&gt;



</summary><category term="django"/><category term="paramiko"/><category term="plugins"/><category term="python"/><category term="heroku"/><category term="datasette"/><category term="fly"/></entry><entry><title>llm -f issue:... -m echo --no-log</title><link href="https://simonwillison.net/2025/May/9/private-issues/#atom-tag" rel="alternate"/><published>2025-05-09T18:55:24+00:00</published><updated>2025-05-09T18:55:24+00:00</updated><id>https://simonwillison.net/2025/May/9/private-issues/#atom-tag</id><summary type="html">
    &lt;p&gt;I had some notes in a GitHub issue thread in a private repository that I wanted to export as Markdown. I realized that I could get them using a combination of several recent projects.&lt;/p&gt;
&lt;p&gt;Here's what I ran:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;export GITHUB_TOKEN="$(llm keys get github)"                                             
llm -f issue:https://github.com/simonw/todos/issues/170 \
  -m echo --no-log | jq .prompt -r &amp;gt; notes.md
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;I have a GitHub personal access token stored in my LLM keys, for use with Anthony Shaw's &lt;a href="https://github.com/tonybaloney/llm-github-models"&gt;llm-github-models&lt;/a&gt; plugin.&lt;/p&gt;
&lt;p&gt;My own &lt;a href="https://github.com/simonw/llm-fragments-github"&gt;llm-fragments-github&lt;/a&gt; plugin expects an optional &lt;code&gt;GITHUB_TOKEN&lt;/code&gt; environment variable, so I set that first - here's &lt;a href="https://github.com/simonw/llm-fragments-github/issues/11"&gt;an issue&lt;/a&gt; to have it use the &lt;code&gt;github&lt;/code&gt; key instead.&lt;/p&gt;
&lt;p&gt;With that set, the &lt;code&gt;issue:&lt;/code&gt; fragment loader can take a URL to a private GitHub issue thread and load it via the API using the token, then concatenate the comments together as Markdown. Here's &lt;a href="https://github.com/simonw/llm-fragments-github/blob/87555488805ffc973b5fb45aa51eac83be2c839f/llm_fragments_github.py#L92-L126"&gt;the code for that&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;Fragments are meant to be used as input to LLMs. I built a &lt;a href="https://github.com/simonw/llm-echo"&gt;llm-echo&lt;/a&gt; plugin recently which adds a fake LLM called "echo" which simply echos its input back out again.&lt;/p&gt;
&lt;p&gt;Adding &lt;code&gt;--no-log&lt;/code&gt; prevents that junk data from being stored in my &lt;a href="https://llm.datasette.io/en/stable/logging.html"&gt;LLM log database&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;The output is JSON with a &lt;code&gt;"prompt"&lt;/code&gt; key for the original prompt. I use &lt;code&gt;jq .prompt&lt;/code&gt; to extract that out, then &lt;code&gt;-r&lt;/code&gt; to get it as raw text (not a &lt;code&gt;"JSON string"&lt;/code&gt;).&lt;/p&gt;
&lt;p&gt;... and I write the result to &lt;code&gt;notes.md&lt;/code&gt;.&lt;/p&gt;

    &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/github"&gt;github&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/plugins"&gt;plugins&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/projects"&gt;projects&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/jq"&gt;jq&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/llm"&gt;llm&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/anthony-shaw"&gt;anthony-shaw&lt;/a&gt;&lt;/p&gt;



</summary><category term="github"/><category term="plugins"/><category term="projects"/><category term="jq"/><category term="llm"/><category term="anthony-shaw"/></entry><entry><title>Feed a video to a vision LLM as a sequence of JPEG frames on the CLI (also LLM 0.25)</title><link href="https://simonwillison.net/2025/May/5/llm-video-frames/#atom-tag" rel="alternate"/><published>2025-05-05T17:38:25+00:00</published><updated>2025-05-05T17:38:25+00:00</updated><id>https://simonwillison.net/2025/May/5/llm-video-frames/#atom-tag</id><summary type="html">
    &lt;p&gt;The new &lt;strong&gt;&lt;a href="https://github.com/simonw/llm-video-frames"&gt;llm-video-frames&lt;/a&gt;&lt;/strong&gt; plugin can turn a video file into a sequence of JPEG frames and feed them directly into a long context vision LLM such as GPT-4.1, even when that LLM doesn't directly support video input. It depends on a plugin feature I added to &lt;a href="https://llm.datasette.io/en/stable/changelog.html#v0-25"&gt;LLM 0.25&lt;/a&gt;, which I released last night.&lt;/p&gt;
&lt;p&gt;Here's how to try it out:&lt;/p&gt;
&lt;div class="highlight highlight-source-shell"&gt;&lt;pre&gt;brew install ffmpeg &lt;span class="pl-c"&gt;&lt;span class="pl-c"&gt;#&lt;/span&gt; or apt-get or your package manager of choice&lt;/span&gt;
uv tool install llm &lt;span class="pl-c"&gt;&lt;span class="pl-c"&gt;#&lt;/span&gt; or pipx install llm or pip install llm&lt;/span&gt;
llm install llm-video-frames
llm keys &lt;span class="pl-c1"&gt;set&lt;/span&gt; openai
&lt;span class="pl-c"&gt;&lt;span class="pl-c"&gt;#&lt;/span&gt; Paste your OpenAI API key here&lt;/span&gt;

llm -f video-frames:video.mp4 \
  &lt;span class="pl-s"&gt;&lt;span class="pl-pds"&gt;'&lt;/span&gt;describe the key scenes in this video&lt;span class="pl-pds"&gt;'&lt;/span&gt;&lt;/span&gt; \
  -m gpt-4.1-mini&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;The &lt;code&gt;video-frames:filepath.mp4&lt;/code&gt; syntax is provided by the new plugin. It takes that video, converts it to a JPEG for every second of the video and then turns those into &lt;a href="https://llm.datasette.io/en/stable/usage.html#attachments"&gt;attachments&lt;/a&gt; that can be passed to the LLM.&lt;/p&gt;
&lt;p&gt;Here's a demo, against &lt;a href="https://static.simonwillison.net/static/2025/cleo.mp4"&gt;this video&lt;/a&gt; of Cleo:&lt;/p&gt;

&lt;div style="max-width: 100%; margin-bottom: 0.4em"&gt;
    &lt;video controls="controls" preload="none" aria-label="Cleo " poster="https://static.simonwillison.net/static/2025/cleo-still.jpg" loop="loop" style="width: 100%; height: auto;" muted="muted"&gt;
        &lt;source src="https://static.simonwillison.net/static/2025/cleo.mp4" type="video/mp4" /&gt;
    &lt;/video&gt;
&lt;/div&gt;

&lt;div class="highlight highlight-source-shell"&gt;&lt;pre&gt;llm -f video-frames:cleo.mp4 &lt;span class="pl-s"&gt;&lt;span class="pl-pds"&gt;'&lt;/span&gt;describe key moments&lt;span class="pl-pds"&gt;'&lt;/span&gt;&lt;/span&gt; -m gpt-4.1-mini&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;And the output from the model (&lt;a href="https://gist.github.com/simonw/a4e26166a524f9c07b4cf32b2f8da6c9"&gt;transcript here&lt;/a&gt;):&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;The sequence of images captures the key moments of a dog being offered and then enjoying a small treat:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;In the first image, a hand is holding a small cupcake with purple frosting close to a black dog's face. The dog looks eager and is focused intently on the treat.&lt;/li&gt;
&lt;li&gt;The second image shows the dog beginning to take a bite of the cupcake from the person's fingers. The dog's mouth is open, gently nibbling on the treat.&lt;/li&gt;
&lt;li&gt;In the third image, the dog has finished or is almost done with the treat and looks content, with a slight smile and a relaxed posture. The treat is no longer in the person's hand, indicating that the dog has consumed it.&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;This progression highlights the dog's anticipation, enjoyment, and satisfaction with the treat.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;Total cost: 7,072 input tokens, 156 output tokens - for GPT-4.1 mini that's 0.3078 cents (less than a third of a cent).&lt;/p&gt;
&lt;p&gt;In this case the plugin converted the video into three images: &lt;a href="https://static.simonwillison.net/static/2025/frame_00001.jpg"&gt;frame_00001.jpg&lt;/a&gt;, &lt;a href="https://static.simonwillison.net/static/2025/frame_00002.jpg"&gt;frame_00002.jpg&lt;/a&gt; and &lt;a href="https://static.simonwillison.net/static/2025/frame_00003.jpg"&gt;frame_00003.jpg&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;The plugin accepts additional arguments. You can increase the frames-per-second using &lt;code&gt;?fps=2&lt;/code&gt; - for example:&lt;/p&gt;
&lt;div class="highlight highlight-source-shell"&gt;&lt;pre&gt;llm -f &lt;span class="pl-s"&gt;&lt;span class="pl-pds"&gt;'&lt;/span&gt;video-frames:video.mp4?fps=2&lt;span class="pl-pds"&gt;'&lt;/span&gt;&lt;/span&gt; &lt;span class="pl-s"&gt;&lt;span class="pl-pds"&gt;'&lt;/span&gt;summarize this video&lt;span class="pl-pds"&gt;'&lt;/span&gt;&lt;/span&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;Or you can add &lt;code&gt;?timestamps=1&lt;/code&gt; to cause &lt;code&gt;ffmpeg&lt;/code&gt; to overlay a timestamp in the bottom right corner of each frame. This gives the model a chance to return timestamps in its output.&lt;/p&gt;
&lt;p&gt;Let's try that with the Cleo video:&lt;/p&gt;
&lt;div class="highlight highlight-source-shell"&gt;&lt;pre&gt;llm -f &lt;span class="pl-s"&gt;&lt;span class="pl-pds"&gt;'&lt;/span&gt;video-frames:cleo.mp4?timestamps=1&amp;amp;fps=5&lt;span class="pl-pds"&gt;'&lt;/span&gt;&lt;/span&gt; \
  &lt;span class="pl-s"&gt;&lt;span class="pl-pds"&gt;'&lt;/span&gt;key moments, include timestamps&lt;span class="pl-pds"&gt;'&lt;/span&gt;&lt;/span&gt; -m gpt-4.1-mini&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;Here's the output (&lt;a href="https://gist.github.com/simonw/371719849a7d2260f1eb0f422ea63187"&gt;transcript here&lt;/a&gt;):&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;Here are the key moments from the video "cleo.mp4" with timestamps:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;00:00:00.000 - A dog on leash looks at a small cupcake with purple frosting being held by a person.&lt;/li&gt;
&lt;li&gt;00:00:00.800 - The dog closely sniffs the cupcake.&lt;/li&gt;
&lt;li&gt;00:00:01.400 - The person puts a small amount of the cupcake frosting on their finger.&lt;/li&gt;
&lt;li&gt;00:00:01.600 - The dog starts licking the frosting from the person's finger.&lt;/li&gt;
&lt;li&gt;00:00:02.600 - The dog continues licking enthusiastically.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Let me know if you need more details or a description of any specific part.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;That one sent 14 images to the API, at a total cost of 32,968 input, 141 output = 1.3413 cents.&lt;/p&gt;
&lt;p&gt;It sent 5.9MB of image data to OpenAI's API, encoded as base64 in the JSON API call.&lt;/p&gt;
&lt;p&gt;The GPT-4.1 model family accepts up to 1,047,576 tokens. Aside from a 20MB size limit per image I haven't seen any documentation of limits on the number of images. You can fit a whole lot of JPEG frames in a million tokens!&lt;/p&gt;
&lt;p&gt;Here's what one of those frames looks like with the timestamp overlaid in the corner:&lt;/p&gt;
&lt;p&gt;&lt;img src="https://static.simonwillison.net/static/2025/cleo-finger.jpg" alt="Cleo taking a treat from my fingers, in the bottom right corner is an overlay t hat says cleo.mp4 00:00:01.600" style="max-width: 100%;" /&gt;&lt;/p&gt;
&lt;h4 id="how-i-built-the-plugin-with-o4-mini"&gt;How I built the plugin with o4-mini&lt;/h4&gt;
&lt;p&gt;This is a great example of how rapid prototyping with an LLM can help demonstrate the value of a feature.&lt;/p&gt;
&lt;p&gt;I was considering whether it would make sense for fragment plugins to return images in &lt;a href="https://github.com/simonw/llm/issues/972#issuecomment-2849342103"&gt;issue 972&lt;/a&gt; when I had the idea to use &lt;code&gt;ffmpeg&lt;/code&gt; to split a video into frames.&lt;/p&gt;
&lt;p&gt;I know &lt;a href="https://simonwillison.net/2025/Apr/23/llm-fragment-symbex/"&gt;from past experience&lt;/a&gt; that a good model can write an entire plugin for LLM if you feed it the right example, so I started with this (reformatted here for readability):&lt;/p&gt;
&lt;div class="highlight highlight-source-shell"&gt;&lt;pre&gt;llm -m o4-mini -f github:simonw/llm-hacker-news -s &lt;span class="pl-s"&gt;&lt;span class="pl-pds"&gt;'&lt;/span&gt;write a new plugin called llm_video_frames.py which takes video:path-to-video.mp4 and creates a temporary directory which it then populates with one frame per second of that video using ffmpeg - then it returns a list of [llm.Attachment(path="path-to-frame1.jpg"), ...] - it should also support passing video:video.mp4?fps=2 to increase to two frames per second, and if you pass ?timestamps=1 or &amp;amp;timestamps=1 then it should add a text timestamp to the bottom right conner of each image with the mm:ss timestamp of that frame (or hh:mm:ss if more than one hour in) and the filename of the video without the path as well.&lt;span class="pl-pds"&gt;'&lt;/span&gt;&lt;/span&gt; -o reasoning_effort high&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;Here's &lt;a href="https://gist.github.com/simonw/4f545ecb347884d1d923dbc49550b8b0#response"&gt;the transcript&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;The new attachment mechanism went from vague idea to "I should build that" as a direct result of having an LLM-built proof-of-concept that demonstrated the feasibility of the new feature.&lt;/p&gt;
&lt;p&gt;The code it produced was about 90% of the code I shipped in the finished plugin. Total cost 5,018 input, 2,208 output = 1.5235 cents.&lt;/p&gt;
&lt;h4 id="annotated-release-notes-for-everything-else-in-llm-0-25"&gt;Annotated release notes for everything else in LLM 0.25&lt;/h4&gt;
&lt;p&gt;Here are the annotated release notes for everything else:&lt;/p&gt;
&lt;blockquote&gt;
&lt;ul&gt;
&lt;li&gt;New plugin feature: &lt;a href="https://llm.datasette.io/en/stable/plugins/plugin-hooks.html#plugin-hooks-register-fragment-loaders"&gt;register_fragment_loaders(register)&lt;/a&gt; plugins can now return a mixture of fragments and attachments. The &lt;a href="https://github.com/simonw/llm-video-frames"&gt;llm-video-frames&lt;/a&gt; plugin is the first to take advantage of this mechanism. &lt;a href="https://github.com/simonw/llm/issues/972"&gt;#972&lt;/a&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;/blockquote&gt;
&lt;p&gt;As decsribed above. The inspiration for this feature came from the &lt;a href="https://github.com/agustif/llm-arxiv"&gt;llm-arxiv&lt;/a&gt; plugin by &lt;a href="https://github.com/agustif"&gt;agustif&lt;/a&gt;.&lt;/p&gt;
&lt;blockquote&gt;
&lt;ul&gt;
&lt;li&gt;New OpenAI models: &lt;code&gt;gpt-4.1&lt;/code&gt;, &lt;code&gt;gpt-4.1-mini&lt;/code&gt;, &lt;code&gt;gpt-41-nano&lt;/code&gt;, &lt;code&gt;o3&lt;/code&gt;, &lt;code&gt;o4-mini&lt;/code&gt;. &lt;a href="https://github.com/simonw/llm/issues/945"&gt;#945&lt;/a&gt;, &lt;a href="https://github.com/simonw/llm/issues/965"&gt;#965&lt;/a&gt;, &lt;a href="https://github.com/simonw/llm/issues/976"&gt;#976&lt;/a&gt;.&lt;/li&gt;
&lt;/ul&gt;
&lt;/blockquote&gt;
&lt;p&gt;My original plan was to leave these models exclusively to the new &lt;a href="https://github.com/simonw/llm-openai-plugin"&gt;llm-openai&lt;/a&gt; plugin, since that allows me to add support for new models without a full LLM release. I'm going to punt on that until I'm ready to entirely remove the OpenAI models from LLM core.&lt;/p&gt;
&lt;blockquote&gt;
&lt;ul&gt;
&lt;li&gt;New environment variables: &lt;code&gt;LLM_MODEL&lt;/code&gt; and &lt;code&gt;LLM_EMBEDDING_MODEL&lt;/code&gt; for setting the model to use without needing to specify &lt;code&gt;-m model_id&lt;/code&gt; every time. &lt;a href="https://github.com/simonw/llm/issues/932"&gt;#932&lt;/a&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;/blockquote&gt;
&lt;p&gt;A convenience feature for if you want to set the default model for a terminal session with LLM without using the global &lt;a href="https://llm.datasette.io/en/stable/setup.html#setting-a-custom-default-model"&gt;default model" mechanism&lt;/a&gt;.&lt;/p&gt;
&lt;blockquote&gt;
&lt;ul&gt;
&lt;li&gt;New command: &lt;code&gt;llm fragments loaders&lt;/code&gt;, to list all currently available fragment loader prefixes provided by plugins. &lt;a href="https://github.com/simonw/llm/issues/941"&gt;#941&lt;/a&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;/blockquote&gt;
&lt;p&gt;Mainly for consistence with the existing &lt;a href="https://llm.datasette.io/en/stable/help.html#llm-templates-loaders-help"&gt;llm templates loaders&lt;/a&gt; command. Here's the output when I run &lt;code&gt;llm fragments loaders&lt;/code&gt; on my machine:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;docs:
  Fetch the latest documentation for the specified package from
  https://github.com/simonw/docs-for-llms

  Use '-f docs:' for the documentation of your current version of LLM.

docs-preview:
  Similar to docs: but fetches the latest docs including alpha/beta releases.

symbex:
  Walk the given directory, parse every .py file, and for every
  top-level function or class-method produce its signature and
  docstring plus an import line.

github:
  Load files from a GitHub repository as fragments

  Argument is a GitHub repository URL or username/repository

issue:
  Fetch GitHub issue/pull and comments as Markdown

  Argument is either "owner/repo/NUMBER" or URL to an issue

pr:
  Fetch GitHub pull request with comments and diff as Markdown

  Argument is either "owner/repo/NUMBER" or URL to a pull request

hn:
  Given a Hacker News article ID returns the full nested conversation.

  For example: -f hn:43875136

video-frames:
  Fragment loader "video-frames:&amp;lt;path&amp;gt;?fps=N&amp;amp;timestamps=1"
  - extracts frames at `fps` per second (default 1)
  - if `timestamps=1`, overlays "filename hh:mm:ss" at bottom-right
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;That's from &lt;a href="https://github.com/simonw/llm-docs"&gt;llm-docs&lt;/a&gt;, &lt;a href="https://github.com/simonw/llm-fragments-github"&gt;llm-fragments-symbex&lt;/a&gt;, &lt;a href="https://github.com/simonw/llm-fragments-github"&gt;llm-fragments-github&lt;/a&gt;, &lt;a href="https://github.com/simonw/llm-hacker-news"&gt;llm-hacker-news&lt;/a&gt; and &lt;a href="https://github.com/simonw/llm-video-frames"&gt;llm-video-frames&lt;/a&gt;.&lt;/p&gt;
&lt;blockquote&gt;
&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;llm fragments&lt;/code&gt; command now shows fragments ordered by the date they were first used. &lt;a href="https://github.com/simonw/llm/issues/973"&gt;#973&lt;/a&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;/blockquote&gt;
&lt;p&gt;This makes it easier to quickly debug a new fragment plugin - you can run &lt;code&gt;llm fragments&lt;/code&gt; and glance at the bottom few entries.&lt;/p&gt;
&lt;p&gt;I've also been using the new &lt;a href="https://github.com/simonw/llm-echo"&gt;llm-echo&lt;/a&gt; debugging plugin for this - it adds a new fake model called "echo" which simply outputs whatever the prompt, system prompt, fragments and attachments are that were passed to the model:&lt;/p&gt;
&lt;div class="highlight highlight-source-shell"&gt;&lt;pre&gt;llm -f docs:sqlite-utils -m &lt;span class="pl-c1"&gt;echo&lt;/span&gt; &lt;span class="pl-s"&gt;&lt;span class="pl-pds"&gt;'&lt;/span&gt;Show me the context&lt;span class="pl-pds"&gt;'&lt;/span&gt;&lt;/span&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;&lt;a href="https://gist.github.com/simonw/cb3249856887379759515022c76d0d9e"&gt;Output here&lt;/a&gt;.&lt;/p&gt;
&lt;blockquote&gt;
&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;llm chat&lt;/code&gt; now includes a &lt;code&gt;!edit&lt;/code&gt; command for editing a prompt using your default terminal text editor. Thanks, &lt;a href="https://github.com/Hopiu"&gt;Benedikt Willi&lt;/a&gt;. &lt;a href="https://github.com/simonw/llm/pull/969"&gt;#969&lt;/a&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;/blockquote&gt;
&lt;p&gt;This is a really nice enhancement to &lt;code&gt;llm chat&lt;/code&gt;, making it much more convenient to edit longe prompts.&lt;/p&gt;
&lt;p&gt;And the rest:&lt;/p&gt;
&lt;blockquote&gt;
&lt;ul&gt;
&lt;li&gt;Allow &lt;code&gt;-t&lt;/code&gt; and &lt;code&gt;--system&lt;/code&gt; to be used at the same time. &lt;a href="https://github.com/simonw/llm/issues/916"&gt;#916&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;Fixed a bug where accessing a model via its alias would fail to respect any default options set for that model. &lt;a href="https://github.com/simonw/llm/issues/968"&gt;#968&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;Improved documentation for &lt;a href="https://llm.datasette.io/en/stable/other-models.html#openai-compatible-models"&gt;extra-openai-models.yaml&lt;/a&gt;. Thanks, &lt;a href="https://github.com/rahimnathwani"&gt;Rahim Nathwani&lt;/a&gt; and &lt;a href="https://github.com/dguido"&gt;Dan Guido&lt;/a&gt;. &lt;a href="https://github.com/simonw/llm/pull/950"&gt;#950&lt;/a&gt;, &lt;a href="https://github.com/simonw/llm/pull/957"&gt;#957&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;llm -c/--continue&lt;/code&gt; now works correctly with the &lt;code&gt;-d/--database&lt;/code&gt; option. &lt;code&gt;llm chat&lt;/code&gt; now accepts that &lt;code&gt;-d/--database&lt;/code&gt; option. Thanks, &lt;a href="https://github.com/sukhbinder"&gt;Sukhbinder Singh&lt;/a&gt;. &lt;a href="https://github.com/simonw/llm/issues/933"&gt;#933&lt;/a&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;/blockquote&gt;
    
        &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/cli"&gt;cli&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/ffmpeg"&gt;ffmpeg&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/plugins"&gt;plugins&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/projects"&gt;projects&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/ai"&gt;ai&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/generative-ai"&gt;generative-ai&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/llms"&gt;llms&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/ai-assisted-programming"&gt;ai-assisted-programming&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/llm"&gt;llm&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/vision-llms"&gt;vision-llms&lt;/a&gt;&lt;/p&gt;
    

</summary><category term="cli"/><category term="ffmpeg"/><category term="plugins"/><category term="projects"/><category term="ai"/><category term="generative-ai"/><category term="llms"/><category term="ai-assisted-programming"/><category term="llm"/><category term="vision-llms"/></entry><entry><title>llm-fragments-github 0.2</title><link href="https://simonwillison.net/2025/Apr/20/llm-fragments-github/#atom-tag" rel="alternate"/><published>2025-04-20T14:01:09+00:00</published><updated>2025-04-20T14:01:09+00:00</updated><id>https://simonwillison.net/2025/Apr/20/llm-fragments-github/#atom-tag</id><summary type="html">
    
&lt;p&gt;&lt;strong&gt;&lt;a href="https://github.com/simonw/llm-fragments-github/releases/tag/0.2"&gt;llm-fragments-github 0.2&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;
I upgraded my &lt;code&gt;llm-fragments-github&lt;/code&gt; plugin to add a new fragment type called &lt;code&gt;issue&lt;/code&gt;. It lets you pull the entire content of a GitHub issue thread into your prompt as a concatenated Markdown file. &lt;/p&gt;
&lt;p&gt;(If you haven't seen fragments before I introduced them in &lt;a href="https://simonwillison.net/2025/Apr/7/long-context-llm/"&gt;Long context support in LLM 0.24 using fragments and template plugins&lt;/a&gt;.)&lt;/p&gt;
&lt;p&gt;I used it just now to have Gemini 2.5 Pro provide feedback and attempt an implementation of a complex issue against my &lt;a href="https://github.com/simonw/llm"&gt;LLM&lt;/a&gt; project:&lt;/p&gt;
&lt;pre&gt;llm install llm-fragments-github
llm -f github:simonw/llm \
  -f issue:simonw/llm/938 \
  -m gemini-2.5-pro-exp-03-25 \
  --system &lt;span class="pl-s"&gt;&lt;span class="pl-pds"&gt;'&lt;/span&gt;muse on this issue, then propose a whole bunch of code to help implement it&lt;span class="pl-pds"&gt;'&lt;/span&gt;&lt;/span&gt;&lt;/pre&gt;

&lt;p&gt;Here I'm loading the FULL content of the &lt;code&gt;simonw/llm&lt;/code&gt; repo using that &lt;code&gt;-f github:simonw/llm&lt;/code&gt; fragment (&lt;a href="https://github.com/simonw/llm-fragments-github?tab=readme-ov-file#usage"&gt;documented here&lt;/a&gt;), then loading all of the comments from &lt;a href="https://github.com/simonw/llm/issues/938"&gt;issue 938&lt;/a&gt; where I discuss quite a complex potential refactoring. I ask Gemini 2.5 Pro to "muse on this issue" and come up with some code.&lt;/p&gt;
&lt;p&gt;This worked &lt;em&gt;shockingly&lt;/em&gt; well. Here's &lt;a href="https://gist.github.com/simonw/a5f0c1e8184f4ddc8b71b30890fe690c#response"&gt;the full response&lt;/a&gt;, which highlighted a few things I hadn't considered yet (such as the need to migrate old database records to the new tree hierarchy) and then spat out a whole bunch of code which looks like a solid start to the actual implementation work I need to do.&lt;/p&gt;
&lt;p&gt;I ran this against Google's free Gemini 2.5 Preview, but if I'd used the paid model it would have cost me 202,680 input tokens, 10,460 output tokens and 1,859 thinking tokens for a total of 62.989 cents.&lt;/p&gt;
&lt;p&gt;As a fun extra, the new &lt;code&gt;issue:&lt;/code&gt; feature itself was written almost entirely by OpenAI o3, again using fragments. I ran this:&lt;/p&gt;
&lt;pre&gt;llm -m openai/o3 \
  -f https://raw.githubusercontent.com/simonw/llm-hacker-news/refs/heads/main/llm_hacker_news.py \
  -f https://raw.githubusercontent.com/simonw/tools/refs/heads/main/github-issue-to-markdown.html \
  -s &lt;span class="pl-s"&gt;&lt;span class="pl-pds"&gt;'&lt;/span&gt;Write a new fragments plugin in Python that registers issue:org/repo/123 which fetches that issue&lt;/span&gt;
&lt;span class="pl-s"&gt;      number from the specified github repo and uses the same markdown logic as the HTML page to turn that into a fragment&lt;span class="pl-pds"&gt;'&lt;/span&gt;&lt;/span&gt;&lt;/pre&gt;

&lt;p&gt;Here I'm using the ability to pass a URL to &lt;code&gt;-f&lt;/code&gt; and giving it the full source of my &lt;a href="https://github.com/simonw/llm-hacker-news/blob/main/llm_hacker_news.py"&gt;llm_hacker_news.py&lt;/a&gt; plugin (which shows how a fragment can load data from an API) plus the &lt;a href="https://github.com/simonw/tools/blob/main/github-issue-to-markdown.html"&gt;HTML source&lt;/a&gt; of my &lt;a href="https://tools.simonwillison.net/github-issue-to-markdown"&gt;github-issue-to-markdown&lt;/a&gt; tool (which I wrote a few months ago &lt;a href="https://gist.github.com/simonw/cd1afb97e595b40fdeedebb48be7f4f1"&gt;with Claude&lt;/a&gt;). I effectively asked o3 to take that HTML/JavaScript tool and port it to Python to work with my fragments plugin mechanism.&lt;/p&gt;
&lt;p&gt;o3 provided &lt;a href="https://gist.github.com/simonw/249e16edffe6350f7265012bee9e3305#response"&gt;almost the exact implementation I needed&lt;/a&gt;, and even included support for a &lt;code&gt;GITHUB_TOKEN&lt;/code&gt; environment variable without me thinking to ask for it. Total cost: 19.928 cents.&lt;/p&gt;
&lt;p&gt;On a final note of curiosity I tried running this prompt against &lt;a href="https://simonwillison.net/2025/Apr/19/gemma-3-qat-models/"&gt;Gemma 3 27B QAT&lt;/a&gt; running on my Mac via MLX and &lt;a href="https://github.com/simonw/llm-mlx"&gt;llm-mlx&lt;/a&gt;:&lt;/p&gt;
&lt;pre&gt;llm install llm-mlx
llm mlx download-model mlx-community/gemma-3-27b-it-qat-4bit

llm -m mlx-community/gemma-3-27b-it-qat-4bit \
  -f https://raw.githubusercontent.com/simonw/llm-hacker-news/refs/heads/main/llm_hacker_news.py \
  -f https://raw.githubusercontent.com/simonw/tools/refs/heads/main/github-issue-to-markdown.html \
  -s &lt;span class="pl-s"&gt;&lt;span class="pl-pds"&gt;'&lt;/span&gt;Write a new fragments plugin in Python that registers issue:org/repo/123 which fetches that issue&lt;/span&gt;
&lt;span class="pl-s"&gt;      number from the specified github repo and uses the same markdown logic as the HTML page to turn that into a fragment&lt;span class="pl-pds"&gt;'&lt;/span&gt;&lt;/span&gt;&lt;/pre&gt;

&lt;p&gt;That worked &lt;a href="https://gist.github.com/simonw/feccff6ce3254556b848c27333f52543#response"&gt;pretty well too&lt;/a&gt;. It turns out a 16GB local model file is powerful enough to write me an LLM plugin now!


    &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/github"&gt;github&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/plugins"&gt;plugins&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/ai"&gt;ai&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/generative-ai"&gt;generative-ai&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/local-llms"&gt;local-llms&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/llms"&gt;llms&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/ai-assisted-programming"&gt;ai-assisted-programming&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/llm"&gt;llm&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/gemini"&gt;gemini&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/mlx"&gt;mlx&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/o3"&gt;o3&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/long-context"&gt;long-context&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/gemma"&gt;gemma&lt;/a&gt;&lt;/p&gt;



</summary><category term="github"/><category term="plugins"/><category term="ai"/><category term="generative-ai"/><category term="local-llms"/><category term="llms"/><category term="ai-assisted-programming"/><category term="llm"/><category term="gemini"/><category term="mlx"/><category term="o3"/><category term="long-context"/><category term="gemma"/></entry><entry><title>llm-fragments-rust</title><link href="https://simonwillison.net/2025/Apr/11/llm-fragments-rust/#atom-tag" rel="alternate"/><published>2025-04-11T17:36:27+00:00</published><updated>2025-04-11T17:36:27+00:00</updated><id>https://simonwillison.net/2025/Apr/11/llm-fragments-rust/#atom-tag</id><summary type="html">
    
&lt;p&gt;&lt;strong&gt;&lt;a href="https://github.com/huitseeker/llm-fragments-rust"&gt;llm-fragments-rust&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;
Inspired by Filippo Valsorda's &lt;a href="https://simonwillison.net/2025/Apr/10/llm-fragments-go/"&gt;llm-fragments-go&lt;/a&gt;, Francois Garillot created &lt;code&gt;llm-fragments-rust&lt;/code&gt;, an &lt;a href="https://simonwillison.net/2025/Apr/7/long-context-llm/"&gt;LLM fragments&lt;/a&gt; plugin that lets you pull documentation for any Rust crate directly into a prompt to LLM.&lt;/p&gt;
&lt;p&gt;I really like this example, which uses two fragments to load documentation for two crates at once:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;code&gt;llm -f rust:rand@0.8.5 -f rust:tokio "How do I generate random numbers asynchronously?"&lt;/code&gt;&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;The &lt;a href="https://github.com/huitseeker/llm-fragments-rust/blob/main/llm_fragments_rust.py"&gt;code&lt;/a&gt; uses some neat tricks: it creates a new Rust project in a temporary directory (similar to how &lt;code&gt;llm-fragments-go&lt;/code&gt; works), adds the crates and uses &lt;code&gt;cargo doc --no-deps --document-private-items&lt;/code&gt; to generate documentation. Then it runs &lt;code&gt;cargo tree --edges features&lt;/code&gt; to add dependency information, and &lt;code&gt;cargo metadata --format-version=1&lt;/code&gt; to include additional metadata about the crate.

    &lt;p&gt;&lt;small&gt;&lt;/small&gt;Via &lt;a href="https://twitter.com/huitseeker/status/1910741742363562325"&gt;@huitseeker&lt;/a&gt;&lt;/small&gt;&lt;/p&gt;


    &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/plugins"&gt;plugins&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/ai"&gt;ai&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/rust"&gt;rust&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/generative-ai"&gt;generative-ai&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/llms"&gt;llms&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/ai-assisted-programming"&gt;ai-assisted-programming&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/llm"&gt;llm&lt;/a&gt;&lt;/p&gt;



</summary><category term="plugins"/><category term="ai"/><category term="rust"/><category term="generative-ai"/><category term="llms"/><category term="ai-assisted-programming"/><category term="llm"/></entry><entry><title>llm-docsmith</title><link href="https://simonwillison.net/2025/Apr/10/llm-docsmith/#atom-tag" rel="alternate"/><published>2025-04-10T18:09:18+00:00</published><updated>2025-04-10T18:09:18+00:00</updated><id>https://simonwillison.net/2025/Apr/10/llm-docsmith/#atom-tag</id><summary type="html">
    
&lt;p&gt;&lt;strong&gt;&lt;a href="https://mathpn.com/posts/llm-docsmith/"&gt;llm-docsmith&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;
Matheus Pedroni released this neat plugin for LLM for adding docstrings to existing Python code. You can run it like this:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;llm install llm-docsmith
llm docsmith ./scripts/main.py -o
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;The &lt;code&gt;-o&lt;/code&gt; option previews the changes that will be made - without &lt;code&gt;-o&lt;/code&gt; it edits the files directly.&lt;/p&gt;
&lt;p&gt;It also accepts a &lt;code&gt;-m claude-3.7-sonnet&lt;/code&gt; parameter for using an alternative model from the default (GPT-4o mini).&lt;/p&gt;
&lt;p&gt;The implementation uses the Python &lt;a href="https://pypi.org/project/libcst/"&gt;libcst&lt;/a&gt; "Concrete Syntax Tree" package to manipulate the code, which means there's no chance of it making edits to anything other than the docstrings.&lt;/p&gt;
&lt;p&gt;Here's &lt;a href="https://github.com/mathpn/llm-docsmith/blob/v0.1/docsmith.py#L10-L30"&gt;the full system prompt&lt;/a&gt; it uses.&lt;/p&gt;
&lt;p&gt;One neat trick is at the end of the system prompt it says:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;code&gt;You will receive a JSON template. Fill the slots marked with &amp;lt;SLOT&amp;gt; with the appropriate description. Return as JSON.&lt;/code&gt;&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;That template is actually provided JSON generated using these Pydantic classes:&lt;/p&gt;
&lt;pre&gt;&lt;span class="pl-k"&gt;class&lt;/span&gt; &lt;span class="pl-v"&gt;Argument&lt;/span&gt;(&lt;span class="pl-v"&gt;BaseModel&lt;/span&gt;):
    &lt;span class="pl-s1"&gt;name&lt;/span&gt;: &lt;span class="pl-smi"&gt;str&lt;/span&gt;
    &lt;span class="pl-s1"&gt;description&lt;/span&gt;: &lt;span class="pl-smi"&gt;str&lt;/span&gt;
    &lt;span class="pl-s1"&gt;annotation&lt;/span&gt;: &lt;span class="pl-s1"&gt;str&lt;/span&gt; &lt;span class="pl-c1"&gt;|&lt;/span&gt; &lt;span class="pl-c1"&gt;None&lt;/span&gt; &lt;span class="pl-c1"&gt;=&lt;/span&gt; &lt;span class="pl-c1"&gt;None&lt;/span&gt;
    &lt;span class="pl-s1"&gt;default&lt;/span&gt;: &lt;span class="pl-s1"&gt;str&lt;/span&gt; &lt;span class="pl-c1"&gt;|&lt;/span&gt; &lt;span class="pl-c1"&gt;None&lt;/span&gt; &lt;span class="pl-c1"&gt;=&lt;/span&gt; &lt;span class="pl-c1"&gt;None&lt;/span&gt;

&lt;span class="pl-k"&gt;class&lt;/span&gt; &lt;span class="pl-v"&gt;Return&lt;/span&gt;(&lt;span class="pl-v"&gt;BaseModel&lt;/span&gt;):
    &lt;span class="pl-s1"&gt;description&lt;/span&gt;: &lt;span class="pl-smi"&gt;str&lt;/span&gt;
    &lt;span class="pl-s1"&gt;annotation&lt;/span&gt;: &lt;span class="pl-s1"&gt;str&lt;/span&gt; &lt;span class="pl-c1"&gt;|&lt;/span&gt; &lt;span class="pl-c1"&gt;None&lt;/span&gt;

&lt;span class="pl-k"&gt;class&lt;/span&gt; &lt;span class="pl-v"&gt;Docstring&lt;/span&gt;(&lt;span class="pl-v"&gt;BaseModel&lt;/span&gt;):
    &lt;span class="pl-s1"&gt;node_type&lt;/span&gt;: &lt;span class="pl-v"&gt;Literal&lt;/span&gt;[&lt;span class="pl-s"&gt;"class"&lt;/span&gt;, &lt;span class="pl-s"&gt;"function"&lt;/span&gt;]
    &lt;span class="pl-s1"&gt;name&lt;/span&gt;: &lt;span class="pl-smi"&gt;str&lt;/span&gt;
    &lt;span class="pl-s1"&gt;docstring&lt;/span&gt;: &lt;span class="pl-smi"&gt;str&lt;/span&gt;
    &lt;span class="pl-s1"&gt;args&lt;/span&gt;: &lt;span class="pl-s1"&gt;list&lt;/span&gt;[&lt;span class="pl-smi"&gt;Argument&lt;/span&gt;] &lt;span class="pl-c1"&gt;|&lt;/span&gt; &lt;span class="pl-c1"&gt;None&lt;/span&gt; &lt;span class="pl-c1"&gt;=&lt;/span&gt; &lt;span class="pl-c1"&gt;None&lt;/span&gt;
    &lt;span class="pl-s1"&gt;ret&lt;/span&gt;: &lt;span class="pl-v"&gt;Return&lt;/span&gt; &lt;span class="pl-c1"&gt;|&lt;/span&gt; &lt;span class="pl-c1"&gt;None&lt;/span&gt; &lt;span class="pl-c1"&gt;=&lt;/span&gt; &lt;span class="pl-c1"&gt;None&lt;/span&gt;

&lt;span class="pl-k"&gt;class&lt;/span&gt; &lt;span class="pl-v"&gt;Documentation&lt;/span&gt;(&lt;span class="pl-v"&gt;BaseModel&lt;/span&gt;):
    &lt;span class="pl-s1"&gt;entries&lt;/span&gt;: &lt;span class="pl-s1"&gt;list&lt;/span&gt;[&lt;span class="pl-smi"&gt;Docstring&lt;/span&gt;]&lt;/pre&gt;

&lt;p&gt;The code adds &lt;code&gt;&amp;lt;SLOT&amp;gt;&lt;/code&gt; notes to that in various places, so the template included in the prompt ends up looking like this:&lt;/p&gt;
&lt;pre&gt;{
  &lt;span class="pl-ent"&gt;"entries"&lt;/span&gt;: [
    {
      &lt;span class="pl-ent"&gt;"node_type"&lt;/span&gt;: &lt;span class="pl-s"&gt;&lt;span class="pl-pds"&gt;"&lt;/span&gt;function&lt;span class="pl-pds"&gt;"&lt;/span&gt;&lt;/span&gt;,
      &lt;span class="pl-ent"&gt;"name"&lt;/span&gt;: &lt;span class="pl-s"&gt;&lt;span class="pl-pds"&gt;"&lt;/span&gt;create_docstring_node&lt;span class="pl-pds"&gt;"&lt;/span&gt;&lt;/span&gt;,
      &lt;span class="pl-ent"&gt;"docstring"&lt;/span&gt;: &lt;span class="pl-s"&gt;&lt;span class="pl-pds"&gt;"&lt;/span&gt;&amp;lt;SLOT&amp;gt;&lt;span class="pl-pds"&gt;"&lt;/span&gt;&lt;/span&gt;,
      &lt;span class="pl-ent"&gt;"args"&lt;/span&gt;: [
        {
          &lt;span class="pl-ent"&gt;"name"&lt;/span&gt;: &lt;span class="pl-s"&gt;&lt;span class="pl-pds"&gt;"&lt;/span&gt;docstring_text&lt;span class="pl-pds"&gt;"&lt;/span&gt;&lt;/span&gt;,
          &lt;span class="pl-ent"&gt;"description"&lt;/span&gt;: &lt;span class="pl-s"&gt;&lt;span class="pl-pds"&gt;"&lt;/span&gt;&amp;lt;SLOT&amp;gt;&lt;span class="pl-pds"&gt;"&lt;/span&gt;&lt;/span&gt;,
          &lt;span class="pl-ent"&gt;"annotation"&lt;/span&gt;: &lt;span class="pl-s"&gt;&lt;span class="pl-pds"&gt;"&lt;/span&gt;str&lt;span class="pl-pds"&gt;"&lt;/span&gt;&lt;/span&gt;,
          &lt;span class="pl-ent"&gt;"default"&lt;/span&gt;: &lt;span class="pl-c1"&gt;null&lt;/span&gt;
        },
        {
          &lt;span class="pl-ent"&gt;"name"&lt;/span&gt;: &lt;span class="pl-s"&gt;&lt;span class="pl-pds"&gt;"&lt;/span&gt;indent&lt;span class="pl-pds"&gt;"&lt;/span&gt;&lt;/span&gt;,
          &lt;span class="pl-ent"&gt;"description"&lt;/span&gt;: &lt;span class="pl-s"&gt;&lt;span class="pl-pds"&gt;"&lt;/span&gt;&amp;lt;SLOT&amp;gt;&lt;span class="pl-pds"&gt;"&lt;/span&gt;&lt;/span&gt;,
          &lt;span class="pl-ent"&gt;"annotation"&lt;/span&gt;: &lt;span class="pl-s"&gt;&lt;span class="pl-pds"&gt;"&lt;/span&gt;str&lt;span class="pl-pds"&gt;"&lt;/span&gt;&lt;/span&gt;,
          &lt;span class="pl-ent"&gt;"default"&lt;/span&gt;: &lt;span class="pl-c1"&gt;null&lt;/span&gt;
        }
      ],
      &lt;span class="pl-ent"&gt;"ret"&lt;/span&gt;: {
        &lt;span class="pl-ent"&gt;"description"&lt;/span&gt;: &lt;span class="pl-s"&gt;&lt;span class="pl-pds"&gt;"&lt;/span&gt;&amp;lt;SLOT&amp;gt;&lt;span class="pl-pds"&gt;"&lt;/span&gt;&lt;/span&gt;,
        &lt;span class="pl-ent"&gt;"annotation"&lt;/span&gt;: &lt;span class="pl-s"&gt;&lt;span class="pl-pds"&gt;"&lt;/span&gt;cst.BaseStatement&lt;span class="pl-pds"&gt;"&lt;/span&gt;&lt;/span&gt;
      }
    }
  ]
}&lt;/pre&gt;

    &lt;p&gt;&lt;small&gt;&lt;/small&gt;Via &lt;a href="https://twitter.com/pnmath/status/1909386592944292018"&gt;@pnmath&lt;/a&gt;&lt;/small&gt;&lt;/p&gt;


    &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/plugins"&gt;plugins&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/python"&gt;python&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/ai"&gt;ai&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/prompt-engineering"&gt;prompt-engineering&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/generative-ai"&gt;generative-ai&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/llm"&gt;llm&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/pydantic"&gt;pydantic&lt;/a&gt;&lt;/p&gt;



</summary><category term="plugins"/><category term="python"/><category term="ai"/><category term="prompt-engineering"/><category term="generative-ai"/><category term="llm"/><category term="pydantic"/></entry><entry><title>llm-fragments-go</title><link href="https://simonwillison.net/2025/Apr/10/llm-fragments-go/#atom-tag" rel="alternate"/><published>2025-04-10T15:19:33+00:00</published><updated>2025-04-10T15:19:33+00:00</updated><id>https://simonwillison.net/2025/Apr/10/llm-fragments-go/#atom-tag</id><summary type="html">
    
&lt;p&gt;&lt;strong&gt;&lt;a href="https://github.com/FiloSottile/mostly-harmless/tree/main/llm-fragments-go"&gt;llm-fragments-go&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;
Filippo Valsorda released the first plugin by someone other than me that uses LLM's new &lt;a href="https://llm.datasette.io/en/stable/plugins/plugin-hooks.html#register-fragment-loaders-register"&gt;register_fragment_loaders()&lt;/a&gt; plugin hook I announced &lt;a href="https://simonwillison.net/2025/Apr/7/long-context-llm/"&gt;the other day&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;Install with &lt;code&gt;llm install llm-fragments-go&lt;/code&gt; and then:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;You can feed the docs of a Go package into LLM using the &lt;code&gt;go:&lt;/code&gt; &lt;a href="https://llm.datasette.io/en/stable/fragments.html"&gt;fragment&lt;/a&gt; with the package name, optionally followed by a version suffix.&lt;/p&gt;
&lt;p&gt;&lt;code&gt;llm -f go:golang.org/x/mod/sumdb/note@v0.23.0 "Write a single file command that generates a key, prints the verifier key, signs an example message, and prints the signed note."&lt;/code&gt;&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;The implementation is &lt;a href="https://github.com/FiloSottile/mostly-harmless/blob/44fb3e6e0b56decd72e893409e8085d88ad43e3d/llm-fragments-go/llm_fragments_go.py"&gt;just 33 lines of Python&lt;/a&gt; and works by running these commands in a temporary directory:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;go mod init llm_fragments_go
go get golang.org/x/mod/sumdb/note@v0.23.0
go doc -all golang.org/x/mod/sumdb/note
&lt;/code&gt;&lt;/pre&gt;

    &lt;p&gt;&lt;small&gt;&lt;/small&gt;Via &lt;a href="https://bsky.app/profile/filippo.abyssdomain.expert/post/3lmhhqccp2c2i"&gt;@filippo.abyssdomain.expert&lt;/a&gt;&lt;/small&gt;&lt;/p&gt;


    &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/go"&gt;go&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/plugins"&gt;plugins&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/ai"&gt;ai&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/generative-ai"&gt;generative-ai&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/llms"&gt;llms&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/llm"&gt;llm&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/filippo-valsorda"&gt;filippo-valsorda&lt;/a&gt;&lt;/p&gt;



</summary><category term="go"/><category term="plugins"/><category term="ai"/><category term="generative-ai"/><category term="llms"/><category term="llm"/><category term="filippo-valsorda"/></entry><entry><title>llm-hacker-news</title><link href="https://simonwillison.net/2025/Apr/8/llm-hacker-news/#atom-tag" rel="alternate"/><published>2025-04-08T00:11:30+00:00</published><updated>2025-04-08T00:11:30+00:00</updated><id>https://simonwillison.net/2025/Apr/8/llm-hacker-news/#atom-tag</id><summary type="html">
    
&lt;p&gt;&lt;strong&gt;&lt;a href="https://github.com/simonw/llm-hacker-news"&gt;llm-hacker-news&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;
I built this new plugin to exercise the new &lt;a href="https://llm.datasette.io/en/stable/plugins/plugin-hooks.html#register-fragment-loaders-register"&gt;register_fragment_loaders()&lt;/a&gt; plugin hook I added to &lt;a href="https://simonwillison.net/2025/Apr/7/long-context-llm/"&gt;LLM 0.24&lt;/a&gt;. It's the plugin equivalent of &lt;a href="https://til.simonwillison.net/llms/claude-hacker-news-themes"&gt;the Bash script&lt;/a&gt; I've been using to summarize &lt;a href="https://news.ycombinator.com/"&gt;Hacker News&lt;/a&gt; conversations for the past 18 months.&lt;/p&gt;
&lt;p&gt;You can use it like this:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;llm install llm-hacker-news
llm -f hn:43615912 'summary with illustrative direct quotes'
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;You can see the output &lt;a href="https://github.com/simonw/llm-hacker-news/issues/1#issuecomment-2784887743"&gt;in this issue&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;The plugin registers a &lt;code&gt;hn:&lt;/code&gt; prefix - combine that with the ID of a Hacker News conversation to pull that conversation into the context.&lt;/p&gt;
&lt;p&gt;It uses the Algolia Hacker News API which returns &lt;a href="https://hn.algolia.com/api/v1/items/43615912"&gt;JSON like this&lt;/a&gt;. Rather than feed the JSON directly to the LLM it instead converts it to a hopefully more LLM-friendly format that looks like this example from &lt;a href="https://github.com/simonw/llm-hacker-news/blob/0.1/tests/test_hacker_news.py#L5-L18"&gt;the plugin's test&lt;/a&gt;:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;[1] BeakMaster: Fish Spotting Techniques

[1.1] CoastalFlyer: The dive technique works best when hunting in shallow waters.

[1.1.1] PouchBill: Agreed. Have you tried the hover method near the pier?

[1.1.2] WingSpan22: My bill gets too wet with that approach.

[1.1.2.1] CoastalFlyer: Try tilting at a 40° angle like our Australian cousins.

[1.2] BrownFeathers: Anyone spotted those "silver fish" near the rocks?

[1.2.1] GulfGlider: Yes! They're best caught at dawn.
Just remember: swoop &amp;gt; grab &amp;gt; lift
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;That format was suggested by Claude, which then wrote most of the plugin implementation for me. Here's &lt;a href="https://claude.ai/share/6da6ec5a-b8b3-4572-ab1b-141bb37ef70b"&gt;that Claude transcript&lt;/a&gt;.


    &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/hacker-news"&gt;hacker-news&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/plugins"&gt;plugins&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/projects"&gt;projects&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/ai"&gt;ai&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/generative-ai"&gt;generative-ai&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/llms"&gt;llms&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/ai-assisted-programming"&gt;ai-assisted-programming&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/llm"&gt;llm&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/anthropic"&gt;anthropic&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/claude"&gt;claude&lt;/a&gt;&lt;/p&gt;



</summary><category term="hacker-news"/><category term="plugins"/><category term="projects"/><category term="ai"/><category term="generative-ai"/><category term="llms"/><category term="ai-assisted-programming"/><category term="llm"/><category term="anthropic"/><category term="claude"/></entry><entry><title>Long context support in LLM 0.24 using fragments and template plugins</title><link href="https://simonwillison.net/2025/Apr/7/long-context-llm/#atom-tag" rel="alternate"/><published>2025-04-07T17:45:11+00:00</published><updated>2025-04-07T17:45:11+00:00</updated><id>https://simonwillison.net/2025/Apr/7/long-context-llm/#atom-tag</id><summary type="html">
    &lt;p&gt;LLM 0.24 is &lt;a href="https://llm.datasette.io/en/stable/changelog.html#v0-24"&gt;now available&lt;/a&gt; with new features to help take advantage of the increasingly long input context supported by modern LLMs.&lt;/p&gt;
&lt;p&gt;(&lt;a href="https://llm.datasette.io/"&gt;LLM&lt;/a&gt; is my command-line tool and &lt;a href="https://llm.datasette.io/en/stable/python-api.html"&gt;Python library&lt;/a&gt; for interacting with LLMs, supported by &lt;a href="https://llm.datasette.io/en/stable/plugins/directory.html"&gt;20+ plugins&lt;/a&gt; adding support for both local and remote models from a bunch of different providers.)&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;&lt;a href="https://simonwillison.net/2025/Apr/7/long-context-llm/#trying-it-out"&gt;Trying it out&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href="https://simonwillison.net/2025/Apr/7/long-context-llm/#improving-llm-s-support-for-long-context-models"&gt;Improving LLM's support for long context models&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href="https://simonwillison.net/2025/Apr/7/long-context-llm/#asking-questions-of-llm-s-documentation"&gt;Asking questions of LLM's documentation&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href="https://simonwillison.net/2025/Apr/7/long-context-llm/#publishing-sharing-and-reusing-templates"&gt;Publishing, sharing and reusing templates&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href="https://simonwillison.net/2025/Apr/7/long-context-llm/#template-and-fragment-loader-plugins"&gt;Template and fragment loader plugins so far&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href="https://simonwillison.net/2025/Apr/7/long-context-llm/#everything-else-in-llm-0-24"&gt;Everything else in LLM 0.24&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;


&lt;h4 id="trying-it-out"&gt;Trying it out&lt;/h4&gt;

&lt;p&gt;To install LLM with &lt;a href="https://github.com/astral-sh/uv"&gt;uv&lt;/a&gt; (there are &lt;a href="https://llm.datasette.io/en/stable/setup.html#installation"&gt;several other options&lt;/a&gt;):&lt;/p&gt;
&lt;div class="highlight highlight-source-shell"&gt;&lt;pre&gt;uv tool install llm&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;You'll need to either provide &lt;a href="https://platform.openai.com/api-keys"&gt;an OpenAI API key&lt;/a&gt; or install &lt;a href="https://llm.datasette.io/en/stable/plugins/directory.html"&gt;a plugin&lt;/a&gt; to use local models or  models from other providers:&lt;/p&gt;
&lt;div class="highlight highlight-source-shell"&gt;&lt;pre&gt;llm keys &lt;span class="pl-c1"&gt;set&lt;/span&gt; openai
&lt;span class="pl-c"&gt;&lt;span class="pl-c"&gt;#&lt;/span&gt; Paste OpenAI API key here&lt;/span&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;To &lt;a href="https://llm.datasette.io/en/stable/setup.html#upgrading-to-the-latest-version"&gt;upgrade LLM&lt;/a&gt; from a previous version:&lt;/p&gt;
&lt;div class="highlight highlight-source-shell"&gt;&lt;pre&gt;llm install -U llm&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;The biggest new feature is &lt;strong&gt;&lt;a href="https://llm.datasette.io/en/stable/fragments.html"&gt;fragments&lt;/a&gt;&lt;/strong&gt;. You can now use &lt;code&gt;-f filename&lt;/code&gt; or &lt;code&gt;-f url&lt;/code&gt; to add one or more fragments to your prompt, which means you can do things like this:&lt;/p&gt;
&lt;div class="highlight highlight-source-shell"&gt;&lt;pre&gt;llm -f https://simonwillison.net/2025/Apr/5/llama-4-notes/ &lt;span class="pl-s"&gt;&lt;span class="pl-pds"&gt;'&lt;/span&gt;bullet point summary&lt;span class="pl-pds"&gt;'&lt;/span&gt;&lt;/span&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;Here's &lt;a href="https://gist.github.com/simonw/7ffb19f647603558c575359e678185c0"&gt;the output&lt;/a&gt; from that prompt, exported using &lt;code&gt;llm logs -c --expand --usage&lt;/code&gt;. Token cost was 5,372 input, 374 output which works out as 0.103 cents (around 1/10th of a cent) using the default GPT-4o mini model.&lt;/p&gt;
&lt;p&gt;Plugins can implement custom fragment loaders with a prefix. The &lt;a href="https://github.com/simonw/llm-fragments-github"&gt;llm-fragments-github&lt;/a&gt; plugin adds a &lt;code&gt;github:&lt;/code&gt; prefix that can be used to load every text file in a GitHub repository as a list of fragments:&lt;/p&gt;
&lt;div class="highlight highlight-source-shell"&gt;&lt;pre&gt;llm install llm-fragments-github
llm -f github:simonw/s3-credentials &lt;span class="pl-s"&gt;&lt;span class="pl-pds"&gt;'&lt;/span&gt;Suggest new features for this tool&lt;span class="pl-pds"&gt;'&lt;/span&gt;&lt;/span&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;Here's &lt;a href="https://gist.github.com/simonw/c9bbbc5f6560b01f4b7882ac0194fb25"&gt;the output&lt;/a&gt;. That took 49,856 input tokens for a total cost of 0.7843 cents - nearly a whole cent!&lt;/p&gt;
&lt;h4 id="improving-llm-s-support-for-long-context-models"&gt;Improving LLM's support for long context models&lt;/h4&gt;
&lt;p&gt;&lt;strong&gt;&lt;a href="https://simonwillison.net/tags/long-context/"&gt;Long context&lt;/a&gt;&lt;/strong&gt; is one of the most exciting trends in LLMs over the past eighteen months. Saturday's &lt;a href="https://simonwillison.net/2025/Apr/5/llama-4-notes/"&gt;Llama 4 Scout release&lt;/a&gt; gave us the first model with a full 10 million token context. Google's &lt;a href="https://simonwillison.net/tags/gemini/"&gt;Gemini&lt;/a&gt; family has several 1-2 million token models, and the baseline for recent models from both OpenAI and Anthropic is 100 or 200 thousand.&lt;/p&gt;
&lt;p&gt;Two years ago most models capped out at 8,000 tokens of input. Long context opens up many new interesting ways to apply this class of technology.&lt;/p&gt;
&lt;p&gt;I've been using long context models via &lt;a href="https://simonwillison.net/tags/files-to-prompt/"&gt;my files-to-prompt tool&lt;/a&gt; to summarize large codebases, explain how they work and even to &lt;a href="https://simonwillison.net/2024/Sep/25/o1-preview-llm/"&gt;debug gnarly bugs&lt;/a&gt;. As demonstrated above, it's surprisingly inexpensive to drop tens of thousands of tokens into  models like GPT-4o mini or most of the Google Gemini series, and the results are often very impressive.&lt;/p&gt;
&lt;p&gt;One of LLM's most useful features is that it &lt;a href="https://llm.datasette.io/en/stable/logging.html"&gt;logs every prompt and response&lt;/a&gt; to a SQLite database. This is great for comparing the same prompt against different models and tracking experiments over time - my own database contained thousands of responses from hundreds of different models accumulated over the past couple of years.&lt;/p&gt;
&lt;p&gt;This is where long context prompts were starting to be a problem. Since LLM stores the full prompt and response in the database, asking five questions of the same source code could result in five duplicate copies of that text in the database!&lt;/p&gt;
&lt;p&gt;The new &lt;strong&gt;fragments&lt;/strong&gt; feature targets this problem head on. Each fragment is stored once in a &lt;a href="https://llm.datasette.io/en/stable/logging.html#sql-schema"&gt;fragments&lt;/a&gt; table, then de-duplicated in the future using a SHA256 hash of its content.&lt;/p&gt;
&lt;p&gt;This saves on storage, and also enables features like &lt;code&gt;llm logs -f X&lt;/code&gt; for seeing all logged responses &lt;a href="https://llm.datasette.io/en/stable/logging.html#filtering-by-prompts-that-used-a-specific-fragment"&gt;that use a particular fragment&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;Fragments can be specified in several different ways:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;a path to a file&lt;/li&gt;
&lt;li&gt;a URL to data online&lt;/li&gt;
&lt;li&gt;an alias that's been set against a previous fragment (see &lt;a href="https://llm.datasette.io/en/latest/help.html#llm-fragments-set-help"&gt;llm fragments set&lt;/a&gt;)&lt;/li&gt;
&lt;li&gt;a hash ID of the content of a fragment&lt;/li&gt;
&lt;li&gt;using &lt;code&gt;prefix:argument&lt;/code&gt; to specify fragments from a plugin&lt;/li&gt;
&lt;/ul&gt;
&lt;h4 id="asking-questions-of-llm-s-documentation"&gt;Asking questions of LLM's documentation&lt;/h4&gt;
&lt;p&gt;Wouldn't it be neat if LLM could answer questions about its own documentation?&lt;/p&gt;
&lt;p&gt;The new &lt;a href="https://github.com/simonw/llm-docs"&gt;llm-docs&lt;/a&gt; plugin (built with the new &lt;a href="https://llm.datasette.io/en/stable/plugins/plugin-hooks.html#register-fragment-loaders-register"&gt;register_fragment_loaders() plugin hook&lt;/a&gt;) enables exactly that:&lt;/p&gt;
&lt;div class="highlight highlight-source-shell"&gt;&lt;pre&gt;llm install llm-docs
llm -f docs: &lt;span class="pl-s"&gt;&lt;span class="pl-pds"&gt;"&lt;/span&gt;How do I embed a binary file?&lt;span class="pl-pds"&gt;"&lt;/span&gt;&lt;/span&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;The &lt;a href="https://gist.github.com/simonw/df115ffa63e101833a100f71555bf53d"&gt;output&lt;/a&gt; starts like this:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;To embed a binary file using the LLM command-line interface, you can use the &lt;code&gt;llm embed&lt;/code&gt; command with the &lt;code&gt;--binary&lt;/code&gt; option. Here’s how you can do it:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;Make sure you have the appropriate embedding model installed that supports binary input.&lt;/li&gt;
&lt;li&gt;Use the following command syntax:&lt;/li&gt;
&lt;/ol&gt;
&lt;div class="highlight highlight-source-shell"&gt;&lt;pre&gt;llm embed -m &lt;span class="pl-k"&gt;&amp;lt;&lt;/span&gt;model_id&lt;span class="pl-k"&gt;&amp;gt;&lt;/span&gt; --binary -i &lt;span class="pl-k"&gt;&amp;lt;&lt;/span&gt;path_to_your_binary_file&lt;span class="pl-k"&gt;&amp;gt;&lt;/span&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;Replace &lt;code&gt;&amp;lt;model_id&amp;gt;&lt;/code&gt; with the identifier for the embedding model you want to use (e.g., &lt;code&gt;clip&lt;/code&gt; for the CLIP model) and &lt;code&gt;&amp;lt;path_to_your_binary_file&amp;gt;&lt;/code&gt; with the path to your actual binary file.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;(74,570 input, 240 output = 1.1329 cents with GPT-4o mini)&lt;/p&gt;
&lt;p&gt;Using &lt;code&gt;-f docs:&lt;/code&gt; with just the prefix is the same as using &lt;code&gt;-f docs:llm&lt;/code&gt;. The plugin fetches the documentation for your current version of LLM from my new &lt;a href="https://github.com/simonw/docs-for-llms"&gt;simonw/docs-for-llms&lt;/a&gt; repo, which also provides packaged documentation files for my &lt;code&gt;datasette&lt;/code&gt;, &lt;code&gt;s3-credentials&lt;/code&gt;, &lt;code&gt;shot-scraper&lt;/code&gt; and &lt;code&gt;sqlite-utils&lt;/code&gt; projects.&lt;/p&gt;
&lt;p&gt;Datasette's documentation has got pretty long, so you might need to run that through a Gemini model instead (using the &lt;a href="https://github.com/simonw/llm-gemini"&gt;llm-gemini plugin&lt;/a&gt;):&lt;/p&gt;
&lt;div class="highlight highlight-source-shell"&gt;&lt;pre&gt;llm -f docs:datasette -m gemini-2.0-flash \
  &lt;span class="pl-s"&gt;&lt;span class="pl-pds"&gt;'&lt;/span&gt;Build a render_cell plugin that detects and renders markdown&lt;span class="pl-pds"&gt;'&lt;/span&gt;&lt;/span&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;&lt;a href="https://gist.github.com/simonw/f08b3832c379ca89d8cff3c61ec5662e"&gt;Here's the output&lt;/a&gt;. 132,042 input, 1,129 output with Gemini 2.0 Flash = 1.3656 cents.&lt;/p&gt;
&lt;p&gt;You can browse the combined documentation files this uses in &lt;a href="https://github.com/simonw/docs-for-llms"&gt;docs-for-llm&lt;/a&gt;. They're built using GitHub Actions.&lt;/p&gt;
&lt;p&gt;&lt;a href="https://llmstxt.org/"&gt;llms-txt&lt;/a&gt; is a project lead by Jeremy Howard that encourages projects to publish similar files to help LLMs ingest a succinct copy of their documentation.&lt;/p&gt;
&lt;h4 id="publishing-sharing-and-reusing-templates"&gt;Publishing, sharing and reusing templates&lt;/h4&gt;
&lt;p&gt;The new &lt;a href="https://llm.datasette.io/en/stable/plugins/plugin-hooks.html#register-template-loaders-register"&gt;register_template_loaders() plugin hook&lt;/a&gt; allows plugins to register &lt;code&gt;prefix:value&lt;/code&gt; custom template loaders, for use with the &lt;code&gt;llm -t&lt;/code&gt; option.&lt;/p&gt;
&lt;p&gt;&lt;a href="https://github.com/simonw/llm-templates-github"&gt;llm-templates-github&lt;/a&gt; and &lt;a href="https://github.com/simonw/llm-templates-fabric"&gt;llm-templates-fabric&lt;/a&gt; are two new plugins that make use of that hook.&lt;/p&gt;
&lt;p&gt;&lt;code&gt;llm-templates-github&lt;/code&gt; lets you share and use templates via a public GitHub repository. Here's how to run my &lt;a href="https://simonwillison.net/tags/pelican-riding-a-bicycle/"&gt;Pelican riding a bicycle&lt;/a&gt; benchmark against a specific model:&lt;/p&gt;
&lt;div class="highlight highlight-source-shell"&gt;&lt;pre&gt;llm install llm-templates-github
llm -t gh:simonw/pelican-svg -m o3-mini&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;This executes &lt;a href="https://github.com/simonw/llm-templates/blob/main/pelican-svg.yaml"&gt;this pelican-svg.yaml&lt;/a&gt; template stored in my &lt;a href="https://github.com/simonw/llm-templates"&gt;simonw/llm-templates&lt;/a&gt; repository, using a new repository naming convention.&lt;/p&gt;
&lt;p&gt;&lt;code&gt;llm -t gh:simonw/pelican-svg&lt;/code&gt; will load that &lt;code&gt;pelican-svg.yaml&lt;/code&gt; file from the &lt;code&gt;simonw/llm-templates&lt;/code&gt; repo. You can also use &lt;code&gt;llm -t gh:simonw/name-of-repo/name-of-template&lt;/code&gt; to load a template from a repository that doesn't follow that convention.&lt;/p&gt;
&lt;p&gt;To share your own templates, create a repository on GitHub under your user account called &lt;code&gt;llm-templates&lt;/code&gt; and start saving &lt;code&gt;.yaml&lt;/code&gt; files to it.&lt;/p&gt;
&lt;p&gt;&lt;a href="https://github.com/simonw/llm-templates-fabric"&gt;llm-templates-fabric&lt;/a&gt; provides a similar mechanism for loading templates from  Daniel Miessler's extensive &lt;a href="https://github.com/danielmiessler/fabric"&gt;fabric collection&lt;/a&gt;:&lt;/p&gt;
&lt;div class="highlight highlight-source-shell"&gt;&lt;pre&gt;llm install llm-templates-fabric
curl https://simonwillison.net/2025/Apr/6/only-miffy/ &lt;span class="pl-k"&gt;|&lt;/span&gt; \
  llm -t f:extract_main_idea&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;A conversation with Daniel was the inspiration for this new plugin hook.&lt;/p&gt;

&lt;h4 id="template-and-fragment-loader-plugins"&gt;Template and fragment loader plugins so far&lt;/h4&gt;
&lt;p&gt;Here's the list of fragment and template loader plugins so far from the &lt;a href="https://llm.datasette.io/en/stable/plugins/directory.html#fragments-and-template-loaders"&gt;LLM plugins directory&lt;/a&gt;:&lt;/p&gt;
&lt;blockquote&gt;
&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;&lt;a href="https://github.com/simonw/llm-templates-github"&gt;llm-templates-github&lt;/a&gt;&lt;/strong&gt; supports loading templates shared on GitHub, e.g. &lt;code&gt;llm -t gh:simonw/pelican-svg&lt;/code&gt;.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;&lt;a href="https://github.com/simonw/llm-templates-fabric"&gt;llm-templates-fabric&lt;/a&gt;&lt;/strong&gt; provides access to the &lt;a href="https://github.com/danielmiessler/fabric"&gt;Fabric&lt;/a&gt; collection of prompts: &lt;code&gt;cat setup.py | llm -t fabric:explain_code&lt;/code&gt;.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;&lt;a href="https://github.com/simonw/llm-fragments-github"&gt;llm-fragments-github&lt;/a&gt;&lt;/strong&gt; can load entire GitHub repositories in a single operation: &lt;code&gt;llm -f github:simonw/files-to-prompt 'explain this code'&lt;/code&gt;.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;&lt;a href="https://github.com/simonw/llm-hacker-news"&gt;llm-hacker-news&lt;/a&gt;&lt;/strong&gt; imports conversations from Hacker News as fragments: &lt;code&gt;llm -f hn:43615912 'summary with illustrative direct quotes'&lt;/code&gt;.&lt;/li&gt;
&lt;/ul&gt;
&lt;/blockquote&gt;

&lt;h4 id="everything-else-in-llm-0-24"&gt;Everything else in LLM 0.24&lt;/h4&gt;
&lt;p&gt;LLM 0.24 is a big release, spanning &lt;a href="https://github.com/simonw/llm/compare/0.23...0.24"&gt;51 commits&lt;/a&gt;. The &lt;a href="https://llm.datasette.io/en/stable/changelog.html#v0-24"&gt;release notes&lt;/a&gt; cover everything that's new in full - here are a few of my highlights:&lt;/p&gt;
&lt;blockquote&gt;
&lt;ul&gt;
&lt;li&gt;The new &lt;a href="https://github.com/simonw/llm-openai-plugin"&gt;llm-openai plugin&lt;/a&gt; provides support for &lt;strong&gt;o1-pro&lt;/strong&gt; (which is not supported by the OpenAI mechanism used by LLM core). Future OpenAI features will migrate to this plugin instead of LLM core itself.&lt;/li&gt;
&lt;/ul&gt;
&lt;/blockquote&gt;
&lt;p&gt;The problem with OpenAI models being handled by LLM core is that I have to release a whole new version of LLM every time OpenAI releases a new model or feature. Migrating this stuff out to a plugin means I can release new version of that plugin independently of LLM itself - something I frequently do for &lt;a href="https://github.com/simonw/llm-anthropic"&gt;llm-anthropic&lt;/a&gt; and &lt;a href="https://github.com/simonw/llm-gemini"&gt;llm-gemini&lt;/a&gt; and others.&lt;/p&gt;
&lt;p&gt;The new &lt;code&gt;llm-openai&lt;/code&gt; plugin uses their Responses API, a new shape of API which &lt;a href="https://simonwillison.net/2025/Mar/11/responses-vs-chat-completions/"&gt;I covered last month&lt;/a&gt;.&lt;/p&gt;
&lt;blockquote&gt;
&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;llm -t $URL&lt;/code&gt; option can now take a URL to a YAML template. &lt;a href="https://github.com/simonw/llm/issues/856"&gt;#856&lt;/a&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;/blockquote&gt;
&lt;p&gt;The new custom template loaders are fun, but being able to paste in a URL to a YAML file somewhere provides a simpler way to share templates.&lt;/p&gt;
&lt;blockquote&gt;
&lt;ul&gt;
&lt;li&gt;Templates can now store default model options. &lt;a href="https://github.com/simonw/llm/issues/845"&gt;#845&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;Attachments can now be stored in templates. &lt;a href="https://github.com/simonw/llm/issues/826"&gt;#826&lt;/a&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;/blockquote&gt;
&lt;p&gt;The quickest way to create your own template is with the &lt;code&gt;llm prompt ... --save name-of-template&lt;/code&gt; command. This now works with attachments, fragments and default model options, each of which is persisted in &lt;a href="https://llm.datasette.io/en/stable/templates.html#templates-as-yaml-files"&gt;the template YAML file&lt;/a&gt;.&lt;/p&gt;
&lt;blockquote&gt;
&lt;ul&gt;
&lt;li&gt;New &lt;a href="https://llm.datasette.io/en/stable/usage.html#setting-default-options-for-models"&gt;llm models options&lt;/a&gt; family of commands for setting default options for particular models. &lt;a href="https://github.com/simonw/llm/issues/829"&gt;#829&lt;/a&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;/blockquote&gt;
&lt;p&gt;I built this when I &lt;a href="https://twitter.com/bindureddy/status/1900345517256958140"&gt;learned that&lt;/a&gt; Qwen's QwQ-32b model works best with temperature 0.7 and top p 0.95.&lt;/p&gt;
&lt;blockquote&gt;
&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;llm prompt -d path-to-sqlite.db&lt;/code&gt; option can now be used to write logs to a custom SQLite database. &lt;a href="https://github.com/simonw/llm/issues/858"&gt;#858&lt;/a&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;/blockquote&gt;
&lt;p&gt;This proved extremely useful for testing fragments - it meant I could run a prompt and save the full response to a separate SQLite database which I could then upload to S3 and share &lt;a href="https://github.com/simonw/llm/issues/863#issuecomment-2781723372"&gt;as a link to Datasette Lite&lt;/a&gt;.&lt;/p&gt;
&lt;blockquote&gt;
&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;llm similar -p/--plain&lt;/code&gt; option providing more human-readable output than the default JSON. &lt;a href="https://github.com/simonw/llm/issues/853"&gt;#853&lt;/a&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;/blockquote&gt;
&lt;p&gt;I'd like this to be the default output, but I'm holding off on changing that until LLM 1.0 since it's a breaking change for people building automations against the JSON from &lt;code&gt;llm similar&lt;/code&gt;.&lt;/p&gt;
&lt;blockquote&gt;
&lt;ul&gt;
&lt;li&gt;Set the &lt;code&gt;LLM_RAISE_ERRORS=1&lt;/code&gt; environment variable to raise errors during prompts rather than suppressing them, which means you can run &lt;code&gt;python -i -m llm 'prompt'&lt;/code&gt; and then drop into a debugger on errors with &lt;code&gt;import pdb; pdb.pm()&lt;/code&gt;. &lt;a href="https://github.com/simonw/llm/issues/817"&gt;#817&lt;/a&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;/blockquote&gt;
&lt;p&gt;Really useful for debugging new model plugins.&lt;/p&gt;
&lt;blockquote&gt;
&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;llm prompt -q gpt -q 4o&lt;/code&gt; option - pass &lt;code&gt;-q searchterm&lt;/code&gt; one or more times to execute a prompt against the first model that matches all of those strings - useful for if you can't remember the full model ID. &lt;a href="https://github.com/simonw/llm/issues/841"&gt;#841&lt;/a&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;/blockquote&gt;
&lt;p&gt;Pretty obscure but I found myself needing this. Vendors love releasing models with names like &lt;code&gt;gemini-2.5-pro-exp-03-25&lt;/code&gt;, now I can run &lt;code&gt;llm -q gem -q 2.5 -q exp 'say hi'&lt;/code&gt; to save me from looking up the model ID.&lt;/p&gt;
&lt;blockquote&gt;
&lt;ul&gt;
&lt;li&gt;
&lt;a href="https://llm.datasette.io/en/stable/other-models.html#openai-compatible-models"&gt;OpenAI compatible models&lt;/a&gt; configured using &lt;code&gt;extra-openai-models.yaml&lt;/code&gt; now support &lt;code&gt;supports_schema: true&lt;/code&gt;, &lt;code&gt;vision: true&lt;/code&gt; and &lt;code&gt;audio: true&lt;/code&gt; options. Thanks &lt;a href="https://github.com/adaitche"&gt;@adaitche&lt;/a&gt; and &lt;a href="https://github.com/giuli007"&gt;@giuli007&lt;/a&gt;. &lt;a href="https://github.com/simonw/llm/pull/819"&gt;#819&lt;/a&gt;, &lt;a href="https://github.com/simonw/llm/pull/843"&gt;#843&lt;/a&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;/blockquote&gt;
&lt;p&gt;I don't use this feature myself but it's clearly popular, this isn't the first time I'e had PRs with improvements from the wider community.&lt;/p&gt;
    
        &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/plugins"&gt;plugins&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/projects"&gt;projects&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/ai"&gt;ai&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/annotated-release-notes"&gt;annotated-release-notes&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/openai"&gt;openai&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/generative-ai"&gt;generative-ai&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/llms"&gt;llms&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/llm"&gt;llm&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/gemini"&gt;gemini&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/long-context"&gt;long-context&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/files-to-prompt"&gt;files-to-prompt&lt;/a&gt;&lt;/p&gt;
    

</summary><category term="plugins"/><category term="projects"/><category term="ai"/><category term="annotated-release-notes"/><category term="openai"/><category term="generative-ai"/><category term="llms"/><category term="llm"/><category term="gemini"/><category term="long-context"/><category term="files-to-prompt"/></entry><entry><title>llm-openrouter 0.4</title><link href="https://simonwillison.net/2025/Mar/10/llm-openrouter-04/#atom-tag" rel="alternate"/><published>2025-03-10T21:40:56+00:00</published><updated>2025-03-10T21:40:56+00:00</updated><id>https://simonwillison.net/2025/Mar/10/llm-openrouter-04/#atom-tag</id><summary type="html">
    
&lt;p&gt;&lt;strong&gt;&lt;a href="https://github.com/simonw/llm-openrouter/releases/tag/0.4"&gt;llm-openrouter 0.4&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;
I found out this morning that &lt;a href="https://openrouter.ai/"&gt;OpenRouter&lt;/a&gt; include support for a number of (rate-limited) &lt;a href="https://openrouter.ai/models?max_price=0"&gt;free API models&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;I occasionally run workshops on top of LLMs (&lt;a href="https://simonwillison.net/2025/Mar/8/cutting-edge-web-scraping/"&gt;like this one&lt;/a&gt;) and being able to provide students with a quick way to obtain an API key against models where they don't have to setup billing is really valuable to me!&lt;/p&gt;
&lt;p&gt;This inspired me to upgrade my existing &lt;a href="https://github.com/simonw/llm-openrouter"&gt;llm-openrouter&lt;/a&gt; plugin, and in doing so I closed out a bunch of open feature requests.&lt;/p&gt;
&lt;p&gt;Consider this post the &lt;a href="https://simonwillison.net/tags/annotated-release-notes/"&gt;annotated release notes&lt;/a&gt;:&lt;/p&gt;
&lt;blockquote&gt;
&lt;ul&gt;
&lt;li&gt;LLM &lt;a href="https://llm.datasette.io/en/stable/schemas.html"&gt;schema support&lt;/a&gt; for OpenRouter models that &lt;a href="https://openrouter.ai/models?order=newest&amp;amp;supported_parameters=structured_outputs"&gt;support structured output&lt;/a&gt;. &lt;a href="https://github.com/simonw/llm-openrouter/issues/23"&gt;#23&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/blockquote&gt;
&lt;p&gt;I'm trying to get support for LLM's &lt;a href="https://simonwillison.net/2025/Feb/28/llm-schemas/"&gt;new schema feature&lt;/a&gt; into as many plugins as possible.&lt;/p&gt;
&lt;p&gt;OpenRouter's OpenAI-compatible API includes support for the &lt;code&gt;response_format&lt;/code&gt; &lt;a href="https://openrouter.ai/docs/features/structured-outputs"&gt;structured content option&lt;/a&gt;, but with an important caveat: it only works for some models, and if you try to use it on others it is silently ignored.&lt;/p&gt;
&lt;p&gt;I &lt;a href="https://github.com/OpenRouterTeam/openrouter-examples/issues/20"&gt;filed an issue&lt;/a&gt; with OpenRouter requesting they include schema support in their machine-readable model index. For the moment LLM will let you specify schemas for unsupported models and will ignore them entirely, which isn't ideal.&lt;/p&gt;
&lt;blockquote&gt;
&lt;ul&gt;
&lt;li&gt;&lt;code&gt;llm openrouter key&lt;/code&gt; command displays information about your current API key. &lt;a href="https://github.com/simonw/llm-openrouter/issues/24"&gt;#24&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/blockquote&gt;
&lt;p&gt;Useful for debugging and checking the details of your key's rate limit.&lt;/p&gt;
&lt;blockquote&gt;
&lt;ul&gt;
&lt;li&gt;&lt;code&gt;llm -m ... -o online 1&lt;/code&gt; enables &lt;a href="https://openrouter.ai/docs/features/web-search"&gt;web search grounding&lt;/a&gt; against any model, powered by &lt;a href="https://exa.ai/"&gt;Exa&lt;/a&gt;. &lt;a href="https://github.com/simonw/llm-openrouter/issues/25"&gt;#25&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/blockquote&gt;
&lt;p&gt;OpenRouter apparently make this feature available to every one of their supported models! They're using new-to-me &lt;a href="https://exa.ai/"&gt;Exa&lt;/a&gt; to power this feature, an AI-focused search engine startup who appear to have built their own index with their own crawlers (according to &lt;a href="https://docs.exa.ai/reference/faqs#how-often-is-the-index-updated"&gt;their FAQ&lt;/a&gt;). This feature is currently priced by OpenRouter at $4 per 1000 results, and since 5 results are returned for every prompt that's 2 cents per prompt.&lt;/p&gt;
&lt;blockquote&gt;
&lt;ul&gt;
&lt;li&gt;&lt;code&gt;llm openrouter models&lt;/code&gt; command for listing details of the OpenRouter models, including a &lt;code&gt;--json&lt;/code&gt; option to get JSON and a &lt;code&gt;--free&lt;/code&gt; option to filter for just the free models. &lt;a href="https://github.com/simonw/llm-openrouter/issues/26"&gt;#26&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/blockquote&gt;
&lt;p&gt;This offers a neat way to list the available models. There are examples of the output &lt;a href="https://github.com/simonw/llm-openrouter/issues/26#issuecomment-2711908704"&gt;in the comments on the issue&lt;/a&gt;.&lt;/p&gt;
&lt;blockquote&gt;
&lt;ul&gt;
&lt;li&gt;New option to specify custom provider routing: &lt;code&gt;-o provider '{JSON here}'&lt;/code&gt;. &lt;a href="https://github.com/simonw/llm-openrouter/issues/17"&gt;#17&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/blockquote&gt;
&lt;p&gt;Part of OpenRouter's USP is that it can route prompts to different providers depending on factors like latency, cost or as a fallback if your first choice is unavailable - great for if you are using open weight models like Llama which are hosted by competing companies.&lt;/p&gt;
&lt;p&gt;The options they provide for routing are &lt;a href="https://openrouter.ai/docs/features/provider-routing"&gt;very thorough&lt;/a&gt; - I had initially hoped to provide a set of CLI options that covered all of these bases, but I decided instead to reuse their JSON format and forward those options directly on to the model.


    &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/cli"&gt;cli&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/plugins"&gt;plugins&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/projects"&gt;projects&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/ai"&gt;ai&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/annotated-release-notes"&gt;annotated-release-notes&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/generative-ai"&gt;generative-ai&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/llms"&gt;llms&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/llm"&gt;llm&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/openrouter"&gt;openrouter&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/ai-assisted-search"&gt;ai-assisted-search&lt;/a&gt;&lt;/p&gt;



</summary><category term="cli"/><category term="plugins"/><category term="projects"/><category term="ai"/><category term="annotated-release-notes"/><category term="generative-ai"/><category term="llms"/><category term="llm"/><category term="openrouter"/><category term="ai-assisted-search"/></entry><entry><title>llm-ollama 0.9.0</title><link href="https://simonwillison.net/2025/Mar/4/llm-ollama-090/#atom-tag" rel="alternate"/><published>2025-03-04T07:17:52+00:00</published><updated>2025-03-04T07:17:52+00:00</updated><id>https://simonwillison.net/2025/Mar/4/llm-ollama-090/#atom-tag</id><summary type="html">
    
&lt;p&gt;&lt;strong&gt;&lt;a href="https://github.com/taketwo/llm-ollama/releases/tag/0.9.0"&gt;llm-ollama 0.9.0&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;
This release of the &lt;code&gt;llm-ollama&lt;/code&gt; plugin adds support for &lt;a href="https://simonwillison.net/2025/Feb/28/llm-schemas/"&gt;schemas&lt;/a&gt;, thanks to a &lt;a href="https://github.com/taketwo/llm-ollama/pull/36"&gt;PR by Adam Compton&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;Ollama provides very robust support for this pattern thanks to their &lt;a href="https://ollama.com/blog/structured-outputs"&gt;structured outputs&lt;/a&gt; feature, which works across all of the models that they support by intercepting the logic that outputs the next token and restricting it to only tokens that would be valid in the context of the provided schema.&lt;/p&gt;
&lt;p&gt;With Ollama and &lt;code&gt;llm-ollama&lt;/code&gt; installed you can run even run structured schemas against vision prompts for local models. Here's one against Ollama's &lt;a href="https://ollama.com/library/llama3.2-vision"&gt;llama3.2-vision&lt;/a&gt;:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;llm -m llama3.2-vision:latest \
  'describe images' \
  --schema 'species,description,count int' \
  -a https://static.simonwillison.net/static/2025/two-pelicans.jpg
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;I got back this:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;{
    "species": "Pelicans",
    "description": "The image features a striking brown pelican with its distinctive orange beak, characterized by its large size and impressive wingspan.",
    "count": 1
}
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;(Actually a bit disappointing, as there are &lt;a href="https://static.simonwillison.net/static/2025/two-pelicans.jpg"&gt;two pelicans&lt;/a&gt; and their beaks are brown.)


    &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/plugins"&gt;plugins&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/ai"&gt;ai&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/generative-ai"&gt;generative-ai&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/llama"&gt;llama&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/local-llms"&gt;local-llms&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/llms"&gt;llms&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/llm"&gt;llm&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/vision-llms"&gt;vision-llms&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/ollama"&gt;ollama&lt;/a&gt;&lt;/p&gt;



</summary><category term="plugins"/><category term="ai"/><category term="generative-ai"/><category term="llama"/><category term="local-llms"/><category term="llms"/><category term="llm"/><category term="vision-llms"/><category term="ollama"/></entry><entry><title>llm-mistral 0.11</title><link href="https://simonwillison.net/2025/Mar/4/llm-mistral-011/#atom-tag" rel="alternate"/><published>2025-03-04T07:05:21+00:00</published><updated>2025-03-04T07:05:21+00:00</updated><id>https://simonwillison.net/2025/Mar/4/llm-mistral-011/#atom-tag</id><summary type="html">
    
&lt;p&gt;&lt;strong&gt;&lt;a href="https://github.com/simonw/llm-mistral/releases/tag/0.11"&gt;llm-mistral 0.11&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;
I added &lt;a href="https://simonwillison.net/2025/Feb/28/llm-schemas/"&gt;schema support&lt;/a&gt; to this plugin which adds support for the &lt;a href="https://docs.mistral.ai/api/"&gt;Mistral API&lt;/a&gt; to LLM. Release notes:&lt;/p&gt;
&lt;blockquote&gt;
&lt;ul&gt;
&lt;li&gt;Support for LLM &lt;a href="https://llm.datasette.io/en/stable/schemas.html"&gt;schemas&lt;/a&gt;. &lt;a href="https://github.com/simonw/llm-mistral/issues/19"&gt;#19&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;code&gt;-o prefix '{'&lt;/code&gt; option for forcing a response prefix. &lt;a href="https://github.com/simonw/llm-mistral/issues/18"&gt;#18&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/blockquote&gt;
&lt;p&gt;Schemas now work with OpenAI, Anthropic, Gemini and Mistral hosted models, plus self-hosted models via &lt;a href="https://www.ollama.com/"&gt;Ollama&lt;/a&gt; and &lt;a href="https://github.com/taketwo/llm-ollama"&gt;llm-ollama&lt;/a&gt;.


    &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/plugins"&gt;plugins&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/projects"&gt;projects&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/ai"&gt;ai&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/generative-ai"&gt;generative-ai&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/local-llms"&gt;local-llms&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/llms"&gt;llms&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/llm"&gt;llm&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/mistral"&gt;mistral&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/ollama"&gt;ollama&lt;/a&gt;&lt;/p&gt;



</summary><category term="plugins"/><category term="projects"/><category term="ai"/><category term="generative-ai"/><category term="local-llms"/><category term="llms"/><category term="llm"/><category term="mistral"/><category term="ollama"/></entry><entry><title>Run LLMs on macOS using llm-mlx and Apple's MLX framework</title><link href="https://simonwillison.net/2025/Feb/15/llm-mlx/#atom-tag" rel="alternate"/><published>2025-02-15T19:13:29+00:00</published><updated>2025-02-15T19:13:29+00:00</updated><id>https://simonwillison.net/2025/Feb/15/llm-mlx/#atom-tag</id><summary type="html">
    &lt;p&gt;&lt;a href="https://github.com/simonw/llm-mlx"&gt;llm-mlx&lt;/a&gt; is a brand new plugin for my &lt;a href="https://llm.datasette.io/"&gt;LLM&lt;/a&gt; Python Library and CLI utility which builds on top of Apple's excellent &lt;a href="https://github.com/ml-explore/mlx"&gt;MLX array framework library&lt;/a&gt; and &lt;a href="https://pypi.org/project/mlx-lm/"&gt;mlx-lm&lt;/a&gt; package. If you're a terminal user or Python developer with a Mac this may be the new easiest way to start exploring local Large Language Models.&lt;/p&gt;
&lt;h4 id="running-llama-3-2-3b-using-llm-mlx"&gt;Running Llama 3.2 3B using llm-mlx&lt;/h4&gt;
&lt;p&gt;If you haven't already got LLM installed you'll need &lt;a href="https://llm.datasette.io/en/stable/setup.html"&gt;to install it&lt;/a&gt; - you can do that in a bunch of different ways - in order of preference I like &lt;code&gt;uv tool install llm&lt;/code&gt; or &lt;code&gt;pipx install llm&lt;/code&gt; or &lt;code&gt;brew install llm&lt;/code&gt; or &lt;code&gt;pip install llm&lt;/code&gt;.&lt;/p&gt;
&lt;p&gt;Next, install the new plugin (macOS only):&lt;/p&gt;
&lt;div class="highlight highlight-source-shell"&gt;&lt;pre&gt;llm install llm-mlx&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;Now download and register a model. Llama 3.2 3B is an excellent first choice - it's pretty small (a 1.8GB download) but is a surprisingly capable starter model.&lt;/p&gt;
&lt;div class="highlight highlight-source-shell"&gt;&lt;pre&gt;llm mlx download-model mlx-community/Llama-3.2-3B-Instruct-4bit&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;This will download 1.8GB of model weights from &lt;a href="https://huggingface.co/mlx-community/Llama-3.2-3B-Instruct-4bit"&gt;mlx-community/Llama-3.2-3B-Instruct-4bit&lt;/a&gt; on Hugging Face and store them here:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;~/.cache/huggingface/hub/models--mlx-community--Llama-3.2-3B-Instruct-4bit
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Now you can start running prompts:&lt;/p&gt;
&lt;div class="highlight highlight-source-shell"&gt;&lt;pre&gt;llm -m mlx-community/Llama-3.2-3B-Instruct-4bit &lt;span class="pl-s"&gt;&lt;span class="pl-pds"&gt;'&lt;/span&gt;Python code to traverse a tree, briefly&lt;span class="pl-pds"&gt;'&lt;/span&gt;&lt;/span&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;Which &lt;a href="https://gist.github.com/simonw/85ce6a0faf1a58a243f9e5f4b2daeb3d"&gt;output this&lt;/a&gt; for me:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;Here's a brief example of how to traverse a tree in Python:&lt;/p&gt;
&lt;pre&gt;&lt;span class="pl-k"&gt;class&lt;/span&gt; &lt;span class="pl-v"&gt;Node&lt;/span&gt;:
    &lt;span class="pl-k"&gt;def&lt;/span&gt; &lt;span class="pl-en"&gt;__init__&lt;/span&gt;(&lt;span class="pl-s1"&gt;self&lt;/span&gt;, &lt;span class="pl-s1"&gt;value&lt;/span&gt;):
        &lt;span class="pl-s1"&gt;self&lt;/span&gt;.&lt;span class="pl-c1"&gt;value&lt;/span&gt; &lt;span class="pl-c1"&gt;=&lt;/span&gt; &lt;span class="pl-s1"&gt;value&lt;/span&gt;
        &lt;span class="pl-s1"&gt;self&lt;/span&gt;.&lt;span class="pl-c1"&gt;children&lt;/span&gt; &lt;span class="pl-c1"&gt;=&lt;/span&gt; []

&lt;span class="pl-k"&gt;def&lt;/span&gt; &lt;span class="pl-en"&gt;traverse_tree&lt;/span&gt;(&lt;span class="pl-s1"&gt;node&lt;/span&gt;):
    &lt;span class="pl-k"&gt;if&lt;/span&gt; &lt;span class="pl-s1"&gt;node&lt;/span&gt; &lt;span class="pl-c1"&gt;is&lt;/span&gt; &lt;span class="pl-c1"&gt;None&lt;/span&gt;:
        &lt;span class="pl-k"&gt;return&lt;/span&gt;
    
    &lt;span class="pl-en"&gt;print&lt;/span&gt;(&lt;span class="pl-s1"&gt;node&lt;/span&gt;.&lt;span class="pl-c1"&gt;value&lt;/span&gt;)
    
    &lt;span class="pl-k"&gt;for&lt;/span&gt; &lt;span class="pl-s1"&gt;child&lt;/span&gt; &lt;span class="pl-c1"&gt;in&lt;/span&gt; &lt;span class="pl-s1"&gt;node&lt;/span&gt;.&lt;span class="pl-c1"&gt;children&lt;/span&gt;:
        &lt;span class="pl-en"&gt;traverse_tree&lt;/span&gt;(&lt;span class="pl-s1"&gt;child&lt;/span&gt;)

&lt;span class="pl-c"&gt;# Example usage:&lt;/span&gt;
&lt;span class="pl-s1"&gt;root&lt;/span&gt; &lt;span class="pl-c1"&gt;=&lt;/span&gt; &lt;span class="pl-en"&gt;Node&lt;/span&gt;(&lt;span class="pl-s"&gt;"A"&lt;/span&gt;)
&lt;span class="pl-s1"&gt;root&lt;/span&gt;.&lt;span class="pl-c1"&gt;children&lt;/span&gt; &lt;span class="pl-c1"&gt;=&lt;/span&gt; [&lt;span class="pl-en"&gt;Node&lt;/span&gt;(&lt;span class="pl-s"&gt;"B"&lt;/span&gt;), &lt;span class="pl-en"&gt;Node&lt;/span&gt;(&lt;span class="pl-s"&gt;"C"&lt;/span&gt;)]
&lt;span class="pl-s1"&gt;root&lt;/span&gt;.&lt;span class="pl-c1"&gt;children&lt;/span&gt;[&lt;span class="pl-c1"&gt;0&lt;/span&gt;].&lt;span class="pl-c1"&gt;children&lt;/span&gt; &lt;span class="pl-c1"&gt;=&lt;/span&gt; [&lt;span class="pl-en"&gt;Node&lt;/span&gt;(&lt;span class="pl-s"&gt;"D"&lt;/span&gt;), &lt;span class="pl-en"&gt;Node&lt;/span&gt;(&lt;span class="pl-s"&gt;"E"&lt;/span&gt;)]
&lt;span class="pl-s1"&gt;root&lt;/span&gt;.&lt;span class="pl-c1"&gt;children&lt;/span&gt;[&lt;span class="pl-c1"&gt;1&lt;/span&gt;].&lt;span class="pl-c1"&gt;children&lt;/span&gt; &lt;span class="pl-c1"&gt;=&lt;/span&gt; [&lt;span class="pl-en"&gt;Node&lt;/span&gt;(&lt;span class="pl-s"&gt;"F"&lt;/span&gt;)]

&lt;span class="pl-en"&gt;traverse_tree&lt;/span&gt;(&lt;span class="pl-s1"&gt;root&lt;/span&gt;)  &lt;span class="pl-c"&gt;# Output: A, B, D, E, C, F&lt;/span&gt;&lt;/pre&gt;
&lt;p&gt;In this example, we define a &lt;code&gt;Node&lt;/code&gt; class to represent each node in the tree, with a &lt;code&gt;value&lt;/code&gt; attribute and a list of &lt;code&gt;children&lt;/code&gt;. The &lt;code&gt;traverse_tree&lt;/code&gt; function recursively visits each node in the tree, printing its value.&lt;/p&gt;
&lt;p&gt;This is a basic example of a tree traversal, and there are many variations and optimizations depending on the specific use case.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;That generation ran at an impressive &lt;a href="https://gist.github.com/simonw/85ce6a0faf1a58a243f9e5f4b2daeb3d?permalink_comment_id=5440701#gistcomment-5440701"&gt;152 tokens per second&lt;/a&gt;!&lt;/p&gt;
&lt;p&gt;That command was a bit of a mouthful, so let's assign an alias to the model:&lt;/p&gt;
&lt;div class="highlight highlight-source-shell"&gt;&lt;pre&gt;llm aliases &lt;span class="pl-c1"&gt;set&lt;/span&gt; l32 mlx-community/Llama-3.2-3B-Instruct-4bit&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;Now we can use that shorter alias instead:&lt;/p&gt;
&lt;div class="highlight highlight-source-shell"&gt;&lt;pre&gt;llm -m l32 &lt;span class="pl-s"&gt;&lt;span class="pl-pds"&gt;'&lt;/span&gt;a joke about a haggis buying a car&lt;span class="pl-pds"&gt;'&lt;/span&gt;&lt;/span&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;(The &lt;a href="https://gist.github.com/simonw/af322b399ce448f766d9349d9dc1cb5d#response"&gt;joke&lt;/a&gt; isn't very good.)&lt;/p&gt;
&lt;p&gt;As with other models supported by LLM, you can also pipe things to it. Here's how to get it to explain a piece of Python code (in this case &lt;a href="https://github.com/simonw/llm-mlx/blob/0.2/llm_mlx.py"&gt;itself&lt;/a&gt;):&lt;/p&gt;
&lt;div class="highlight highlight-source-shell"&gt;&lt;pre&gt;cat llm_mlx.py &lt;span class="pl-k"&gt;|&lt;/span&gt; llm -m l32 &lt;span class="pl-s"&gt;&lt;span class="pl-pds"&gt;'&lt;/span&gt;explain this code&lt;span class="pl-pds"&gt;'&lt;/span&gt;&lt;/span&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;The response started like this:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;This code is a part of a larger project that uses the Hugging Face Transformers library to create a text-to-text conversational AI model. The code defines a custom model class &lt;code&gt;MlxModel&lt;/code&gt; and a set of command-line interface (CLI) commands for working with MLX models. [...]&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;Here's &lt;a href="https://gist.github.com/simonw/a577c820b18591d4825b70679cee3f9c"&gt;the rest of the response&lt;/a&gt;. I'm pretty amazed at how well it did for a tiny 1.8GB model!&lt;/p&gt;

&lt;p&gt;This plugin can only run models that have been converted to work with Apple's MLX framework. Thankfully the &lt;a href="https://huggingface.co/mlx-community"&gt;mlx-community&lt;/a&gt; organization on Hugging Face has published over 1,000 of these. A few I've tried successfully:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;mlx-community/Qwen2.5-0.5B-Instruct-4bit&lt;/code&gt; - &lt;a href="https://huggingface.co/mlx-community/Qwen2.5-0.5B-Instruct-4bit"&gt;278 MB&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;mlx-community/Mistral-7B-Instruct-v0.3-4bit&lt;/code&gt; - &lt;a href="https://huggingface.co/mlx-community/Mistral-7B-Instruct-v0.3-4bit"&gt;4.08 GB&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;mlx-community/Mistral-Small-24B-Instruct-2501-4bit&lt;/code&gt; - &lt;a href="https://huggingface.co/mlx-community/Mistral-Small-24B-Instruct-2501-4bit"&gt;13.26 GB&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;mlx-community/DeepSeek-R1-Distill-Qwen-32B-4bit&lt;/code&gt; - &lt;a href="https://huggingface.co/mlx-community/DeepSeek-R1-Distill-Qwen-32B-4bit"&gt;18.5 GB&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;mlx-community/Llama-3.3-70B-Instruct-4bit&lt;/code&gt; - &lt;a href="https://huggingface.co/mlx-community/Llama-3.3-70B-Instruct-4bit"&gt;40 GB&lt;/a&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Mistral-Small-24B really does feel GPT-4 quality despite only needing around 12GB of RAM to run - so it's a good default model if you want to leave space to run other apps.&lt;/p&gt;

&lt;p&gt;DeepSeek-R1-Distill-Qwen-32B-4bit is a great model if you want to try out R1 style "reasoning" which outputs a &lt;code&gt;&amp;lt;think&amp;gt;...&amp;lt;/think&amp;gt;...&lt;/code&gt; block at the start of each response.&lt;/p&gt;

&lt;p&gt;That last Llama 70B one needs a machine with 64GB of RAM to run but is &lt;em&gt;very&lt;/em&gt; capable - I wrote about trying out the Ollama version of Llama 3.3 70B in &lt;a href="https://simonwillison.net/2024/Dec/9/llama-33-70b/"&gt;I can now run a GPT-4 class model on my laptop&lt;/a&gt; back in December.&lt;/p&gt;
&lt;div class="highlight highlight-source-shell"&gt;&lt;pre&gt;llm mlx download-model mlx-community/Llama-3.3-70B-Instruct-4bit&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;&lt;img src="https://static.simonwillison.net/static/2025/mlx-download-llama-33.jpg" alt="Screenshot of the download-model command running against the Llama 3.3 70B model, it shows fetching 13 files and displays progress bars for 8 different safetensors files each weighing between 2.6 and 3.3GB." style="max-width: 100%;" /&gt;&lt;/p&gt;
&lt;p&gt;For larger models like that I recommend using the &lt;code&gt;llm chat&lt;/code&gt; command as that ensures they stay in memory for multiple rounds of the conversation:&lt;/p&gt;
&lt;div class="highlight highlight-source-shell"&gt;&lt;pre&gt;llm chat -m mlx-community/Llama-3.3-70B-Instruct-4bit&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;Activity Monitor shows the process using 37.5GB of RAM while the model is running.&lt;/p&gt;
&lt;p&gt;Some of these jokes are actually pretty good!&lt;/p&gt;
&lt;blockquote&gt;
&lt;pre&gt;&lt;code&gt;Chatting with mlx-community/Llama-3.3-70B-Instruct-4bit
Type 'exit' or 'quit' to exit
Type '!multi' to enter multiple lines, then '!end' to finish
&amp;gt; hi
How's your day going so far? Is there something I can help you with or would you like to chat?
&amp;gt; jokes about weird animals please
Here are a few:

1. Why did the axolotl go to therapy? Because it was struggling to come to terms with its identity crisis... and also because it had a lot of "regrowing" to do!
2. What did the aye-aye say to the other aye-aye when they were out on a date? "You've got a finger on the pulse of my heart!"
3. Why did the naked mole rat go to the party? Because it was a "hair-raising" experience and it wanted to get to the "root" of the fun!
4. What do you call a group of flamingos playing instruments? The Pink Floyd... of the avian world, of course!
5. Why did the blobfish go to the doctor? It had a "gut" feeling something was wrong... and also because it was feeling a little "deflated"!

Want more? I've got a whole zoo of 'em!
&lt;/code&gt;&lt;/pre&gt;
&lt;/blockquote&gt;
&lt;p&gt;That ran at &lt;a href="https://gist.github.com/simonw/ec99b6cf65ae61256a34f3ac269951dc"&gt;8.8 tokens per second&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;As with every LLM plugin, all prompts and responses are logged to &lt;a href="https://llm.datasette.io/en/stable/logging.html"&gt;a SQLite database&lt;/a&gt; that can be accessed using the &lt;code&gt;llm logs&lt;/code&gt; command.&lt;/p&gt;

&lt;p&gt;You can use LLM models from your Python code as well. The README includes &lt;a href="https://github.com/simonw/llm-mlx/blob/0.2.1/README.md#using-models-from-python"&gt;notes on how to do this&lt;/a&gt;, as does the &lt;a href="https://llm.datasette.io/en/stable/python-api.html"&gt;LLM Python API documentation&lt;/a&gt;. TLDR:&lt;/p&gt;

&lt;pre&gt;&lt;span class="pl-k"&gt;import&lt;/span&gt; &lt;span class="pl-s1"&gt;llm&lt;/span&gt;
&lt;span class="pl-s1"&gt;model&lt;/span&gt; &lt;span class="pl-c1"&gt;=&lt;/span&gt; &lt;span class="pl-s1"&gt;llm&lt;/span&gt;.&lt;span class="pl-c1"&gt;get_model&lt;/span&gt;(&lt;span class="pl-s"&gt;"mlx-community/Llama-3.2-3B-Instruct-4bit"&lt;/span&gt;)
&lt;span class="pl-en"&gt;print&lt;/span&gt;(&lt;span class="pl-s1"&gt;model&lt;/span&gt;.&lt;span class="pl-c1"&gt;prompt&lt;/span&gt;(&lt;span class="pl-s"&gt;"hi"&lt;/span&gt;).&lt;span class="pl-c1"&gt;text&lt;/span&gt;())&lt;/pre&gt;

&lt;h4 id="mlx-and-mlx-lm"&gt;MLX and mlx-lm&lt;/h4&gt;
&lt;p&gt;The secret sauce behind this new plugin is Apple's &lt;a href="https://github.com/ml-explore/mlx"&gt;MLX library&lt;/a&gt;, which as been improving at an extraordinary pace over the past year - it first launched on GitHub just in November of 2023.&lt;/p&gt;
&lt;p&gt;The core library is C++, but the MLX team at Apple also maintain a &lt;a href="https://github.com/ml-explore/mlx-examples"&gt;mlx-examples&lt;/a&gt; repo with a wealth of Python examples, including the &lt;a href="https://github.com/ml-explore/mlx-examples/blob/main/llms/README.md"&gt;mlx-lm&lt;/a&gt; and &lt;a href="https://github.com/ml-explore/mlx-examples/tree/main/whisper"&gt;mlx-whisper&lt;/a&gt; packages.&lt;/p&gt;
&lt;p&gt;The community around it is growing too. I'm particularly impressed by Prince Canuma's &lt;a href="https://github.com/Blaizzy/mlx-vlm"&gt;mlx-vlm&lt;/a&gt; library for accessing multi-modal vision LLMs - I'd love to get that working as an LLM plugin once I figure out &lt;a href="https://github.com/Blaizzy/mlx-vlm/issues/68"&gt;how to use it for conversations&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;I've used MLX for &lt;a href="https://simonwillison.net/tags/mlx/"&gt;a few experiments&lt;/a&gt; in the past, but &lt;a href="https://twitter.com/awnihannun/status/1890524526821126620"&gt;this tweet&lt;/a&gt; from MLX core developer Awni Hannun finally convinced me to wrap it up as an LLM plugin:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;In the latest MLX small LLMs are a lot faster.&lt;/p&gt;
&lt;p&gt;On M4 Max 4-bit Qwen 0.5B generates 1k tokens at a whopping 510 toks/sec. And runs at over 150 tok/sec on iPhone 16 pro.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;This is &lt;em&gt;really&lt;/em&gt; good software. This small team at Apple appear to be almost single-handedly giving NVIDIA's CUDA a run for their money!&lt;/p&gt;
&lt;h4 id="building-the-plugin"&gt;Building the plugin&lt;/h4&gt;
&lt;p&gt;The &lt;a href="https://github.com/simonw/llm-mlx"&gt;llm-mlx&lt;/a&gt; plugin came together pretty quickly. The &lt;a href="https://github.com/simonw/llm-mlx/blob/0.1/llm_mlx.py"&gt;first version&lt;/a&gt; was ~100 lines of Python, much of it repurposed from my existing &lt;a href="https://github.com/simonw/llm-gguf"&gt;llm-gguf&lt;/a&gt; plugin.&lt;/p&gt;
&lt;p&gt;The hardest problem was figuring out how to hide the Hugging Face progress bars that displayed every time it attempted to access a model!&lt;/p&gt;
&lt;p&gt;I eventually found the &lt;code&gt;from huggingface_hub.utils import disable_progress_bars&lt;/code&gt; utility function by piping library code through Gemini 2.0.&lt;/p&gt;
&lt;p&gt;I then added &lt;a href="https://github.com/simonw/llm-mlx/issues/3"&gt;model options support&lt;/a&gt; allowing you to pass options like this:&lt;/p&gt;
&lt;div class="highlight highlight-source-shell"&gt;&lt;pre&gt;llm -m l32 &lt;span class="pl-s"&gt;&lt;span class="pl-pds"&gt;'&lt;/span&gt;a greeting&lt;span class="pl-pds"&gt;'&lt;/span&gt;&lt;/span&gt; -o temperature 1.0 -o seed 2&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;So far using a fixed seed &lt;a href="https://github.com/simonw/llm-mlx/issues/3#issuecomment-2661025591"&gt;appears to produce repeatable results&lt;/a&gt;, which is exciting for writing more predictable tests in the future.&lt;/p&gt;

&lt;p&gt;For the &lt;a href="https://github.com/simonw/llm-mlx/blob/0.2/tests/test_mlx.py"&gt;automated tests&lt;/a&gt; that run in GitHub Actions I decided to use a small model - I went with the tiny 75MB &lt;a href="https://huggingface.co/mlx-community/SmolLM-135M-Instruct-4bit/tree/main"&gt;mlx-community/SmolLM-135M-Instruct-4bit&lt;/a&gt; (explored &lt;a href="https://simonwillison.net/2025/Feb/7/pip-install-llm-smollm2/"&gt;previously&lt;/a&gt;). I configured GitHub Actions to cache the model in between CI runs by adding the following YAML to my &lt;a href="https://github.com/simonw/llm-mlx/blob/0.2/.github/workflows/test.yml"&gt;.github/workflows/test.yml&lt;/a&gt; file:&lt;/p&gt;
&lt;div class="highlight highlight-source-yaml"&gt;&lt;pre&gt;    - &lt;span class="pl-ent"&gt;name&lt;/span&gt;: &lt;span class="pl-s"&gt;Cache models&lt;/span&gt;
      &lt;span class="pl-ent"&gt;uses&lt;/span&gt;: &lt;span class="pl-s"&gt;actions/cache@v4&lt;/span&gt;
      &lt;span class="pl-ent"&gt;with&lt;/span&gt;:
        &lt;span class="pl-ent"&gt;path&lt;/span&gt;: &lt;span class="pl-s"&gt;~/.cache/huggingface&lt;/span&gt;
        &lt;span class="pl-ent"&gt;key&lt;/span&gt;: &lt;span class="pl-s"&gt;${{ runner.os }}-huggingface-&lt;/span&gt;&lt;/pre&gt;&lt;/div&gt;
    
        &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/apple"&gt;apple&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/plugins"&gt;plugins&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/projects"&gt;projects&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/ai"&gt;ai&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/github-actions"&gt;github-actions&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/generative-ai"&gt;generative-ai&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/local-llms"&gt;local-llms&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/llms"&gt;llms&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/llm"&gt;llm&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/qwen"&gt;qwen&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/mlx"&gt;mlx&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/prince-canuma"&gt;prince-canuma&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/ai-in-china"&gt;ai-in-china&lt;/a&gt;&lt;/p&gt;
    

</summary><category term="apple"/><category term="plugins"/><category term="projects"/><category term="ai"/><category term="github-actions"/><category term="generative-ai"/><category term="local-llms"/><category term="llms"/><category term="llm"/><category term="qwen"/><category term="mlx"/><category term="prince-canuma"/><category term="ai-in-china"/></entry><entry><title>llm-sort</title><link href="https://simonwillison.net/2025/Feb/11/llm-sort/#atom-tag" rel="alternate"/><published>2025-02-11T20:50:21+00:00</published><updated>2025-02-11T20:50:21+00:00</updated><id>https://simonwillison.net/2025/Feb/11/llm-sort/#atom-tag</id><summary type="html">
    
&lt;p&gt;&lt;strong&gt;&lt;a href="https://github.com/vagos/llm-sort"&gt;llm-sort&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;
Delightful &lt;a href="https://llm.datasette.io/"&gt;LLM&lt;/a&gt; plugin by Evangelos Lamprou which adds the ability to perform "semantic search" - allowing you to sort the contents of a file based on using a prompt against an LLM to determine sort order.&lt;/p&gt;
&lt;p&gt;Best illustrated by these examples from the README:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;llm sort --query "Which names is more suitable for a pet monkey?" names.txt

cat titles.txt | llm sort --query "Which book should I read to cook better?"
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;It works using this pairwise prompt, which is executed multiple times using Python's &lt;code&gt;sorted(documents, key=functools.cmp_to_key(compare_callback))&lt;/code&gt; mechanism:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;Given the query:
{query}

Compare the following two lines:

Line A:
{docA}

Line B:
{docB}

Which line is more relevant to the query? Please answer with "Line A" or "Line B".
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;From &lt;a href="https://lobste.rs/s/yxlisx/llm_sort_sort_input_lines_semantically#c_enduz7"&gt;the lobste.rs comments&lt;/a&gt;, Cole Kurashige:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;I'm not saying I'm prescient, but in The Before Times &lt;a href="https://github.com/cole-k/turksort"&gt;I did something similar&lt;/a&gt; with Mechanical Turk&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;This made me realize that &lt;em&gt;so many&lt;/em&gt; of the patterns we were using against Mechanical Turk a decade+ ago can provide hints about potential ways to apply LLMs.

    &lt;p&gt;&lt;small&gt;&lt;/small&gt;Via &lt;a href="https://lobste.rs/s/yxlisx/llm_sort_sort_input_lines_semantically"&gt;lobste.rs&lt;/a&gt;&lt;/small&gt;&lt;/p&gt;


    &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/mechanical-turk"&gt;mechanical-turk&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/plugins"&gt;plugins&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/python"&gt;python&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/ai"&gt;ai&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/generative-ai"&gt;generative-ai&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/llms"&gt;llms&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/llm"&gt;llm&lt;/a&gt;&lt;/p&gt;



</summary><category term="mechanical-turk"/><category term="plugins"/><category term="python"/><category term="ai"/><category term="generative-ai"/><category term="llms"/><category term="llm"/></entry><entry><title>Using pip to install a Large Language Model that's under 100MB</title><link href="https://simonwillison.net/2025/Feb/7/pip-install-llm-smollm2/#atom-tag" rel="alternate"/><published>2025-02-07T06:34:59+00:00</published><updated>2025-02-07T06:34:59+00:00</updated><id>https://simonwillison.net/2025/Feb/7/pip-install-llm-smollm2/#atom-tag</id><summary type="html">
    &lt;p&gt;I just released &lt;a href="https://github.com/simonw/llm-smollm2"&gt;llm-smollm2&lt;/a&gt;, a new plugin for &lt;a href="https://llm.datasette.io/"&gt;LLM&lt;/a&gt; that bundles a quantized copy of the &lt;a href="https://huggingface.co/HuggingFaceTB/SmolLM2-135M-Instruct"&gt;SmolLM2-135M-Instruct&lt;/a&gt; LLM inside of the Python package.&lt;/p&gt;
&lt;p&gt;This means you can now &lt;code&gt;pip install&lt;/code&gt; a full LLM!&lt;/p&gt;

&lt;p&gt;If you're already using &lt;a href="https://llm.datasette.io/"&gt;LLM&lt;/a&gt; you can install it like this:&lt;/p&gt;
&lt;div class="highlight highlight-source-shell"&gt;&lt;pre&gt;llm install llm-smollm2&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;Then run prompts like this:&lt;/p&gt;
&lt;div class="highlight highlight-source-shell"&gt;&lt;pre&gt;llm -m SmolLM2 &lt;span class="pl-s"&gt;&lt;span class="pl-pds"&gt;'&lt;/span&gt;Are dogs real?&lt;span class="pl-pds"&gt;'&lt;/span&gt;&lt;/span&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;(New favourite test prompt for tiny models, courtesy of &lt;a href="https://bsky.app/profile/timfduffy.com/post/3lhknvwijps2o"&gt;Tim Duffy&lt;/a&gt;. &lt;a href="https://gist.github.com/simonw/a71d83cd8c6caa49cec4a9f296c267df"&gt;Here's the result&lt;/a&gt;).&lt;/p&gt;
&lt;p&gt;If you don't have LLM yet first &lt;a href="https://llm.datasette.io/en/stable/setup.html"&gt;follow these installation instructions&lt;/a&gt;, or &lt;code&gt;brew install llm&lt;/code&gt; or &lt;code&gt;pipx install llm&lt;/code&gt; or &lt;code&gt;uv tool install llm&lt;/code&gt; depending on your preferred way of getting your Python tools.&lt;/p&gt;
&lt;p&gt;If you have &lt;a href="https://github.com/astral-sh/uv"&gt;uv&lt;/a&gt; setup you don't need to install anything at all! The following command will spin up an ephemeral environment, install the necessary packages and start a chat session with the model all in one go:&lt;/p&gt;
&lt;div class="highlight highlight-source-shell"&gt;&lt;pre&gt;uvx --with llm-smollm2 llm chat -m SmolLM2&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;&lt;img src="https://static.simonwillison.net/static/2025/smol.gif" alt="$ uvx --with llm-smollm2 llm chat -m SmolLM2
Chatting with SmolLM2
Type 'exit' or 'quit' to exit
Type '!multi' to enter multiple lines, then '!end' to finish
&gt; tell me a long story about a dog
The story of a dog is a tale of love, loss, and the unspoken rules of the universe. It begins with a small, timid, and timid dog named Max. Max was born into a loving family, but his life was forever changed by a cruel and cruel owner. He was left to fend for himself, and he was left to fend for himself.

Max's owner, a kind and gentle soul named Elena, was a cruel and cruel one. She had no patience, no love, and she would do anything to keep Max away from her. Max was left to fend for himself, and he was left to fend for himself.

Max's owner, Elena, was determined to make Max happy. She took Max to a beautiful and beautiful place, where Max was allowed to roam freely. Max was allowed to roam freely, and Max was allowed to roam freely. [Then repeats that sentence many times]" style="max-width: 100%;" /&gt;&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;&lt;a href="https://simonwillison.net/2025/Feb/7/pip-install-llm-smollm2/#finding-a-tiny-model"&gt;Finding a tiny model&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href="https://simonwillison.net/2025/Feb/7/pip-install-llm-smollm2/#building-the-plugin"&gt;Building the plugin&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href="https://simonwillison.net/2025/Feb/7/pip-install-llm-smollm2/#packaging-the-plugin"&gt;Packaging the plugin&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href="https://simonwillison.net/2025/Feb/7/pip-install-llm-smollm2/#publishing-to-pypi"&gt;Publishing to PyPI&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href="https://simonwillison.net/2025/Feb/7/pip-install-llm-smollm2/#is-the-model-any-good-"&gt;Is the model any good?&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;h4 id="finding-a-tiny-model"&gt;Finding a tiny model&lt;/h4&gt;
&lt;p&gt;The fact that the model is almost exactly 100MB is no coincidence: that's the &lt;a href="https://pypi.org/help/#file-size-limit"&gt;default size limit&lt;/a&gt; for a Python package that can be uploaded to the Python Package Index (PyPI).&lt;/p&gt;
&lt;p&gt;I &lt;a href="https://bsky.app/profile/simonwillison.net/post/3lhklqd62jc2x"&gt;asked on Bluesky&lt;/a&gt; if anyone had seen a just-about-usable GGUF model that was under 100MB, and Artisan Loaf &lt;a href="https://bsky.app/profile/artisanloaf.bsky.social/post/3lhklumfhvs2r"&gt;pointed me&lt;/a&gt; to &lt;a href="https://huggingface.co/HuggingFaceTB/SmolLM2-135M-Instruct"&gt;SmolLM2-135M-Instruct&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;I ended up using &lt;a href="https://huggingface.co/QuantFactory/SmolLM2-135M-Instruct-GGUF/tree/main"&gt;this quantization&lt;/a&gt; by &lt;a href="https://huggingface.co/QuantFactory"&gt;QuantFactory&lt;/a&gt; just because it was the first sub-100MB model I tried that worked.&lt;/p&gt;
&lt;p&gt;Trick for finding quantized models: Hugging Face has a neat "model tree" feature in the side panel of their model pages, which includes links to relevant quantized models. I find most of my GGUFs using that feature.&lt;/p&gt;
&lt;p&gt;&lt;img src="https://static.simonwillison.net/static/2025/hugging-face-model-tree.jpg" alt="Model tree for HuggingFaceTB/SmolLM2-135M-Instruct. 60 Quantizations, 6 adapters, 80 finetunes, 1 merge." style="max-width: 100%;" /&gt;&lt;/p&gt;
&lt;h4 id="building-the-plugin"&gt;Building the plugin&lt;/h4&gt;
&lt;p&gt;I first tried the model out using Python and the &lt;a href="https://github.com/abetlen/llama-cpp-python"&gt;llama-cpp-python&lt;/a&gt; library like this:&lt;/p&gt;
&lt;div class="highlight highlight-source-shell"&gt;&lt;pre&gt;uv run --with llama-cpp-python python&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;Then:&lt;/p&gt;
&lt;pre&gt;&lt;span class="pl-k"&gt;from&lt;/span&gt; &lt;span class="pl-s1"&gt;llama_cpp&lt;/span&gt; &lt;span class="pl-k"&gt;import&lt;/span&gt; &lt;span class="pl-v"&gt;Llama&lt;/span&gt;
&lt;span class="pl-k"&gt;from&lt;/span&gt; &lt;span class="pl-s1"&gt;pprint&lt;/span&gt; &lt;span class="pl-k"&gt;import&lt;/span&gt; &lt;span class="pl-s1"&gt;pprint&lt;/span&gt;
&lt;span class="pl-s1"&gt;llm&lt;/span&gt; &lt;span class="pl-c1"&gt;=&lt;/span&gt; &lt;span class="pl-en"&gt;Llama&lt;/span&gt;(&lt;span class="pl-s1"&gt;model_path&lt;/span&gt;&lt;span class="pl-c1"&gt;=&lt;/span&gt;&lt;span class="pl-s"&gt;"SmolLM2-135M-Instruct.Q4_1.gguf"&lt;/span&gt;)
&lt;span class="pl-s1"&gt;output&lt;/span&gt; &lt;span class="pl-c1"&gt;=&lt;/span&gt; &lt;span class="pl-s1"&gt;llm&lt;/span&gt;.&lt;span class="pl-c1"&gt;create_chat_completion&lt;/span&gt;(&lt;span class="pl-s1"&gt;messages&lt;/span&gt;&lt;span class="pl-c1"&gt;=&lt;/span&gt;[
    {&lt;span class="pl-s"&gt;"role"&lt;/span&gt;: &lt;span class="pl-s"&gt;"user"&lt;/span&gt;, &lt;span class="pl-s"&gt;"content"&lt;/span&gt;: &lt;span class="pl-s"&gt;"Hi"&lt;/span&gt;}
])
&lt;span class="pl-en"&gt;pprint&lt;/span&gt;(&lt;span class="pl-s1"&gt;output&lt;/span&gt;)&lt;/pre&gt;
&lt;p&gt;This gave me the output I was expecting:&lt;/p&gt;
&lt;pre&gt;{&lt;span class="pl-s"&gt;'choices'&lt;/span&gt;: [{&lt;span class="pl-s"&gt;'finish_reason'&lt;/span&gt;: &lt;span class="pl-s"&gt;'stop'&lt;/span&gt;,
              &lt;span class="pl-s"&gt;'index'&lt;/span&gt;: &lt;span class="pl-c1"&gt;0&lt;/span&gt;,
              &lt;span class="pl-s"&gt;'logprobs'&lt;/span&gt;: &lt;span class="pl-c1"&gt;None&lt;/span&gt;,
              &lt;span class="pl-s"&gt;'message'&lt;/span&gt;: {&lt;span class="pl-s"&gt;'content'&lt;/span&gt;: &lt;span class="pl-s"&gt;'Hello! How can I assist you today?'&lt;/span&gt;,
                          &lt;span class="pl-s"&gt;'role'&lt;/span&gt;: &lt;span class="pl-s"&gt;'assistant'&lt;/span&gt;}}],
 &lt;span class="pl-s"&gt;'created'&lt;/span&gt;: &lt;span class="pl-c1"&gt;1738903256&lt;/span&gt;,
 &lt;span class="pl-s"&gt;'id'&lt;/span&gt;: &lt;span class="pl-s"&gt;'chatcmpl-76ea1733-cc2f-46d4-9939-90efa2a05e7c'&lt;/span&gt;,
 &lt;span class="pl-s"&gt;'model'&lt;/span&gt;: &lt;span class="pl-s"&gt;'SmolLM2-135M-Instruct.Q4_1.gguf'&lt;/span&gt;,
 &lt;span class="pl-s"&gt;'object'&lt;/span&gt;: &lt;span class="pl-s"&gt;'chat.completion'&lt;/span&gt;,
 &lt;span class="pl-s"&gt;'usage'&lt;/span&gt;: {&lt;span class="pl-s"&gt;'completion_tokens'&lt;/span&gt;: &lt;span class="pl-c1"&gt;9&lt;/span&gt;, &lt;span class="pl-s"&gt;'prompt_tokens'&lt;/span&gt;: &lt;span class="pl-c1"&gt;31&lt;/span&gt;, &lt;span class="pl-s"&gt;'total_tokens'&lt;/span&gt;: &lt;span class="pl-c1"&gt;40&lt;/span&gt;}}&lt;/pre&gt;
&lt;p&gt;But it also &lt;em&gt;spammed&lt;/em&gt; my terminal with a huge volume of debugging output - which started like this:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;llama_model_load_from_file_impl: using device Metal (Apple M2 Max) - 49151 MiB free
llama_model_loader: loaded meta data with 33 key-value pairs and 272 tensors from SmolLM2-135M-Instruct.Q4_1.gguf (version GGUF V3 (latest))
llama_model_loader: Dumping metadata keys/values. Note: KV overrides do not apply in this output.
llama_model_loader: - kv   0:                       general.architecture str              = llama
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;And then continued for more than &lt;a href="https://gist.github.com/simonw/9ef7acd836b1cc40c14686eae4dca340"&gt;500 lines&lt;/a&gt;!&lt;/p&gt;
&lt;p&gt;I've had this problem with &lt;code&gt;llama-cpp-python&lt;/code&gt; and &lt;code&gt;llama.cpp&lt;/code&gt; in the past, and was sad to find that the documentation still doesn't have a great answer for how to avoid this.&lt;/p&gt;
&lt;p&gt;So I turned to the just released &lt;a href="https://simonwillison.net/2025/Feb/5/gemini-2/"&gt;Gemini 2.0 Pro (Experimental)&lt;/a&gt;, because I know it's a strong model with a long input limit.&lt;/p&gt;
&lt;p&gt;I ran the entire &lt;code&gt;llama-cpp-python&lt;/code&gt; codebase through it like this:&lt;/p&gt;
&lt;div class="highlight highlight-source-shell"&gt;&lt;pre&gt;&lt;span class="pl-c1"&gt;cd&lt;/span&gt; /tmp
git clone https://github.com/abetlen/llama-cpp-python
&lt;span class="pl-c1"&gt;cd&lt;/span&gt; llama-cpp-python
files-to-prompt -e py &lt;span class="pl-c1"&gt;.&lt;/span&gt; -c &lt;span class="pl-k"&gt;|&lt;/span&gt; llm -m gemini-2.0-pro-exp-02-05 \
  &lt;span class="pl-s"&gt;&lt;span class="pl-pds"&gt;'&lt;/span&gt;How can I prevent this library from logging any information at all while it is running - no stderr or anything like that&lt;span class="pl-pds"&gt;'&lt;/span&gt;&lt;/span&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;Here's &lt;a href="https://gist.github.com/simonw/20476c2c6f7604df2994212cebfafef4#response"&gt;the answer I got back&lt;/a&gt;. It recommended setting the logger to &lt;code&gt;logging.CRITICAL&lt;/code&gt;, passing &lt;code&gt;verbose=False&lt;/code&gt; to the constructor and, most importantly, using the following context manager to suppress all output:&lt;/p&gt;
&lt;pre&gt;&lt;span class="pl-k"&gt;from&lt;/span&gt; &lt;span class="pl-s1"&gt;contextlib&lt;/span&gt; &lt;span class="pl-k"&gt;import&lt;/span&gt; &lt;span class="pl-s1"&gt;contextmanager&lt;/span&gt;, &lt;span class="pl-s1"&gt;redirect_stderr&lt;/span&gt;, &lt;span class="pl-s1"&gt;redirect_stdout&lt;/span&gt;

&lt;span class="pl-en"&gt;@&lt;span class="pl-s1"&gt;contextmanager&lt;/span&gt;&lt;/span&gt;
&lt;span class="pl-k"&gt;def&lt;/span&gt; &lt;span class="pl-en"&gt;suppress_output&lt;/span&gt;():
    &lt;span class="pl-s"&gt;"""&lt;/span&gt;
&lt;span class="pl-s"&gt;    Suppresses all stdout and stderr output within the context.&lt;/span&gt;
&lt;span class="pl-s"&gt;    """&lt;/span&gt;
    &lt;span class="pl-k"&gt;with&lt;/span&gt; &lt;span class="pl-en"&gt;open&lt;/span&gt;(&lt;span class="pl-s1"&gt;os&lt;/span&gt;.&lt;span class="pl-c1"&gt;devnull&lt;/span&gt;, &lt;span class="pl-s"&gt;"w"&lt;/span&gt;) &lt;span class="pl-k"&gt;as&lt;/span&gt; &lt;span class="pl-s1"&gt;devnull&lt;/span&gt;:
        &lt;span class="pl-k"&gt;with&lt;/span&gt; &lt;span class="pl-en"&gt;redirect_stdout&lt;/span&gt;(&lt;span class="pl-s1"&gt;devnull&lt;/span&gt;), &lt;span class="pl-en"&gt;redirect_stderr&lt;/span&gt;(&lt;span class="pl-s1"&gt;devnull&lt;/span&gt;):
            &lt;span class="pl-k"&gt;yield&lt;/span&gt;&lt;/pre&gt;
&lt;p&gt;This worked! It turned out most of the output came from initializing the &lt;code&gt;LLM&lt;/code&gt; class, so I wrapped that like so:&lt;/p&gt;
&lt;pre&gt;&lt;span class="pl-k"&gt;with&lt;/span&gt; &lt;span class="pl-en"&gt;suppress_output&lt;/span&gt;():
    &lt;span class="pl-s1"&gt;model&lt;/span&gt; &lt;span class="pl-c1"&gt;=&lt;/span&gt; &lt;span class="pl-en"&gt;Llama&lt;/span&gt;(&lt;span class="pl-s1"&gt;model_path&lt;/span&gt;&lt;span class="pl-c1"&gt;=&lt;/span&gt;&lt;span class="pl-s1"&gt;self&lt;/span&gt;.&lt;span class="pl-c1"&gt;model_path&lt;/span&gt;, &lt;span class="pl-s1"&gt;verbose&lt;/span&gt;&lt;span class="pl-c1"&gt;=&lt;/span&gt;&lt;span class="pl-c1"&gt;False&lt;/span&gt;)&lt;/pre&gt;
&lt;p&gt;Proof of concept in hand I set about writing the plugin. I started with my &lt;a href="https://github.com/simonw/llm-plugin"&gt;simonw/llm-plugin&lt;/a&gt; cookiecutter template:&lt;/p&gt;
&lt;div class="highlight highlight-source-shell"&gt;&lt;pre&gt;uvx cookiecutter gh:simonw/llm-plugin&lt;/pre&gt;&lt;/div&gt;
&lt;pre&gt;&lt;code&gt;  [1/6] plugin_name (): smollm2
  [2/6] description (): SmolLM2-135M-Instruct.Q4_1 for LLM
  [3/6] hyphenated (smollm2): 
  [4/6] underscored (smollm2): 
  [5/6] github_username (): simonw
  [6/6] author_name (): Simon Willison
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;The &lt;a href="https://github.com/simonw/llm-smollm2/blob/0.1.1/llm_smollm2/__init__.py"&gt;rest of the plugin&lt;/a&gt; was mostly borrowed from my existing &lt;a href="https://github.com/simonw/llm-gguf/blob/0.2/llm_gguf.py"&gt;llm-gguf&lt;/a&gt; plugin, updated based on the latest README for the &lt;code&gt;llama-cpp-python&lt;/code&gt; project.&lt;/p&gt;
&lt;p&gt;There's more information on building plugins in &lt;a href="https://llm.datasette.io/en/stable/plugins/tutorial-model-plugin.html"&gt;the tutorial on writing a plugin&lt;/a&gt;.&lt;/p&gt;
&lt;h4 id="packaging-the-plugin"&gt;Packaging the plugin&lt;/h4&gt;
&lt;p&gt;Once I had that working the last step was to figure out how to package it for PyPI. I'm never quite sure of the best way to bundle a binary file in a Python package, especially one that uses a &lt;code&gt;pyproject.toml&lt;/code&gt; file... so I dumped a copy of my existing &lt;code&gt;pyproject.toml&lt;/code&gt; file into o3-mini-high and prompted:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;Modify this to bundle a SmolLM2-135M-Instruct.Q4_1.gguf file inside the package. I don't want to use hatch or a manifest or anything, I just want to use setuptools.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;Here's &lt;a href="https://chatgpt.com/share/67a59122-67c8-8006-9be4-29f8419343ad"&gt;the shared transcript&lt;/a&gt; - it gave me exactly what I wanted. I bundled it by adding this to the end of the &lt;code&gt;toml&lt;/code&gt; file:&lt;/p&gt;
&lt;div class="highlight highlight-source-toml"&gt;&lt;pre&gt;[&lt;span class="pl-en"&gt;tool&lt;/span&gt;.&lt;span class="pl-en"&gt;setuptools&lt;/span&gt;.&lt;span class="pl-en"&gt;package-data&lt;/span&gt;]
&lt;span class="pl-smi"&gt;llm_smollm2&lt;/span&gt; = [&lt;span class="pl-s"&gt;&lt;span class="pl-pds"&gt;"&lt;/span&gt;SmolLM2-135M-Instruct.Q4_1.gguf&lt;span class="pl-pds"&gt;"&lt;/span&gt;&lt;/span&gt;]&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;Then dropping that &lt;code&gt;.gguf&lt;/code&gt; file into the &lt;code&gt;llm_smollm2/&lt;/code&gt; directory and putting my plugin code in &lt;code&gt;llm_smollm2/__init__.py&lt;/code&gt;.&lt;/p&gt;
&lt;p&gt;I tested it locally by running this:&lt;/p&gt;
&lt;div class="highlight highlight-source-shell"&gt;&lt;pre&gt;python -m pip install build
python -m build&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;I fired up a fresh virtual environment and ran &lt;code&gt;pip install ../path/to/llm-smollm2/dist/llm_smollm2-0.1-py3-none-any.whl&lt;/code&gt; to confirm that the package worked as expected.&lt;/p&gt;
&lt;h4 id="publishing-to-pypi"&gt;Publishing to PyPI&lt;/h4&gt;
&lt;p&gt;My cookiecutter template comes with &lt;a href="https://github.com/simonw/llm-smollm2/blob/main/.github/workflows/publish.yml"&gt;a GitHub Actions workflow&lt;/a&gt; that publishes the package to PyPI when a new release is created using the GitHub web interface. Here's the relevant YAML:&lt;/p&gt;
&lt;div class="highlight highlight-source-yaml"&gt;&lt;pre&gt;  &lt;span class="pl-ent"&gt;deploy&lt;/span&gt;:
    &lt;span class="pl-ent"&gt;runs-on&lt;/span&gt;: &lt;span class="pl-s"&gt;ubuntu-latest&lt;/span&gt;
    &lt;span class="pl-ent"&gt;needs&lt;/span&gt;: &lt;span class="pl-s"&gt;[test]&lt;/span&gt;
    &lt;span class="pl-ent"&gt;environment&lt;/span&gt;: &lt;span class="pl-s"&gt;release&lt;/span&gt;
    &lt;span class="pl-ent"&gt;permissions&lt;/span&gt;:
      &lt;span class="pl-ent"&gt;id-token&lt;/span&gt;: &lt;span class="pl-s"&gt;write&lt;/span&gt;
    &lt;span class="pl-ent"&gt;steps&lt;/span&gt;:
    - &lt;span class="pl-ent"&gt;uses&lt;/span&gt;: &lt;span class="pl-s"&gt;actions/checkout@v4&lt;/span&gt;
    - &lt;span class="pl-ent"&gt;name&lt;/span&gt;: &lt;span class="pl-s"&gt;Set up Python&lt;/span&gt;
      &lt;span class="pl-ent"&gt;uses&lt;/span&gt;: &lt;span class="pl-s"&gt;actions/setup-python@v5&lt;/span&gt;
      &lt;span class="pl-ent"&gt;with&lt;/span&gt;:
        &lt;span class="pl-ent"&gt;python-version&lt;/span&gt;: &lt;span class="pl-s"&gt;&lt;span class="pl-pds"&gt;"&lt;/span&gt;3.13&lt;span class="pl-pds"&gt;"&lt;/span&gt;&lt;/span&gt;
        &lt;span class="pl-ent"&gt;cache&lt;/span&gt;: &lt;span class="pl-s"&gt;pip&lt;/span&gt;
        &lt;span class="pl-ent"&gt;cache-dependency-path&lt;/span&gt;: &lt;span class="pl-s"&gt;pyproject.toml&lt;/span&gt;
    - &lt;span class="pl-ent"&gt;name&lt;/span&gt;: &lt;span class="pl-s"&gt;Install dependencies&lt;/span&gt;
      &lt;span class="pl-ent"&gt;run&lt;/span&gt;: &lt;span class="pl-s"&gt;|&lt;/span&gt;
&lt;span class="pl-s"&gt;        pip install setuptools wheel build&lt;/span&gt;
&lt;span class="pl-s"&gt;&lt;/span&gt;    - &lt;span class="pl-ent"&gt;name&lt;/span&gt;: &lt;span class="pl-s"&gt;Build&lt;/span&gt;
      &lt;span class="pl-ent"&gt;run&lt;/span&gt;: &lt;span class="pl-s"&gt;|&lt;/span&gt;
&lt;span class="pl-s"&gt;        python -m build&lt;/span&gt;
&lt;span class="pl-s"&gt;&lt;/span&gt;    - &lt;span class="pl-ent"&gt;name&lt;/span&gt;: &lt;span class="pl-s"&gt;Publish&lt;/span&gt;
      &lt;span class="pl-ent"&gt;uses&lt;/span&gt;: &lt;span class="pl-s"&gt;pypa/gh-action-pypi-publish@release/v1&lt;/span&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;This runs after the &lt;code&gt;test&lt;/code&gt; job has passed. It uses the &lt;a href="https://github.com/pypa/gh-action-pypi-publish"&gt;pypa/gh-action-pypi-publish&lt;/a&gt; Action to publish to PyPI - I wrote more about how that works &lt;a href="https://til.simonwillison.net/pypi/pypi-releases-from-github"&gt;in this TIL&lt;/a&gt;.&lt;/p&gt;
&lt;h4 id="is-the-model-any-good-"&gt;Is the model any good?&lt;/h4&gt;
&lt;p&gt;This one really isn't! It's not really surprising but it turns out 94MB really isn't enough space for a model that can do anything useful.&lt;/p&gt;
&lt;p&gt;It's &lt;em&gt;super&lt;/em&gt; fun to play with, and I continue to maintain that small, weak models are a great way to help build a mental model of how this technology actually works.&lt;/p&gt;
&lt;p&gt;That's not to say SmolLM2 isn't a fantastic model family. I'm running the smallest, most restricted version here. &lt;a href="https://huggingface.co/blog/smollm"&gt;SmolLM - blazingly fast and remarkably powerful&lt;/a&gt; describes the full model family - which comes in 135M, 360M, and 1.7B sizes. The larger versions are a whole lot more capable.&lt;/p&gt;
&lt;p&gt;If anyone can figure out something genuinely useful to do with the 94MB version I'd love to hear about it.&lt;/p&gt;
    
        &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/pip"&gt;pip&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/plugins"&gt;plugins&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/projects"&gt;projects&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/pypi"&gt;pypi&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/python"&gt;python&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/ai"&gt;ai&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/github-actions"&gt;github-actions&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/generative-ai"&gt;generative-ai&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/local-llms"&gt;local-llms&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/llms"&gt;llms&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/ai-assisted-programming"&gt;ai-assisted-programming&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/llm"&gt;llm&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/gemini"&gt;gemini&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/uv"&gt;uv&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/smollm"&gt;smollm&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/o3"&gt;o3&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/llama-cpp"&gt;llama-cpp&lt;/a&gt;&lt;/p&gt;
    

</summary><category term="pip"/><category term="plugins"/><category term="projects"/><category term="pypi"/><category term="python"/><category term="ai"/><category term="github-actions"/><category term="generative-ai"/><category term="local-llms"/><category term="llms"/><category term="ai-assisted-programming"/><category term="llm"/><category term="gemini"/><category term="uv"/><category term="smollm"/><category term="o3"/><category term="llama-cpp"/></entry><entry><title>llm-anthropic</title><link href="https://simonwillison.net/2025/Feb/2/llm-anthropic/#atom-tag" rel="alternate"/><published>2025-02-02T06:17:53+00:00</published><updated>2025-02-02T06:17:53+00:00</updated><id>https://simonwillison.net/2025/Feb/2/llm-anthropic/#atom-tag</id><summary type="html">
    
&lt;p&gt;&lt;strong&gt;&lt;a href="https://github.com/simonw/llm-anthropic"&gt;llm-anthropic&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;
I've renamed my &lt;a href="https://github.com/simonw/llm-claude-3"&gt;llm-claude-3&lt;/a&gt; plugin to &lt;code&gt;llm-anthropic&lt;/code&gt;, on the basis that Claude 4 will probably happen at some point so this is a better name for the plugin.&lt;/p&gt;
&lt;p&gt;If you're a previous user of &lt;code&gt;llm-claude-3&lt;/code&gt; you can upgrade to the new plugin like this:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;llm install -U llm-claude-3
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;This should remove the old plugin and install the new one, because the latest &lt;code&gt;llm-claude-3&lt;/code&gt; depends on &lt;code&gt;llm-anthropic&lt;/code&gt;. Just installing &lt;code&gt;llm-anthropic&lt;/code&gt; may leave you with both plugins installed at once.&lt;/p&gt;
&lt;p&gt;There is one extra manual step you'll need to take during this upgrade: creating a new &lt;code&gt;anthropic&lt;/code&gt; stored key with the same API token you previously stored under &lt;code&gt;claude&lt;/code&gt;. You can do that like so:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;llm keys set anthropic --value "$(llm keys get claude)"
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;I released &lt;a href="https://github.com/simonw/llm-anthropic/releases/tag/0.12"&gt;llm-anthropic 0.12&lt;/a&gt; yesterday with new features not previously included in &lt;code&gt;llm-claude-3&lt;/code&gt;:&lt;/p&gt;
&lt;blockquote&gt;
&lt;ul&gt;
&lt;li&gt;Support for Claude's &lt;a href="https://docs.anthropic.com/en/docs/build-with-claude/prompt-engineering/prefill-claudes-response"&gt;prefill&lt;/a&gt; feature, using the new &lt;code&gt;-o prefill '{'&lt;/code&gt; option and the accompanying &lt;code&gt;-o hide_prefill 1&lt;/code&gt; option to prevent the prefill from being included in the output text. &lt;a href="https://github.com/simonw/llm-anthropic/issues/2"&gt;#2&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;New &lt;code&gt;-o stop_sequences '```'&lt;/code&gt; option for specifying one or more stop sequences. To specify multiple stop sequences pass a JSON array of strings :&lt;code&gt;-o stop_sequences '["end", "stop"]&lt;/code&gt;.&lt;/li&gt;
&lt;li&gt;Model options are now documented in the README.&lt;/li&gt;
&lt;/ul&gt;
&lt;/blockquote&gt;
&lt;p&gt;If you install or upgrade &lt;code&gt;llm-claude-3&lt;/code&gt; you will now get &lt;code&gt;llm-anthropic&lt;/code&gt; instead, thanks to a tiny package on PyPI which depends on the new plugin name. I created that with my &lt;a href="https://github.com/simonw/pypi-rename"&gt;pypi-rename&lt;/a&gt; cookiecutter template.&lt;/p&gt;
&lt;p&gt;Here's the &lt;a href="https://github.com/simonw/llm-claude-3/issues/31"&gt;issue for the rename&lt;/a&gt;. I archived the &lt;a href="https://github.com/simonw/llm-claude-3"&gt;llm-claude-3 repository on GitHub&lt;/a&gt;, and got to use the brand new &lt;a href="https://simonwillison.net/2025/Jan/30/pypi-now-supports-project-archival/"&gt;PyPI archiving feature&lt;/a&gt; to archive the &lt;a href="https://pypi.org/project/llm-claude-3/"&gt;llm-claude-3 project on PyPI&lt;/a&gt; as well.


    &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/plugins"&gt;plugins&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/pypi"&gt;pypi&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/python"&gt;python&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/ai"&gt;ai&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/generative-ai"&gt;generative-ai&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/llms"&gt;llms&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/llm"&gt;llm&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/anthropic"&gt;anthropic&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/claude"&gt;claude&lt;/a&gt;&lt;/p&gt;



</summary><category term="plugins"/><category term="pypi"/><category term="python"/><category term="ai"/><category term="generative-ai"/><category term="llms"/><category term="llm"/><category term="anthropic"/><category term="claude"/></entry><entry><title>llm-openrouter 0.3</title><link href="https://simonwillison.net/2024/Dec/8/llm-openrouter-03/#atom-tag" rel="alternate"/><published>2024-12-08T23:56:14+00:00</published><updated>2024-12-08T23:56:14+00:00</updated><id>https://simonwillison.net/2024/Dec/8/llm-openrouter-03/#atom-tag</id><summary type="html">
    
&lt;p&gt;&lt;strong&gt;&lt;a href="https://github.com/simonw/llm-openrouter/releases/tag/0.3"&gt;llm-openrouter 0.3&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;
New release of my &lt;a href="https://github.com/simonw/llm-openrouter"&gt;llm-openrouter&lt;/a&gt; plugin, which allows &lt;a href="https://llm.datasette.io/"&gt;LLM&lt;/a&gt; to access models hosted by &lt;a href="https://openrouter.ai/"&gt;OpenRouter&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;Quoting the release notes:&lt;/p&gt;
&lt;blockquote&gt;
&lt;ul&gt;
&lt;li&gt;Enable image attachments for models that support images. Thanks, &lt;a href="https://github.com/montasaurus"&gt;Adam Montgomery&lt;/a&gt;. &lt;a href="https://github.com/simonw/llm-openrouter/issues/12"&gt;#12&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;Provide async model access. &lt;a href="https://github.com/simonw/llm-openrouter/issues/15"&gt;#15&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;Fix documentation to list correct &lt;code&gt;LLM_OPENROUTER_KEY&lt;/code&gt; environment variable. &lt;a href="https://github.com/simonw/llm-openrouter/issues/10"&gt;#10&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/blockquote&gt;


    &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/plugins"&gt;plugins&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/releases"&gt;releases&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/ai"&gt;ai&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/generative-ai"&gt;generative-ai&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/llms"&gt;llms&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/llm"&gt;llm&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/openrouter"&gt;openrouter&lt;/a&gt;&lt;/p&gt;



</summary><category term="plugins"/><category term="releases"/><category term="ai"/><category term="generative-ai"/><category term="llms"/><category term="llm"/><category term="openrouter"/></entry><entry><title>datasette-enrichments-llm</title><link href="https://simonwillison.net/2024/Dec/5/datasette-enrichments-llm/#atom-tag" rel="alternate"/><published>2024-12-05T23:46:48+00:00</published><updated>2024-12-05T23:46:48+00:00</updated><id>https://simonwillison.net/2024/Dec/5/datasette-enrichments-llm/#atom-tag</id><summary type="html">
    
&lt;p&gt;&lt;strong&gt;&lt;a href="https://github.com/datasette/datasette-enrichments-llm"&gt;datasette-enrichments-llm&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;
Today's new alpha release is &lt;strong&gt;datasette-enrichments-llm&lt;/strong&gt;, a plugin for Datasette 1.0a+ that provides an &lt;a href="https://simonwillison.net/2023/Dec/1/datasette-enrichments/"&gt;enrichment&lt;/a&gt; that lets you run prompts against data from one or more column and store the result in another column.&lt;/p&gt;
&lt;p&gt;So far it's a light re-implementation of the existing &lt;a href="https://github.com/datasette/datasette-enrichments-gpt"&gt;datasette-enrichments-gpt&lt;/a&gt; plugin, now using the new &lt;a href="https://llm.datasette.io/en/stable/python-api.html#python-api-listing-models"&gt;llm.get_async_models()&lt;/a&gt; method to allow users to select any async-enabled model that has been registered by a plugin - so currently any of the models from OpenAI, Anthropic, Gemini or Mistral via their &lt;a href="https://llm.datasette.io/en/stable/plugins/directory.html#remote-apis"&gt;respective plugins&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;Still plenty to do on this one. Next step is to integrate it with &lt;a href="https://simonwillison.net/2024/Dec/2/datasette-llm-usage/"&gt;datasette-llm-usage&lt;/a&gt; and use it to drive a design-complete stable version of that.


    &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/plugins"&gt;plugins&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/projects"&gt;projects&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/releases"&gt;releases&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/ai"&gt;ai&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/datasette"&gt;datasette&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/generative-ai"&gt;generative-ai&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/llms"&gt;llms&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/llm"&gt;llm&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/enrichments"&gt;enrichments&lt;/a&gt;&lt;/p&gt;



</summary><category term="plugins"/><category term="projects"/><category term="releases"/><category term="ai"/><category term="datasette"/><category term="generative-ai"/><category term="llms"/><category term="llm"/><category term="enrichments"/></entry><entry><title>datasette-queries</title><link href="https://simonwillison.net/2024/Dec/3/datasette-queries/#atom-tag" rel="alternate"/><published>2024-12-03T23:59:26+00:00</published><updated>2024-12-03T23:59:26+00:00</updated><id>https://simonwillison.net/2024/Dec/3/datasette-queries/#atom-tag</id><summary type="html">
    
&lt;p&gt;&lt;strong&gt;&lt;a href="https://github.com/datasette/datasette-queries"&gt;datasette-queries&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;
I released the first alpha of a new plugin to replace the crusty old &lt;a href="https://github.com/simonw/datasette-saved-queries"&gt;datasette-saved-queries&lt;/a&gt;. This one adds a new UI element to the top of the query results page with an expandable form for saving the query as a new &lt;a href="https://docs.datasette.io/en/stable/sql_queries.html#canned-queries"&gt;canned query&lt;/a&gt;:&lt;/p&gt;
&lt;p&gt;&lt;img alt="Animated demo. I start on the table page, run a search, click View and edit SQL, then on the SQL query page open a Save query dialog, click a Suggest title and description button, wait for that to suggest something and click save." src="https://static.simonwillison.net/static/2024/datasette-queries.gif" /&gt;&lt;/p&gt;
&lt;p&gt;It's my first plugin to depend on LLM and &lt;a href="https://simonwillison.net/2024/Dec/2/datasette-llm-usage/"&gt;datasette-llm-usage&lt;/a&gt; - it uses GPT-4o mini to power an optional "Suggest title and description" button, labeled with the becoming-standard ✨ sparkles emoji to indicate an LLM-powered feature.&lt;/p&gt;
&lt;p&gt;I intend to expand this to work across multiple models as I continue to iterate on &lt;code&gt;llm-datasette-usage&lt;/code&gt; to better support those kinds of patterns.&lt;/p&gt;
&lt;p&gt;For the moment though each suggested title and description call costs about 250 input tokens and 50 output tokens, which against GPT-4o mini adds up to 0.0067 cents.


    &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/plugins"&gt;plugins&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/projects"&gt;projects&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/releases"&gt;releases&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/ai"&gt;ai&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/datasette"&gt;datasette&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/openai"&gt;openai&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/generative-ai"&gt;generative-ai&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/llms"&gt;llms&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/llm"&gt;llm&lt;/a&gt;&lt;/p&gt;



</summary><category term="plugins"/><category term="projects"/><category term="releases"/><category term="ai"/><category term="datasette"/><category term="openai"/><category term="generative-ai"/><category term="llms"/><category term="llm"/></entry><entry><title>datasette-llm-usage</title><link href="https://simonwillison.net/2024/Dec/2/datasette-llm-usage/#atom-tag" rel="alternate"/><published>2024-12-02T21:33:05+00:00</published><updated>2024-12-02T21:33:05+00:00</updated><id>https://simonwillison.net/2024/Dec/2/datasette-llm-usage/#atom-tag</id><summary type="html">
    
&lt;p&gt;&lt;strong&gt;&lt;a href="https://github.com/datasette/datasette-llm-usage"&gt;datasette-llm-usage&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;
I released the first alpha of a Datasette plugin to help track LLM usage by other plugins, with the goal of supporting token allowances - both for things like free public apps that stop working after a daily allowance, plus free previews of AI features for paid-account-based projects such as Datasette Cloud.&lt;/p&gt;
&lt;p&gt;It's using the usage features I added in &lt;a href="https://simonwillison.net/2024/Dec/1/llm-019/"&gt;LLM 0.19&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;The alpha doesn't do much yet - it will start getting interesting once I upgrade other plugins to depend on it.&lt;/p&gt;
&lt;p&gt;Design notes so far in &lt;a href="https://github.com/datasette/datasette-llm-usage/issues/1"&gt;issue #1&lt;/a&gt;.


    &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/plugins"&gt;plugins&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/projects"&gt;projects&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/releases"&gt;releases&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/ai"&gt;ai&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/datasette"&gt;datasette&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/datasette-cloud"&gt;datasette-cloud&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/generative-ai"&gt;generative-ai&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/llms"&gt;llms&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/llm"&gt;llm&lt;/a&gt;&lt;/p&gt;



</summary><category term="plugins"/><category term="projects"/><category term="releases"/><category term="ai"/><category term="datasette"/><category term="datasette-cloud"/><category term="generative-ai"/><category term="llms"/><category term="llm"/></entry><entry><title>Ask questions of SQLite databases and CSV/JSON files in your terminal</title><link href="https://simonwillison.net/2024/Nov/25/ask-questions-of-sqlite/#atom-tag" rel="alternate"/><published>2024-11-25T01:33:03+00:00</published><updated>2024-11-25T01:33:03+00:00</updated><id>https://simonwillison.net/2024/Nov/25/ask-questions-of-sqlite/#atom-tag</id><summary type="html">
    &lt;p&gt;I built a new plugin for my &lt;a href="https://sqlite-utils.datasette.io/en/stable/cli.html"&gt;sqlite-utils CLI tool&lt;/a&gt; that lets you ask human-language questions directly of SQLite databases and CSV/JSON files on your computer.&lt;/p&gt;
&lt;p&gt;It's called &lt;a href="https://github.com/simonw/sqlite-utils-ask"&gt;sqlite-utils-ask&lt;/a&gt;. Here's how you install it:&lt;/p&gt;
&lt;div class="highlight highlight-source-shell"&gt;&lt;pre&gt;sqlite-utils install sqlite-utils-ask&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;It picks up API keys from an &lt;code&gt;OPENAI_API_KEY&lt;/code&gt; environment variable, or you can &lt;a href="https://llm.datasette.io/"&gt;install LLM&lt;/a&gt; and use &lt;a href="https://llm.datasette.io/en/stable/setup.html#saving-and-using-stored-keys"&gt;llm keys set openai&lt;/a&gt; to store a key in a configuration file.&lt;/p&gt;
&lt;p&gt;Then you can use it like this:&lt;/p&gt;
&lt;div class="highlight highlight-source-shell"&gt;&lt;pre&gt;curl -O https://datasette.io/content.db
sqlite-utils ask content.db &lt;span class="pl-s"&gt;&lt;span class="pl-pds"&gt;"&lt;/span&gt;how many sqlite-utils pypi downloads in 2024?&lt;span class="pl-pds"&gt;"&lt;/span&gt;&lt;/span&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;This command will extract the SQL schema for the provided database file, send that through an LLM along with your question, get back a SQL query and attempt to run it to derive a result.&lt;/p&gt;
&lt;p&gt;If all goes well it spits out an answer something like this:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;SELECT SUM(downloads)
FROM stats
WHERE package = 'sqlite-utils' AND date &amp;gt;= '2024-01-01' AND date &amp;lt; '2025-01-01';

[
    {
        "SUM(downloads)": 4300221
    }
]
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;If the SQL query fails to execute (due to a syntax error of some kind) it passes that error back to the model for corrections and retries up to three times before giving up.&lt;/p&gt;
&lt;p&gt;Add &lt;code&gt;-v/--verbose&lt;/code&gt; to see the exact prompt it's using:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;System prompt:
You will be given a SQLite schema followed by a question. Generate a single SQL
query to answer that question. Return that query in a ```sql ... ```
fenced code block.

Example: How many repos are there?
Answer:
```sql
select count(*) from repos
```

Prompt:
...
CREATE TABLE [stats] (
   [package] TEXT,
   [date] TEXT,
   [downloads] INTEGER,
   PRIMARY KEY ([package], [date])
);
...
how many sqlite-utils pypi downloads in 2024?
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;I've truncated the above to just the relevant table - it actually includes the full schema of every table in that database.&lt;/p&gt;
&lt;p&gt;By default, the tool sends just that database schema and your question to the LLM. If you add the &lt;code&gt;-e/--examples&lt;/code&gt; option it will also include five common values for each of the text columns in that schema with an average length less than 32 characters. This can sometimes help get a better result, for example sending values "CA" and "FL" and "TX" for a &lt;code&gt;state&lt;/code&gt; column can tip the model of that it should use state abbreviations rather than full names in its queries.&lt;/p&gt;
&lt;h4 id="ask-files"&gt;Asking questions of CSV and JSON data&lt;/h4&gt;
&lt;p&gt;The core &lt;code&gt;sqlite-utils&lt;/code&gt; CLI usually works against SQLite files directly, but three years ago I added the ability to run SQL queries against CSV and JSON files directly with the &lt;a href="https://simonwillison.net/2021/Jun/19/sqlite-utils-memory/"&gt;sqlite-utils memory&lt;/a&gt; command. This works by loading that data into an in-memory SQLite database before executing a SQL query.&lt;/p&gt;
&lt;p&gt;I decided to reuse that mechanism to enable LLM prompts against CSV and JSON data directly as well.&lt;/p&gt;
&lt;p&gt;The &lt;code&gt;sqlite-utils ask-files&lt;/code&gt; command looks like this:&lt;/p&gt;
&lt;div class="highlight highlight-source-shell"&gt;&lt;pre&gt;sqlite-utils ask-files transactions.csv &lt;span class="pl-s"&gt;&lt;span class="pl-pds"&gt;"&lt;/span&gt;total sales by year&lt;span class="pl-pds"&gt;"&lt;/span&gt;&lt;/span&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;This command accepts one or more files, and you can provide a mix of CSV, TSV and JSON. Each provided file will be imported into a different table, allowing the model to construct join queries where necessary.&lt;/p&gt;
&lt;h4 id="implementation-notes"&gt;Implementation notes&lt;/h4&gt;
&lt;p&gt;The core of the plugin is implemented as around &lt;a href="https://github.com/simonw/sqlite-utils-ask/blob/0.2/sqlite_utils_ask.py"&gt;250 lines of Python&lt;/a&gt;, using the &lt;code&gt;sqlite-utils&lt;/code&gt; &lt;a href="https://sqlite-utils.datasette.io/en/stable/plugins.html#register-commands-cli"&gt;register_commands()&lt;/a&gt; plugin hook to add the &lt;code&gt;ask&lt;/code&gt; and &lt;code&gt;ask-files&lt;/code&gt; commands.&lt;/p&gt;
&lt;p&gt;It adds &lt;a href="https://llm.datasette.io/"&gt;LLM&lt;/a&gt; as a dependency, and takes advantage of LLM's &lt;a href="https://llm.datasette.io/en/stable/python-api.html"&gt;Python API&lt;/a&gt; to abstract over the details of talking to the models. This means &lt;code&gt;sqlite-utils-ask&lt;/code&gt; can use any of the models supported by LLM or its plugins - if you want to run your prompt through Claude 3.5 Sonnet you can do this:&lt;/p&gt;
&lt;div class="highlight highlight-source-shell"&gt;&lt;pre&gt;sqlite-utils install llm-claude-3
sqlite-utils ask content.db &lt;span class="pl-s"&gt;&lt;span class="pl-pds"&gt;"&lt;/span&gt;count rows in news table&lt;span class="pl-pds"&gt;"&lt;/span&gt;&lt;/span&gt; -m claude-3.5-sonnet&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;The plugin defaults to &lt;a href="https://openai.com/index/gpt-4o-mini-advancing-cost-efficient-intelligence/"&gt;gpt-4o-mini&lt;/a&gt; initially to take advantage of that model's automatic prompt caching: if you run multiple questions against the same schema you'll end up sending the same lengthy prompt prefix multiple times, and OpenAI's prompt caching should automatically kick in and provide a 50% discount on those input tokens.&lt;/p&gt;
&lt;p&gt;Then I ran the actual numbers and found that &lt;code&gt;gpt-4o-mini&lt;/code&gt; is cheap enough that even without caching a 4,000 token prompt (that's a pretty large SQL schema) should cost less than a tenth of a cent. So those caching savings aren't worth anything at all!&lt;/p&gt;
    
        &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/cli"&gt;cli&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/plugins"&gt;plugins&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/projects"&gt;projects&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/sqlite"&gt;sqlite&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/ai"&gt;ai&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/sqlite-utils"&gt;sqlite-utils&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/generative-ai"&gt;generative-ai&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/llms"&gt;llms&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/ai-assisted-programming"&gt;ai-assisted-programming&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/llm"&gt;llm&lt;/a&gt;&lt;/p&gt;
    

</summary><category term="cli"/><category term="plugins"/><category term="projects"/><category term="sqlite"/><category term="ai"/><category term="sqlite-utils"/><category term="generative-ai"/><category term="llms"/><category term="ai-assisted-programming"/><category term="llm"/></entry></feed>