<?xml version="1.0" encoding="utf-8"?>
<feed xml:lang="en-us" xmlns="http://www.w3.org/2005/Atom"><title>Simon Willison's Weblog: showboat</title><link href="http://simonwillison.net/" rel="alternate"/><link href="http://simonwillison.net/tags/showboat.atom" rel="self"/><id>http://simonwillison.net/</id><updated>2026-03-06T05:43:54+00:00</updated><author><name>Simon Willison</name></author><entry><title>Agentic manual testing</title><link href="https://simonwillison.net/guides/agentic-engineering-patterns/agentic-manual-testing/#atom-tag" rel="alternate"/><published>2026-03-06T05:43:54+00:00</published><updated>2026-03-06T05:43:54+00:00</updated><id>https://simonwillison.net/guides/agentic-engineering-patterns/agentic-manual-testing/#atom-tag</id><summary type="html">
    &lt;p&gt;&lt;em&gt;&lt;a href="https://simonwillison.net/guides/agentic-engineering-patterns/"&gt;Agentic Engineering Patterns&lt;/a&gt; &amp;gt;&lt;/em&gt;&lt;/p&gt;
    &lt;p&gt;The defining characteristic of a coding agent is that it can &lt;em&gt;execute the code&lt;/em&gt; that it writes. This is what makes coding agents so much more useful than LLMs that simply spit out code without any way to verify it.&lt;/p&gt;
&lt;p&gt;Never assume that code generated by an LLM works until that code has been executed.&lt;/p&gt;
&lt;p&gt;Coding agents have the ability to confirm that the code they have produced works as intended, or iterate further on that code until it does.&lt;/p&gt;
&lt;p&gt;Getting agents to &lt;a href="https://simonwillison.net/guides/agentic-engineering-patterns/red-green-tdd/"&gt;write unit tests&lt;/a&gt;, especially using test-first TDD, is a powerful way to ensure they have exercised the code they are writing.&lt;/p&gt;
&lt;p&gt;That's not the only worthwhile approach, though. &lt;/p&gt;
&lt;p&gt;Just because code passes tests doesn't mean it works as intended. Anyone who's worked with automated tests will have seen cases where the tests all pass but the code itself fails in some obvious way - it might crash the server on startup, fail to display a crucial UI element, or miss some detail that the tests failed to cover.&lt;/p&gt;
&lt;p&gt;Automated tests are no replacement for &lt;strong&gt;manual testing&lt;/strong&gt;. I like to see a feature working with my own eye before I land it in a release.&lt;/p&gt;
&lt;p&gt;I've found that getting agents to manually test code is valuable as well, frequently revealing issues that weren't spotted by the automated tests.&lt;/p&gt;
&lt;h2 id="mechanisms-for-agentic-manual-testing"&gt;Mechanisms for agentic manual testing&lt;/h2&gt;
&lt;p&gt;How an agent should "manually" test a piece of code varies depending on what that code is.&lt;/p&gt;
&lt;p&gt;For Python libraries a useful pattern is &lt;code&gt;python -c "... code ..."&lt;/code&gt;. You can pass a string (or multiline string) of Python code directly to the Python interpreter, including code that imports other modules.&lt;/p&gt;
&lt;p&gt;The coding agents are all familiar with this trick and will sometimes use it without prompting. Reminding them to test using &lt;code&gt;python -c&lt;/code&gt; can often be effective though:&lt;/p&gt;
&lt;div&gt;&lt;markdown-copy&gt;&lt;textarea&gt;Try that new function on some edge cases using `python -c`&lt;/textarea&gt;&lt;/markdown-copy&gt;&lt;/div&gt;
&lt;p&gt;Other languages may have similar mechanisms, and if they don't it's still quick for an agent to write out a demo file and then compile and run it. I sometimes encourage it to use &lt;code&gt;/tmp&lt;/code&gt; purely to avoid those files being accidentally committed to the repository later on.&lt;/p&gt;
&lt;div&gt;&lt;markdown-copy&gt;&lt;textarea&gt;Write code in `/tmp` to try edge cases of that function and then compile and run it&lt;/textarea&gt;&lt;/markdown-copy&gt;&lt;/div&gt;
&lt;p&gt;Many of my projects involve building web applications with JSON APIs. For these I tell the agent to exercise them using &lt;code&gt;curl&lt;/code&gt;:&lt;/p&gt;
&lt;div&gt;&lt;markdown-copy&gt;&lt;textarea&gt;Run a dev server and explore that new JSON API using `curl`&lt;/textarea&gt;&lt;/markdown-copy&gt;&lt;/div&gt;
&lt;p&gt;Telling an agent to "explore" often results in it trying out a bunch of different aspects of a new API, which can quickly cover a whole lot of ground.&lt;/p&gt;
&lt;p&gt;If an agent finds something that doesn't work through their manual testing, I like to tell them to fix it with red/green TDD. This ensures the new case ends up covered by the permanent automated tests.&lt;/p&gt;
&lt;h2 id="using-browser-automation-for-web-uis"&gt;Using browser automation for web UIs&lt;/h2&gt;
&lt;p&gt;Having a manual testing procedure in place becomes even more valuable if a project involves an interactive web UI.&lt;/p&gt;
&lt;p&gt;Historically these have been difficult to test from code, but the past decade has seen notable improvements in systems for automating real web browsers. Running a real Chrome or Firefox or Safari browser against an application can uncover all sorts of interesting problems in a realistic setting.&lt;/p&gt;
&lt;p&gt;Coding agents know how to use these tools extremely well.&lt;/p&gt;
&lt;p&gt;The most powerful of these today is &lt;strong&gt;&lt;a href="https://playwright.dev/"&gt;Playwright&lt;/a&gt;&lt;/strong&gt;, an open source library developed by Microsoft. Playwright offers a full-featured API with bindings in multiple popular programming languages and can automate any of the popular browser engines.&lt;/p&gt;
&lt;p&gt;Simply telling your agent to "test that with Playwright" may be enough. The agent can then select the language binding that makes the most sense, or use Playwright's &lt;a href="https://github.com/microsoft/playwright-cli"&gt;playwright-cli&lt;/a&gt; tool.&lt;/p&gt;
&lt;p&gt;Coding agents work really well with dedicated CLIs. &lt;a href="https://github.com/vercel-labs/agent-browser"&gt;agent-browser&lt;/a&gt; by Vercel is a comprehensive CLI wrapper around Playwright specially designed for coding agents to use.&lt;/p&gt;
&lt;p&gt;My own project &lt;a href="https://github.com/simonw/rodney"&gt;Rodney&lt;/a&gt; serves a similar purpose, albeit using the Chrome DevTools Protocol to directly control an instance of Chrome.&lt;/p&gt;
&lt;p&gt;Here's an example prompt I use to test things with Rodney:&lt;/p&gt;
&lt;p&gt;&lt;div&gt;&lt;markdown-copy&gt;&lt;textarea&gt;Start a dev server and then use `uvx rodney --help` to test the new homepage, look at screenshots to confirm the menu is in the right place&lt;/textarea&gt;&lt;/markdown-copy&gt;&lt;/div&gt;
There are three tricks in this prompt:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Saying "use &lt;code&gt;uvx rodney --help&lt;/code&gt;" causes the agent to run &lt;code&gt;rodney --help&lt;/code&gt; via the &lt;a href="https://docs.astral.sh/uv/guides/tools/"&gt;uvx&lt;/a&gt; package management tool, which automatically installs Rodney the first time it is called.&lt;/li&gt;
&lt;li&gt;The &lt;code&gt;rodney --help&lt;/code&gt; command is specifically designed to give agents everything they need to know to both understand and use the tool. Here's &lt;a href="https://github.com/simonw/rodney/blob/main/help.txt"&gt;that help text&lt;/a&gt;.&lt;/li&gt;
&lt;li&gt;Saying "look at screenshots" hints to the agent that it should use the &lt;code&gt;rodney screenshot&lt;/code&gt; command and reminds it that it can use its own vision abilities against the resulting image files to evaluate the visual appearance of the page.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;That's a whole lot of manual testing baked into a short prompt!&lt;/p&gt;
&lt;p&gt;Rodney and tools like it offer a wide array of capabilities, from running JavaScript on the loaded site to scrolling, clicking, typing, and even reading the accessibility tree of the page.&lt;/p&gt;
&lt;p&gt;As with other forms of manual tests, issues found and fixed via browser automation can then be added to permanent automated tests as well.&lt;/p&gt;
&lt;p&gt;Many developers have avoided too many automated browser tests in the past due to their reputation for flakiness - the smallest tweak to the HTML of a page can result in frustrating waves of test breaks.&lt;/p&gt;
&lt;p&gt;Having coding agents maintain those tests over time greatly reduces the friction involved in keeping them up-to-date in the face of design changes to the web interfaces.&lt;/p&gt;
&lt;h2 id="have-them-take-notes-with-showboat"&gt;Have them take notes with Showboat&lt;/h2&gt;
&lt;p&gt;Having agents manually test code can catch extra problems, but it can also be used to create artifacts that can help document the code and demonstrate how it has been tested.&lt;/p&gt;
&lt;p&gt;I'm fascinated by the challenge of having agents &lt;em&gt;show their work&lt;/em&gt;. Being able to see demos or documented experiments is a really useful way of confirming that the agent has comprehensively solved the challenge it was given.&lt;/p&gt;
&lt;p&gt;I built &lt;a href="https://github.com/simonw/showboat"&gt;Showboat&lt;/a&gt; to facilitate building documents that capture the agentic manual testing flow.&lt;/p&gt;
&lt;p&gt;Here's a prompt I frequently use:&lt;/p&gt;
&lt;p&gt;&lt;div&gt;&lt;markdown-copy&gt;&lt;textarea&gt;Run `uvx showboat --help` and then create a `notes/api-demo.md` showboat document and use it to test and document that new API.&lt;/textarea&gt;&lt;/markdown-copy&gt;&lt;/div&gt;
As with Rodney above, the &lt;code&gt;showboat --help&lt;/code&gt; command teaches the agent what Showboat is and how to use it. Here's &lt;a href="https://github.com/simonw/showboat/blob/main/help.txt"&gt;that help text in full&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;The three key Showboat commands are &lt;code&gt;note&lt;/code&gt;, &lt;code&gt;exec&lt;/code&gt;, and &lt;code&gt;image&lt;/code&gt;.&lt;/p&gt;
&lt;p&gt;&lt;code&gt;note&lt;/code&gt; appends a Markdown note to the Showboat document. &lt;code&gt;exec&lt;/code&gt; records a command, then runs that command and records its output. &lt;code&gt;image&lt;/code&gt; adds an image to the document - useful for screenshots of web applications taken using Rodney.&lt;/p&gt;
&lt;p&gt;The &lt;code&gt;exec&lt;/code&gt; command is the most important of these, because it captures a command along with the resulting output. This shows you what the agent did and what the result was, and is designed to discourage the agent from cheating and writing what it &lt;em&gt;hoped&lt;/em&gt; had happened into the document.&lt;/p&gt;
&lt;p&gt;I've been finding the Showboat pattern to work really well for documenting the work that has been achieved during my agent sessions. I'm hoping to see similar patterns adopted across a wider set of tools.&lt;/p&gt;
    
        &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/playwright"&gt;playwright&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/testing"&gt;testing&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/agentic-engineering"&gt;agentic-engineering&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/ai"&gt;ai&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/llms"&gt;llms&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/coding-agents"&gt;coding-agents&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/ai-assisted-programming"&gt;ai-assisted-programming&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/rodney"&gt;rodney&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/showboat"&gt;showboat&lt;/a&gt;&lt;/p&gt;
    

</summary><category term="playwright"/><category term="testing"/><category term="agentic-engineering"/><category term="ai"/><category term="llms"/><category term="coding-agents"/><category term="ai-assisted-programming"/><category term="rodney"/><category term="showboat"/></entry><entry><title>Linear walkthroughs</title><link href="https://simonwillison.net/guides/agentic-engineering-patterns/linear-walkthroughs/#atom-tag" rel="alternate"/><published>2026-02-25T01:07:10+00:00</published><updated>2026-02-25T01:07:10+00:00</updated><id>https://simonwillison.net/guides/agentic-engineering-patterns/linear-walkthroughs/#atom-tag</id><summary type="html">
    &lt;p&gt;&lt;em&gt;&lt;a href="https://simonwillison.net/guides/agentic-engineering-patterns/"&gt;Agentic Engineering Patterns&lt;/a&gt; &amp;gt;&lt;/em&gt;&lt;/p&gt;
    &lt;p&gt;Sometimes it's useful to have a coding agent give you a structured walkthrough of a codebase. &lt;/p&gt;
&lt;p&gt;Maybe it's existing code you need to get up to speed on, maybe it's your own code that you've forgotten the details of, or maybe you vibe coded the whole thing and need to understand how it actually works.&lt;/p&gt;
&lt;p&gt;Frontier models with the right agent harness can construct a detailed walkthrough to help you understand how code works.&lt;/p&gt;
&lt;h2 id="an-example-using-showboat-and-present"&gt;An example using Showboat and Present&lt;/h2&gt;
&lt;p&gt;I recently &lt;a href="https://simonwillison.net/2026/Feb/25/present/"&gt;vibe coded a SwiftUI slide presentation app&lt;/a&gt; on my Mac using Claude Code and Opus 4.6.&lt;/p&gt;
&lt;p&gt;I was speaking about the advances in frontier models between November 2025 and February 2026, and I like to include at least one gimmick in my talks (a &lt;a href="https://simonwillison.net/2019/Dec/10/better-presentations/"&gt;STAR moment&lt;/a&gt; - Something They'll Always Remember). In this case I decided the gimmick would be revealing at the end of the presentation that the slide mechanism itself was an example of what vibe coding could do.&lt;/p&gt;
&lt;p&gt;I released the code &lt;a href="https://github.com/simonw/present"&gt;to GitHub&lt;/a&gt; and then realized I didn't know anything about how it actually worked - I had prompted the whole thing into existence (&lt;a href="https://gisthost.github.io/?bfbc338977ceb71e298e4d4d5ac7d63c"&gt;partial transcript here&lt;/a&gt;) without paying any attention to the code it was writing.&lt;/p&gt;
&lt;p&gt;So I fired up a new instance of Claude Code for web, pointed it at my repo and prompted:
&lt;div&gt;&lt;markdown-copy&gt;&lt;textarea&gt;Read the source and then plan a linear walkthrough of the code that explains how it all works in detail

Then run “uvx showboat –help” to learn showboat - use showboat to create a walkthrough.md file in the repo and build the walkthrough in there, using showboat note for commentary and showboat exec plus sed or grep or cat or whatever you need to include snippets of code you are talking about&lt;/textarea&gt;&lt;/markdown-copy&gt;&lt;/div&gt;
&lt;a href="https://github.com/simonw/showboat"&gt;Showboat&lt;/a&gt; is a tool I built to help coding agents write documents that demonstrate their work. You can see the &lt;a href="https://github.com/simonw/showboat/blob/main/help.txt"&gt;showboat --help output here&lt;/a&gt;, which is designed to give the model everything it needs to know in order to use the tool.&lt;/p&gt;
&lt;p&gt;The &lt;code&gt;showboat note&lt;/code&gt; command adds Markdown to the document. The &lt;code&gt;showboat exec&lt;/code&gt; command accepts a shell command, executes it and then adds both the command and its output to the document.&lt;/p&gt;
&lt;p&gt;By telling it to use "sed or grep or cat or whatever you need to include snippets of code you are talking about" I ensured that Claude Code would not manually copy snippets of code into the document, since that could introduce a risk of hallucinations or mistakes.&lt;/p&gt;
&lt;p&gt;This worked extremely well. Here's the &lt;a href="https://github.com/simonw/present/blob/main/walkthrough.md"&gt;document Claude Code created with Showboat&lt;/a&gt;, which talks through all six &lt;code&gt;.swift&lt;/code&gt; files in detail and provides a clear and actionable explanation about how the code works.&lt;/p&gt;
&lt;p&gt;I learned a great deal about how SwiftUI apps are structured and absorbed some solid details about the Swift language itself just from reading this document.&lt;/p&gt;
&lt;p&gt;If you are concerned that LLMs might reduce the speed at which you learn new skills I strongly recommend adopting patterns like this one.  Even a ~40 minute vibe coded toy project can become an opportunity to explore new ecosystems and pick up some interesting new tricks.&lt;/p&gt;
    
        &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/agentic-engineering"&gt;agentic-engineering&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/ai"&gt;ai&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/llms"&gt;llms&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/vibe-coding"&gt;vibe-coding&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/coding-agents"&gt;coding-agents&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/ai-assisted-programming"&gt;ai-assisted-programming&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/swift"&gt;swift&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/generative-ai"&gt;generative-ai&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/showboat"&gt;showboat&lt;/a&gt;&lt;/p&gt;
    

</summary><category term="agentic-engineering"/><category term="ai"/><category term="llms"/><category term="vibe-coding"/><category term="coding-agents"/><category term="ai-assisted-programming"/><category term="swift"/><category term="generative-ai"/><category term="showboat"/></entry><entry><title>go-size-analyzer</title><link href="https://simonwillison.net/2026/Feb/24/go-size-analyzer/#atom-tag" rel="alternate"/><published>2026-02-24T16:10:06+00:00</published><updated>2026-02-24T16:10:06+00:00</updated><id>https://simonwillison.net/2026/Feb/24/go-size-analyzer/#atom-tag</id><summary type="html">
    
&lt;p&gt;&lt;strong&gt;&lt;a href="https://github.com/Zxilly/go-size-analyzer"&gt;go-size-analyzer&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;
The Go ecosystem is &lt;em&gt;really&lt;/em&gt; good at tooling. I just learned about this tool for analyzing the size of Go binaries using a pleasing treemap view of their bundled dependencies.&lt;/p&gt;
&lt;p&gt;You can install and run the tool locally, but it's also compiled to WebAssembly and hosted at &lt;a href="https://gsa.zxilly.dev/"&gt;gsa.zxilly.dev&lt;/a&gt; - which means you can open compiled Go binaries and analyze them directly in your browser.&lt;/p&gt;
&lt;p&gt;I tried it with a 8.1MB macOS compiled copy of my Go &lt;a href="https://github.com/simonw/showboat"&gt;Showboat&lt;/a&gt; tool and got this:&lt;/p&gt;
&lt;p&gt;&lt;img alt="Treemap visualization of a Go binary named &amp;quot;showboat&amp;quot; showing size breakdown across four major categories: &amp;quot;Unknown Sections Size&amp;quot; (containing __rodata __TEXT, __rodata __DATA_CONST, __data __DATA, and Debug Sections Size with __zdebug_line __DWARF, __zdebug_loc __DWARF, __zdebug_info __DWARF), &amp;quot;Std Packages Size&amp;quot; (showing standard library packages like runtime, net, crypto, reflect, math, os, fmt, strings, syscall, context, and many subpackages such as crypto/tls, crypto/x509, net/http, with individual .go files visible at deeper levels), &amp;quot;Main Packages Size&amp;quot; (showing main, showboat, cmd), and &amp;quot;Generated Packages Size&amp;quot; (showing &amp;lt;autogenerated&amp;gt;). A tooltip is visible over __zdebug_line __DWARF showing: Section: __zdebug_line __DWARF, Size: 404.44 KB, File Size: 404.44 KB, Known size: 0 B, Unknown size: 404.44 KB, Offset: 0x52814a – 0x58d310, Address: 0x1005c014a – 0x1005c5310, Memory: false, Debug: true. The treemap uses green for main/generated packages, blue-gray for unknown sections, and shades of purple/pink for standard library packages." src="https://static.simonwillison.net/static/2026/showboat-treemap.jpg" /&gt;

    &lt;p&gt;&lt;small&gt;&lt;/small&gt;Via &lt;a href="https://www.datadoghq.com/blog/engineering/agent-go-binaries/"&gt;Datadog: How we reduced the size of our Agent Go binaries by up to 77%&lt;/a&gt;&lt;/small&gt;&lt;/p&gt;


    &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/go"&gt;go&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/webassembly"&gt;webassembly&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/showboat"&gt;showboat&lt;/a&gt;&lt;/p&gt;



</summary><category term="go"/><category term="webassembly"/><category term="showboat"/></entry><entry><title>showboat v0.6.1</title><link href="https://simonwillison.net/2026/Feb/23/showboat/#atom-tag" rel="alternate"/><published>2026-02-23T14:08:09+00:00</published><updated>2026-02-23T14:08:09+00:00</updated><id>https://simonwillison.net/2026/Feb/23/showboat/#atom-tag</id><summary type="html">
    &lt;p&gt;&lt;strong&gt;Release:&lt;/strong&gt; &lt;a href="https://github.com/simonw/showboat/releases/tag/v0.6.1"&gt;showboat v0.6.1&lt;/a&gt;&lt;/p&gt;
    
    
        &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/showboat"&gt;showboat&lt;/a&gt;&lt;/p&gt;
    

</summary><category term="showboat"/></entry><entry><title>datasette-showboat 0.1a1</title><link href="https://simonwillison.net/2026/Feb/18/datasette-showboat/#atom-tag" rel="alternate"/><published>2026-02-18T12:53:00+00:00</published><updated>2026-02-18T12:53:00+00:00</updated><id>https://simonwillison.net/2026/Feb/18/datasette-showboat/#atom-tag</id><summary type="html">
    &lt;p&gt;&lt;strong&gt;Release:&lt;/strong&gt; &lt;a href="https://github.com/simonw/datasette-showboat/releases/tag/0.1a1"&gt;datasette-showboat 0.1a1&lt;/a&gt;&lt;/p&gt;
    
    
        &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/showboat"&gt;showboat&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/datasette"&gt;datasette&lt;/a&gt;&lt;/p&gt;
    

</summary><category term="showboat"/><category term="datasette"/></entry><entry><title>Nano Banana Pro diff to webcomic</title><link href="https://simonwillison.net/2026/Feb/17/release-notes-webcomic/#atom-tag" rel="alternate"/><published>2026-02-17T04:51:58+00:00</published><updated>2026-02-17T04:51:58+00:00</updated><id>https://simonwillison.net/2026/Feb/17/release-notes-webcomic/#atom-tag</id><summary type="html">
    &lt;p&gt;Given the threat of &lt;a href="https://simonwillison.net/tags/cognitive-debt/"&gt;cognitive debt&lt;/a&gt; brought on by AI-accelerated software development leading to more projects and less deep understanding of how they work and what they actually do, it's interesting to consider artifacts that might be able to help.&lt;/p&gt;
&lt;p&gt;Nathan Baschez &lt;a href="https://twitter.com/nbaschez/status/2023501535343509871"&gt;on Twitter&lt;/a&gt;:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;my current favorite trick for reducing "cognitive debt" (h/t @simonw
) is to ask the LLM to write two versions of the plan:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;The version for it (highly technical and detailed)&lt;/li&gt;
&lt;li&gt;The version for me (an entertaining essay designed to build my intuition)&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;Works great&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;This inspired me to try something new. I generated &lt;a href="https://github.com/simonw/showboat/compare/v0.5.0...v0.6.0.diff"&gt;the diff&lt;/a&gt; between v0.5.0 and v0.6.0 of my Showboat project - which introduced &lt;a href="https://simonwillison.net/2026/Feb/17/chartroom-and-datasette-showboat/#showboat-remote-publishing"&gt;the remote publishing feature&lt;/a&gt; - and dumped that into Nano Banana Pro with the prompt:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;Create a webcomic that explains the new feature as clearly and entertainingly as possible&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;Here's &lt;a href="https://gemini.google.com/share/cce6da8e5083"&gt;what it produced&lt;/a&gt;:&lt;/p&gt;
&lt;p&gt;&lt;img alt="A six-panel comic strip illustrating a tool called &amp;quot;Showboat&amp;quot; for live-streaming document building. Panel 1, titled &amp;quot;THE OLD WAY: Building docs was a lonely voyage. You finished it all before anyone saw it.&amp;quot;, shows a sad bearded man on a wooden boat labeled &amp;quot;THE LOCALHOST&amp;quot; holding papers and saying &amp;quot;Almost done... then I have to export and email the HTML...&amp;quot;. Panel 2, titled &amp;quot;THE UPGRADE: Just set the environment variable!&amp;quot;, shows the same man excitedly plugging in a device with a speech bubble reading &amp;quot;ENV VAR: SHOWBOAT_REMOTE_URL&amp;quot; and the sound effect &amp;quot;*KA-CHUNK!*&amp;quot;. Panel 3, titled &amp;quot;init establishes the uplink and generates a unique UUID beacon.&amp;quot;, shows the man typing at a keyboard with a terminal reading &amp;quot;$ showboat init 'Live Demo'&amp;quot;, a satellite dish transmitting to a floating label &amp;quot;UUID: 550e84...&amp;quot;, and a monitor reading &amp;quot;WAITING FOR STREAM...&amp;quot;. Panel 4, titled &amp;quot;Every note and exec is instantly beamed to the remote viewer!&amp;quot;, shows the man coding with sound effects &amp;quot;*HAMMER!*&amp;quot;, &amp;quot;ZAP!&amp;quot;, &amp;quot;ZAP!&amp;quot;, &amp;quot;BANG!&amp;quot; as red laser beams shoot from a satellite dish to a remote screen displaying &amp;quot;NOTE: Step 1...&amp;quot; and &amp;quot;SUCCESS&amp;quot;. Panel 5, titled &amp;quot;Even image files are teleported in real-time!&amp;quot;, shows a satellite dish firing a cyan beam with the sound effect &amp;quot;*FOOMP!*&amp;quot; toward a monitor displaying a bar chart. Panel 6, titled &amp;quot;You just build. The audience gets the show live.&amp;quot;, shows the man happily working at his boat while a crowd of cheering people watches a projected screen reading &amp;quot;SHOWBOAT LIVE STREAM: Live Demo&amp;quot;, with a label &amp;quot;UUID: 550e84...&amp;quot; and one person in the foreground eating popcorn." src="https://static.simonwillison.net/static/2026/nano-banana-diff.jpg" /&gt;&lt;/p&gt;
&lt;p&gt;Good enough to publish with the release notes? I don't think so. I'm sharing it here purely to demonstrate the idea. Creating assets like this as a personal tool for thinking about novel ways to explain a feature feels worth exploring further.&lt;/p&gt;

    &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/ai"&gt;ai&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/generative-ai"&gt;generative-ai&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/llms"&gt;llms&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/ai-assisted-programming"&gt;ai-assisted-programming&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/gemini"&gt;gemini&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/text-to-image"&gt;text-to-image&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/nano-banana"&gt;nano-banana&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/showboat"&gt;showboat&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/cognitive-debt"&gt;cognitive-debt&lt;/a&gt;&lt;/p&gt;



</summary><category term="ai"/><category term="generative-ai"/><category term="llms"/><category term="ai-assisted-programming"/><category term="gemini"/><category term="text-to-image"/><category term="nano-banana"/><category term="showboat"/><category term="cognitive-debt"/></entry><entry><title>Two new Showboat tools: Chartroom and datasette-showboat</title><link href="https://simonwillison.net/2026/Feb/17/chartroom-and-datasette-showboat/#atom-tag" rel="alternate"/><published>2026-02-17T00:43:45+00:00</published><updated>2026-02-17T00:43:45+00:00</updated><id>https://simonwillison.net/2026/Feb/17/chartroom-and-datasette-showboat/#atom-tag</id><summary type="html">
    &lt;p&gt;I &lt;a href="https://simonwillison.net/2026/Feb/10/showboat-and-rodney/"&gt;introduced Showboat&lt;/a&gt; a week ago - my CLI tool that helps coding agents create Markdown documents that demonstrate the code that they have created. I've been finding new ways to use it on a daily basis, and I've just released two new tools to help get the best out of the Showboat pattern. &lt;a href="https://github.com/simonw/chartroom"&gt;Chartroom&lt;/a&gt; is a CLI charting tool that works well with Showboat, and &lt;a href="https://github.com/simonw/datasette-showboat"&gt;datasette-showboat&lt;/a&gt; lets Showboat's new remote publishing feature incrementally push documents to a Datasette instance.&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;&lt;a href="https://simonwillison.net/2026/Feb/17/chartroom-and-datasette-showboat/#showboat-remote-publishing"&gt;Showboat remote publishing&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href="https://simonwillison.net/2026/Feb/17/chartroom-and-datasette-showboat/#datasette-showboat"&gt;datasette-showboat&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href="https://simonwillison.net/2026/Feb/17/chartroom-and-datasette-showboat/#chartroom"&gt;Chartroom&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href="https://simonwillison.net/2026/Feb/17/chartroom-and-datasette-showboat/#how-i-built-chartroom"&gt;How I built Chartroom&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href="https://simonwillison.net/2026/Feb/17/chartroom-and-datasette-showboat/#the-burgeoning-showboat-ecosystem"&gt;The burgeoning Showboat ecosystem&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;h4 id="showboat-remote-publishing"&gt;Showboat remote publishing&lt;/h4&gt;
&lt;p&gt;I normally use Showboat in Claude Code for web (see &lt;a href="https://simonwillison.net/2026/Feb/16/rodney-claude-code/"&gt;note from this morning&lt;/a&gt;). I've used it in several different projects in the past few days, each of them with a prompt that looks something like this:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;code&gt;Use "uvx showboat --help" to perform a very thorough investigation of what happens if you use the Python sqlite-chronicle and sqlite-history-json libraries against the same SQLite database table&lt;/code&gt;&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;Here's &lt;a href="https://github.com/simonw/research/blob/main/sqlite-chronicle-vs-history-json/demo.md"&gt;the resulting document&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;Just telling Claude Code to run &lt;code&gt;uvx showboat --help&lt;/code&gt; is enough for it to learn how to use the tool - the &lt;a href="https://github.com/simonw/showboat/blob/main/help.txt"&gt;help text&lt;/a&gt; is designed to work as a sort of ad-hoc Skill document.&lt;/p&gt;
&lt;p&gt;The one catch with this approach is that I can't &lt;em&gt;see&lt;/em&gt; the new Showboat document until it's finished. I have to wait for Claude to commit the document plus embedded screenshots and push that to a branch in my GitHub repo - then I can view it through the GitHub interface.&lt;/p&gt;
&lt;p&gt;For a while I've been thinking it would be neat to have a remote web server of my own which Claude instances can submit updates to while they are working. Then this morning I realized Showboat might be the ideal mechanism to set that up...&lt;/p&gt;
&lt;p&gt;Showboat &lt;a href="https://github.com/simonw/showboat/releases/tag/v0.6.0"&gt;v0.6.0&lt;/a&gt; adds a new "remote" feature. It's almost invisible to users of the tool itself, instead being configured by an environment variable.&lt;/p&gt;
&lt;p&gt;Set a variable like this:&lt;/p&gt;
&lt;div class="highlight highlight-source-shell"&gt;&lt;pre&gt;&lt;span class="pl-k"&gt;export&lt;/span&gt; SHOWBOAT_REMOTE_URL=https://www.example.com/submit&lt;span class="pl-k"&gt;?&lt;/span&gt;token=xyz&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;And every time you run a &lt;code&gt;showboat init&lt;/code&gt; or &lt;code&gt;showboat note&lt;/code&gt; or &lt;code&gt;showboat exec&lt;/code&gt; or &lt;code&gt;showboat image&lt;/code&gt; command the resulting document fragments will be POSTed to that API endpoint, in addition to the Showboat Markdown file itself being updated.&lt;/p&gt;
&lt;p&gt;There are &lt;a href="https://github.com/simonw/showboat/blob/v0.6.0/README.md#remote-document-streaming"&gt;full details in the Showboat README&lt;/a&gt; - it's a very simple API format, using regular POST form variables or a multipart form upload for the image attached to &lt;code&gt;showboat image&lt;/code&gt;.&lt;/p&gt;
&lt;h4 id="datasette-showboat"&gt;datasette-showboat&lt;/h4&gt;
&lt;p&gt;It's simple enough to build a webapp to receive these updates from Showboat, but I needed one that I could easily deploy and would work well with the rest of my personal ecosystem.&lt;/p&gt;
&lt;p&gt;So I had Claude Code write me a Datasette plugin that could act as a Showboat remote endpoint. I actually had this building at the same time as the Showboat remote feature, a neat example of running &lt;a href="https://simonwillison.net/2025/Oct/5/parallel-coding-agents/"&gt;parallel agents&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;&lt;a href="https://github.com/simonw/datasette-showboat"&gt;datasette-showboat&lt;/a&gt;&lt;/strong&gt; is a Datasette plugin that adds a &lt;code&gt;/-/showboat&lt;/code&gt; endpoint to Datasette for viewing documents and a &lt;code&gt;/-/showboat/receive&lt;/code&gt; endpoint for receiving updates from Showboat.&lt;/p&gt;
&lt;p&gt;Here's a very quick way to try it out:&lt;/p&gt;
&lt;div class="highlight highlight-source-shell"&gt;&lt;pre&gt;uvx --with datasette-showboat --prerelease=allow \
  datasette showboat.db --create \
  -s plugins.datasette-showboat.database showboat \
  -s plugins.datasette-showboat.token secret123 \
  --root --secret cookie-secret-123&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;Click on the sign in as root link that shows up in the console, then navigate to &lt;a href="http://127.0.0.1:8001/-/showboat"&gt;http://127.0.0.1:8001/-/showboat&lt;/a&gt; to see the interface.&lt;/p&gt;
&lt;p&gt;Now set your environment variable to point to this instance:&lt;/p&gt;
&lt;div class="highlight highlight-source-shell"&gt;&lt;pre&gt;&lt;span class="pl-k"&gt;export&lt;/span&gt; SHOWBOAT_REMOTE_URL=&lt;span class="pl-s"&gt;&lt;span class="pl-pds"&gt;"&lt;/span&gt;http://127.0.0.1:8001/-/showboat/receive?token=secret123&lt;span class="pl-pds"&gt;"&lt;/span&gt;&lt;/span&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;And run Showboat like this:&lt;/p&gt;
&lt;div class="highlight highlight-source-shell"&gt;&lt;pre&gt;uvx showboat init demo.md &lt;span class="pl-s"&gt;&lt;span class="pl-pds"&gt;"&lt;/span&gt;Showboat Feature Demo&lt;span class="pl-pds"&gt;"&lt;/span&gt;&lt;/span&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;Refresh that page and you should see this:&lt;/p&gt;
&lt;p&gt;&lt;img src="https://static.simonwillison.net/static/2026/datasette-showboat-documents.jpg" alt="Title: Showboat. Remote viewer for Showboat documents. Showboat Feature Demo 2026-02-17 00:06 · 6 chunks, UUID. To send showboat output to this server, set the SHOWBOAT_REMOTE_URL environment variable: export SHOWBOAT_REMOTE_URL=&amp;quot;http://127.0.0.1:8001/-/showboat/receive?token=your-token&amp;quot;" style="max-width: 100%;" /&gt;&lt;/p&gt;
&lt;p&gt;Click through to the document, then start Claude Code or Codex or your agent of choice and prompt:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;code&gt;Run 'uvx showboat --help' and then use showboat to add to the existing demo.md document with notes and exec and image to demonstrate the tool - fetch a placekitten for the image demo.&lt;/code&gt;&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;The &lt;code&gt;init&lt;/code&gt; command assigns a UUID and title and sends those up to Datasette.&lt;/p&gt;
&lt;p&gt;&lt;img src="https://static.simonwillison.net/static/2026/datasette-showboat.gif" alt="Animated demo - in the foreground a terminal window runs Claude Code, which executes various Showboat commands. In the background a Firefox window where the Showboat Feature Demo adds notes then some bash commands, then a placekitten image." style="max-width: 100%;" /&gt;&lt;/p&gt;
&lt;p&gt;The best part of this is that it works in Claude Code for web. Run the plugin on a server somewhere (an exercise left up to the reader - I use &lt;a href="https://fly.io/"&gt;Fly.io&lt;/a&gt; to host mine) and set that &lt;code&gt;SHOWBOAT_REMOTE_URL&lt;/code&gt; environment variable in your Claude environment, then any time you tell it to use Showboat the document it creates will be transmitted to your server and viewable in real time.&lt;/p&gt;
&lt;p&gt;I built &lt;a href="https://simonwillison.net/2026/Feb/10/showboat-and-rodney/#rodney-cli-browser-automation-designed-to-work-with-showboat"&gt;Rodney&lt;/a&gt;, a CLI browser automation tool, specifically to work with Showboat. It makes it easy to have a Showboat document load up web pages, interact with them via clicks or injected JavaScript and captures screenshots to embed in the Showboat document and show the effects.&lt;/p&gt;
&lt;p&gt;This is wildly useful for hacking on web interfaces using Claude Code for web, especially when coupled with the new remote publishing feature. I only got this stuff working this morning and I've already had several sessions where Claude Code has published screenshots of its work in progress, which I've then been able to provide feedback on directly in the Claude session while it's still working.&lt;/p&gt;
&lt;h3 id="chartroom"&gt;Chartroom&lt;/h3&gt;
&lt;p&gt;A few days ago I had another idea for a way to extend the Showboat ecosystem: what if Showboat documents could easily include charts?&lt;/p&gt;
&lt;p&gt;I sometimes fire up Claude Code for data analysis tasks, often telling it to download a SQLite database and then run queries against it to figure out interesting things from the data.&lt;/p&gt;
&lt;p&gt;With a simple CLI tool that produced PNG images I could have Claude use Showboat to build a document with embedded charts to help illustrate its findings.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;&lt;a href="https://github.com/simonw/chartroom"&gt;Chartroom&lt;/a&gt;&lt;/strong&gt; is exactly that. It's effectively a thin wrapper around the excellent &lt;a href="https://matplotlib.org/"&gt;matplotlib&lt;/a&gt; Python library, designed to be used by coding agents to create charts that can be embedded in Showboat documents.&lt;/p&gt;
&lt;p&gt;Here's how to render a simple bar chart:&lt;/p&gt;
&lt;div class="highlight highlight-source-shell"&gt;&lt;pre&gt;&lt;span class="pl-c1"&gt;echo&lt;/span&gt; &lt;span class="pl-s"&gt;&lt;span class="pl-pds"&gt;'&lt;/span&gt;name,value&lt;/span&gt;
&lt;span class="pl-s"&gt;Alice,42&lt;/span&gt;
&lt;span class="pl-s"&gt;Bob,28&lt;/span&gt;
&lt;span class="pl-s"&gt;Charlie,35&lt;/span&gt;
&lt;span class="pl-s"&gt;Diana,51&lt;/span&gt;
&lt;span class="pl-s"&gt;Eve,19&lt;span class="pl-pds"&gt;'&lt;/span&gt;&lt;/span&gt; &lt;span class="pl-k"&gt;|&lt;/span&gt; uvx chartroom bar --csv \
  --title &lt;span class="pl-s"&gt;&lt;span class="pl-pds"&gt;'&lt;/span&gt;Sales by Person&lt;span class="pl-pds"&gt;'&lt;/span&gt;&lt;/span&gt; --ylabel &lt;span class="pl-s"&gt;&lt;span class="pl-pds"&gt;'&lt;/span&gt;Sales&lt;span class="pl-pds"&gt;'&lt;/span&gt;&lt;/span&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;&lt;a target="_blank" rel="noopener noreferrer nofollow" href="https://raw.githubusercontent.com/simonw/chartroom/8812afc02e1310e9eddbb56508b06005ff2c0ed5/demo/1f6851ec-2026-02-14.png"&gt;&lt;img src="https://raw.githubusercontent.com/simonw/chartroom/8812afc02e1310e9eddbb56508b06005ff2c0ed5/demo/1f6851ec-2026-02-14.png" alt="A chart of those numbers, with a title and y-axis label" style="max-width: 100%;" /&gt;&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;It can also do line charts, bar charts, scatter charts, and histograms - as seen in &lt;a href="https://github.com/simonw/chartroom/blob/0.2.1/demo/README.md"&gt;this demo document&lt;/a&gt; that was built using Showboat.&lt;/p&gt;
&lt;p&gt;Chartroom can also generate alt text. If you add &lt;code&gt;-f alt&lt;/code&gt; to the above it will output the alt text for the chart instead of the image:&lt;/p&gt;
&lt;div class="highlight highlight-source-shell"&gt;&lt;pre&gt;&lt;span class="pl-c1"&gt;echo&lt;/span&gt; &lt;span class="pl-s"&gt;&lt;span class="pl-pds"&gt;'&lt;/span&gt;name,value&lt;/span&gt;
&lt;span class="pl-s"&gt;Alice,42&lt;/span&gt;
&lt;span class="pl-s"&gt;Bob,28&lt;/span&gt;
&lt;span class="pl-s"&gt;Charlie,35&lt;/span&gt;
&lt;span class="pl-s"&gt;Diana,51&lt;/span&gt;
&lt;span class="pl-s"&gt;Eve,19&lt;span class="pl-pds"&gt;'&lt;/span&gt;&lt;/span&gt; &lt;span class="pl-k"&gt;|&lt;/span&gt; uvx chartroom bar --csv \
  --title &lt;span class="pl-s"&gt;&lt;span class="pl-pds"&gt;'&lt;/span&gt;Sales by Person&lt;span class="pl-pds"&gt;'&lt;/span&gt;&lt;/span&gt; --ylabel &lt;span class="pl-s"&gt;&lt;span class="pl-pds"&gt;'&lt;/span&gt;Sales&lt;span class="pl-pds"&gt;'&lt;/span&gt;&lt;/span&gt; -f alt&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;Outputs:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;Sales by Person. Bar chart of value by name — Alice: 42, Bob: 28, Charlie: 35, Diana: 51, Eve: 19
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Or you can use &lt;code&gt;-f html&lt;/code&gt; or &lt;code&gt;-f markdown&lt;/code&gt; to get the image tag with alt text directly:&lt;/p&gt;
&lt;div class="highlight highlight-text-md"&gt;&lt;pre&gt;&lt;span class="pl-s"&gt;![&lt;/span&gt;Sales by Person. Bar chart of value by name — Alice: 42, Bob: 28, Charlie: 35, Diana: 51, Eve: 19&lt;span class="pl-s"&gt;]&lt;/span&gt;&lt;span class="pl-s"&gt;(&lt;/span&gt;&lt;span class="pl-corl"&gt;/Users/simon/chart-7.png&lt;/span&gt;&lt;span class="pl-s"&gt;)&lt;/span&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;I added support for Markdown images with alt text to Showboat in &lt;a href="https://github.com/simonw/showboat/releases/tag/v0.5.0"&gt;v0.5.0&lt;/a&gt;, to complement this feature of Chartroom.&lt;/p&gt;
&lt;p&gt;Finally, Chartroom has support for different &lt;a href="https://matplotlib.org/stable/gallery/style_sheets/style_sheets_reference.html"&gt;matplotlib styles&lt;/a&gt;. I had Claude build a Showboat document to demonstrate these all in one place - you can see that at &lt;a href="https://github.com/simonw/chartroom/blob/main/demo/styles.md"&gt;demo/styles.md&lt;/a&gt;.&lt;/p&gt;
&lt;h4 id="how-i-built-chartroom"&gt;How I built Chartroom&lt;/h4&gt;
&lt;p&gt;I started the Chartroom repository with my &lt;a href="https://github.com/simonw/click-app"&gt;click-app&lt;/a&gt; cookiecutter template, then told a fresh Claude Code for web session:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;We are building a Python CLI tool which uses matplotlib to generate a PNG image containing a chart. It will have multiple sub commands for different chart types, controlled by command line options. Everything you need to know to use it will be available in the single "chartroom --help" output.&lt;/p&gt;
&lt;p&gt;It will accept data from files or standard input as CSV or TSV or JSON, similar to how sqlite-utils accepts data - clone simonw/sqlite-utils to /tmp for reference there. Clone matplotlib/matplotlib for reference as well&lt;/p&gt;
&lt;p&gt;It will also accept data from --sql path/to/sqlite.db "select ..." which runs in read-only mode&lt;/p&gt;
&lt;p&gt;Start by asking clarifying questions - do not use the ask user tool though it is broken - and generate a spec for me to approve&lt;/p&gt;
&lt;p&gt;Once approved proceed using red/green TDD running tests with "uv run pytest"&lt;/p&gt;
&lt;p&gt;Also while building maintain a demo/README.md document using the "uvx showboat --help" tool - each time you get a new chart type working commit the tests, implementation, root level
README update and a new version of that demo/README.md document with an inline image demo of the new chart type (which should be a UUID image filename managed by the showboat image command and should be stored in the demo/ folder&lt;/p&gt;
&lt;p&gt;Make sure "uv build" runs cleanly without complaining about extra directories but also ensure dist/ and uv.lock are in gitignore&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;This got most of the work done. You can see the rest &lt;a href="https://github.com/simonw/chartroom/pulls?q=is%3Apr+is%3Aclosed"&gt;in the PRs&lt;/a&gt; that followed.&lt;/p&gt;
&lt;h4 id="the-burgeoning-showboat-ecosystem"&gt;The burgeoning Showboat ecosystem&lt;/h4&gt;
&lt;p&gt;The Showboat family of tools now consists of &lt;a href="https://github.com/simonw/showboat"&gt;Showboat&lt;/a&gt; itself, &lt;a href="https://github.com/simonw/rodney"&gt;Rodney&lt;/a&gt; for browser automation, &lt;a href="https://github.com/simonw/chartroom"&gt;Chartroom&lt;/a&gt; for charting and &lt;a href="https://github.com/simonw/datasette-showboat"&gt;datasette-showboat&lt;/a&gt; for streaming remote Showboat documents to Datasette.&lt;/p&gt;
&lt;p&gt;I'm enjoying how these tools can operate together based on a very loose set of conventions. If a tool can output a path to an image Showboat can include that image in a document. Any tool that can output text can be used with Showboat.&lt;/p&gt;
&lt;p&gt;I'll almost certainly be building more tools that fit this pattern. They're very quick to knock out!&lt;/p&gt;
&lt;p&gt;The environment variable mechanism for Showboat's remote streaming is a fun hack too - so far I'm just using it to stream documents somewhere else, but it's effectively a webhook extension mechanism that could likely be used for all sorts of things I haven't thought of yet.&lt;/p&gt;
    
        &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/charting"&gt;charting&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/projects"&gt;projects&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/ai"&gt;ai&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/datasette"&gt;datasette&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/generative-ai"&gt;generative-ai&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/llms"&gt;llms&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/ai-assisted-programming"&gt;ai-assisted-programming&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/coding-agents"&gt;coding-agents&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/claude-code"&gt;claude-code&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/showboat"&gt;showboat&lt;/a&gt;&lt;/p&gt;
    

</summary><category term="charting"/><category term="projects"/><category term="ai"/><category term="datasette"/><category term="generative-ai"/><category term="llms"/><category term="ai-assisted-programming"/><category term="coding-agents"/><category term="claude-code"/><category term="showboat"/></entry><entry><title>datasette-showboat 0.1a0</title><link href="https://simonwillison.net/2026/Feb/16/datasette-showboat/#atom-tag" rel="alternate"/><published>2026-02-16T19:43:11+00:00</published><updated>2026-02-16T19:43:11+00:00</updated><id>https://simonwillison.net/2026/Feb/16/datasette-showboat/#atom-tag</id><summary type="html">
    &lt;p&gt;&lt;strong&gt;Release:&lt;/strong&gt; &lt;a href="https://github.com/simonw/datasette-showboat/releases/tag/0.1a0"&gt;datasette-showboat 0.1a0&lt;/a&gt;&lt;/p&gt;
    
    
        &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/showboat"&gt;showboat&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/datasette"&gt;datasette&lt;/a&gt;&lt;/p&gt;
    

</summary><category term="showboat"/><category term="datasette"/></entry><entry><title>showboat v0.6.0</title><link href="https://simonwillison.net/2026/Feb/16/showboat/#atom-tag" rel="alternate"/><published>2026-02-16T19:42:53+00:00</published><updated>2026-02-16T19:42:53+00:00</updated><id>https://simonwillison.net/2026/Feb/16/showboat/#atom-tag</id><summary type="html">
    &lt;p&gt;&lt;strong&gt;Release:&lt;/strong&gt; &lt;a href="https://github.com/simonw/showboat/releases/tag/v0.6.0"&gt;showboat v0.6.0&lt;/a&gt;&lt;/p&gt;
    
    
        &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/showboat"&gt;showboat&lt;/a&gt;&lt;/p&gt;
    

</summary><category term="showboat"/></entry><entry><title>showboat v0.5.0</title><link href="https://simonwillison.net/2026/Feb/14/showboat/#atom-tag" rel="alternate"/><published>2026-02-14T19:48:14+00:00</published><updated>2026-02-14T19:48:14+00:00</updated><id>https://simonwillison.net/2026/Feb/14/showboat/#atom-tag</id><summary type="html">
    &lt;p&gt;&lt;strong&gt;Release:&lt;/strong&gt; &lt;a href="https://github.com/simonw/showboat/releases/tag/v0.5.0"&gt;showboat v0.5.0&lt;/a&gt;&lt;/p&gt;
    
    
        &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/showboat"&gt;showboat&lt;/a&gt;&lt;/p&gt;
    

</summary><category term="showboat"/></entry><entry><title>Skills in OpenAI API</title><link href="https://simonwillison.net/2026/Feb/11/skills-in-openai-api/#atom-tag" rel="alternate"/><published>2026-02-11T19:19:22+00:00</published><updated>2026-02-11T19:19:22+00:00</updated><id>https://simonwillison.net/2026/Feb/11/skills-in-openai-api/#atom-tag</id><summary type="html">
    
&lt;p&gt;&lt;strong&gt;&lt;a href="https://developers.openai.com/cookbook/examples/skills_in_api"&gt;Skills in OpenAI API&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;
OpenAI's adoption of Skills continues to gain ground. You can now use Skills directly in the OpenAI API with their &lt;a href="https://developers.openai.com/api/docs/guides/tools-shell/"&gt;shell tool&lt;/a&gt;. You can zip skills up and upload them first, but I think an even neater interface is the ability to send skills with the JSON request as inline base64-encoded zip data, as seen &lt;a href="https://github.com/simonw/research/blob/main/openai-api-skills/openai_inline_skills.py"&gt;in this script&lt;/a&gt;:&lt;/p&gt;
&lt;pre&gt;&lt;span class="pl-s1"&gt;r&lt;/span&gt; &lt;span class="pl-c1"&gt;=&lt;/span&gt; &lt;span class="pl-en"&gt;OpenAI&lt;/span&gt;().&lt;span class="pl-c1"&gt;responses&lt;/span&gt;.&lt;span class="pl-c1"&gt;create&lt;/span&gt;(
    &lt;span class="pl-s1"&gt;model&lt;/span&gt;&lt;span class="pl-c1"&gt;=&lt;/span&gt;&lt;span class="pl-s"&gt;"gpt-5.2"&lt;/span&gt;,
    &lt;span class="pl-s1"&gt;tools&lt;/span&gt;&lt;span class="pl-c1"&gt;=&lt;/span&gt;[
      {
        &lt;span class="pl-s"&gt;"type"&lt;/span&gt;: &lt;span class="pl-s"&gt;"shell"&lt;/span&gt;,
        &lt;span class="pl-s"&gt;"environment"&lt;/span&gt;: {
          &lt;span class="pl-s"&gt;"type"&lt;/span&gt;: &lt;span class="pl-s"&gt;"container_auto"&lt;/span&gt;,
          &lt;span class="pl-s"&gt;"skills"&lt;/span&gt;: [
            {
              &lt;span class="pl-s"&gt;"type"&lt;/span&gt;: &lt;span class="pl-s"&gt;"inline"&lt;/span&gt;,
              &lt;span class="pl-s"&gt;"name"&lt;/span&gt;: &lt;span class="pl-s"&gt;"wc"&lt;/span&gt;,
              &lt;span class="pl-s"&gt;"description"&lt;/span&gt;: &lt;span class="pl-s"&gt;"Count words in a file."&lt;/span&gt;,
              &lt;span class="pl-s"&gt;"source"&lt;/span&gt;: {
                &lt;span class="pl-s"&gt;"type"&lt;/span&gt;: &lt;span class="pl-s"&gt;"base64"&lt;/span&gt;,
                &lt;span class="pl-s"&gt;"media_type"&lt;/span&gt;: &lt;span class="pl-s"&gt;"application/zip"&lt;/span&gt;,
                &lt;span class="pl-s"&gt;"data"&lt;/span&gt;: &lt;span class="pl-s1"&gt;b64_encoded_zip_file&lt;/span&gt;,
              },
            }
          ],
        },
      }
    ],
    &lt;span class="pl-s1"&gt;input&lt;/span&gt;&lt;span class="pl-c1"&gt;=&lt;/span&gt;&lt;span class="pl-s"&gt;"Use the wc skill to count words in its own SKILL.md file."&lt;/span&gt;,
)
&lt;span class="pl-en"&gt;print&lt;/span&gt;(&lt;span class="pl-s1"&gt;r&lt;/span&gt;.&lt;span class="pl-c1"&gt;output_text&lt;/span&gt;)&lt;/pre&gt;

&lt;p&gt;I built that example script after first having Claude Code for web use &lt;a href="https://simonwillison.net/2026/Feb/10/showboat-and-rodney/"&gt;Showboat&lt;/a&gt; to explore the API for me and create &lt;a href="https://github.com/simonw/research/blob/main/openai-api-skills/README.md"&gt;this report&lt;/a&gt;. My opening prompt for the research project was:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;code&gt;Run uvx showboat --help - you will use this tool later&lt;/code&gt;&lt;/p&gt;
&lt;p&gt;&lt;code&gt;Fetch https://developers.openai.com/cookbook/examples/skills_in_api.md to /tmp with curl, then read it&lt;/code&gt;&lt;/p&gt;
&lt;p&gt;&lt;code&gt;Use the OpenAI API key you have in your environment variables&lt;/code&gt;&lt;/p&gt;
&lt;p&gt;&lt;code&gt;Use showboat to build up a detailed demo of this, replaying the examples from the documents and then trying some experiments of your own&lt;/code&gt;&lt;/p&gt;
&lt;/blockquote&gt;


    &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/ai"&gt;ai&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/openai"&gt;openai&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/generative-ai"&gt;generative-ai&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/llms"&gt;llms&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/ai-assisted-programming"&gt;ai-assisted-programming&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/skills"&gt;skills&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/showboat"&gt;showboat&lt;/a&gt;&lt;/p&gt;



</summary><category term="ai"/><category term="openai"/><category term="generative-ai"/><category term="llms"/><category term="ai-assisted-programming"/><category term="skills"/><category term="showboat"/></entry><entry><title>Introducing Showboat and Rodney, so agents can demo what they’ve built</title><link href="https://simonwillison.net/2026/Feb/10/showboat-and-rodney/#atom-tag" rel="alternate"/><published>2026-02-10T17:45:29+00:00</published><updated>2026-02-10T17:45:29+00:00</updated><id>https://simonwillison.net/2026/Feb/10/showboat-and-rodney/#atom-tag</id><summary type="html">
    &lt;p&gt;A key challenge working with coding agents is having them both test what they’ve built and demonstrate that software to you, their supervisor. This goes beyond automated tests - we need artifacts that show their progress and help us see exactly what the agent-produced software is able to do. I’ve just released two new tools aimed at this problem: &lt;a href="https://github.com/simonw/showboat"&gt;Showboat&lt;/a&gt; and &lt;a href="https://github.com/simonw/rodney"&gt;Rodney&lt;/a&gt;.&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;&lt;a href="https://simonwillison.net/2026/Feb/10/showboat-and-rodney/#proving-code-actually-works"&gt;Proving code actually works&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href="https://simonwillison.net/2026/Feb/10/showboat-and-rodney/#showboat-agents-build-documents-to-demo-their-work"&gt;Showboat: Agents build documents to demo their work&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href="https://simonwillison.net/2026/Feb/10/showboat-and-rodney/#rodney-cli-browser-automation-designed-to-work-with-showboat"&gt;Rodney: CLI browser automation designed to work with Showboat&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href="https://simonwillison.net/2026/Feb/10/showboat-and-rodney/#test-driven-development-helps-but-we-still-need-manual-testing"&gt;Test-driven development helps, but we still need manual testing&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href="https://simonwillison.net/2026/Feb/10/showboat-and-rodney/#i-built-both-of-these-tools-on-my-phone"&gt;I built both of these tools on my phone&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;h4 id="proving-code-actually-works"&gt;Proving code actually works&lt;/h4&gt;
&lt;p&gt;I recently wrote about how the job of a software engineer isn't to write code, it's to &lt;em&gt;&lt;a href="https://simonwillison.net/2025/Dec/18/code-proven-to-work/"&gt;deliver code that works&lt;/a&gt;&lt;/em&gt;. A big part of that is proving to ourselves and to other people that the code we are responsible for behaves as expected.&lt;/p&gt;
&lt;p&gt;This becomes even more important - and challenging - as we embrace coding agents as a core part of our software development process.&lt;/p&gt;
&lt;p&gt;The more code we churn out with agents, the more valuable tools are that reduce the amount of manual QA time we need to spend.&lt;/p&gt;
&lt;p&gt;One of the most interesting things about &lt;a href="https://simonwillison.net/2026/Feb/7/software-factory/"&gt;the StrongDM software factory model&lt;/a&gt; is how they ensure that their software is well tested and delivers value despite their policy that "code must not be reviewed by humans". Part of their solution involves expensive swarms of QA agents running through "scenarios" to exercise their software. It's fascinating, but I don't want to spend thousands of dollars on QA robots if I can avoid it!&lt;/p&gt;
&lt;p&gt;I need tools that allow agents to clearly demonstrate their work to me, while minimizing the opportunities for them to cheat about what they've done.&lt;/p&gt;

&lt;h4 id="showboat-agents-build-documents-to-demo-their-work"&gt;Showboat: Agents build documents to demo their work&lt;/h4&gt;
&lt;p&gt;&lt;strong&gt;&lt;a href="https://github.com/simonw/showboat"&gt;Showboat&lt;/a&gt;&lt;/strong&gt; is the tool I built to help agents demonstrate their work to me.&lt;/p&gt;
&lt;p&gt;It's a CLI tool (a Go binary, optionally &lt;a href="https://simonwillison.net/2026/Feb/4/distributing-go-binaries/"&gt;wrapped in Python&lt;/a&gt; to make it easier to install) that helps an agent construct a Markdown document demonstrating exactly what their newly developed code can do.&lt;/p&gt;
&lt;p&gt;It's not designed for humans to run, but here's how you would run it anyway:&lt;/p&gt;
&lt;div class="highlight highlight-source-shell"&gt;&lt;pre&gt;showboat init demo.md &lt;span class="pl-s"&gt;&lt;span class="pl-pds"&gt;'&lt;/span&gt;How to use curl and jq&lt;span class="pl-pds"&gt;'&lt;/span&gt;&lt;/span&gt;
showboat note demo.md &lt;span class="pl-s"&gt;&lt;span class="pl-pds"&gt;"&lt;/span&gt;Here's how to use curl and jq together.&lt;span class="pl-pds"&gt;"&lt;/span&gt;&lt;/span&gt;
showboat &lt;span class="pl-c1"&gt;exec&lt;/span&gt; demo.md bash &lt;span class="pl-s"&gt;&lt;span class="pl-pds"&gt;'&lt;/span&gt;curl -s https://api.github.com/repos/simonw/rodney | jq .description&lt;span class="pl-pds"&gt;'&lt;/span&gt;&lt;/span&gt;
showboat note demo.md &lt;span class="pl-s"&gt;&lt;span class="pl-pds"&gt;'&lt;/span&gt;And the curl logo, to demonstrate the image command:&lt;span class="pl-pds"&gt;'&lt;/span&gt;&lt;/span&gt;
showboat image demo.md &lt;span class="pl-s"&gt;&lt;span class="pl-pds"&gt;'&lt;/span&gt;curl -o curl-logo.png https://curl.se/logo/curl-logo.png &amp;amp;&amp;amp; echo curl-logo.png&lt;span class="pl-pds"&gt;'&lt;/span&gt;&lt;/span&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;Here's what the result looks like if you open it up in VS Code and preview the Markdown:&lt;/p&gt;
&lt;p&gt;&lt;img src="https://static.simonwillison.net/static/2026/curl-demo.jpg" alt="Screenshot showing a Markdown file &amp;quot;demo.md&amp;quot; side-by-side with its rendered preview. The Markdown source (left) shows: &amp;quot;# How to use curl and jq&amp;quot;, italic timestamp &amp;quot;2026-02-10T01:12:30Z&amp;quot;, prose &amp;quot;Here's how to use curl and jq together.&amp;quot;, a bash code block with &amp;quot;curl -s https://api.github.com/repos/simonw/rodney | jq .description&amp;quot;, output block showing '&amp;quot;CLI tool for interacting with the web&amp;quot;', text &amp;quot;And the curl logo, to demonstrate the image command:&amp;quot;, a bash {image} code block with &amp;quot;curl -o curl-logo.png https://curl.se/logo/curl-logo.png &amp;amp;&amp;amp; echo curl-logo.png&amp;quot;, and a Markdown image reference &amp;quot;2056e48f-2026-02-10&amp;quot;. The rendered preview (right) displays the formatted heading, timestamp, prose, styled code blocks, and the curl logo image in dark teal showing &amp;quot;curl://&amp;quot; with circuit-style design elements." style="max-width: 100%;" /&gt;&lt;/p&gt;
&lt;p&gt;Here's that &lt;a href="https://gist.github.com/simonw/fb0b24696ed8dd91314fe41f4c453563#file-demo-md"&gt;demo.md file in a Gist&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;So a sequence of &lt;code&gt;showboat init&lt;/code&gt;, &lt;code&gt;showboat note&lt;/code&gt;, &lt;code&gt;showboat exec&lt;/code&gt; and &lt;code&gt;showboat image&lt;/code&gt; commands constructs a Markdown document one section at a time, with the output of those &lt;code&gt;exec&lt;/code&gt; commands automatically added to the document directly following the commands that were run.&lt;/p&gt;
&lt;p&gt;The &lt;code&gt;image&lt;/code&gt; command is a little special - it looks for a file path to an image in the output of the command and copies that image to the current folder and references it in the file.&lt;/p&gt;
&lt;p&gt;That's basically the whole thing! There's a &lt;code&gt;pop&lt;/code&gt; command to remove the most recently added section if something goes wrong, a &lt;code&gt;verify&lt;/code&gt; command to re-run the document and check nothing has changed (I'm not entirely convinced by the design of that one) and a &lt;code&gt;extract&lt;/code&gt; command that reverse-engineers the CLI commands that were used to create the document.&lt;/p&gt;
&lt;p&gt;It's pretty simple - just 172 lines of Go.&lt;/p&gt;
&lt;p&gt;I packaged it up with my &lt;a href="https://github.com/simonw/go-to-wheel"&gt;go-to-wheel&lt;/a&gt; tool which means you can run it without even installing it first like this:&lt;/p&gt;
&lt;div class="highlight highlight-source-shell"&gt;&lt;pre&gt;uvx showboat --help&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;That &lt;code&gt;--help&lt;/code&gt; command is really important: it's designed to provide a coding agent with &lt;em&gt;everything it needs to know&lt;/em&gt; in order to use the tool. Here's &lt;a href="https://github.com/simonw/showboat/blob/main/help.txt"&gt;that help text in full&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;This means you can pop open Claude Code and tell it:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;code&gt;Run "uvx showboat --help" and then use showboat to create a demo.md document describing the feature you just built&lt;/code&gt;&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;And that's it! The &lt;code&gt;--help&lt;/code&gt; text acts &lt;a href="https://simonwillison.net/2025/Oct/16/claude-skills/"&gt;a bit like a Skill&lt;/a&gt;. Your agent can read the help text and use every feature of Showboat to create a document that demonstrates whatever it is you need demonstrated.&lt;/p&gt;
&lt;p&gt;Here's a fun trick: if you set Claude off to build a Showboat document you can pop that open in VS Code and watch the preview pane update in real time as the agent runs through the demo. It's a bit like having your coworker talk you through their latest work in a screensharing session.&lt;/p&gt;
&lt;p&gt;And finally, some examples. Here are documents I had Claude create using Showboat to help demonstrate features I was working on in other projects:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;
&lt;a href="https://github.com/simonw/showboat-demos/blob/main/shot-scraper/README.md"&gt;shot-scraper: A Comprehensive Demo&lt;/a&gt; runs through the full suite of features of my &lt;a href="https://shot-scraper.datasette.io/"&gt;shot-scraper&lt;/a&gt; browser automation tool, mainly to exercise the &lt;code&gt;showboat image&lt;/code&gt; command.&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://github.com/simonw/sqlite-history-json/blob/main/demos/cli.md"&gt;sqlite-history-json CLI demo&lt;/a&gt; demonstrates the CLI feature I added to my new &lt;a href="https://github.com/simonw/sqlite-history-json"&gt;sqlite-history-json&lt;/a&gt; Python library.
&lt;ul&gt;
&lt;li&gt;
&lt;p&gt;&lt;a href="https://github.com/simonw/sqlite-history-json/blob/main/demos/row-state-sql.md"&gt;row-state-sql CLI Demo&lt;/a&gt; shows a new &lt;code&gt;row-state-sql&lt;/code&gt; command I added to that same project.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;a href="https://github.com/simonw/sqlite-history-json/blob/main/demos/change-grouping.md"&gt;Change grouping with Notes&lt;/a&gt; demonstrates another feature where groups of changes within the same transaction can have a note attached to them.&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://github.com/simonw/research/blob/main/libkrun-go-cli-tool/demo.md"&gt;krunsh: Pipe Shell Commands to an Ephemeral libkrun MicroVM&lt;/a&gt; is a particularly convoluted example where I managed to get Claude Code for web to run a libkrun microVM inside a QEMU emulated Linux environment inside the Claude gVisor sandbox.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;I've now used Showboat often enough that I've convinced myself of its utility.&lt;/p&gt;
&lt;p&gt;(I've also seen agents cheat! Since the demo file is Markdown the agent will sometimes edit that file directly rather than using Showboat, which could result in command outputs that don't reflect what actually happened. Here's &lt;a href="https://github.com/simonw/showboat/issues/12"&gt;an issue about that&lt;/a&gt;.)&lt;/p&gt;
&lt;h4 id="rodney-cli-browser-automation-designed-to-work-with-showboat"&gt;Rodney: CLI browser automation designed to work with Showboat&lt;/h4&gt;
&lt;p&gt;Many of the projects I work on involve web interfaces. Agents often build entirely new pages for these, and I want to see those represented in the demos.&lt;/p&gt;
&lt;p&gt;Showboat's image feature was designed to allow agents to capture screenshots as part of their demos, originally using my &lt;a href="https://shot-scraper.datasette.io/"&gt;shot-scraper tool&lt;/a&gt; or &lt;a href="https://www.playwright.dev"&gt;Playwright&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;The Showboat format benefits from CLI utilities. I went looking for good options for managing a multi-turn browser session from a CLI and came up short, so I decided to try building something new.&lt;/p&gt;
&lt;p&gt;Claude Opus 4.6 pointed me to the &lt;a href="https://github.com/go-rod/rod"&gt;Rod&lt;/a&gt; Go library for interacting with the Chrome DevTools protocol. It's fantastic - it provides a comprehensive wrapper across basically everything you can do with automated Chrome, all in a self-contained library that compiles to a few MBs.&lt;/p&gt;
&lt;p&gt;All Rod was missing was a CLI.&lt;/p&gt;
&lt;p&gt;I built the first version &lt;a href="https://github.com/simonw/research/blob/main/go-rod-cli/README.md"&gt;as an asynchronous report prototype&lt;/a&gt;, which convinced me it was worth spinning out into its own project.&lt;/p&gt;
&lt;p&gt;I called it Rodney as a nod to the Rod library it builds on and a reference to &lt;a href="https://en.wikipedia.org/wiki/Only_Fools_and_Horses"&gt;Only Fools and Horses&lt;/a&gt; - and because the package name was available on PyPI.&lt;/p&gt;
&lt;p&gt;You can run Rodney using &lt;code&gt;uvx rodney&lt;/code&gt; or install it like this:&lt;/p&gt;
&lt;div class="highlight highlight-source-shell"&gt;&lt;pre&gt;uv tool install rodney&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;(Or grab a Go binary &lt;a href="https://github.com/simonw/rodney/releases/"&gt;from the releases page&lt;/a&gt;.)&lt;/p&gt;
&lt;p&gt;Here's a simple example session:&lt;/p&gt;
&lt;div class="highlight highlight-source-shell"&gt;&lt;pre&gt;rodney start &lt;span class="pl-c"&gt;&lt;span class="pl-c"&gt;#&lt;/span&gt; starts Chrome in the background&lt;/span&gt;
rodney open https://datasette.io/
rodney js &lt;span class="pl-s"&gt;&lt;span class="pl-pds"&gt;'&lt;/span&gt;Array.from(document.links).map(el =&amp;gt; el.href).slice(0, 5)&lt;span class="pl-pds"&gt;'&lt;/span&gt;&lt;/span&gt;
rodney click &lt;span class="pl-s"&gt;&lt;span class="pl-pds"&gt;'&lt;/span&gt;a[href="/for"]&lt;span class="pl-pds"&gt;'&lt;/span&gt;&lt;/span&gt;
rodney js location.href
rodney js document.title
rodney screenshot datasette-for-page.png
rodney stop&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;Here's what that looks like in the terminal:&lt;/p&gt;
&lt;p&gt;&lt;img alt=";~ % rodney start
Chrome started (PID 91462)
Debug URL: ws://127.0.0.1:64623/devtools/browser/cac6988e-8153-483b-80b9-1b75c611868d
~ % rodney open https://datasette.io/
Datasette: An open source multi-tool for exploring and publishing data
~ % rodney js 'Array.from(document.links).map(el =&amp;gt; el.href).slice(0, 5)'
[
&amp;quot;https://datasette.io/for&amp;quot;,
&amp;quot;https://docs.datasette.io/en/stable/&amp;quot;,
&amp;quot;https://datasette.io/tutorials&amp;quot;,
&amp;quot;https://datasette.io/examples&amp;quot;,
&amp;quot;https://datasette.io/plugins&amp;quot;
]
~ % rodney click 'a[href=&amp;quot;/for&amp;quot;]'
Clicked
~ % rodney js location.href
https://datasette.io/for
~ % rodney js document.title
Use cases for Datasette
~ % rodney screenshot datasette-for-page.png
datasette-for-page.png
~ % rodney stop
Chrome stopped" src="https://static.simonwillison.net/static/2026/rodney-demo.jpg" style="max-width: 100%;" /&gt;&lt;/p&gt;
&lt;p&gt;As with Showboat, this tool is not designed to be used by humans! The goal is for coding agents to be able to run &lt;code&gt;rodney --help&lt;/code&gt; and see everything they need to know to start using the tool. You can see &lt;a href="https://github.com/simonw/rodney/blob/main/help.txt"&gt;that help output&lt;/a&gt; in the GitHub repo.&lt;/p&gt;
&lt;p&gt;Here are three demonstrations of Rodney that I created using Showboat:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;
&lt;a href="https://github.com/simonw/showboat-demos/blob/main/rodney/README.md"&gt;Rodney's original feature set&lt;/a&gt;, including screenshots of pages and executing JavaScript.&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://github.com/simonw/rodney/blob/main/notes/accessibility-features/README.md"&gt;Rodney's new accessibility testing features&lt;/a&gt;, built during development of those features to show what they could do.&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://github.com/simonw/showboat-demos/blob/main/datasette-database-page-accessibility-audit/README.md"&gt;Using those features to run a basic accessibility audit of a page&lt;/a&gt;. I was impressed at how well Claude Opus 4.6 responded to the prompt "Use showboat and rodney to perform an accessibility audit of &lt;a href="https://latest.datasette.io/fixtures"&gt;https://latest.datasette.io/fixtures&lt;/a&gt;" - &lt;a href="https://gisthost.github.io/?dce6b2680db4b05c04469ed8f251eb34/index.html"&gt;transcript here&lt;/a&gt;.&lt;/li&gt;
&lt;/ul&gt;
&lt;h4 id="test-driven-development-helps-but-we-still-need-manual-testing"&gt;Test-driven development helps, but we still need manual testing&lt;/h4&gt;
&lt;p&gt;After being a career-long skeptic of the test-first, maximum test coverage school of software development (I like &lt;a href="https://simonwillison.net/2022/Oct/29/the-perfect-commit/#tests"&gt;tests included&lt;/a&gt; development instead) I've recently come around to test-first processes as a way to force agents to write only the code that's necessary to solve the problem at hand.&lt;/p&gt;
&lt;p&gt;Many of my Python coding agent sessions start the same way:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;code&gt;Run the existing tests with "uv run pytest". Build using red/green TDD.&lt;/code&gt;&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;Telling the agents how to run the tests doubles as an indicator that tests on this project exist and matter. Agents will read existing tests before writing their own so having a clean test suite with good patterns makes it more likely they'll write good tests of their own.&lt;/p&gt;
&lt;p&gt;The frontier models all understand that "red/green TDD" means they should write the test first, run it and watch it fail and then write the code to make it pass - it's a convenient shortcut.&lt;/p&gt;
&lt;p&gt;I find this greatly increases the quality of the code and the likelihood that the agent will produce the right thing with the smallest amount of prompts to guide it.&lt;/p&gt;
&lt;p&gt;But anyone who's worked with tests will know that just because the automated tests pass doesn't mean the software actually works! That’s the motivation behind Showboat and Rodney - I never trust any feature until I’ve seen it running with my own eye.&lt;/p&gt;
&lt;p&gt;Before building Showboat I'd often add a “manual” testing step to my agent sessions, something like:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;code&gt;Once the tests pass, start a development server and exercise the new feature using curl&lt;/code&gt;&lt;/p&gt;
&lt;/blockquote&gt;
&lt;h4 id="i-built-both-of-these-tools-on-my-phone"&gt;I built both of these tools on my phone&lt;/h4&gt;
&lt;p&gt;Both Showboat and Rodney started life as Claude Code for web projects created via the Claude iPhone app. Most of the ongoing feature work for them happened in the same way.&lt;/p&gt;
&lt;p&gt;I'm still a little startled at how much of my coding work I get done on my phone now, but I'd estimate that the majority of code I ship to GitHub these days was written for me by coding agents driven via that iPhone app.&lt;/p&gt;
&lt;p&gt;I initially designed these two tools for use in asynchronous coding agent environments like Claude Code for the web. So far that's working out really well.&lt;/p&gt;
    
        &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/go"&gt;go&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/projects"&gt;projects&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/testing"&gt;testing&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/markdown"&gt;markdown&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/ai"&gt;ai&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/generative-ai"&gt;generative-ai&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/llms"&gt;llms&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/ai-assisted-programming"&gt;ai-assisted-programming&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/coding-agents"&gt;coding-agents&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/async-coding-agents"&gt;async-coding-agents&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/showboat"&gt;showboat&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/rodney"&gt;rodney&lt;/a&gt;&lt;/p&gt;
    

</summary><category term="go"/><category term="projects"/><category term="testing"/><category term="markdown"/><category term="ai"/><category term="generative-ai"/><category term="llms"/><category term="ai-assisted-programming"/><category term="coding-agents"/><category term="async-coding-agents"/><category term="showboat"/><category term="rodney"/></entry><entry><title>showboat v0.4.0</title><link href="https://simonwillison.net/2026/Feb/9/showboat/#atom-tag" rel="alternate"/><published>2026-02-09T05:38:27+00:00</published><updated>2026-02-09T05:38:27+00:00</updated><id>https://simonwillison.net/2026/Feb/9/showboat/#atom-tag</id><summary type="html">
    &lt;p&gt;&lt;strong&gt;Release:&lt;/strong&gt; &lt;a href="https://github.com/simonw/showboat/releases/tag/v0.4.0"&gt;showboat v0.4.0&lt;/a&gt;&lt;/p&gt;
    
    
        &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/showboat"&gt;showboat&lt;/a&gt;&lt;/p&gt;
    

</summary><category term="showboat"/></entry><entry><title>showboat v0.3.0</title><link href="https://simonwillison.net/2026/Feb/9/showboat-2/#atom-tag" rel="alternate"/><published>2026-02-09T01:42:18+00:00</published><updated>2026-02-09T01:42:18+00:00</updated><id>https://simonwillison.net/2026/Feb/9/showboat-2/#atom-tag</id><summary type="html">
    &lt;p&gt;&lt;strong&gt;Release:&lt;/strong&gt; &lt;a href="https://github.com/simonw/showboat/releases/tag/v0.3.0"&gt;showboat v0.3.0&lt;/a&gt;&lt;/p&gt;
    
    
        &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/showboat"&gt;showboat&lt;/a&gt;&lt;/p&gt;
    

</summary><category term="showboat"/></entry></feed>