<?xml version="1.0" encoding="utf-8"?>
<feed xml:lang="en-us" xmlns="http://www.w3.org/2005/Atom"><title>Simon Willison's Weblog: playwright</title><link href="http://simonwillison.net/" rel="alternate"/><link href="http://simonwillison.net/tags/playwright.atom" rel="self"/><id>http://simonwillison.net/</id><updated>2026-03-06T05:43:54+00:00</updated><author><name>Simon Willison</name></author><entry><title>Agentic manual testing</title><link href="https://simonwillison.net/guides/agentic-engineering-patterns/agentic-manual-testing/#atom-tag" rel="alternate"/><published>2026-03-06T05:43:54+00:00</published><updated>2026-03-06T05:43:54+00:00</updated><id>https://simonwillison.net/guides/agentic-engineering-patterns/agentic-manual-testing/#atom-tag</id><summary type="html">
    &lt;p&gt;&lt;em&gt;&lt;a href="https://simonwillison.net/guides/agentic-engineering-patterns/"&gt;Agentic Engineering Patterns&lt;/a&gt; &amp;gt;&lt;/em&gt;&lt;/p&gt;
    &lt;p&gt;The defining characteristic of a coding agent is that it can &lt;em&gt;execute the code&lt;/em&gt; that it writes. This is what makes coding agents so much more useful than LLMs that simply spit out code without any way to verify it.&lt;/p&gt;
&lt;p&gt;Never assume that code generated by an LLM works until that code has been executed.&lt;/p&gt;
&lt;p&gt;Coding agents have the ability to confirm that the code they have produced works as intended, or iterate further on that code until it does.&lt;/p&gt;
&lt;p&gt;Getting agents to &lt;a href="https://simonwillison.net/guides/agentic-engineering-patterns/red-green-tdd/"&gt;write unit tests&lt;/a&gt;, especially using test-first TDD, is a powerful way to ensure they have exercised the code they are writing.&lt;/p&gt;
&lt;p&gt;That's not the only worthwhile approach, though. &lt;/p&gt;
&lt;p&gt;Just because code passes tests doesn't mean it works as intended. Anyone who's worked with automated tests will have seen cases where the tests all pass but the code itself fails in some obvious way - it might crash the server on startup, fail to display a crucial UI element, or miss some detail that the tests failed to cover.&lt;/p&gt;
&lt;p&gt;Automated tests are no replacement for &lt;strong&gt;manual testing&lt;/strong&gt;. I like to see a feature working with my own eye before I land it in a release.&lt;/p&gt;
&lt;p&gt;I've found that getting agents to manually test code is valuable as well, frequently revealing issues that weren't spotted by the automated tests.&lt;/p&gt;
&lt;h2 id="mechanisms-for-agentic-manual-testing"&gt;Mechanisms for agentic manual testing&lt;/h2&gt;
&lt;p&gt;How an agent should "manually" test a piece of code varies depending on what that code is.&lt;/p&gt;
&lt;p&gt;For Python libraries a useful pattern is &lt;code&gt;python -c "... code ..."&lt;/code&gt;. You can pass a string (or multiline string) of Python code directly to the Python interpreter, including code that imports other modules.&lt;/p&gt;
&lt;p&gt;The coding agents are all familiar with this trick and will sometimes use it without prompting. Reminding them to test using &lt;code&gt;python -c&lt;/code&gt; can often be effective though:&lt;/p&gt;
&lt;div&gt;&lt;markdown-copy&gt;&lt;textarea&gt;Try that new function on some edge cases using `python -c`&lt;/textarea&gt;&lt;/markdown-copy&gt;&lt;/div&gt;
&lt;p&gt;Other languages may have similar mechanisms, and if they don't it's still quick for an agent to write out a demo file and then compile and run it. I sometimes encourage it to use &lt;code&gt;/tmp&lt;/code&gt; purely to avoid those files being accidentally committed to the repository later on.&lt;/p&gt;
&lt;div&gt;&lt;markdown-copy&gt;&lt;textarea&gt;Write code in `/tmp` to try edge cases of that function and then compile and run it&lt;/textarea&gt;&lt;/markdown-copy&gt;&lt;/div&gt;
&lt;p&gt;Many of my projects involve building web applications with JSON APIs. For these I tell the agent to exercise them using &lt;code&gt;curl&lt;/code&gt;:&lt;/p&gt;
&lt;div&gt;&lt;markdown-copy&gt;&lt;textarea&gt;Run a dev server and explore that new JSON API using `curl`&lt;/textarea&gt;&lt;/markdown-copy&gt;&lt;/div&gt;
&lt;p&gt;Telling an agent to "explore" often results in it trying out a bunch of different aspects of a new API, which can quickly cover a whole lot of ground.&lt;/p&gt;
&lt;p&gt;If an agent finds something that doesn't work through their manual testing, I like to tell them to fix it with red/green TDD. This ensures the new case ends up covered by the permanent automated tests.&lt;/p&gt;
&lt;h2 id="using-browser-automation-for-web-uis"&gt;Using browser automation for web UIs&lt;/h2&gt;
&lt;p&gt;Having a manual testing procedure in place becomes even more valuable if a project involves an interactive web UI.&lt;/p&gt;
&lt;p&gt;Historically these have been difficult to test from code, but the past decade has seen notable improvements in systems for automating real web browsers. Running a real Chrome or Firefox or Safari browser against an application can uncover all sorts of interesting problems in a realistic setting.&lt;/p&gt;
&lt;p&gt;Coding agents know how to use these tools extremely well.&lt;/p&gt;
&lt;p&gt;The most powerful of these today is &lt;strong&gt;&lt;a href="https://playwright.dev/"&gt;Playwright&lt;/a&gt;&lt;/strong&gt;, an open source library developed by Microsoft. Playwright offers a full-featured API with bindings in multiple popular programming languages and can automate any of the popular browser engines.&lt;/p&gt;
&lt;p&gt;Simply telling your agent to "test that with Playwright" may be enough. The agent can then select the language binding that makes the most sense, or use Playwright's &lt;a href="https://github.com/microsoft/playwright-cli"&gt;playwright-cli&lt;/a&gt; tool.&lt;/p&gt;
&lt;p&gt;Coding agents work really well with dedicated CLIs. &lt;a href="https://github.com/vercel-labs/agent-browser"&gt;agent-browser&lt;/a&gt; by Vercel is a comprehensive CLI wrapper around Playwright specially designed for coding agents to use.&lt;/p&gt;
&lt;p&gt;My own project &lt;a href="https://github.com/simonw/rodney"&gt;Rodney&lt;/a&gt; serves a similar purpose, albeit using the Chrome DevTools Protocol to directly control an instance of Chrome.&lt;/p&gt;
&lt;p&gt;Here's an example prompt I use to test things with Rodney:&lt;/p&gt;
&lt;p&gt;&lt;div&gt;&lt;markdown-copy&gt;&lt;textarea&gt;Start a dev server and then use `uvx rodney --help` to test the new homepage, look at screenshots to confirm the menu is in the right place&lt;/textarea&gt;&lt;/markdown-copy&gt;&lt;/div&gt;
There are three tricks in this prompt:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Saying "use &lt;code&gt;uvx rodney --help&lt;/code&gt;" causes the agent to run &lt;code&gt;rodney --help&lt;/code&gt; via the &lt;a href="https://docs.astral.sh/uv/guides/tools/"&gt;uvx&lt;/a&gt; package management tool, which automatically installs Rodney the first time it is called.&lt;/li&gt;
&lt;li&gt;The &lt;code&gt;rodney --help&lt;/code&gt; command is specifically designed to give agents everything they need to know to both understand and use the tool. Here's &lt;a href="https://github.com/simonw/rodney/blob/main/help.txt"&gt;that help text&lt;/a&gt;.&lt;/li&gt;
&lt;li&gt;Saying "look at screenshots" hints to the agent that it should use the &lt;code&gt;rodney screenshot&lt;/code&gt; command and reminds it that it can use its own vision abilities against the resulting image files to evaluate the visual appearance of the page.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;That's a whole lot of manual testing baked into a short prompt!&lt;/p&gt;
&lt;p&gt;Rodney and tools like it offer a wide array of capabilities, from running JavaScript on the loaded site to scrolling, clicking, typing, and even reading the accessibility tree of the page.&lt;/p&gt;
&lt;p&gt;As with other forms of manual tests, issues found and fixed via browser automation can then be added to permanent automated tests as well.&lt;/p&gt;
&lt;p&gt;Many developers have avoided too many automated browser tests in the past due to their reputation for flakiness - the smallest tweak to the HTML of a page can result in frustrating waves of test breaks.&lt;/p&gt;
&lt;p&gt;Having coding agents maintain those tests over time greatly reduces the friction involved in keeping them up-to-date in the face of design changes to the web interfaces.&lt;/p&gt;
&lt;h2 id="have-them-take-notes-with-showboat"&gt;Have them take notes with Showboat&lt;/h2&gt;
&lt;p&gt;Having agents manually test code can catch extra problems, but it can also be used to create artifacts that can help document the code and demonstrate how it has been tested.&lt;/p&gt;
&lt;p&gt;I'm fascinated by the challenge of having agents &lt;em&gt;show their work&lt;/em&gt;. Being able to see demos or documented experiments is a really useful way of confirming that the agent has comprehensively solved the challenge it was given.&lt;/p&gt;
&lt;p&gt;I built &lt;a href="https://github.com/simonw/showboat"&gt;Showboat&lt;/a&gt; to facilitate building documents that capture the agentic manual testing flow.&lt;/p&gt;
&lt;p&gt;Here's a prompt I frequently use:&lt;/p&gt;
&lt;p&gt;&lt;div&gt;&lt;markdown-copy&gt;&lt;textarea&gt;Run `uvx showboat --help` and then create a `notes/api-demo.md` showboat document and use it to test and document that new API.&lt;/textarea&gt;&lt;/markdown-copy&gt;&lt;/div&gt;
As with Rodney above, the &lt;code&gt;showboat --help&lt;/code&gt; command teaches the agent what Showboat is and how to use it. Here's &lt;a href="https://github.com/simonw/showboat/blob/main/help.txt"&gt;that help text in full&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;The three key Showboat commands are &lt;code&gt;note&lt;/code&gt;, &lt;code&gt;exec&lt;/code&gt;, and &lt;code&gt;image&lt;/code&gt;.&lt;/p&gt;
&lt;p&gt;&lt;code&gt;note&lt;/code&gt; appends a Markdown note to the Showboat document. &lt;code&gt;exec&lt;/code&gt; records a command, then runs that command and records its output. &lt;code&gt;image&lt;/code&gt; adds an image to the document - useful for screenshots of web applications taken using Rodney.&lt;/p&gt;
&lt;p&gt;The &lt;code&gt;exec&lt;/code&gt; command is the most important of these, because it captures a command along with the resulting output. This shows you what the agent did and what the result was, and is designed to discourage the agent from cheating and writing what it &lt;em&gt;hoped&lt;/em&gt; had happened into the document.&lt;/p&gt;
&lt;p&gt;I've been finding the Showboat pattern to work really well for documenting the work that has been achieved during my agent sessions. I'm hoping to see similar patterns adopted across a wider set of tools.&lt;/p&gt;
    
        &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/playwright"&gt;playwright&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/testing"&gt;testing&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/agentic-engineering"&gt;agentic-engineering&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/ai"&gt;ai&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/llms"&gt;llms&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/coding-agents"&gt;coding-agents&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/ai-assisted-programming"&gt;ai-assisted-programming&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/rodney"&gt;rodney&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/showboat"&gt;showboat&lt;/a&gt;&lt;/p&gt;
    

</summary><category term="playwright"/><category term="testing"/><category term="agentic-engineering"/><category term="ai"/><category term="llms"/><category term="coding-agents"/><category term="ai-assisted-programming"/><category term="rodney"/><category term="showboat"/></entry><entry><title>Vibe scraping and vibe coding a schedule app for Open Sauce 2025 entirely on my phone</title><link href="https://simonwillison.net/2025/Jul/17/vibe-scraping/#atom-tag" rel="alternate"/><published>2025-07-17T19:38:50+00:00</published><updated>2025-07-17T19:38:50+00:00</updated><id>https://simonwillison.net/2025/Jul/17/vibe-scraping/#atom-tag</id><summary type="html">
    &lt;p&gt;This morning, working entirely on my phone, I scraped a conference website and vibe coded up an alternative UI for interacting with the schedule using a combination of OpenAI Codex and Claude Artifacts.&lt;/p&gt;
&lt;p&gt;This weekend is &lt;a href="https://opensauce.com/"&gt;Open Sauce 2025&lt;/a&gt;, the third edition of the Bay Area conference for YouTube creators in the science and engineering space. I have a couple of friends going and they were complaining that the official schedule was difficult to navigate on a phone - it's not even linked from the homepage on mobile, and once you do find &lt;a href="https://opensauce.com/agenda/"&gt;the agenda&lt;/a&gt; it isn't particularly mobile-friendly.&lt;/p&gt;
&lt;p&gt;We were out for coffee this morning so I only had my phone, but I decided to see if I could fix it anyway.&lt;/p&gt;
&lt;p&gt;TLDR: Working entirely on my iPhone, using a combination of &lt;a href="https://chatgpt.com/codex"&gt;OpenAI Codex&lt;/a&gt; in the ChatGPT mobile app and Claude Artifacts via the Claude app, I was able to scrape the full schedule and then build and deploy this: &lt;a href="https://tools.simonwillison.net/open-sauce-2025"&gt;tools.simonwillison.net/open-sauce-2025&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;img src="https://static.simonwillison.net/static/2025/open-sauce-2025-card.jpg" alt="Screenshot of a blue page, Open Sauce 2025, July 18-20 2025, Download Calendar ICS button, then Friday 18th and Saturday 18th and Sunday 20th pill buttons, Friday is selected, the Welcome to Open Sauce with William Osman event on the Industry Stage is visible." style="max-width: 100%;" /&gt;&lt;/p&gt;
&lt;p&gt;The site offers a faster loading and more useful agenda view, but more importantly it includes an option to "Download Calendar (ICS)" which allows mobile phone users (Android and iOS) to easily import the schedule events directly into their calendar app of choice.&lt;/p&gt;
&lt;p&gt;Here are some detailed notes on how I built it.&lt;/p&gt;
&lt;h4 id="scraping-the-schedule"&gt;Scraping the schedule&lt;/h4&gt;
&lt;p&gt;Step one was to get that schedule in a structured format. I don't have good tools for viewing source on my iPhone, so I took a different approach to turning the schedule site into structured data.&lt;/p&gt;
&lt;p&gt;My first thought was to screenshot the schedule on my phone and then dump the images into a vision LLM - but the schedule was long enough that I didn't feel like scrolling through several different pages and stitching together dozens of images.&lt;/p&gt;
&lt;p&gt;If I was working on a laptop I'd turn to scraping: I'd dig around in the site itself and figure out where the data came from, then write code to extract it out.&lt;/p&gt;
&lt;p&gt;How could I do the same thing working on my phone?&lt;/p&gt;
&lt;p&gt;I decided to use &lt;strong&gt;OpenAI Codex&lt;/strong&gt; - the &lt;a href="https://simonwillison.net/2025/May/16/openai-codex/"&gt;hosted tool&lt;/a&gt;, not the confusingly named &lt;a href="https://simonwillison.net/2025/Apr/16/openai-codex/"&gt;CLI utility&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;Codex recently &lt;a href="https://simonwillison.net/2025/Jun/3/codex-agent-internet-access/"&gt;grew the ability&lt;/a&gt; to interact with the internet while attempting to resolve a task. I have a dedicated Codex "environment" configured against a GitHub repository that doesn't do anything else, purely so I can run internet-enabled sessions there that can execute arbitrary network-enabled commands.&lt;/p&gt;
&lt;p&gt;I started a new task there (using the Codex interface inside the ChatGPT iPhone app) and prompted:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;code&gt;Install playwright and use it to visit https://opensauce.com/agenda/ and grab the full details of all three day schedules from the tabs - Friday and Saturday and Sunday - then save and on Data in as much detail as possible in a JSON file and submit that as a PR&lt;/code&gt;&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;Codex is frustrating in that you only get one shot: it can go away and work autonomously on a task for a long time, but while it's working you can't give it follow-up prompts. You can wait for it to finish entirely and then tell it to try again in a new session, but ideally the instructions you give it are enough for it to get to the finish state where it submits a pull request against your repo with the results.&lt;/p&gt;
&lt;p&gt;I got lucky: my above prompt worked exactly as intended.&lt;/p&gt;
&lt;p&gt;Codex churned for a &lt;em&gt;13 minutes&lt;/em&gt;! I was sat chatting in a coffee shop, occasionally checking the logs to see what it was up to.&lt;/p&gt;
&lt;p&gt;It tried a whole bunch of approaches, all involving running the Playwright Python library to interact with the site. You can see &lt;a href="https://chatgpt.com/s/cd_687945dea5f48191892e0d73ebb45aa4"&gt;the full transcript here&lt;/a&gt;. It includes notes like "&lt;em&gt;Looks like xxd isn't installed. I'll grab "vim-common" or "xxd" to fix it.&lt;/em&gt;".&lt;/p&gt;
&lt;p&gt;Eventually it downloaded an enormous obfuscated chunk of JavaScript called &lt;a href="https://opensauce.com/wp-content/uploads/2025/07/schedule-overview-main-1752724893152.js"&gt;schedule-overview-main-1752724893152.js&lt;/a&gt; (316KB) and then ran a complex sequence of grep, grep, sed, strings, xxd and dd commands against it to figure out the location of the raw schedule data in order to extract it out.&lt;/p&gt;
&lt;p&gt;Here's the eventual &lt;a href="https://github.com/simonw/.github/blob/f671bf57f7c20a4a7a5b0642837811e37c557499/extract_schedule.py"&gt;extract_schedule.py&lt;/a&gt; Python script it wrote, which uses Playwright to save that &lt;code&gt;schedule-overview-main-1752724893152.js&lt;/code&gt; file and then extracts the raw data using the following code (which calls Node.js inside Python, just so it can use the JavaScript &lt;code&gt;eval()&lt;/code&gt; function):&lt;/p&gt;
&lt;pre&gt;&lt;span class="pl-s1"&gt;node_script&lt;/span&gt; &lt;span class="pl-c1"&gt;=&lt;/span&gt; (
    &lt;span class="pl-s"&gt;"const fs=require('fs');"&lt;/span&gt;
    &lt;span class="pl-s"&gt;f"const d=fs.readFileSync('&lt;span class="pl-s1"&gt;&lt;span class="pl-kos"&gt;{&lt;/span&gt;&lt;span class="pl-s1"&gt;tmp_path&lt;/span&gt;&lt;span class="pl-kos"&gt;}&lt;/span&gt;&lt;/span&gt;','utf8');"&lt;/span&gt;
    &lt;span class="pl-s"&gt;"const m=d.match(/var oo=(&lt;span class="pl-cce"&gt;\\&lt;/span&gt;{.*?&lt;span class="pl-cce"&gt;\\&lt;/span&gt;});/s);"&lt;/span&gt;
    &lt;span class="pl-s"&gt;"if(!m){throw new Error('not found');}"&lt;/span&gt;
    &lt;span class="pl-s"&gt;"const obj=eval('(' + m[1] + ')');"&lt;/span&gt;
    &lt;span class="pl-s"&gt;f"fs.writeFileSync('&lt;span class="pl-s1"&gt;&lt;span class="pl-kos"&gt;{&lt;/span&gt;&lt;span class="pl-c1"&gt;OUTPUT_FILE&lt;/span&gt;&lt;span class="pl-kos"&gt;}&lt;/span&gt;&lt;/span&gt;', JSON.stringify(obj, null, 2));"&lt;/span&gt;
)
&lt;span class="pl-s1"&gt;subprocess&lt;/span&gt;.&lt;span class="pl-c1"&gt;run&lt;/span&gt;([&lt;span class="pl-s"&gt;'node'&lt;/span&gt;, &lt;span class="pl-s"&gt;'-e'&lt;/span&gt;, &lt;span class="pl-s1"&gt;node_script&lt;/span&gt;], &lt;span class="pl-s1"&gt;check&lt;/span&gt;&lt;span class="pl-c1"&gt;=&lt;/span&gt;&lt;span class="pl-c1"&gt;True&lt;/span&gt;)&lt;/pre&gt;
&lt;p&gt;As instructed, it then filed &lt;a href="https://github.com/simonw/.github/pull/1"&gt;a PR against my repo&lt;/a&gt;. It included the Python Playwright script, but more importantly it also included that full extracted &lt;a href="https://github.com/simonw/.github/blob/f671bf57f7c20a4a7a5b0642837811e37c557499/schedule.json"&gt;schedule.json&lt;/a&gt; file. That meant I now had the schedule data, with a  &lt;code&gt;raw.githubusercontent.com&lt;/code&gt;  URL with open CORS headers that could be fetched by a web app!&lt;/p&gt;
&lt;h4 id="building-the-web-app"&gt;Building the web app&lt;/h4&gt;
&lt;p&gt;Now that I had the data, the next step was to build a web application to preview it and serve it up in a more useful format.&lt;/p&gt;
&lt;p&gt;I decided I wanted two things: a nice mobile friendly interface for browsing the schedule, and mechanism for importing that schedule into a calendar application, such as Apple or Google Calendar.&lt;/p&gt;
&lt;p&gt;It took me several false starts to get this to work. The biggest challenge was getting that 63KB of schedule JSON data into the app. I tried a few approaches here, all on my iPhone while sitting in coffee shop and later while driving with a friend to drop them off at the closest BART station.&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;Using ChatGPT Canvas and o3, since unlike Claude Artifacts a Canvas can fetch data from remote URLs if you allow-list that domain. I later found out that &lt;a href="https://chatgpt.com/share/687948b7-e8b8-8006-a450-0c07bdfd7f85"&gt;this had worked&lt;/a&gt; when I viewed it on my laptop, but on my phone it threw errors so I gave up on it.&lt;/li&gt;
&lt;li&gt;Uploading the JSON to Claude and telling it to build an artifact that read the file directly - this &lt;a href="https://claude.ai/share/25297074-37a9-4583-bc2f-630f6dea5c5d"&gt;failed with an error&lt;/a&gt; "undefined is not an object (evaluating 'window.fs.readFile')". The Claude 4 system prompt &lt;a href="https://simonwillison.net/2025/May/25/claude-4-system-prompt/#artifacts-the-missing-manual"&gt;had lead me to expect this to work&lt;/a&gt;, I'm not sure why it didn't.&lt;/li&gt;
&lt;li&gt;Having Claude copy the full JSON into the artifact. This took too long - typing out 63KB of JSON is not a sensible use of LLM tokens, and it flaked out on me when my connection went intermittent driving through a tunnel.&lt;/li&gt;
&lt;li&gt;Telling Claude to fetch from the URL to that schedule JSON instead. This was my last resort because the Claude Artifacts UI blocks access to external URLs, so you have to copy and paste the code out to a separate interface (on an iPhone, which still lacks a "select all" button) making for a frustrating process.&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;That final option worked! Here's the full sequence of prompts I used with Claude to get to a working implementation - &lt;a href="https://claude.ai/share/e391bbcc-09a2-4f86-9bec-c6def8fc8dc9"&gt;full transcript here&lt;/a&gt;:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;code&gt;Use your analyst tool to read this JSON file and show me the top level keys&lt;/code&gt;&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;This was to prime Claude - I wanted to remind it about its &lt;code&gt;window.fs.readFile&lt;/code&gt; function and have it read enough of the JSON to understand the structure.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;code&gt;Build an artifact with no react that turns the schedule into a nice mobile friendly webpage - there are three days Friday, Saturday and Sunday, which corresponded to the 25th and 26th and 27th of July 2025&lt;/code&gt;&lt;/p&gt;
&lt;p&gt;&lt;code&gt;Don’t copy the raw JSON over to the artifact - use your fs function to read it instead&lt;/code&gt;&lt;/p&gt;
&lt;p&gt;&lt;code&gt;Also include a button to download ICS at the top of the page which downloads a ICS version of the schedule&lt;/code&gt;&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;I had noticed that the schedule data had keys for "friday" and "saturday" and "sunday" but no indication of the dates, so I told it those. It turned out later I'd got these wrong!&lt;/p&gt;
&lt;p&gt;This got me a version of the page that failed with an error, because that &lt;code&gt;fs.readFile()&lt;/code&gt; couldn't load the data from the artifact for some reason. So I fixed that with:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;code&gt;Change it so instead of using the readFile thing it fetches the same JSON from  https://raw.githubusercontent.com/simonw/.github/f671bf57f7c20a4a7a5b0642837811e37c557499/schedule.json&lt;/code&gt;&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;... then copied the HTML out to a Gist and previewed it with &lt;a href="https://gistpreview.github.io/"&gt;gistpreview.github.io&lt;/a&gt; - here's &lt;a href="https://gistpreview.github.io/?06a5d1f3bf0af81d55a411f32b2f37c7"&gt;that preview&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;Then we spot-checked it, since there are &lt;em&gt;so many ways&lt;/em&gt; this could have gone wrong. Thankfully the schedule JSON itself never round-tripped through an LLM so we didn't need to worry about hallucinated session details, but this was almost pure vibe coding so there was a big risk of a mistake sneaking through.&lt;/p&gt;
&lt;p&gt;I'd set myself a deadline of "by the time we drop my friend at the BART station" and I hit that deadline with just seconds to spare. I pasted the resulting HTML &lt;a href="https://github.com/simonw/tools/blob/main/open-sauce-2025.html"&gt;into my simonw/tools GitHub repo&lt;/a&gt; using the GitHub mobile web interface which deployed it to that final &lt;a href="https://tools.simonwillison.net/open-sauce-2025"&gt;tools.simonwillison.net/open-sauce-2025&lt;/a&gt; URL.&lt;/p&gt;
&lt;p&gt;... then we noticed that we &lt;em&gt;had&lt;/em&gt; missed a bug: I had given it the dates of "25th and 26th and 27th of July 2025" but actually that was a week too late, the correct dates were July 18th-20th.&lt;/p&gt;
&lt;p&gt;Thankfully I have Codex configured against my &lt;code&gt;simonw/tools&lt;/code&gt; repo as well, so fixing that was a case of prompting a new Codex session with:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;code&gt;The open sauce schedule got the dates wrong - Friday is 18 July 2025 and Saturday is 19 and Sunday is 20 - fix it&lt;/code&gt;&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;Here's &lt;a href="https://chatgpt.com/s/cd_68794c97a3d88191a2cbe9de78103334"&gt;that Codex transcript&lt;/a&gt;, which resulted in &lt;a href="https://github.com/simonw/tools/pull/34"&gt;this PR&lt;/a&gt; which I landed and deployed, again using the GitHub mobile web interface.&lt;/p&gt;
&lt;h4 id="what-this-all-demonstrates"&gt;What this all demonstrates&lt;/h4&gt;
&lt;p&gt;So, to recap: I was able to scrape a website (without even a view source too), turn the resulting JSON data into a mobile-friendly website, add an ICS export feature and deploy the results to a static hosting platform (GitHub Pages) working entirely on my phone.&lt;/p&gt;
&lt;p&gt;If I'd had a laptop this project would have been faster, but honestly aside from a little bit more hands-on debugging I wouldn't have gone about it in a particularly different way.&lt;/p&gt;
&lt;p&gt;I was able to do other stuff at the same time - the Codex scraping project ran entirely autonomously, and the app build itself was more involved only because I had to work around the limitations of the tools I was using in terms of fetching data from external sources.&lt;/p&gt;
&lt;p&gt;As usual with this stuff, my 25+ years of previous web development experience was critical to being able to execute the project. I knew about Codex, and Artifacts, and GitHub, and Playwright, and CORS headers, and Artifacts sandbox limitations, and the capabilities of ICS files on mobile phones.&lt;/p&gt;
&lt;p&gt;This whole thing was &lt;em&gt;so much fun!&lt;/em&gt; Being able to spin up multiple coding agents directly from my phone and have them solve quite complex problems while only paying partial attention to the details is a solid demonstration of why I continue to enjoying exploring the edges of &lt;a href="https://simonwillison.net/tags/ai-assisted-programming/"&gt;AI-assisted programming&lt;/a&gt;.&lt;/p&gt;

&lt;h4 id="update-i-removed-the-speaker-avatars"&gt;Update: I removed the speaker avatars&lt;/h4&gt;
&lt;p&gt;Here's a beautiful cautionary tale about the dangers of vibe-coding on a phone with no access to performance profiling tools. A commenter on Hacker News &lt;a href="https://news.ycombinator.com/item?id=44597405#44597808"&gt;pointed out&lt;/a&gt;:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;The web app makes 176 requests and downloads 130 megabytes.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;And yeah, it did! Turns out those speaker avatar images weren't optimized, and there were over 170 of them.&lt;/p&gt;
&lt;p&gt;I told &lt;a href="https://chatgpt.com/s/cd_6879631d99c48191b1ab7f84dfab8dea"&gt;a fresh Codex instance&lt;/a&gt; "Remove the speaker avatar images from open-sauce-2025.html" and now the page weighs 93.58 KB - about 1,400 times smaller!&lt;/p&gt;
&lt;h4 id="update-2-improved-accessibility"&gt;Update 2: Improved accessibility&lt;/h4&gt;
&lt;p&gt;That same commenter &lt;a href="https://news.ycombinator.com/item?id=44597405#44597808"&gt;on Hacker News&lt;/a&gt;:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;It's also &lt;code&gt;&amp;lt;div&amp;gt;&lt;/code&gt; soup and largely inaccessible.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;Yeah, this HTML isn't great:&lt;/p&gt;
&lt;div class="highlight highlight-source-js"&gt;&lt;pre&gt;&lt;span class="pl-s1"&gt;dayContainer&lt;/span&gt;&lt;span class="pl-kos"&gt;.&lt;/span&gt;&lt;span class="pl-c1"&gt;innerHTML&lt;/span&gt; &lt;span class="pl-c1"&gt;=&lt;/span&gt; &lt;span class="pl-s1"&gt;sessions&lt;/span&gt;&lt;span class="pl-kos"&gt;.&lt;/span&gt;&lt;span class="pl-c1"&gt;map&lt;/span&gt;&lt;span class="pl-kos"&gt;(&lt;/span&gt;&lt;span class="pl-s1"&gt;session&lt;/span&gt; &lt;span class="pl-c1"&gt;=&amp;gt;&lt;/span&gt; `
    &amp;lt;div class="session-card"&amp;gt;
        &amp;lt;div class="session-header"&amp;gt;
            &amp;lt;div&amp;gt;
                &amp;lt;span class="session-time"&amp;gt;&lt;span class="pl-s1"&gt;&lt;span class="pl-kos"&gt;${&lt;/span&gt;&lt;span class="pl-s1"&gt;session&lt;/span&gt;&lt;span class="pl-kos"&gt;.&lt;/span&gt;&lt;span class="pl-c1"&gt;time&lt;/span&gt;&lt;span class="pl-kos"&gt;}&lt;/span&gt;&lt;/span&gt;&amp;lt;/span&amp;gt;
                &amp;lt;span class="length-badge"&amp;gt;&lt;span class="pl-s1"&gt;&lt;span class="pl-kos"&gt;${&lt;/span&gt;&lt;span class="pl-s1"&gt;session&lt;/span&gt;&lt;span class="pl-kos"&gt;.&lt;/span&gt;&lt;span class="pl-c1"&gt;length&lt;/span&gt;&lt;span class="pl-kos"&gt;}&lt;/span&gt;&lt;/span&gt; min&amp;lt;/span&amp;gt;
            &amp;lt;/div&amp;gt;
            &amp;lt;div class="session-location"&amp;gt;&lt;span class="pl-s1"&gt;&lt;span class="pl-kos"&gt;${&lt;/span&gt;&lt;span class="pl-s1"&gt;session&lt;/span&gt;&lt;span class="pl-kos"&gt;.&lt;/span&gt;&lt;span class="pl-c1"&gt;where&lt;/span&gt;&lt;span class="pl-kos"&gt;}&lt;/span&gt;&lt;/span&gt;&amp;lt;/&lt;span class="pl-s1"&gt;div&lt;/span&gt;&lt;span class="pl-c1"&gt;&amp;gt;&lt;/span&gt;
        &amp;lt;/&lt;span class="pl-s1"&gt;div&lt;/span&gt;&lt;span class="pl-c1"&gt;&amp;gt;&lt;/span&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;I &lt;a href="https://github.com/simonw/tools/issues/36"&gt;opened an issue&lt;/a&gt; and had both Claude Code and Codex look at it. Claude Code &lt;a href="https://github.com/simonw/tools/issues/36#issuecomment-3085516331"&gt;failed to submit a PR&lt;/a&gt; for some reason, but Codex &lt;a href="https://github.com/simonw/tools/pull/37"&gt;opened one&lt;/a&gt; with a fix that sounded good to me when I tried it with VoiceOver on iOS (using &lt;a href="https://codex-make-open-sauce-2025-h.tools-b1q.pages.dev/open-sauce-2025"&gt;a Cloudflare Pages preview&lt;/a&gt;) so I landed that. Here's &lt;a href="https://github.com/simonw/tools/commit/29c8298363869bbd4b4e7c51378c20dc8ac30c39"&gt;the diff&lt;/a&gt;, which added a hidden "skip to content" link, some &lt;code&gt;aria-&lt;/code&gt; attributes on buttons and upgraded the HTML to use &lt;code&gt;&amp;lt;h3&amp;gt;&lt;/code&gt; for the session titles.&lt;/p&gt;
&lt;p&gt;Next time I'll remember to specify accessibility as a requirement in the initial prompt. I'm disappointed that Claude didn't consider that without me having to ask.&lt;/p&gt;
    
        &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/definitions"&gt;definitions&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/github"&gt;github&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/icalendar"&gt;icalendar&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/mobile"&gt;mobile&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/scraping"&gt;scraping&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/tools"&gt;tools&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/ai"&gt;ai&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/playwright"&gt;playwright&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/openai"&gt;openai&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/generative-ai"&gt;generative-ai&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/chatgpt"&gt;chatgpt&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/llms"&gt;llms&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/ai-assisted-programming"&gt;ai-assisted-programming&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/claude"&gt;claude&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/claude-artifacts"&gt;claude-artifacts&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/ai-agents"&gt;ai-agents&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/vibe-coding"&gt;vibe-coding&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/coding-agents"&gt;coding-agents&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/async-coding-agents"&gt;async-coding-agents&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/prompt-to-app"&gt;prompt-to-app&lt;/a&gt;&lt;/p&gt;
    

</summary><category term="definitions"/><category term="github"/><category term="icalendar"/><category term="mobile"/><category term="scraping"/><category term="tools"/><category term="ai"/><category term="playwright"/><category term="openai"/><category term="generative-ai"/><category term="chatgpt"/><category term="llms"/><category term="ai-assisted-programming"/><category term="claude"/><category term="claude-artifacts"/><category term="ai-agents"/><category term="vibe-coding"/><category term="coding-agents"/><category term="async-coding-agents"/><category term="prompt-to-app"/></entry><entry><title>TIL: Using Playwright MCP with Claude Code</title><link href="https://simonwillison.net/2025/Jul/1/using-playwright-mcp-with-claude-code/#atom-tag" rel="alternate"/><published>2025-07-01T23:55:09+00:00</published><updated>2025-07-01T23:55:09+00:00</updated><id>https://simonwillison.net/2025/Jul/1/using-playwright-mcp-with-claude-code/#atom-tag</id><summary type="html">
    
&lt;p&gt;&lt;strong&gt;&lt;a href="https://til.simonwillison.net/claude-code/playwright-mcp-claude-code"&gt;TIL: Using Playwright MCP with Claude Code&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;
Inspired &lt;a href="https://simonwillison.net/2025/Jun/29/agentic-coding/"&gt;by Armin&lt;/a&gt; ("I personally use only one MCP - I only use Playwright") I decided to figure out how to use the official &lt;a href="https://github.com/microsoft/playwright-mcp"&gt;Playwright MCP server&lt;/a&gt; with &lt;a href="https://simonwillison.net/tags/claude-code/"&gt;Claude Code&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;It turns out it's easy:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;claude mcp add playwright npx '@playwright/mcp@latest'
claude
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;The &lt;code&gt;claude mcp add&lt;/code&gt; command only affects the current directory by default - it gets persisted in the &lt;code&gt;~/.claude.json&lt;/code&gt; file.&lt;/p&gt;
&lt;p&gt;Now Claude can use Playwright to automate a Chrome browser! Tell it to "Use playwright mcp to open a browser to example.com" and watch it go - it can navigate pages, submit forms, execute custom JavaScript and take screenshots to feed back into the LLM.&lt;/p&gt;
&lt;p&gt;The browser window stays visible which means you can interact with it too, including signing into websites so Claude can act on your behalf.


    &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/armin-ronacher"&gt;armin-ronacher&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/til"&gt;til&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/playwright"&gt;playwright&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/ai-assisted-programming"&gt;ai-assisted-programming&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/anthropic"&gt;anthropic&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/claude"&gt;claude&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/claude-code"&gt;claude-code&lt;/a&gt;&lt;/p&gt;



</summary><category term="armin-ronacher"/><category term="til"/><category term="playwright"/><category term="ai-assisted-programming"/><category term="anthropic"/><category term="claude"/><category term="claude-code"/></entry><entry><title>shot-scraper 1.8</title><link href="https://simonwillison.net/2025/Mar/25/shot-scraper/#atom-tag" rel="alternate"/><published>2025-03-25T01:59:38+00:00</published><updated>2025-03-25T01:59:38+00:00</updated><id>https://simonwillison.net/2025/Mar/25/shot-scraper/#atom-tag</id><summary type="html">
    
&lt;p&gt;&lt;strong&gt;&lt;a href="https://github.com/simonw/shot-scraper/releases/tag/1.8"&gt;shot-scraper 1.8&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;
I've added a new feature to &lt;a href="https://shot-scraper.datasette.io/"&gt;shot-scraper&lt;/a&gt; that makes it easier to share scripts for other people to use with the &lt;a href="https://shot-scraper.datasette.io/en/stable/javascript.html"&gt;shot-scraper javascript&lt;/a&gt; command.&lt;/p&gt;
&lt;p&gt;&lt;code&gt;shot-scraper javascript&lt;/code&gt; lets you load up a web page in an invisible Chrome browser (via Playwright), execute some JavaScript against that page and output the results to your terminal. It's a fun way of running complex screen-scraping routines as part of a terminal session, or even chained together with other commands using pipes.&lt;/p&gt;
&lt;p&gt;The &lt;code&gt;-i/--input&lt;/code&gt; option lets you load that JavaScript from a file on disk - but now you can also use a &lt;code&gt;gh:&lt;/code&gt; prefix to specify loading code from GitHub instead.&lt;/p&gt;
&lt;p&gt;To quote &lt;a href="https://github.com/simonw/shot-scraper/releases/tag/1.8"&gt;the release notes&lt;/a&gt;:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;code&gt;shot-scraper javascript&lt;/code&gt; can now optionally &lt;a href="https://shot-scraper.datasette.io/en/stable/javascript.html#running-javascript-from-github"&gt;load scripts hosted on GitHub&lt;/a&gt; via the new &lt;code&gt;gh:&lt;/code&gt; prefix to the &lt;code&gt;shot-scraper javascript -i/--input&lt;/code&gt; option. &lt;a href="https://github.com/simonw/shot-scraper/issues/173"&gt;#173&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;Scripts can be referenced as &lt;code&gt;gh:username/repo/path/to/script.js&lt;/code&gt; or, if the GitHub user has created a dedicated &lt;code&gt;shot-scraper-scripts&lt;/code&gt; repository and placed scripts in the root of it, using &lt;code&gt;gh:username/name-of-script&lt;/code&gt;.&lt;/p&gt;
&lt;p&gt;For example, to run this &lt;a href="https://github.com/simonw/shot-scraper-scripts/blob/main/readability.js"&gt;readability.js&lt;/a&gt; script against any web page you can use the following:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;shot-scraper javascript --input gh:simonw/readability \
  https://simonwillison.net/2025/Mar/24/qwen25-vl-32b/
&lt;/code&gt;&lt;/pre&gt;
&lt;/blockquote&gt;
&lt;p&gt;The &lt;a href="https://gist.github.com/simonw/60e196ec39a5a75dcabfd75fbe911a4c"&gt;output from that example&lt;/a&gt; starts like this:&lt;/p&gt;
&lt;div class="highlight highlight-source-json"&gt;&lt;pre&gt;{
    &lt;span class="pl-ent"&gt;"title"&lt;/span&gt;: &lt;span class="pl-s"&gt;&lt;span class="pl-pds"&gt;"&lt;/span&gt;Qwen2.5-VL-32B: Smarter and Lighter&lt;span class="pl-pds"&gt;"&lt;/span&gt;&lt;/span&gt;,
    &lt;span class="pl-ent"&gt;"byline"&lt;/span&gt;: &lt;span class="pl-s"&gt;&lt;span class="pl-pds"&gt;"&lt;/span&gt;Simon Willison&lt;span class="pl-pds"&gt;"&lt;/span&gt;&lt;/span&gt;,
    &lt;span class="pl-ent"&gt;"dir"&lt;/span&gt;: &lt;span class="pl-c1"&gt;null&lt;/span&gt;,
    &lt;span class="pl-ent"&gt;"lang"&lt;/span&gt;: &lt;span class="pl-s"&gt;&lt;span class="pl-pds"&gt;"&lt;/span&gt;en-gb&lt;span class="pl-pds"&gt;"&lt;/span&gt;&lt;/span&gt;,
    &lt;span class="pl-ent"&gt;"content"&lt;/span&gt;: &lt;span class="pl-s"&gt;&lt;span class="pl-pds"&gt;"&lt;/span&gt;&amp;lt;div id=&lt;span class="pl-cce"&gt;\"&lt;/span&gt;readability-page-1&lt;span class="pl-cce"&gt;\"...&lt;/span&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;My &lt;a href="https://github.com/simonw/shot-scraper-scripts"&gt;simonw/shot-scraper-scripts&lt;/a&gt; repo only has that one file in it so far, but I'm looking forward to growing that collection and hopefully seeing other people create and share their own &lt;code&gt;shot-scraper-scripts&lt;/code&gt; repos as well.&lt;/p&gt;
&lt;p&gt;This feature is an imitation of &lt;a href="https://github.com/simonw/llm/issues/809"&gt;a similar feature&lt;/a&gt; that's coming in the next release of LLM.


    &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/github"&gt;github&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/javascript"&gt;javascript&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/projects"&gt;projects&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/scraping"&gt;scraping&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/annotated-release-notes"&gt;annotated-release-notes&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/playwright"&gt;playwright&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/shot-scraper"&gt;shot-scraper&lt;/a&gt;&lt;/p&gt;



</summary><category term="github"/><category term="javascript"/><category term="projects"/><category term="scraping"/><category term="annotated-release-notes"/><category term="playwright"/><category term="shot-scraper"/></entry><entry><title>microsoft/playwright-mcp</title><link href="https://simonwillison.net/2025/Mar/25/playwright-mcp/#atom-tag" rel="alternate"/><published>2025-03-25T01:40:05+00:00</published><updated>2025-03-25T01:40:05+00:00</updated><id>https://simonwillison.net/2025/Mar/25/playwright-mcp/#atom-tag</id><summary type="html">
    
&lt;p&gt;&lt;strong&gt;&lt;a href="https://github.com/microsoft/playwright-mcp"&gt;microsoft/playwright-mcp&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;
The Playwright team at Microsoft have released an MCP (&lt;a href="https://github.com/microsoft/playwright-mcp"&gt;Model Context Protocol&lt;/a&gt;) server wrapping Playwright, and it's pretty fascinating.&lt;/p&gt;
&lt;p&gt;They implemented it on top of the Chrome accessibility tree, so MCP clients (such as the Claude Desktop app) can use it to drive an automated browser and use the accessibility tree to read and navigate pages that they visit.&lt;/p&gt;
&lt;p&gt;Trying it out is quite easy if you have Claude Desktop and Node.js installed already. Edit your &lt;code&gt;claude_desktop_config.json&lt;/code&gt; file:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;code ~/Library/Application\ Support/Claude/claude_desktop_config.json
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;And add this:&lt;/p&gt;
&lt;div class="highlight highlight-source-json"&gt;&lt;pre&gt;{
  &lt;span class="pl-ent"&gt;"mcpServers"&lt;/span&gt;: {
    &lt;span class="pl-ent"&gt;"playwright"&lt;/span&gt;: {
      &lt;span class="pl-ent"&gt;"command"&lt;/span&gt;: &lt;span class="pl-s"&gt;&lt;span class="pl-pds"&gt;"&lt;/span&gt;npx&lt;span class="pl-pds"&gt;"&lt;/span&gt;&lt;/span&gt;,
      &lt;span class="pl-ent"&gt;"args"&lt;/span&gt;: [
        &lt;span class="pl-s"&gt;&lt;span class="pl-pds"&gt;"&lt;/span&gt;@playwright/mcp@latest&lt;span class="pl-pds"&gt;"&lt;/span&gt;&lt;/span&gt;
      ]
    }
  }
}&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;Now when you launch Claude Desktop various new browser automation tools will be available to it, and you can tell Claude to navigate to a website and interact with it.&lt;/p&gt;
&lt;p&gt;&lt;img alt="Screenshot of Claude interface showing a conversation about Datasette. The interface shows Claude responding to a user (SW) after navigating to datasette.io. Claude's response includes page details (URL: https://datasette.io/, Title: Datasette: An open source multi-tool for exploring and publishing data) and a summary of what's visible on the site: a description of Datasette as an open-source tool for exploring and publishing data, the tagline &amp;quot;Find stories in data&amp;quot;, navigation options, and features including exploratory data analysis, instant data publishing, and rapid prototyping." src="https://static.simonwillison.net/static/2025/claude-playwright.jpg" /&gt;&lt;/p&gt;
&lt;p&gt;I ran the following to get a list of the available tools:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;cd /tmp
git clone https://github.com/microsoft/playwright-mcp
cd playwright-mcp/src/tools
files-to-prompt . | llm -m claude-3.7-sonnet \
  'Output a detailed description of these tools'
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;The &lt;a href="https://gist.github.com/simonw/69200999149221c549c1f62e7befa20f"&gt;full output is here&lt;/a&gt;, but here's the truncated tool list:&lt;/p&gt;
&lt;blockquote&gt;
&lt;h4&gt;Navigation Tools (&lt;code&gt;common.ts&lt;/code&gt;)&lt;/h4&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;browser_navigate&lt;/strong&gt;: Navigate to a specific URL&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;browser_go_back&lt;/strong&gt;: Navigate back in browser history&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;browser_go_forward&lt;/strong&gt;: Navigate forward in browser history&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;browser_wait&lt;/strong&gt;: Wait for a specified time in seconds&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;browser_press_key&lt;/strong&gt;: Press a keyboard key&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;browser_save_as_pdf&lt;/strong&gt;: Save current page as PDF&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;browser_close&lt;/strong&gt;: Close the current page&lt;/li&gt;
&lt;/ul&gt;
&lt;h4&gt;Screenshot and Mouse Tools (&lt;code&gt;screenshot.ts&lt;/code&gt;)&lt;/h4&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;browser_screenshot&lt;/strong&gt;: Take a screenshot of the current page&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;browser_move_mouse&lt;/strong&gt;: Move mouse to specific coordinates&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;browser_click&lt;/strong&gt; (coordinate-based): Click at specific x,y coordinates&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;browser_drag&lt;/strong&gt; (coordinate-based): Drag mouse from one position to another&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;browser_type&lt;/strong&gt; (keyboard): Type text and optionally submit&lt;/li&gt;
&lt;/ul&gt;
&lt;h4&gt;Accessibility Snapshot Tools (&lt;code&gt;snapshot.ts&lt;/code&gt;)&lt;/h4&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;browser_snapshot&lt;/strong&gt;: Capture accessibility structure of the page&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;browser_click&lt;/strong&gt; (element-based): Click on a specific element using accessibility reference&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;browser_drag&lt;/strong&gt; (element-based): Drag between two elements&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;browser_hover&lt;/strong&gt;: Hover over an element&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;browser_type&lt;/strong&gt; (element-based): Type text into a specific element&lt;/li&gt;
&lt;/ul&gt;
&lt;/blockquote&gt;


    &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/ai"&gt;ai&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/playwright"&gt;playwright&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/generative-ai"&gt;generative-ai&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/llms"&gt;llms&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/anthropic"&gt;anthropic&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/claude"&gt;claude&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/llm-tool-use"&gt;llm-tool-use&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/model-context-protocol"&gt;model-context-protocol&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/files-to-prompt"&gt;files-to-prompt&lt;/a&gt;&lt;/p&gt;



</summary><category term="ai"/><category term="playwright"/><category term="generative-ai"/><category term="llms"/><category term="anthropic"/><category term="claude"/><category term="llm-tool-use"/><category term="model-context-protocol"/><category term="files-to-prompt"/></entry><entry><title>shot-scraper 1.6 with support for HTTP Archives</title><link href="https://simonwillison.net/2025/Feb/13/shot-scraper/#atom-tag" rel="alternate"/><published>2025-02-13T21:02:37+00:00</published><updated>2025-02-13T21:02:37+00:00</updated><id>https://simonwillison.net/2025/Feb/13/shot-scraper/#atom-tag</id><summary type="html">
    
&lt;p&gt;&lt;strong&gt;&lt;a href="https://github.com/simonw/shot-scraper/releases/tag/1.6"&gt;shot-scraper 1.6 with support for HTTP Archives&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;
New release of my &lt;a href="https://shot-scraper.datasette.io/"&gt;shot-scraper&lt;/a&gt; CLI tool for taking screenshots and scraping web pages.&lt;/p&gt;
&lt;p&gt;The big new feature is &lt;a href="https://en.wikipedia.org/wiki/HAR_(file_format)"&gt;HTTP Archive (HAR)&lt;/a&gt; support. The new &lt;a href="https://shot-scraper.datasette.io/en/stable/har.html"&gt;shot-scraper har command&lt;/a&gt; can now create an archive of a page and all of its dependents like this:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;shot-scraper har https://datasette.io/
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;This produces a &lt;code&gt;datasette-io.har&lt;/code&gt; file (currently 163KB) which is JSON representing the full set of requests used to render that page. Here's &lt;a href="https://gist.github.com/simonw/b1fdf434e460814efdb89c95c354f794"&gt;a copy of that file&lt;/a&gt;. You can visualize that &lt;a href="https://ericduran.github.io/chromeHAR/?url=https://gist.githubusercontent.com/simonw/b1fdf434e460814efdb89c95c354f794/raw/924c1eb12b940ff02cefa2cc068f23c9d3cc5895/datasette.har.json"&gt;here using ericduran.github.io/chromeHAR&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;&lt;img alt="The HAR viewer shows a line for each of the loaded resources, with options to view timing information" src="https://static.simonwillison.net/static/2025/har-viewer.jpg" /&gt;&lt;/p&gt;
&lt;p&gt;That JSON includes full copies of all of the responses, base64 encoded if they are binary files such as images.&lt;/p&gt;
&lt;p&gt;You can add the &lt;code&gt;--zip&lt;/code&gt; flag to instead get a &lt;code&gt;datasette-io.har.zip&lt;/code&gt; file, containing JSON data in &lt;code&gt;har.har&lt;/code&gt; but with the response bodies saved as separate files in that archive.&lt;/p&gt;
&lt;p&gt;The &lt;code&gt;shot-scraper multi&lt;/code&gt; command lets you run &lt;code&gt;shot-scraper&lt;/code&gt; against multiple URLs in sequence, specified using a YAML file. That command now takes a &lt;code&gt;--har&lt;/code&gt; option (or &lt;code&gt;--har-zip&lt;/code&gt; or &lt;code&gt;--har-file name-of-file)&lt;/code&gt;, &lt;a href="https://shot-scraper.datasette.io/en/stable/multi.html#recording-to-an-http-archive"&gt;described in the documentation&lt;/a&gt;, which will produce a HAR at the same time as taking the screenshots.&lt;/p&gt;
&lt;p&gt;Shots are usually defined in YAML that looks like this:&lt;/p&gt;
&lt;div class="highlight highlight-source-yaml"&gt;&lt;pre&gt;- &lt;span class="pl-ent"&gt;output&lt;/span&gt;: &lt;span class="pl-s"&gt;example.com.png&lt;/span&gt;
  &lt;span class="pl-ent"&gt;url&lt;/span&gt;: &lt;span class="pl-s"&gt;http://www.example.com/&lt;/span&gt;
- &lt;span class="pl-ent"&gt;output&lt;/span&gt;: &lt;span class="pl-s"&gt;w3c.org.png&lt;/span&gt;
  &lt;span class="pl-ent"&gt;url&lt;/span&gt;: &lt;span class="pl-s"&gt;https://www.w3.org/&lt;/span&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;You can now omit the &lt;code&gt;output:&lt;/code&gt; keys and generate a HAR file without taking any screenshots at all:&lt;/p&gt;
&lt;div class="highlight highlight-source-yaml"&gt;&lt;pre&gt;- &lt;span class="pl-ent"&gt;url&lt;/span&gt;: &lt;span class="pl-s"&gt;http://www.example.com/&lt;/span&gt;
- &lt;span class="pl-ent"&gt;url&lt;/span&gt;: &lt;span class="pl-s"&gt;https://www.w3.org/&lt;/span&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;Run like this:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;shot-scraper multi shots.yml --har
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Which outputs:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;Skipping screenshot of 'https://www.example.com/'
Skipping screenshot of 'https://www.w3.org/'
Wrote to HAR file: trace.har
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;&lt;code&gt;shot-scraper&lt;/code&gt; is built on top of Playwright, and the new features use the &lt;a href="https://playwright.dev/python/docs/next/api/class-browser#browser-new-context-option-record-har-path"&gt;browser.new_context(record_har_path=...)&lt;/a&gt; parameter.


    &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/cli"&gt;cli&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/projects"&gt;projects&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/python"&gt;python&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/scraping"&gt;scraping&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/playwright"&gt;playwright&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/shot-scraper"&gt;shot-scraper&lt;/a&gt;&lt;/p&gt;



</summary><category term="cli"/><category term="projects"/><category term="python"/><category term="scraping"/><category term="playwright"/><category term="shot-scraper"/></entry><entry><title>Guidepup</title><link href="https://simonwillison.net/2024/Mar/14/guidepup/#atom-tag" rel="alternate"/><published>2024-03-14T04:07:49+00:00</published><updated>2024-03-14T04:07:49+00:00</updated><id>https://simonwillison.net/2024/Mar/14/guidepup/#atom-tag</id><summary type="html">
    
&lt;p&gt;&lt;strong&gt;&lt;a href="https://github.com/guidepup/guidepup"&gt;Guidepup&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;
I’ve been hoping to find something like this for years. Guidepup is “a screen reader driver for test automation”—you can use it to automate both VoiceOver on macOS and NVDA on Windows, and it can both drive the screen reader for automated tests and even produce a video at the end of the test.&lt;/p&gt;

&lt;p&gt;Also available: @guidepup/playwright, providing integration with the Playwright browser automation testing framework.&lt;/p&gt;

&lt;p&gt;I’d love to see open source JavaScript libraries both use something like this for their testing and publish videos of the tests to demonstrate how they work in these common screen readers.


    &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/accessibility"&gt;accessibility&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/screen-readers"&gt;screen-readers&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/playwright"&gt;playwright&lt;/a&gt;&lt;/p&gt;



</summary><category term="accessibility"/><category term="screen-readers"/><category term="playwright"/></entry><entry><title>Weeknotes: datasette-test, datasette-build, PSF board retreat</title><link href="https://simonwillison.net/2024/Jan/21/weeknotes/#atom-tag" rel="alternate"/><published>2024-01-21T11:34:43+00:00</published><updated>2024-01-21T11:34:43+00:00</updated><id>https://simonwillison.net/2024/Jan/21/weeknotes/#atom-tag</id><summary type="html">
    &lt;p&gt;I wrote about &lt;a href="https://simonwillison.net/2024/Jan/7/page-caching-and-custom-templates-for-datasette-cloud/"&gt;Page caching and custom templates&lt;/a&gt; in my last weeknotes. This week I wrapped up that work, modifying &lt;a href="https://github.com/simonw/datasette-edit-templates/releases"&gt;datasette-edit-templates&lt;/a&gt; to be compatible with the &lt;a href="https://docs.datasette.io/en/latest/plugin_hooks.html#jinja2-environment-from-request-datasette-request-env"&gt;jinja2_environment_from_request()&lt;/a&gt; plugin hook. This means you can edit templates directly in Datasette itself and have those served either for the full instance or just for the instance when served from a specific domain (the Datasette Cloud case).&lt;/p&gt;
&lt;h4 id="testing-plugins-with-playwright"&gt;Testing plugins with Playwright&lt;/h4&gt;
&lt;p&gt;As Datasette 1.0 draws closer, I've started thinking about plugin compatibility. This is heavily inspired by my work on Datasette Cloud, which has been running the latest Datasette alphas for several months.&lt;/p&gt;
&lt;p&gt;I spotted that &lt;code&gt;datasette-cluster-map&lt;/code&gt; wasn't working correctly on &lt;a href="https://www.datasette.cloud/"&gt;Datasette Cloud&lt;/a&gt;, as it hadn't been upgraded to account for JSON API changes in Datasette 1.0.&lt;/p&gt;
&lt;p&gt;&lt;a href="https://github.com/simonw/datasette-cluster-map/releases/tag/0.18"&gt;datasette-cluster-map 0.18&lt;/a&gt; fixed that, while continuing to work with previous versions of Datasette. More importantly, it introduced &lt;a href="https://playwright.dev/python/"&gt;Playwright&lt;/a&gt; tests to exercise the plugin in a real Chromium browser running in GitHub Actions.&lt;/p&gt;
&lt;p&gt;I've been wanting to establish a good pattern for this for a while, since a lot of Datasette plugins include JavaScript behaviour that warrants browser automation testing.&lt;/p&gt;
&lt;p&gt;Alex Garcia figured this out for &lt;a href="https://github.com/datasette/datasette-comments/blob/main/tests/test_ui.py"&gt;datasette-comments&lt;/a&gt; - inspired by his code I wrote up a TIL on &lt;a href="https://til.simonwillison.net/datasette/playwright-tests-datasette-plugin"&gt;Writing Playwright tests for a Datasette Plugin&lt;/a&gt; which I've now also used in &lt;a href="https://github.com/simonw/datasette-search-all/blob/770f95018f106d3b754a526b84d2f877d4725cf9/tests/test_playwright.py"&gt;datasette-search-all&lt;/a&gt;.&lt;/p&gt;
&lt;h4 id="datasette-test"&gt;datasette-test&lt;/h4&gt;
&lt;p&gt;&lt;a href="https://github.com/datasette/datasette-test"&gt;datasette-test&lt;/a&gt; is a new library that provides testing utilities for Datasette plugins. So far it offers two:&lt;/p&gt;
&lt;pre&gt;&lt;span class="pl-k"&gt;from&lt;/span&gt; &lt;span class="pl-s1"&gt;datasette_test&lt;/span&gt; &lt;span class="pl-k"&gt;import&lt;/span&gt; &lt;span class="pl-v"&gt;Datasette&lt;/span&gt;
&lt;span class="pl-k"&gt;import&lt;/span&gt; &lt;span class="pl-s1"&gt;pytest&lt;/span&gt;

&lt;span class="pl-en"&gt;@&lt;span class="pl-s1"&gt;pytest&lt;/span&gt;.&lt;span class="pl-s1"&gt;mark&lt;/span&gt;.&lt;span class="pl-s1"&gt;asyncio&lt;/span&gt;&lt;/span&gt;
&lt;span class="pl-k"&gt;async&lt;/span&gt; &lt;span class="pl-k"&gt;def&lt;/span&gt; &lt;span class="pl-en"&gt;test_datasette&lt;/span&gt;():
    &lt;span class="pl-s1"&gt;ds&lt;/span&gt; &lt;span class="pl-c1"&gt;=&lt;/span&gt; &lt;span class="pl-v"&gt;Datasette&lt;/span&gt;(&lt;span class="pl-s1"&gt;plugin_config&lt;/span&gt;&lt;span class="pl-c1"&gt;=&lt;/span&gt;{&lt;span class="pl-s"&gt;"my-plugin"&lt;/span&gt;: {&lt;span class="pl-s"&gt;"config"&lt;/span&gt;: &lt;span class="pl-s"&gt;"goes here"&lt;/span&gt;})&lt;/pre&gt;
&lt;p&gt;This &lt;code&gt;datasette_test.Datasette&lt;/code&gt; class is a subclass of &lt;code&gt;Datasette&lt;/code&gt; which helps write tests that work against both Datasette &amp;lt;1.0 and Datasette &amp;gt;=1.0a8 (releasing shortly). The way plugin configuration works is changing, and this &lt;code&gt;plugin_config=&lt;/code&gt; parameter papers over that difference for plugin tests.&lt;/p&gt;
&lt;p&gt;The other utility is a &lt;code&gt;wait_until_responds("http://localhost:8001")&lt;/code&gt; function. Thes can be used to wait until a server has started, useful for testing with Playwright. I extracted this from Alex's &lt;code&gt;datasette-comments&lt;/code&gt; tests.&lt;/p&gt;
&lt;h4 id="datasette-build"&gt;datasette-build&lt;/h4&gt;
&lt;p&gt;So far this is just the skeleton of a new tool. I plan for &lt;a href="https://github.com/datasette/datasette-build"&gt;datasette-build&lt;/a&gt; to offer comprehensive support for converting a directory full of static data files - JSON, TSV, CSV and more - into a SQLite database, and eventually to other database backends as well.&lt;/p&gt;
&lt;p&gt;So far it's pretty minimal, but my goal is to use plugins to provide optional support for further formats, such as GeoJSON or Parquet or even &lt;code&gt;.xlsx&lt;/code&gt;.&lt;/p&gt;
&lt;p&gt;I really like using GitHub to keep smaller (less than 1GB) datasets under version control. My plan is for &lt;code&gt;datasette-build&lt;/code&gt; to support that pattern, making it easy to load version-controlled data files into a SQLite database you can then query directly.&lt;/p&gt;
&lt;h4 id="psf-in-person"&gt;PSF board in-person meeting&lt;/h4&gt;
&lt;p&gt;I spent the last two days of this week at the annual &lt;a href="https://www.python.org/psf-landing/"&gt;Python Software Foundation&lt;/a&gt; in-person board meeting. It's been fantastic catching up with the other board members over more than just a Zoom connection, and we had a very thorough two days figuring out strategy for the next year and beyond.&lt;/p&gt;
&lt;h4 id="blog-entries-2024-01-21"&gt;Blog entries&lt;/h4&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href="https://simonwillison.net/2024/Jan/17/oxide-and-friends/"&gt;Talking about Open Source LLMs on Oxide and Friends&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://simonwillison.net/2024/Jan/16/python-lib-pypi/"&gt;Publish Python packages to PyPI with a python-lib cookiecutter template and GitHub Actions&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://simonwillison.net/2024/Jan/9/what-i-should-have-said-about-ai/"&gt;What I should have said about the term Artificial Intelligence&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;h4 id="releases-2024-01-21"&gt;Releases&lt;/h4&gt;
&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;&lt;a href="https://github.com/simonw/datasette-edit-templates/releases/tag/0.4.3"&gt;datasette-edit-templates 0.4.3&lt;/a&gt;&lt;/strong&gt; - 2024-01-17&lt;br /&gt;Plugin allowing Datasette templates to be edited within Datasette&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;&lt;a href="https://github.com/datasette/datasette-test/releases/tag/0.2"&gt;datasette-test 0.2&lt;/a&gt;&lt;/strong&gt; - 2024-01-16&lt;br /&gt;Utilities to help write tests for Datasette plugins and applications&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;&lt;a href="https://github.com/simonw/datasette-cluster-map/releases/tag/0.18.1"&gt;datasette-cluster-map 0.18.1&lt;/a&gt;&lt;/strong&gt; - 2024-01-16&lt;br /&gt;Datasette plugin that shows a map for any data with latitude/longitude columns&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;&lt;a href="https://github.com/datasette/datasette-build/releases/tag/0.1a0"&gt;datasette-build 0.1a0&lt;/a&gt;&lt;/strong&gt; - 2024-01-15&lt;br /&gt;Build a directory full of files into a SQLite database&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;&lt;a href="https://github.com/simonw/datasette-auth-tokens/releases/tag/0.4a7"&gt;datasette-auth-tokens 0.4a7&lt;/a&gt;&lt;/strong&gt; - 2024-01-13&lt;br /&gt;Datasette plugin for authenticating access using API tokens&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;&lt;a href="https://github.com/simonw/datasette-search-all/releases/tag/1.1.2"&gt;datasette-search-all 1.1.2&lt;/a&gt;&lt;/strong&gt; - 2024-01-08&lt;br /&gt;Datasette plugin for searching all searchable tables at once&lt;/li&gt;
&lt;/ul&gt;
&lt;h4 id="tils-2024-01-21"&gt;TILs&lt;/h4&gt;
&lt;ul&gt;
&lt;li&gt;
&lt;a href="https://til.simonwillison.net/pypi/pypi-releases-from-github"&gt;Publish releases to PyPI from GitHub Actions without a password or token&lt;/a&gt; - 2024-01-15&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://til.simonwillison.net/python/pprint-no-sort-dicts"&gt;Using pprint() to print dictionaries while preserving their key order&lt;/a&gt; - 2024-01-15&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://til.simonwillison.net/playwright/expect-selector-count"&gt;Using expect() to wait for a selector to match multiple items&lt;/a&gt; - 2024-01-13&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://til.simonwillison.net/sphinx/literalinclude-with-markers"&gt;literalinclude with markers for showing code in documentation&lt;/a&gt; - 2024-01-10&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://til.simonwillison.net/datasette/playwright-tests-datasette-plugin"&gt;Writing Playwright tests for a Datasette Plugin&lt;/a&gt; - 2024-01-09&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://til.simonwillison.net/cloudflare/cloudflare-cache-html"&gt;How to get Cloudflare to cache HTML&lt;/a&gt; - 2024-01-09&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://til.simonwillison.net/fly/varnish-on-fly"&gt;Running Varnish on Fly&lt;/a&gt; - 2024-01-08&lt;/li&gt;
&lt;/ul&gt;
    
        &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/projects"&gt;projects&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/datasette"&gt;datasette&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/weeknotes"&gt;weeknotes&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/datasette-cloud"&gt;datasette-cloud&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/playwright"&gt;playwright&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/psf"&gt;psf&lt;/a&gt;&lt;/p&gt;
    

</summary><category term="projects"/><category term="datasette"/><category term="weeknotes"/><category term="datasette-cloud"/><category term="playwright"/><category term="psf"/></entry><entry><title>nat/natbot</title><link href="https://simonwillison.net/2022/Sep/30/natbot/#atom-tag" rel="alternate"/><published>2022-09-30T01:01:30+00:00</published><updated>2022-09-30T01:01:30+00:00</updated><id>https://simonwillison.net/2022/Sep/30/natbot/#atom-tag</id><summary type="html">
    
&lt;p&gt;&lt;strong&gt;&lt;a href="https://github.com/nat/natbot"&gt;nat/natbot&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;
Extremely devious hack by Nat Friedman: opens a browser using Playwright and then passes a DOM representation to GPT-3 in order to power a chat-style interface for driving the browser. Worth diving into the code to look at the prompt it uses, it’s fascinating.

    &lt;p&gt;&lt;small&gt;&lt;/small&gt;Via &lt;a href="https://twitter.com/natfriedman/status/1575631194032549888"&gt;@natfriedman&lt;/a&gt;&lt;/small&gt;&lt;/p&gt;


    &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/playwright"&gt;playwright&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/gpt-3"&gt;gpt-3&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/openai"&gt;openai&lt;/a&gt;&lt;/p&gt;



</summary><category term="playwright"/><category term="gpt-3"/><category term="openai"/></entry><entry><title>Bundling binary tools in Python wheels</title><link href="https://simonwillison.net/2022/May/23/bundling-binary-tools-in-python-wheels/#atom-tag" rel="alternate"/><published>2022-05-23T15:06:04+00:00</published><updated>2022-05-23T15:06:04+00:00</updated><id>https://simonwillison.net/2022/May/23/bundling-binary-tools-in-python-wheels/#atom-tag</id><summary type="html">
    &lt;p&gt;I spotted a new (to me) pattern which I think is pretty interesting: projects are bundling compiled binary applications as part of their Python packaging wheels. I think it’s really neat.&lt;/p&gt;
&lt;h4&gt;pip install ziglang&lt;/h4&gt;
&lt;p&gt;&lt;a href="https://ziglang.org/"&gt;Zig&lt;/a&gt; is a new programming language lead by Andrew Kelley that sits somewhere near Rust: Wikipedia &lt;a href="https://en.wikipedia.org/wiki/Zig_(programming_language)"&gt;calls it&lt;/a&gt; an "imperative, general-purpose, statically typed, compiled system programming language".&lt;/p&gt;
&lt;p&gt;One of its most notable features is that it bundles its own C/C++ compiler, as a “hermetic” compiler - it’s completely standalone, unaffected by the system that it is operating within. I learned about this usage of the word hermetic this morning from &lt;a href="https://jakstys.lt/2022/how-uber-uses-zig/"&gt;How Uber Uses Zig&lt;/a&gt; by Motiejus Jakštys.&lt;/p&gt;
&lt;p&gt;The concept reminds me of Gregory Szorc's &lt;a href="https://github.com/indygreg/python-build-standalone"&gt;python-build-standalone&lt;/a&gt;, which provides redistributable Python builds and was key to getting &lt;a href="https://simonwillison.net/2021/Sep/8/datasette-desktop/"&gt;my Datasette Desktop Electron application&lt;/a&gt; working with its own hermetic build of Python.&lt;/p&gt;
&lt;p&gt;One of the options provided for installing Zig (and its bundled toolchain) is &lt;a href="https://github.com/ziglang/zig-pypi/blob/main/README.pypi.md"&gt;to use pip&lt;/a&gt;:&lt;/p&gt;
&lt;div class="highlight highlight-source-shell"&gt;&lt;pre&gt;% pip install ziglang
...
% python -m ziglang cc --help
OVERVIEW: clang LLVM compiler

USAGE: zig [options] file...

OPTIONS:
  -#&lt;span class="pl-c"&gt;&lt;span class="pl-c"&gt;#&lt;/span&gt;#                    Print (but do not run) the commands to run for this compilation&lt;/span&gt;
  --amdgpu-arch-tool=&lt;span class="pl-k"&gt;&amp;lt;&lt;/span&gt;value&lt;span class="pl-k"&gt;&amp;gt;&lt;/span&gt;
                          Tool used &lt;span class="pl-k"&gt;for&lt;/span&gt; &lt;span class="pl-smi"&gt;detecting AMD GPU arch&lt;/span&gt; &lt;span class="pl-k"&gt;in&lt;/span&gt; the system.
...&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;This means you can now &lt;code&gt;pip install&lt;/code&gt; a full C compiler for your current platform!&lt;/p&gt;
&lt;p&gt;The way this works is really simple. The &lt;code&gt;ziglang&lt;/code&gt; package that you install has two key files: A &lt;code&gt;zig&lt;/code&gt; binary (155MB on my system) containing the full Zig compiled implementation, and a &lt;code&gt;__main__.py&lt;/code&gt; module containing the following:&lt;/p&gt;
&lt;pre&gt;&lt;span class="pl-k"&gt;import&lt;/span&gt; &lt;span class="pl-s1"&gt;os&lt;/span&gt;, &lt;span class="pl-s1"&gt;sys&lt;/span&gt;, &lt;span class="pl-s1"&gt;subprocess&lt;/span&gt;
&lt;span class="pl-s1"&gt;sys&lt;/span&gt;.&lt;span class="pl-en"&gt;exit&lt;/span&gt;(&lt;span class="pl-s1"&gt;subprocess&lt;/span&gt;.&lt;span class="pl-en"&gt;call&lt;/span&gt;([
    &lt;span class="pl-s1"&gt;os&lt;/span&gt;.&lt;span class="pl-s1"&gt;path&lt;/span&gt;.&lt;span class="pl-en"&gt;join&lt;/span&gt;(&lt;span class="pl-s1"&gt;os&lt;/span&gt;.&lt;span class="pl-s1"&gt;path&lt;/span&gt;.&lt;span class="pl-en"&gt;dirname&lt;/span&gt;(&lt;span class="pl-s1"&gt;__file__&lt;/span&gt;), &lt;span class="pl-s"&gt;"zig"&lt;/span&gt;),
    &lt;span class="pl-c1"&gt;*&lt;/span&gt;&lt;span class="pl-s1"&gt;sys&lt;/span&gt;.&lt;span class="pl-s1"&gt;argv&lt;/span&gt;[&lt;span class="pl-c1"&gt;1&lt;/span&gt;:]
]))&lt;/pre&gt;
&lt;p&gt;The package also bundles &lt;code&gt;lib&lt;/code&gt; and &lt;code&gt;doc&lt;/code&gt; folders with supporting files used by Zig itself, unrelated to Python.&lt;/p&gt;
&lt;p&gt;The Zig project then bundles and ships eight different Python wheels targetting different platforms. &lt;a href="https://github.com/ziglang/zig-pypi/blob/de14cf728fa35c014821f62a4fa9abd9f4bb560e/make_wheels.py#L115-L124"&gt;Here's their code&lt;/a&gt; that does that, which lists the platforms that are supported:&lt;/p&gt;
&lt;pre&gt;&lt;span class="pl-k"&gt;for&lt;/span&gt; &lt;span class="pl-s1"&gt;zig_platform&lt;/span&gt;, &lt;span class="pl-s1"&gt;python_platform&lt;/span&gt; &lt;span class="pl-c1"&gt;in&lt;/span&gt; {
    &lt;span class="pl-s"&gt;'windows-i386'&lt;/span&gt;:   &lt;span class="pl-s"&gt;'win32'&lt;/span&gt;,
    &lt;span class="pl-s"&gt;'windows-x86_64'&lt;/span&gt;: &lt;span class="pl-s"&gt;'win_amd64'&lt;/span&gt;,
    &lt;span class="pl-s"&gt;'macos-x86_64'&lt;/span&gt;:   &lt;span class="pl-s"&gt;'macosx_10_9_x86_64'&lt;/span&gt;,
    &lt;span class="pl-s"&gt;'macos-aarch64'&lt;/span&gt;:  &lt;span class="pl-s"&gt;'macosx_11_0_arm64'&lt;/span&gt;,
    &lt;span class="pl-s"&gt;'linux-i386'&lt;/span&gt;:     &lt;span class="pl-s"&gt;'manylinux_2_12_i686.manylinux2010_i686'&lt;/span&gt;,
    &lt;span class="pl-s"&gt;'linux-x86_64'&lt;/span&gt;:   &lt;span class="pl-s"&gt;'manylinux_2_12_x86_64.manylinux2010_x86_64'&lt;/span&gt;,
    &lt;span class="pl-s"&gt;'linux-armv7a'&lt;/span&gt;:   &lt;span class="pl-s"&gt;'manylinux_2_17_armv7l.manylinux2014_armv7l'&lt;/span&gt;,
    &lt;span class="pl-s"&gt;'linux-aarch64'&lt;/span&gt;:  &lt;span class="pl-s"&gt;'manylinux_2_17_aarch64.manylinux2014_aarch64'&lt;/span&gt;,
}.&lt;span class="pl-en"&gt;items&lt;/span&gt;():
    &lt;span class="pl-c"&gt;# Build the wheel here...&lt;/span&gt;&lt;/pre&gt;
&lt;p&gt;They &lt;a href="https://github.com/ziglang/zig-pypi/blob/main/README.pypi.md#usage"&gt;suggest&lt;/a&gt; that if you want to run their tools from a Python program you do so like this, to ensure your script can find the installed binary:&lt;/p&gt;
&lt;pre&gt;&lt;span class="pl-k"&gt;import&lt;/span&gt; &lt;span class="pl-s1"&gt;sys&lt;/span&gt;, &lt;span class="pl-s1"&gt;subprocess&lt;/span&gt;

&lt;span class="pl-s1"&gt;subprocess&lt;/span&gt;.&lt;span class="pl-en"&gt;call&lt;/span&gt;([&lt;span class="pl-s1"&gt;sys&lt;/span&gt;.&lt;span class="pl-s1"&gt;executable&lt;/span&gt;, &lt;span class="pl-s"&gt;"-m"&lt;/span&gt;, &lt;span class="pl-s"&gt;"ziglang"&lt;/span&gt;])&lt;/pre&gt;
&lt;p&gt;I find this whole approach pretty fascinating. I really love the idea that I can add a full C/C++ compiler as a dependency to any of my Python projects, and thanks to Python wheels I'll automatically get a binary excutable compiled for my current platform.&lt;/p&gt;
&lt;h4&gt;Playwright Python&lt;/h4&gt;
&lt;p&gt;I spotted another example of this pattern recently in &lt;a href="https://playwright.dev/python/docs/intro"&gt;Playwright Python&lt;/a&gt;. Playwright is Microsoft's open source browser automation and testing framework - a kind of modern Selenium. I used it recently to build my &lt;a href="https://shot-scraper.datasette.io/"&gt;shot-scraper&lt;/a&gt; screenshot automation tool.&lt;/p&gt;
&lt;p&gt;Playwright provides a full-featured API for controlling headless (and headful) browser instances, with implementations in Node.js, Python, Java and .NET.&lt;/p&gt;
&lt;p&gt;I was intrigued as to how they had developed such a sophisticated API for four different platforms/languages at once, providing full equivalence for all of their features across all four.&lt;/p&gt;
&lt;p&gt;So I dug around in their Python package (from &lt;code&gt;pip install playwright&lt;/code&gt;) and found this:&lt;/p&gt;
&lt;p&gt;&lt;code&gt;77M ./venv/lib/python3.10/site-packages/playwright/driver/node&lt;/code&gt;&lt;/p&gt;
&lt;p&gt;That's a full copy of the Node.js binary!&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;% ./venv/lib/python3.10/site-packages/playwright/driver/node --version
v16.13.0
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Playwright Python works by providing a Python layer on top of the existing JavaScript API library. It runs a Node.js process which does the actual work, the Python library just communicates with the JavaScript for you.&lt;/p&gt;
&lt;p&gt;As with Zig, the Playwright team offer &lt;a href="https://pypi.org/project/playwright/#files"&gt;seven pre-compiled wheels&lt;/a&gt; for different platforms. The list today is:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;code&gt;playwright-1.22.0-py3-none-win_amd64.whl&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;&lt;code&gt;playwright-1.22.0-py3-none-win32.whl&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;&lt;code&gt;playwright-1.22.0-py3-none-manylinux_2_17_aarch64.manylinux2014_aarch64.whl&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;&lt;code&gt;playwright-1.22.0-py3-none-manylinux1_x86_64.whl&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;&lt;code&gt;playwright-1.22.0-py3-none-macosx_11_0_universal2.whl&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;&lt;code&gt;playwright-1.22.0-py3-none-macosx_11_0_arm64.whl&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;&lt;code&gt;playwright-1.22.0-py3-none-macosx_10_13_x86_64.whl&lt;/code&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;I wish I could say "you can now &lt;code&gt;pip install&lt;/code&gt; a browser!" but Playwright doesn't actually bundle the browsers themselves - you need to run &lt;code&gt;python -m playwright install&lt;/code&gt; to download those separately.&lt;/p&gt;
&lt;p&gt;Pretty fascinating example of the same pattern though!&lt;/p&gt;
&lt;h4&gt;pip install a SQLite database&lt;/h4&gt;
&lt;p&gt;It's not quite the same thing, since it's not packaging an executable, but the one project I have that fits this mould if you squint a little is my &lt;a href="https://datasette.io/plugins/datasette-basemap"&gt;datasette-basemap&lt;/a&gt; plugin.&lt;/p&gt;
&lt;p&gt;It's a Datasette plugin which bundles a 23MB SQLite database file containing OpenStreetMap tiles for the first seven zoom levels of their world map - 5,461 tile images total.&lt;/p&gt;
&lt;p&gt;I built it so that people could use my &lt;a href="https://datasette.io/plugins/datasette-cluster-map"&gt;datasette-cluster-map&lt;/a&gt; and &lt;a href="https://datasette.io/plugins/datasette-leaflet-geojson"&gt;datasette-leaflet-geojson&lt;/a&gt; entirely standalone, without needing to load tiles from a central tile server.&lt;/p&gt;
&lt;p&gt;You can &lt;a href="https://datasette-tiles-demo.datasette.io/-/tiles/basemap"&gt;play with a demo here&lt;/a&gt;. I wrote more about that project in &lt;a href="https://simonwillison.net/2021/Feb/4/datasette-tiles/"&gt;Serving map tiles from SQLite with MBTiles and datasette-tiles&lt;/a&gt;. It's pretty fun to be able to run &lt;code&gt;pip install datasette-basemap&lt;/code&gt; to install a full map of the world.&lt;/p&gt;
&lt;p&gt;Seen any other interesting examples of &lt;code&gt;pip install&lt;/code&gt; being (ab)used in this way? Ping them to me &lt;a href="https://twitter.com/simonw/status/1528754782311047168"&gt;on Twitter&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Update:&lt;/strong&gt;  Paul O'Leary McCann &lt;a href="https://twitter.com/polm23/status/1528937321139122177"&gt;points out&lt;/a&gt; that PyPI has a default 60MB size limit for packages, though it can be raised on a case-by-case basis. He wrote about this in &lt;a href="https://www.dampfkraft.com/code/distributing-large-files-with-pypi.html"&gt;Distributing Large Files with PyPI Packages&lt;/a&gt;.&lt;/p&gt;
    
        &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/packaging"&gt;packaging&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/pypi"&gt;pypi&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/python"&gt;python&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/playwright"&gt;playwright&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/zig"&gt;zig&lt;/a&gt;&lt;/p&gt;
    

</summary><category term="packaging"/><category term="pypi"/><category term="python"/><category term="playwright"/><category term="zig"/></entry><entry><title>@newshomepages</title><link href="https://simonwillison.net/2022/Mar/12/newshomepages/#atom-tag" rel="alternate"/><published>2022-03-12T19:21:34+00:00</published><updated>2022-03-12T19:21:34+00:00</updated><id>https://simonwillison.net/2022/Mar/12/newshomepages/#atom-tag</id><summary type="html">
    
&lt;p&gt;&lt;strong&gt;&lt;a href="https://twitter.com/newshomepages"&gt;@newshomepages&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;
Ben Welsh used my shot-scraper tool and GitHub Actions to launch a Twitter bot which tweets screenshots of newspaper homepages on a scheduled basis. Ben says: “The tech is so easy, I was able to pull it off in a couple hours at zero cost. A decade ago I ran a similar project using the cloud resources of the day. [...] It costs thousands of dollars and the screenshots were of much lower quality. Incredible progress!”

    &lt;p&gt;&lt;small&gt;&lt;/small&gt;Via &lt;a href="https://twitter.com/palewire/status/1502679775973834752"&gt;@palewire&lt;/a&gt;&lt;/small&gt;&lt;/p&gt;


    &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/twitter"&gt;twitter&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/github-actions"&gt;github-actions&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/playwright"&gt;playwright&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/shot-scraper"&gt;shot-scraper&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/ben-welsh"&gt;ben-welsh&lt;/a&gt;&lt;/p&gt;



</summary><category term="twitter"/><category term="github-actions"/><category term="playwright"/><category term="shot-scraper"/><category term="ben-welsh"/></entry><entry><title>Weeknotes: Distracted by Playwright</title><link href="https://simonwillison.net/2022/Mar/12/weeknotes-playwright/#atom-tag" rel="alternate"/><published>2022-03-12T00:30:26+00:00</published><updated>2022-03-12T00:30:26+00:00</updated><id>https://simonwillison.net/2022/Mar/12/weeknotes-playwright/#atom-tag</id><summary type="html">
    &lt;p&gt;My goal for this week was to unblock progress on Datasette by finally finishing the dash encoding implementation I described last week. I was getting close, and then I got very distracted by &lt;a href="https://playwright.dev/"&gt;Playwright&lt;/a&gt;.&lt;/p&gt;
&lt;h4&gt;Dash encoding v2&lt;/h4&gt;
&lt;p&gt;In &lt;a href="https://simonwillison.net/2022/Mar/5/dash-encoding/"&gt;Why I invented “dash encoding”, a new encoding scheme for URL paths&lt;/a&gt; I described a new mechanism I had invented for handling the gnarly problem of including table names with &lt;code&gt;/&lt;/code&gt; characters in the URL path on Datasette. The very short version: you can't use URL encoding in a path, because common proxies (including Apache and Nginx) will decode them before they get to your application.&lt;/p&gt;
&lt;p&gt;Thanks to feedback on that post I actually changed my design: I'm now using a variant of percent encoding that uses the &lt;code&gt;-&lt;/code&gt; instead of the &lt;code&gt;%&lt;/code&gt;. More &lt;a href="https://github.com/simonw/datasette/issues/1439#issuecomment-1059851259"&gt;details in the issue&lt;/a&gt; - and I'll write this up fully once I've finished landing the change.&lt;/p&gt;
&lt;h4&gt;shot-scraper and Playwright&lt;/h4&gt;
&lt;p&gt;I thoroughly &lt;a href="https://xkcd.com/356/"&gt;nerd-sniped&lt;/a&gt; myself with this one. I started investigating possibilities for automatically generating screeshots for documentation, and realized that &lt;a href="https://playwright.dev/"&gt;Playwright&lt;/a&gt; made this substantially easier than it has been in the past.&lt;/p&gt;
&lt;p&gt;The result was &lt;strong&gt;&lt;a href="https://simonwillison.net/2022/Mar/10/shot-scraper/"&gt;shot-scraper&lt;/a&gt;&lt;/strong&gt; - a new command-line utility for taking screenshots of web pages, or portions of web pages - and for running through a set of screenshots defined in a YAML file.&lt;/p&gt;
&lt;p&gt;I still can't quite believe how quickly this came together.&lt;/p&gt;
&lt;p&gt;Every now and then a tool comes along which adds a fundamental new set of capabilities to your toolbox, and can be multiplied against other tools to open up a huge range of possibilities.&lt;/p&gt;
&lt;p&gt;Playwright feels like one of those tools.&lt;/p&gt;
&lt;p&gt;A quick &lt;code&gt;pip install playwright&lt;/code&gt; is all it takes to start writing robust browser automation tools, using dedicated standalone headless instances of multiple browsers that are installed for you using &lt;code&gt;playwright install&lt;/code&gt;.&lt;/p&gt;
&lt;p&gt;It's easy to run in CI - getting it working in GitHub Actions was trivial.&lt;/p&gt;
&lt;p&gt;&lt;code&gt;shot-scraper&lt;/code&gt; is my first project built on Playwright, but there will definitely be more.&lt;/p&gt;
&lt;h4&gt;shot-scraper accessibility&lt;/h4&gt;
&lt;p&gt;I started &lt;a href="https://twitter.com/simonw/status/1502044953836503048"&gt;a Twitter conversation&lt;/a&gt; asking for ways to write automated tests that exercise screen readers - not just running audit rules, but actually simulating what happens when a screen reader user attempts to navigate through a specific flow within an application.&lt;/p&gt;
&lt;p&gt;The most interesting answer I had was &lt;a href="https://twitter.com/bmustillrose/status/1502066504401141767"&gt;from Ben Mustill-Rose&lt;/a&gt;, who built a system for automating tests against an Android screen reader while working on BBC iPlayer - &lt;a href="https://youtu.be/-vEHOiIggss?t=253"&gt;demo here&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;@fardarter &lt;a href="https://twitter.com/fardarter/status/1502045993667280905"&gt;pointed me&lt;/a&gt; back to Playwright again, which turns out to have an &lt;a href="https://playwright.dev/python/docs/api/class-accessibility"&gt;Accessibility snapshot&lt;/a&gt; mechanism that can dump out the current state of the Chromium accessibility tree.&lt;/p&gt;
&lt;p&gt;I couldn't resist &lt;a href="https://github.com/simonw/shot-scraper/issues/22"&gt;adding that to shot-scraper&lt;/a&gt; - so now you can run the following to see the accessibility tree for a web page:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;~ % shot-scraper accessibility https://datasette.io
{
    "role": "WebArea",
    "name": "Datasette: An open source multi-tool for exploring and publishing data",
    "children": [
        {
            "role": "link",
            "name": "Uses"
        },
        {
            "role": "link",
            "name": "Documentation"
        },
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;&lt;a href="https://gist.github.com/simonw/431e9075441463236850bab042b9d20d"&gt;Full output here&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;As a really fun bonus trick: since the output is JSON, you can pipe it into &lt;a href="https://sqlite-utils.datasette.io/en/stable/cli.html#inserting-json-data"&gt;sqlite-utils insert&lt;/a&gt; to get a SQLite database:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;shot-scraper accessibility https://datasette.io \
    | jq .children | sqlite-utils insert \
    /tmp/accessibility.db nodes - --alter
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;And then open it in &lt;a href="https://datasette.io/desktop"&gt;Datasette Desktop&lt;/a&gt; and start faceting by role and heading level!&lt;/p&gt;
&lt;p&gt;&lt;img src="https://static.simonwillison.net/static/2022/datasette-desktop-accessibility.jpg" alt="Datasette Desktop browsing the nodes table - it has text, link, heading, button and textbox roles and four different heading levels." style="max-width:100%;" /&gt;&lt;/p&gt;
&lt;h4&gt;sqlite-utils documentation improvements&lt;/h4&gt;
&lt;p&gt;I complained on Twitter that the way type information was displayed in the Sphinx &lt;a href="https://sqlite-utils.datasette.io/en/stable/reference.html"&gt;sqlite-utils API reference documentation&lt;/a&gt; was ugly:&lt;/p&gt;
&lt;p&gt;&lt;img src="https://static.simonwillison.net/static/2022/docs-ugly.png" alt="Really long ugly type signatures" style="max-width:100%;" /&gt;&lt;/p&gt;
&lt;p&gt;Adam Johnson &lt;a href="https://twitter.com/AdamChainz/status/1502311047612575745"&gt;pointed me&lt;/a&gt; to the &lt;code&gt;autodoc_typehints = "description"&lt;/code&gt; option which fixes this. I spent a while tidying up the documentation to work better with this, mainly by adding a whole bunch of &lt;code&gt;:param name: description&lt;/code&gt; tags that I had previously omitted. That work happenen in &lt;a href="https://github.com/simonw/sqlite-utils/issues/413"&gt;this issue&lt;/a&gt;. I think it looks &lt;em&gt;much&lt;/em&gt; better now:&lt;/p&gt;
&lt;p&gt;&lt;img src="https://static.simonwillison.net/static/2022/docs-pretty.png" alt="Type signatures are much easier to read now, and there's a detailed list of parameters with descriptions." style="max-width:100%;" /&gt;&lt;/p&gt;
&lt;h4&gt;Releases this week&lt;/h4&gt;
&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;&lt;a href="https://github.com/simonw/image-diff"&gt;image-diff&lt;/a&gt;&lt;/strong&gt;: &lt;a href="https://github.com/simonw/image-diff/releases/tag/0.2.1"&gt;0.2.1&lt;/a&gt; - (&lt;a href="https://github.com/simonw/image-diff/releases"&gt;3 releases total&lt;/a&gt;) - 2022-03-11
&lt;br /&gt;CLI tool for comparing images&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;&lt;a href="https://github.com/simonw/sqlite-utils"&gt;sqlite-utils&lt;/a&gt;&lt;/strong&gt;: &lt;a href="https://github.com/simonw/sqlite-utils/releases/tag/3.25.1"&gt;3.25.1&lt;/a&gt; - (&lt;a href="https://github.com/simonw/sqlite-utils/releases"&gt;98 releases total&lt;/a&gt;) - 2022-03-11
&lt;br /&gt;Python CLI utility and library for manipulating SQLite databases&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;&lt;a href="https://github.com/simonw/shot-scraper"&gt;shot-scraper&lt;/a&gt;&lt;/strong&gt;: &lt;a href="https://github.com/simonw/shot-scraper/releases/tag/0.4"&gt;0.4&lt;/a&gt; - (&lt;a href="https://github.com/simonw/shot-scraper/releases"&gt;5 releases total&lt;/a&gt;) - 2022-03-10
&lt;br /&gt;Automated website screenshots using GitHub Actions&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;&lt;a href="https://github.com/simonw/django-sql-dashboard"&gt;django-sql-dashboard&lt;/a&gt;&lt;/strong&gt;: &lt;a href="https://github.com/simonw/django-sql-dashboard/releases/tag/1.0.2"&gt;1.0.2&lt;/a&gt; - (&lt;a href="https://github.com/simonw/django-sql-dashboard/releases"&gt;34 releases total&lt;/a&gt;) - 2022-03-08
&lt;br /&gt;Django app for building dashboards using raw SQL queries&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;&lt;a href="https://github.com/simonw/geojson-to-sqlite"&gt;geojson-to-sqlite&lt;/a&gt;&lt;/strong&gt;: &lt;a href="https://github.com/simonw/geojson-to-sqlite/releases/tag/1.0"&gt;1.0&lt;/a&gt; - (&lt;a href="https://github.com/simonw/geojson-to-sqlite/releases"&gt;8 releases total&lt;/a&gt;) - 2022-03-04
&lt;br /&gt;CLI tool for converting GeoJSON files to SQLite (with SpatiaLite)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;&lt;a href="https://github.com/simonw/xml-analyser"&gt;xml-analyser&lt;/a&gt;&lt;/strong&gt;: &lt;a href="https://github.com/simonw/xml-analyser/releases/tag/1.3"&gt;1.3&lt;/a&gt; - (&lt;a href="https://github.com/simonw/xml-analyser/releases"&gt;4 releases total&lt;/a&gt;) - 2022-03-01
&lt;br /&gt;Simple command line tool for quickly analysing the structure of an arbitrary XML file&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;&lt;a href="https://github.com/simonw/datasette-dateutil"&gt;datasette-dateutil&lt;/a&gt;&lt;/strong&gt;: &lt;a href="https://github.com/simonw/datasette-dateutil/releases/tag/0.3"&gt;0.3&lt;/a&gt; - (&lt;a href="https://github.com/simonw/datasette-dateutil/releases"&gt;4 releases total&lt;/a&gt;) - 2022-03-01
&lt;br /&gt;dateutil functions for Datasette&lt;/li&gt;
&lt;/ul&gt;
&lt;h4&gt;TIL this week&lt;/h4&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href="https://til.simonwillison.net/datasette/crawling-datasette-with-datasette"&gt;Crawling Datasette with Datasette&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://til.simonwillison.net/homebrew/latest-sqlite"&gt;Running the latest SQLite in Datasette using Homebrew&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://til.simonwillison.net/macos/python-installer-macos"&gt;Installing Python on macOS with the official Python installer&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://til.simonwillison.net/gis/natural-earth-in-spatialite-and-datasette"&gt;Natural Earth in SpatiaLite and Datasette&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://til.simonwillison.net/pytest/coverage-with-context"&gt;pytest coverage with context&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
    
        &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/accessibility"&gt;accessibility&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/documentation"&gt;documentation&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/datasette"&gt;datasette&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/weeknotes"&gt;weeknotes&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/sphinx-docs"&gt;sphinx-docs&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/playwright"&gt;playwright&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/shot-scraper"&gt;shot-scraper&lt;/a&gt;&lt;/p&gt;
    

</summary><category term="accessibility"/><category term="documentation"/><category term="datasette"/><category term="weeknotes"/><category term="sphinx-docs"/><category term="playwright"/><category term="shot-scraper"/></entry><entry><title>shot-scraper: automated screenshots for documentation, built on Playwright</title><link href="https://simonwillison.net/2022/Mar/10/shot-scraper/#atom-tag" rel="alternate"/><published>2022-03-10T00:13:30+00:00</published><updated>2022-03-10T00:13:30+00:00</updated><id>https://simonwillison.net/2022/Mar/10/shot-scraper/#atom-tag</id><summary type="html">
    &lt;p&gt;&lt;a href="https://github.com/simonw/shot-scraper"&gt;shot-scraper&lt;/a&gt; is a new tool that I’ve built to help automate the process of keeping screenshots up-to-date in my documentation. It also doubles as a scraping tool - hence the name - which I picked as a complement to my &lt;a href="https://simonwillison.net/2020/Oct/9/git-scraping/"&gt;git scraping&lt;/a&gt; and &lt;a href="https://simonwillison.net/2022/Feb/2/help-scraping/"&gt;help scraping&lt;/a&gt; techniques.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Update 13th March 2022:&lt;/strong&gt; The new &lt;code&gt;shot-scraper javascript&lt;/code&gt; command can now be used to &lt;a href="https://simonwillison.net/2022/Mar/14/scraping-web-pages-shot-scraper/"&gt;scrape web pages from the command line&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Update 14th October 2022:&lt;/strong&gt; &lt;a href="https://simonwillison.net/2022/Oct/14/automating-screenshots/"&gt;Automating screenshots for the Datasette documentation using shot-scraper&lt;/a&gt; offers a tutorial introduction to using the tool.&lt;/p&gt;
&lt;h4&gt;The problem&lt;/h4&gt;
&lt;p&gt;I like to include screenshots in documentation. I recently &lt;a href="https://simonwillison.net/2022/Feb/27/datasette-tutorials/"&gt;started writing end-user tutorials&lt;/a&gt; for Datasette, which are particularly image heavy (&lt;a href="https://datasette.io/tutorials/explore"&gt;for example&lt;/a&gt;).&lt;/p&gt;
&lt;p&gt;As software changes over time, screenshots get out-of-date. I don't like the idea of stale screenshots, but I also don't want to have to manually recreate them every time I make the tiniest tweak to the visual appearance of my software.&lt;/p&gt;
&lt;h4&gt;Introducing shot-scraper&lt;/h4&gt;
&lt;p&gt;&lt;code&gt;shot-scraper&lt;/code&gt; is a tool for automating this process. You can install it using &lt;code&gt;pip&lt;/code&gt; like this:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;pip install shot-scraper
shot-scraper install
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;That second &lt;code&gt;shot-scraper install&lt;/code&gt; line will install the browser it needs to do its job - more on that later.&lt;/p&gt;
&lt;p&gt;You can use it in two ways. To take a one-off screenshot, you can run it like this:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;shot-scraper https://simonwillison.net/ -o simonwillison.png
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Or if you want to take a set of screenshots in a repeatable way, you can define them in a YAML file that looks like this:&lt;/p&gt;
&lt;div class="highlight highlight-source-yaml"&gt;&lt;pre&gt;- &lt;span class="pl-ent"&gt;url&lt;/span&gt;: &lt;span class="pl-s"&gt;https://simonwillison.net/&lt;/span&gt;
  &lt;span class="pl-ent"&gt;output&lt;/span&gt;: &lt;span class="pl-s"&gt;simonwillison.png&lt;/span&gt;
- &lt;span class="pl-ent"&gt;url&lt;/span&gt;: &lt;span class="pl-s"&gt;https://www.example.com/&lt;/span&gt;
  &lt;span class="pl-ent"&gt;width&lt;/span&gt;: &lt;span class="pl-c1"&gt;400&lt;/span&gt;
  &lt;span class="pl-ent"&gt;height&lt;/span&gt;: &lt;span class="pl-c1"&gt;400&lt;/span&gt;
  &lt;span class="pl-ent"&gt;quality&lt;/span&gt;: &lt;span class="pl-c1"&gt;80&lt;/span&gt;
  &lt;span class="pl-ent"&gt;output&lt;/span&gt;: &lt;span class="pl-s"&gt;example.jpg&lt;/span&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;And then use &lt;code&gt;shot-scraper multi&lt;/code&gt; to execute every screenshot in one go:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;% shot-scraper multi shots.yml 
Screenshot of 'https://simonwillison.net/' written to 'simonwillison.png'
Screenshot of 'https://www.example.com/' written to 'example.jpg'
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;&lt;a href="https://shot-scraper.datasette.io/en/stable/screenshots.html"&gt;The documentation&lt;/a&gt; describes all of the available options you can use when taking a screenshot.&lt;/p&gt;
&lt;p&gt;Each option can be provided to the &lt;code&gt;shot-scraper&lt;/code&gt; one-off tool, or can be embedded in the YAML file for use with &lt;code&gt;shot-scraper multi&lt;/code&gt;.&lt;/p&gt;
&lt;h4&gt;JavaScript and CSS selectors&lt;/h4&gt;
&lt;p&gt;The default behaviour for &lt;code&gt;shot-scraper&lt;/code&gt; is to take a full page screenshot, using a browser width of 1280px.&lt;/p&gt;
&lt;p&gt;For documentation screenshots you probably don't want the whole page though - you likely want to create an image of one specific part of the interface.&lt;/p&gt;
&lt;p&gt;The &lt;code&gt;--selector&lt;/code&gt; option allows you to specify an area of the page by CSS selector. The resulting image will consist just of that part of the page.&lt;/p&gt;
&lt;p&gt;What if you want to modify the page in addition to selecting a specific area?&lt;/p&gt;
&lt;p&gt;The &lt;code&gt;--javascript&lt;/code&gt; option lets you pass in a block of JavaScript code which will be injected into the page and executed after the page has loaded, but before the screenshot is taken.&lt;/p&gt;
&lt;p&gt;The combination of these two options - also available as &lt;code&gt;javascript:&lt;/code&gt; and &lt;code&gt;selector:&lt;/code&gt; keys in the YAML file - should be flexible enough to cover the custom screenshot case for documentation.&lt;/p&gt;
&lt;h4 id="a-complex-example"&gt;A complex example&lt;/h4&gt;
&lt;p&gt;To prove to myself that the tool works, I decided to try replicating this screenshot from &lt;a href="https://datasette.io/tutorials/explore"&gt;my tutorial&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;I made the original using &lt;a href="https://cleanshot.com/"&gt;CleanShot X&lt;/a&gt;, manually adding the two pink arrows:&lt;/p&gt;
&lt;p&gt;&lt;img src="https://static.simonwillison.net/static/2022/select-facets-original.jpg" alt="A screenshot of a portion of the table interface in Datasette, with a menu open and two pink arrows pointing to menu items" style="max-width:100%;" /&gt;&lt;/p&gt;
&lt;p&gt;This is pretty tricky!&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;It's not &lt;a href="https://congress-legislators.datasettes.com/legislators/executive_terms?start__startswith=18&amp;amp;type=prez"&gt;this whole page&lt;/a&gt;, just a subset of the page&lt;/li&gt;
&lt;li&gt;The cog menu for one of the columns is open, which means the cog icon needs to be clicked before taking the screenshot&lt;/li&gt;
&lt;li&gt;There are two pink arrows superimposed on the image&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;I decided to do use just one arrow for the moment, which should hopefully result in a clearer image.&lt;/p&gt;
&lt;p&gt;I started by &lt;a href="https://github.com/simonw/shot-scraper/issues/9#issuecomment-1063314278"&gt;creating my own pink arrow SVG&lt;/a&gt; using Figma:&lt;/p&gt;
&lt;p&gt;&lt;img src="https://static.simonwillison.net/static/2022/pink-arrow.png" alt="A big pink arrow, with a drop shadow" style="width: 200px; max-width:100%;" /&gt;&lt;/p&gt;
&lt;p&gt;I then fiddled around in the Firefox developer console for quite a while, working out the JavaScript needed to trim the page down to the bit I wanted, open the menu and position the arrow.&lt;/p&gt;
&lt;p&gt;With the JavaScript figured out, I pasted it into a YAML file called &lt;code&gt;shot.yml&lt;/code&gt;:&lt;/p&gt;
&lt;div class="highlight highlight-source-yaml"&gt;&lt;pre&gt;- &lt;span class="pl-ent"&gt;url&lt;/span&gt;: &lt;span class="pl-s"&gt;https://congress-legislators.datasettes.com/legislators/executive_terms?start__startswith=18&amp;amp;type=prez&lt;/span&gt;
  &lt;span class="pl-ent"&gt;javascript&lt;/span&gt;: &lt;span class="pl-s"&gt;|&lt;/span&gt;
&lt;span class="pl-s"&gt;    new Promise(resolve =&amp;gt; {&lt;/span&gt;
&lt;span class="pl-s"&gt;      // Run in a promise so we can sleep 1s at the end&lt;/span&gt;
&lt;span class="pl-s"&gt;      function remove(el) { el.parentNode.removeChild(el);}&lt;/span&gt;
&lt;span class="pl-s"&gt;      // Remove header and footer&lt;/span&gt;
&lt;span class="pl-s"&gt;      remove(document.querySelector('header'));&lt;/span&gt;
&lt;span class="pl-s"&gt;      remove(document.querySelector('footer'));&lt;/span&gt;
&lt;span class="pl-s"&gt;      // Remove most of the children of .content&lt;/span&gt;
&lt;span class="pl-s"&gt;      Array.from(document.querySelectorAll('.content &amp;gt; *:not(.table-wrapper,.suggested-facets)')).map(remove)&lt;/span&gt;
&lt;span class="pl-s"&gt;      // Bit of breathing room for the screenshot&lt;/span&gt;
&lt;span class="pl-s"&gt;      document.body.style.marginTop = '10px';&lt;/span&gt;
&lt;span class="pl-s"&gt;      // Add a bit of padding to .content&lt;/span&gt;
&lt;span class="pl-s"&gt;      var content = document.querySelector('.content');&lt;/span&gt;
&lt;span class="pl-s"&gt;      content.style.width = '820px';&lt;/span&gt;
&lt;span class="pl-s"&gt;      content.style.padding = '10px';&lt;/span&gt;
&lt;span class="pl-s"&gt;      // Open the menu - it's an SVG so we need to use dispatchEvent here&lt;/span&gt;
&lt;span class="pl-s"&gt;      document.querySelector('th.col-executive_id svg').dispatchEvent(new Event('click'));&lt;/span&gt;
&lt;span class="pl-s"&gt;      // Remove all but table header and first 11 rows&lt;/span&gt;
&lt;span class="pl-s"&gt;      Array.from(document.querySelectorAll('tr')).slice(12).map(remove);&lt;/span&gt;
&lt;span class="pl-s"&gt;      // Add a pink SVG arrow&lt;/span&gt;
&lt;span class="pl-s"&gt;      let div = document.createElement('div');&lt;/span&gt;
&lt;span class="pl-s"&gt;      div.innerHTML = `&amp;lt;svg width="104" height="60" fill="none" xmlns="http://www.w3.org/2000/svg"&amp;gt;&lt;/span&gt;
&lt;span class="pl-s"&gt;        &amp;lt;g filter="url(#a)"&amp;gt;&lt;/span&gt;
&lt;span class="pl-s"&gt;          &amp;lt;path fill-rule="evenodd" clip-rule="evenodd" d="m76.7 1 2 2 .2-.1.1.4 20 20a3.5 3.5 0 0 1 0 5l-20 20-.1.4-.3-.1-1.9 2a3.5 3.5 0 0 1-5.4-4.4l3.2-14.4H4v-12h70.6L71.3 5.4A3.5 3.5 0 0 1 76.7 1Z" fill="#FF31A0"/&amp;gt;&lt;/span&gt;
&lt;span class="pl-s"&gt;        &amp;lt;/g&amp;gt;&lt;/span&gt;
&lt;span class="pl-s"&gt;        &amp;lt;defs&amp;gt;&lt;/span&gt;
&lt;span class="pl-s"&gt;          &amp;lt;filter id="a" x="0" y="0" width="104" height="59.5" filterUnits="userSpaceOnUse" color-interpolation-filters="sRGB"&amp;gt;&lt;/span&gt;
&lt;span class="pl-s"&gt;              &amp;lt;feFlood flood-opacity="0" result="BackgroundImageFix"/&amp;gt;&lt;/span&gt;
&lt;span class="pl-s"&gt;              &amp;lt;feColorMatrix in="SourceAlpha" values="0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 127 0" result="hardAlpha"/&amp;gt;&lt;/span&gt;
&lt;span class="pl-s"&gt;              &amp;lt;feOffset dy="4"/&amp;gt;&lt;/span&gt;
&lt;span class="pl-s"&gt;              &amp;lt;feGaussianBlur stdDeviation="2"/&amp;gt;&lt;/span&gt;
&lt;span class="pl-s"&gt;              &amp;lt;feComposite in2="hardAlpha" operator="out"/&amp;gt;&lt;/span&gt;
&lt;span class="pl-s"&gt;              &amp;lt;feColorMatrix values="0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0.25 0"/&amp;gt;&lt;/span&gt;
&lt;span class="pl-s"&gt;              &amp;lt;feBlend in2="BackgroundImageFix" result="effect1_dropShadow_2_26"/&amp;gt;&lt;/span&gt;
&lt;span class="pl-s"&gt;              &amp;lt;feBlend in="SourceGraphic" in2="effect1_dropShadow_2_26" result="shape"/&amp;gt;&lt;/span&gt;
&lt;span class="pl-s"&gt;          &amp;lt;/filter&amp;gt;&lt;/span&gt;
&lt;span class="pl-s"&gt;        &amp;lt;/defs&amp;gt;&lt;/span&gt;
&lt;span class="pl-s"&gt;      &amp;lt;/svg&amp;gt;`;&lt;/span&gt;
&lt;span class="pl-s"&gt;      let svg = div.firstChild;&lt;/span&gt;
&lt;span class="pl-s"&gt;      content.appendChild(svg);&lt;/span&gt;
&lt;span class="pl-s"&gt;      content.style.position = 'relative';&lt;/span&gt;
&lt;span class="pl-s"&gt;      svg.style.position = 'absolute';&lt;/span&gt;
&lt;span class="pl-s"&gt;      // Give the menu time to finish fading in&lt;/span&gt;
&lt;span class="pl-s"&gt;      setTimeout(() =&amp;gt; {&lt;/span&gt;
&lt;span class="pl-s"&gt;        // Position arrow pointing to the 'facet by this' menu item&lt;/span&gt;
&lt;span class="pl-s"&gt;        var pos = document.querySelector('.dropdown-facet').getBoundingClientRect();&lt;/span&gt;
&lt;span class="pl-s"&gt;        svg.style.left = (pos.left - pos.width) + 'px';&lt;/span&gt;
&lt;span class="pl-s"&gt;        svg.style.top = (pos.top - 20) + 'px';&lt;/span&gt;
&lt;span class="pl-s"&gt;        resolve();&lt;/span&gt;
&lt;span class="pl-s"&gt;      }, 1000);&lt;/span&gt;
&lt;span class="pl-s"&gt;    });&lt;/span&gt;
&lt;span class="pl-s"&gt;&lt;/span&gt;  &lt;span class="pl-ent"&gt;output&lt;/span&gt;: &lt;span class="pl-s"&gt;annotated-screenshot.png&lt;/span&gt;
  &lt;span class="pl-ent"&gt;selector&lt;/span&gt;: &lt;span class="pl-s"&gt;.content&lt;/span&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;And ran this command to generate the screenshot:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;shot-scraper multi shot.yml
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;The generated &lt;code&gt;annotated-screenshot.png&lt;/code&gt; image looks like this:&lt;/p&gt;
&lt;p&gt;&lt;img src="https://static.simonwillison.net/static/2022/annotated-screenshot.png" alt="A screenshot of the table with the menu open and a single pink arrow pointing to the 'facet by this' menu item" style="max-width:100%;" /&gt;&lt;/p&gt;
&lt;p&gt;I'm pretty happy with this! I think it works very well as a proof of concept for the process.&lt;/p&gt;
&lt;h4 id="how-it-works-playwright"&gt;How it works: Playwright&lt;/h4&gt;
&lt;p&gt;I built the &lt;a href="https://github.com/simonw/shot-scraper/tree/44995cd45ca6c56d34c5c3d131217f7b9170f6f7"&gt;first prototype&lt;/a&gt; of &lt;code&gt;shot-scraper&lt;/code&gt; using Puppeteer, because I had &lt;a href="https://simonwillison.net/2020/Sep/3/weeknotes-airtable-screenshots-dogsheep/"&gt;used that before&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;Then I noticed that the &lt;a href="https://www.npmjs.com/package/puppeteer-cli"&gt;puppeteer-cli&lt;/a&gt; package I was using hadn't had an update in two years, which reminded me to check out Playwright.&lt;/p&gt;
&lt;p&gt;I've been looking for an excuse to learn &lt;a href="https://playwright.dev/"&gt;Playwright&lt;/a&gt; for a while now, and this project turned out to be ideal.&lt;/p&gt;
&lt;p&gt;Playwright is Microsoft's open source browser automation framework. They promote it as a testing tool, but it has plenty of applications outside of testing - screenshot automation and screen scraping being two of the most obvious.&lt;/p&gt;
&lt;p&gt;Playwright is comprehensive: it downloads its own custom browser builds, and can run tests across multiple different rendering engines.&lt;/p&gt;
&lt;p&gt;The second prototype used the &lt;a href="https://github.com/simonw/shot-scraper/tree/b3318b2f27ca1526d5a9f06de50cf9900dd4d8d0"&gt;Playwright CLI utility&lt;/a&gt; instead, &lt;a href="https://github.com/simonw/shot-scraper/blob/b3318b2f27ca1526d5a9f06de50cf9900dd4d8d0/shot_scraper/cli.py#L39-L50"&gt;executed via npx&lt;/a&gt;:&lt;/p&gt;
&lt;pre&gt;&lt;span class="pl-s1"&gt;subprocess&lt;/span&gt;.&lt;span class="pl-en"&gt;run&lt;/span&gt;(
    [
        &lt;span class="pl-s"&gt;"npx"&lt;/span&gt;,
        &lt;span class="pl-s"&gt;"playwright"&lt;/span&gt;,
        &lt;span class="pl-s"&gt;"screenshot"&lt;/span&gt;,
        &lt;span class="pl-s"&gt;"--full-page"&lt;/span&gt;,
        &lt;span class="pl-s1"&gt;url&lt;/span&gt;,
        &lt;span class="pl-s1"&gt;output&lt;/span&gt;,
    ],
    &lt;span class="pl-s1"&gt;capture_output&lt;/span&gt;&lt;span class="pl-c1"&gt;=&lt;/span&gt;&lt;span class="pl-c1"&gt;True&lt;/span&gt;,
)&lt;/pre&gt;
&lt;p&gt;This could take a full page screenshot, but that CLI tool wasn't flexible enough to take screenshots of specific elements. So I needed to switch to the Playwright programmatic API.&lt;/p&gt;
&lt;p&gt;I started out trying to get Python to generate and pass JavaScript to the Node.js library... and then I spotted the official &lt;a href="https://playwright.dev/python/docs/intro"&gt;Playwright for Python&lt;/a&gt; package.&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;pip install playwright
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;It's amazing! It has the exact same functionality as the JavaScript library - the same classes, the same methods. Everything just works, in both languages.&lt;/p&gt;
&lt;p&gt;I was curious how they pulled this off, so I dug inside the &lt;code&gt;playwright&lt;/code&gt; Python package in my &lt;code&gt;site-packages&lt;/code&gt; folder... and found it bundles a full Node.js binary executable and uses it to bridge the two worlds! What a wild hack.&lt;/p&gt;
&lt;p&gt;Thanks to Playwright, the entire implementation of &lt;code&gt;shot-scraper&lt;/code&gt; is currently just &lt;a href="https://github.com/simonw/shot-scraper/blob/0.3/shot_scraper/cli.py"&gt;181 lines of Python code&lt;/a&gt; - it's all glue code tying together a &lt;a href="https://click.palletsprojects.com/"&gt;Click&lt;/a&gt; CLI interface with some code that calls Playwright to do the actual work.&lt;/p&gt;
&lt;p&gt;I couldn't be more impressed with Playwright. I'll definitely be using it for other projects - for one thing, I think I'll finally be able to add automated tests to my &lt;a href="https://datasette.io/desktop"&gt;Datasette Desktop&lt;/a&gt; Electron application.&lt;/p&gt;
&lt;h4&gt;Hooking shot-scraper up to GitHub Actions&lt;/h4&gt;
&lt;p&gt;I built &lt;code&gt;shot-scraper&lt;/code&gt; very much with GitHub Actions in mind.&lt;/p&gt;
&lt;p&gt;My &lt;a href="https://github.com/simonw/shot-scraper-demo"&gt;shot-scraper-demo&lt;/a&gt; repository is my first live demo of the tool.&lt;/p&gt;
&lt;p&gt;Once a day, it runs &lt;a href="https://github.com/simonw/shot-scraper-demo/blob/3fdd9d3e79f95d9d396aeefd5bf65e85a7700ef4/.github/workflows/shots.yml"&gt;this shots.yml&lt;/a&gt; file, generates two screenshots and commits them back to the repository.&lt;/p&gt;
&lt;p&gt;One of them is the tutorial screenshot described above.&lt;/p&gt;
&lt;p&gt;The other is a screenshot of the list of "recently spotted owls" from &lt;a href="https://www.owlsnearme.com/?place=127871"&gt;this page&lt;/a&gt; on &lt;a href="https://www.owlsnearme.com/"&gt;owlsnearme.com&lt;/a&gt;. I wanted a page that would change on an occasional basis, to demonstrate GitHub's neat image diffing interface.&lt;/p&gt;
&lt;p&gt;I may need to change that demo though! That page includes "spotted 5 hours ago" text, which means that there's almost always a tiny pixel difference, &lt;a href="https://github.com/simonw/shot-scraper-demo/commit/bc86510f49b6f8d6728c9f1880b999c83361dd5a#diff-897c3444fbbb2033cbba5840da4994d01c3f396e0cdf4b0613d7f410db9887e0"&gt;like this one&lt;/a&gt; (use the "swipe" comparison tool to watch 6 hours ago change to 7 hours ago under the top left photo).&lt;/p&gt;
&lt;p&gt;Storing image files that change frequently in a free repository on GitHub feels rude to me, so please use this tool cautiously there!&lt;/p&gt;
&lt;h4&gt;What's next?&lt;/h4&gt;
&lt;p&gt;I had ambitious plans to add utilities to the tool that would &lt;a href="https://github.com/simonw/shot-scraper/issues/9"&gt;help with annotations&lt;/a&gt;, such as adding pink arrows and drawing circles around different elements on the page.&lt;/p&gt;
&lt;p&gt;I've shelved those plans for the moment: as the demo above shows, the JavaScript hook is good enough. I may revisit this later once common patterns have started to emerge.&lt;/p&gt;
&lt;p&gt;So really, my next step is to start using this tool for my own projects - to generate screenshots for my documentation.&lt;/p&gt;
&lt;p&gt;I'm also very interested to see what kinds of things other people use this for.&lt;/p&gt;
    
        &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/cli"&gt;cli&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/documentation"&gt;documentation&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/projects"&gt;projects&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/scraping"&gt;scraping&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/github-actions"&gt;github-actions&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/git-scraping"&gt;git-scraping&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/puppeteer"&gt;puppeteer&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/playwright"&gt;playwright&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/shot-scraper"&gt;shot-scraper&lt;/a&gt;&lt;/p&gt;
    

</summary><category term="cli"/><category term="documentation"/><category term="projects"/><category term="scraping"/><category term="github-actions"/><category term="git-scraping"/><category term="puppeteer"/><category term="playwright"/><category term="shot-scraper"/></entry></feed>