<?xml version="1.0" encoding="utf-8"?>
<feed xml:lang="en-us" xmlns="http://www.w3.org/2005/Atom"><title>Simon Willison's Weblog: cors</title><link href="http://simonwillison.net/" rel="alternate"/><link href="http://simonwillison.net/tags/cors.atom" rel="self"/><id>http://simonwillison.net/</id><updated>2026-04-09T21:31:50+00:00</updated><author><name>Simon Willison</name></author><entry><title>GitHub Repo Size</title><link href="https://simonwillison.net/2026/Apr/9/github-repo-size/#atom-tag" rel="alternate"/><published>2026-04-09T21:31:50+00:00</published><updated>2026-04-09T21:31:50+00:00</updated><id>https://simonwillison.net/2026/Apr/9/github-repo-size/#atom-tag</id><summary type="html">
    &lt;p&gt;&lt;strong&gt;Tool:&lt;/strong&gt; &lt;a href="https://tools.simonwillison.net/github-repo-size"&gt;GitHub Repo Size&lt;/a&gt;&lt;/p&gt;
    &lt;p&gt;GitHub doesn't tell you the repo size in the UI, but it's available in the CORS-friendly &lt;a href="https://api.github.com/repos/simonw/datasette"&gt;API&lt;/a&gt;. Paste a repo into this tool to see the size, &lt;a href="https://tools.simonwillison.net/github-repo-size?repo=simonw%2Fdatasette"&gt;for example for simonw/datasette&lt;/a&gt; (8.1MB).&lt;/p&gt;
    
        &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/cors"&gt;cors&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/github"&gt;github&lt;/a&gt;&lt;/p&gt;
    

</summary><category term="cors"/><category term="github"/></entry><entry><title>DNS Lookup</title><link href="https://simonwillison.net/2026/Mar/22/dns/#atom-tag" rel="alternate"/><published>2026-03-22T19:16:30+00:00</published><updated>2026-03-22T19:16:30+00:00</updated><id>https://simonwillison.net/2026/Mar/22/dns/#atom-tag</id><summary type="html">
    &lt;p&gt;&lt;strong&gt;Tool:&lt;/strong&gt; &lt;a href="https://tools.simonwillison.net/dns"&gt;DNS Lookup&lt;/a&gt;&lt;/p&gt;
    &lt;p&gt;TIL that Cloudflare's 1.1.1.1 DNS service (and 1.1.1.2 and 1.1.1.3, which block malware and malware + adult content respectively) has a CORS-enabled JSON API, so I &lt;a href="https://github.com/simonw/tools/pull/258#issue-4116864108"&gt;had Claude Code build me&lt;/a&gt; a UI for running DNS queries against all three of those resolvers.&lt;/p&gt;
    
        &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/dns"&gt;dns&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/cors"&gt;cors&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/cloudflare"&gt;cloudflare&lt;/a&gt;&lt;/p&gt;
    

</summary><category term="dns"/><category term="cors"/><category term="cloudflare"/></entry><entry><title>NICAR 2026 Schedule</title><link href="https://simonwillison.net/2026/Mar/4/nicar-2026/#atom-tag" rel="alternate"/><published>2026-03-04T21:34:26+00:00</published><updated>2026-03-04T21:34:26+00:00</updated><id>https://simonwillison.net/2026/Mar/4/nicar-2026/#atom-tag</id><summary type="html">
    &lt;p&gt;&lt;strong&gt;Tool:&lt;/strong&gt; &lt;a href="https://tools.simonwillison.net/nicar-2026"&gt;NICAR 2026 Schedule&lt;/a&gt;&lt;/p&gt;
    &lt;p&gt;The &lt;a href="https://www.ire.org/training/conferences/nicar-2026/"&gt;NICAR&lt;/a&gt; data journalism conference provides a copy of the schedule as CORS-enabled JSON, so I vibe coded my own mobile-friendly schedule app to help me keep track of the sessions I want to attend.&lt;/p&gt;
    
        &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/cors"&gt;cors&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/vibe-coding"&gt;vibe-coding&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/nicar"&gt;nicar&lt;/a&gt;&lt;/p&gt;
    

</summary><category term="cors"/><category term="vibe-coding"/><category term="nicar"/></entry><entry><title>The most popular blogs of Hacker News in 2025</title><link href="https://simonwillison.net/2026/Jan/2/most-popular-blogs-of-hacker-news/#atom-tag" rel="alternate"/><published>2026-01-02T19:10:43+00:00</published><updated>2026-01-02T19:10:43+00:00</updated><id>https://simonwillison.net/2026/Jan/2/most-popular-blogs-of-hacker-news/#atom-tag</id><summary type="html">
    
&lt;p&gt;&lt;strong&gt;&lt;a href="https://refactoringenglish.com/blog/2025-hn-top-5/"&gt;The most popular blogs of Hacker News in 2025&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;
Michael Lynch maintains &lt;a href="https://refactoringenglish.com/tools/hn-popularity/"&gt;HN Popularity Contest&lt;/a&gt;, a site that tracks personal blogs on Hacker News and scores them based on how well they perform on that platform.&lt;/p&gt;
&lt;p&gt;The engine behind the project is the &lt;a href="https://github.com/mtlynch/hn-popularity-contest-data/blob/master/data/domains-meta.csv"&gt;domain-meta.csv&lt;/a&gt; CSV on GiHub, a hand-curated list of known personal blogs with author and bio and tag metadata, which Michael uses to separate out personal blog posts from other types of content.&lt;/p&gt;
&lt;p&gt;I came top of the rankings in 2023, 2024 and 2025 but I'm listed &lt;a href="https://refactoringenglish.com/tools/hn-popularity/"&gt;in third place&lt;/a&gt; for all time behind Paul Graham and Brian Krebs.&lt;/p&gt;
&lt;p&gt;I dug around in the browser inspector and was delighted to find that the data powering the site is served with open CORS headers, which means you can easily explore it with external services like Datasette Lite.&lt;/p&gt;
&lt;p&gt;Here's a convoluted window function query Claude Opus 4.5 &lt;a href="https://claude.ai/share/8e1cb294-0ff0-4d5b-b83f-58e4c7fdb0d2"&gt;wrote for me&lt;/a&gt; which, for a given domain, shows where that domain ranked for each year since it first appeared in the dataset:&lt;/p&gt;
&lt;pre&gt;&lt;span class="pl-s"&gt;with yearly_scores as (&lt;/span&gt;
&lt;span class="pl-s"&gt;  select &lt;/span&gt;
&lt;span class="pl-s"&gt;    domain,&lt;/span&gt;
&lt;span class="pl-s"&gt;    strftime('%Y', date) as year,&lt;/span&gt;
&lt;span class="pl-s"&gt;    sum(score) as total_score,&lt;/span&gt;
&lt;span class="pl-s"&gt;    count(distinct date) as days_mentioned&lt;/span&gt;
&lt;span class="pl-s"&gt;  from "hn-data"&lt;/span&gt;
&lt;span class="pl-s"&gt;  group by domain, strftime('%Y', date)&lt;/span&gt;
&lt;span class="pl-s"&gt;),&lt;/span&gt;
&lt;span class="pl-s"&gt;ranked as (&lt;/span&gt;
&lt;span class="pl-s"&gt;  select &lt;/span&gt;
&lt;span class="pl-s"&gt;    domain,&lt;/span&gt;
&lt;span class="pl-s"&gt;    year,&lt;/span&gt;
&lt;span class="pl-s"&gt;    total_score,&lt;/span&gt;
&lt;span class="pl-s"&gt;    days_mentioned,&lt;/span&gt;
&lt;span class="pl-s"&gt;    rank() over (partition by year order by total_score desc) as rank&lt;/span&gt;
&lt;span class="pl-s"&gt;  from yearly_scores&lt;/span&gt;
&lt;span class="pl-s"&gt;)&lt;/span&gt;
&lt;span class="pl-s"&gt;select &lt;/span&gt;
&lt;span class="pl-s"&gt;  r.year,&lt;/span&gt;
&lt;span class="pl-s"&gt;  r.total_score,&lt;/span&gt;
&lt;span class="pl-s"&gt;  r.rank,&lt;/span&gt;
&lt;span class="pl-s"&gt;  r.days_mentioned&lt;/span&gt;
&lt;span class="pl-s"&gt;from ranked r&lt;/span&gt;
&lt;span class="pl-s"&gt;where r.domain = :domain&lt;/span&gt;
&lt;span class="pl-s"&gt;  and r.year &amp;gt;= (&lt;/span&gt;
&lt;span class="pl-s"&gt;    select min(strftime('%Y', date)) &lt;/span&gt;
&lt;span class="pl-s"&gt;    from "hn-data"&lt;/span&gt;
&lt;span class="pl-s"&gt;    where domain = :domain&lt;/span&gt;
&lt;span class="pl-s"&gt;  )&lt;/span&gt;
&lt;span class="pl-s"&gt;order by r.year desc&lt;/span&gt;&lt;/pre&gt;

&lt;p&gt;(I just noticed that the last &lt;code&gt;and r.year &amp;gt;= (&lt;/code&gt; clause isn't actually needed here.)&lt;/p&gt;
&lt;p&gt;My &lt;a href="https://lite.datasette.io/?csv=https://hn-popularity.cdn.refactoringenglish.com/hn-data.csv#/data?sql=with+yearly_scores+as+%28%0A++select+%0A++++domain%2C%0A++++strftime%28%27%25Y%27%2C+date%29+as+year%2C%0A++++sum%28score%29+as+total_score%2C%0A++++count%28distinct+date%29+as+days_mentioned%0A++from+%22hn-data%22%0A++group+by+domain%2C+strftime%28%27%25Y%27%2C+date%29%0A%29%2C%0Aranked+as+%28%0A++select+%0A++++domain%2C%0A++++year%2C%0A++++total_score%2C%0A++++days_mentioned%2C%0A++++rank%28%29+over+%28partition+by+year+order+by+total_score+desc%29+as+rank%0A++from+yearly_scores%0A%29%0Aselect+%0A++r.year%2C%0A++r.total_score%2C%0A++r.rank%2C%0A++r.days_mentioned%0Afrom+ranked+r%0Awhere+r.domain+%3D+%3Adomain%0A++and+r.year+%3E%3D+%28%0A++++select+min%28strftime%28%27%25Y%27%2C+date%29%29+%0A++++from+%22hn-data%22%0A++++where+domain+%3D+%3Adomain%0A++%29%0Aorder+by+r.year+desc&amp;amp;domain=simonwillison.net"&gt;simonwillison.net results&lt;/a&gt; show me ranked 3rd in 2022, 30th in 2021 and 85th back in 2007 - though I expect there are many personal blogs from that year which haven't yet been manually added to Michael's list.&lt;/p&gt;
&lt;p&gt;Also useful is that every domain gets its own CORS-enabled CSV file with details of the actual Hacker News submitted from that domain, e.g. &lt;code&gt;https://hn-popularity.cdn.refactoringenglish.com/domains/simonwillison.net.csv&lt;/code&gt;. Here's &lt;a href="https://lite.datasette.io/?csv=https://hn-popularity.cdn.refactoringenglish.com/domains/simonwillison.net.csv#/data/simonwillison"&gt;that one in Datasette Lite&lt;/a&gt;.

    &lt;p&gt;&lt;small&gt;&lt;/small&gt;Via &lt;a href="https://news.ycombinator.com/item?id=46465819"&gt;Hacker News&lt;/a&gt;&lt;/small&gt;&lt;/p&gt;


    &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/hacker-news"&gt;hacker-news&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/sql"&gt;sql&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/sqlite"&gt;sqlite&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/datasette"&gt;datasette&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/datasette-lite"&gt;datasette-lite&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/cors"&gt;cors&lt;/a&gt;&lt;/p&gt;



</summary><category term="hacker-news"/><category term="sql"/><category term="sqlite"/><category term="datasette"/><category term="datasette-lite"/><category term="cors"/></entry><entry><title>Introducing gisthost.github.io</title><link href="https://simonwillison.net/2026/Jan/1/gisthost/#atom-tag" rel="alternate"/><published>2026-01-01T22:12:20+00:00</published><updated>2026-01-01T22:12:20+00:00</updated><id>https://simonwillison.net/2026/Jan/1/gisthost/#atom-tag</id><summary type="html">
    &lt;p&gt;I am a huge fan of &lt;a href="https://gistpreview.github.io/"&gt;gistpreview.github.io&lt;/a&gt;, the site by Leon Huang that lets you append &lt;code&gt;?GIST_id&lt;/code&gt; to see a browser-rendered version of an HTML page that you have saved to a Gist. The last commit was ten years ago and I needed a couple of small changes so I've forked it and deployed an updated version at &lt;a href="https://gisthost.github.io/"&gt;gisthost.github.io&lt;/a&gt;.&lt;/p&gt;
&lt;h4 id="some-background-on-gistpreview"&gt;Some background on gistpreview&lt;/h4&gt;
&lt;p&gt;The genius thing about &lt;code&gt;gistpreview.github.io&lt;/code&gt; is that it's a core piece of GitHub infrastructure, hosted and cost-covered entirely by GitHub, that wasn't built with any involvement from GitHub at all.&lt;/p&gt;
&lt;p&gt;To understand how it works we need to first talk about Gists.&lt;/p&gt;
&lt;p&gt;Any file hosted in a &lt;a href="https://gist.github.com/"&gt;GitHub Gist&lt;/a&gt; can be accessed via a direct URL that looks like this:&lt;/p&gt;
&lt;p&gt;&lt;code&gt;https://gist.githubusercontent.com/simonw/d168778e8e62f65886000f3f314d63e3/raw/79e58f90821aeb8b538116066311e7ca30c870c9/index.html&lt;/code&gt;&lt;/p&gt;
&lt;p&gt;That URL is served with a few key HTTP headers:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;Content-Type: text/plain; charset=utf-8
X-Content-Type-Options: nosniff
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;These ensure that every file is treated by browsers as plain text, so HTML file will not be rendered even by older browsers that attempt to guess the content type based on the content.&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;Via: 1.1 varnish
Cache-Control: max-age=300
X-Served-By: cache-sjc1000085-SJC
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;These confirm that the file is sever via GitHub's caching CDN, which means I don't feel guilty about linking to them for potentially high traffic scenarios.&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;Access-Control-Allow-Origin: *
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;This is my favorite HTTP header! It means I can hit these files with a &lt;code&gt;fetch()&lt;/code&gt; call from any domain on the internet, which is fantastic for building &lt;a href="https://simonwillison.net/2025/Dec/10/html-tools/"&gt;HTML tools&lt;/a&gt; that do useful things with content hosted in a Gist.&lt;/p&gt;
&lt;p&gt;The one big catch is that Content-Type header. It means you can't use a Gist to serve HTML files that people can view.&lt;/p&gt;
&lt;p&gt;That's where &lt;code&gt;gistpreview&lt;/code&gt; comes in. The &lt;code&gt;gistpreview.github.io&lt;/code&gt; site belongs to the dedicated &lt;a href="https://github.com/gistpreview"&gt;gistpreview&lt;/a&gt; GitHub organization, and is served out of the &lt;a href="https://github.com/gistpreview/gistpreview.github.io"&gt;github.com/gistpreview/gistpreview.github.io&lt;/a&gt; repository by GitHub Pages.&lt;/p&gt;
&lt;p&gt;It's not much code. The key functionality is this snippet of JavaScript from &lt;a href="https://github.com/gistpreview/gistpreview.github.io/blob/master/main.js"&gt;main.js&lt;/a&gt;:&lt;/p&gt;
&lt;div class="highlight highlight-source-js"&gt;&lt;pre&gt;&lt;span class="pl-en"&gt;fetch&lt;/span&gt;&lt;span class="pl-kos"&gt;(&lt;/span&gt;&lt;span class="pl-s"&gt;'https://api.github.com/gists/'&lt;/span&gt; &lt;span class="pl-c1"&gt;+&lt;/span&gt; &lt;span class="pl-s1"&gt;gistId&lt;/span&gt;&lt;span class="pl-kos"&gt;)&lt;/span&gt;
&lt;span class="pl-kos"&gt;.&lt;/span&gt;&lt;span class="pl-en"&gt;then&lt;/span&gt;&lt;span class="pl-kos"&gt;(&lt;/span&gt;&lt;span class="pl-k"&gt;function&lt;/span&gt; &lt;span class="pl-kos"&gt;(&lt;/span&gt;&lt;span class="pl-s1"&gt;res&lt;/span&gt;&lt;span class="pl-kos"&gt;)&lt;/span&gt; &lt;span class="pl-kos"&gt;{&lt;/span&gt;
  &lt;span class="pl-k"&gt;return&lt;/span&gt; &lt;span class="pl-s1"&gt;res&lt;/span&gt;&lt;span class="pl-kos"&gt;.&lt;/span&gt;&lt;span class="pl-en"&gt;json&lt;/span&gt;&lt;span class="pl-kos"&gt;(&lt;/span&gt;&lt;span class="pl-kos"&gt;)&lt;/span&gt;&lt;span class="pl-kos"&gt;.&lt;/span&gt;&lt;span class="pl-en"&gt;then&lt;/span&gt;&lt;span class="pl-kos"&gt;(&lt;/span&gt;&lt;span class="pl-k"&gt;function&lt;/span&gt; &lt;span class="pl-kos"&gt;(&lt;/span&gt;&lt;span class="pl-s1"&gt;body&lt;/span&gt;&lt;span class="pl-kos"&gt;)&lt;/span&gt; &lt;span class="pl-kos"&gt;{&lt;/span&gt;
    &lt;span class="pl-k"&gt;if&lt;/span&gt; &lt;span class="pl-kos"&gt;(&lt;/span&gt;&lt;span class="pl-s1"&gt;res&lt;/span&gt;&lt;span class="pl-kos"&gt;.&lt;/span&gt;&lt;span class="pl-c1"&gt;status&lt;/span&gt; &lt;span class="pl-c1"&gt;===&lt;/span&gt; &lt;span class="pl-c1"&gt;200&lt;/span&gt;&lt;span class="pl-kos"&gt;)&lt;/span&gt; &lt;span class="pl-kos"&gt;{&lt;/span&gt;
      &lt;span class="pl-k"&gt;return&lt;/span&gt; &lt;span class="pl-s1"&gt;body&lt;/span&gt;&lt;span class="pl-kos"&gt;;&lt;/span&gt;
    &lt;span class="pl-kos"&gt;}&lt;/span&gt;
    &lt;span class="pl-smi"&gt;console&lt;/span&gt;&lt;span class="pl-kos"&gt;.&lt;/span&gt;&lt;span class="pl-en"&gt;log&lt;/span&gt;&lt;span class="pl-kos"&gt;(&lt;/span&gt;&lt;span class="pl-s1"&gt;res&lt;/span&gt;&lt;span class="pl-kos"&gt;,&lt;/span&gt; &lt;span class="pl-s1"&gt;body&lt;/span&gt;&lt;span class="pl-kos"&gt;)&lt;/span&gt;&lt;span class="pl-kos"&gt;;&lt;/span&gt; &lt;span class="pl-c"&gt;// debug&lt;/span&gt;
    &lt;span class="pl-k"&gt;throw&lt;/span&gt; &lt;span class="pl-k"&gt;new&lt;/span&gt; &lt;span class="pl-v"&gt;Error&lt;/span&gt;&lt;span class="pl-kos"&gt;(&lt;/span&gt;&lt;span class="pl-s"&gt;'Gist &amp;lt;strong&amp;gt;'&lt;/span&gt; &lt;span class="pl-c1"&gt;+&lt;/span&gt; &lt;span class="pl-s1"&gt;gistId&lt;/span&gt; &lt;span class="pl-c1"&gt;+&lt;/span&gt; &lt;span class="pl-s"&gt;'&amp;lt;/strong&amp;gt;, '&lt;/span&gt; &lt;span class="pl-c1"&gt;+&lt;/span&gt; &lt;span class="pl-s1"&gt;body&lt;/span&gt;&lt;span class="pl-kos"&gt;.&lt;/span&gt;&lt;span class="pl-c1"&gt;message&lt;/span&gt;&lt;span class="pl-kos"&gt;.&lt;/span&gt;&lt;span class="pl-en"&gt;replace&lt;/span&gt;&lt;span class="pl-kos"&gt;(&lt;/span&gt;&lt;span class="pl-pds"&gt;&lt;span class="pl-c1"&gt;/&lt;/span&gt;&lt;span class="pl-cce"&gt;\(&lt;/span&gt;.&lt;span class="pl-c1"&gt;*&lt;/span&gt;&lt;span class="pl-cce"&gt;\)&lt;/span&gt;&lt;span class="pl-c1"&gt;/&lt;/span&gt;&lt;/span&gt;&lt;span class="pl-kos"&gt;,&lt;/span&gt; &lt;span class="pl-s"&gt;''&lt;/span&gt;&lt;span class="pl-kos"&gt;)&lt;/span&gt;&lt;span class="pl-kos"&gt;)&lt;/span&gt;&lt;span class="pl-kos"&gt;;&lt;/span&gt;
  &lt;span class="pl-kos"&gt;}&lt;/span&gt;&lt;span class="pl-kos"&gt;)&lt;/span&gt;&lt;span class="pl-kos"&gt;;&lt;/span&gt;
&lt;span class="pl-kos"&gt;}&lt;/span&gt;&lt;span class="pl-kos"&gt;)&lt;/span&gt;
&lt;span class="pl-kos"&gt;.&lt;/span&gt;&lt;span class="pl-en"&gt;then&lt;/span&gt;&lt;span class="pl-kos"&gt;(&lt;/span&gt;&lt;span class="pl-k"&gt;function&lt;/span&gt; &lt;span class="pl-kos"&gt;(&lt;/span&gt;&lt;span class="pl-s1"&gt;info&lt;/span&gt;&lt;span class="pl-kos"&gt;)&lt;/span&gt; &lt;span class="pl-kos"&gt;{&lt;/span&gt;
  &lt;span class="pl-k"&gt;if&lt;/span&gt; &lt;span class="pl-kos"&gt;(&lt;/span&gt;&lt;span class="pl-s1"&gt;fileName&lt;/span&gt; &lt;span class="pl-c1"&gt;===&lt;/span&gt; &lt;span class="pl-s"&gt;''&lt;/span&gt;&lt;span class="pl-kos"&gt;)&lt;/span&gt; &lt;span class="pl-kos"&gt;{&lt;/span&gt;
    &lt;span class="pl-k"&gt;for&lt;/span&gt; &lt;span class="pl-kos"&gt;(&lt;/span&gt;&lt;span class="pl-k"&gt;var&lt;/span&gt; &lt;span class="pl-s1"&gt;file&lt;/span&gt; &lt;span class="pl-k"&gt;in&lt;/span&gt; &lt;span class="pl-s1"&gt;info&lt;/span&gt;&lt;span class="pl-kos"&gt;.&lt;/span&gt;&lt;span class="pl-c1"&gt;files&lt;/span&gt;&lt;span class="pl-kos"&gt;)&lt;/span&gt; &lt;span class="pl-kos"&gt;{&lt;/span&gt;
      &lt;span class="pl-c"&gt;// index.html or the first file&lt;/span&gt;
      &lt;span class="pl-k"&gt;if&lt;/span&gt; &lt;span class="pl-kos"&gt;(&lt;/span&gt;&lt;span class="pl-s1"&gt;fileName&lt;/span&gt; &lt;span class="pl-c1"&gt;===&lt;/span&gt; &lt;span class="pl-s"&gt;''&lt;/span&gt; &lt;span class="pl-c1"&gt;||&lt;/span&gt; &lt;span class="pl-s1"&gt;file&lt;/span&gt; &lt;span class="pl-c1"&gt;===&lt;/span&gt; &lt;span class="pl-s"&gt;'index.html'&lt;/span&gt;&lt;span class="pl-kos"&gt;)&lt;/span&gt; &lt;span class="pl-kos"&gt;{&lt;/span&gt;
        &lt;span class="pl-s1"&gt;fileName&lt;/span&gt; &lt;span class="pl-c1"&gt;=&lt;/span&gt; &lt;span class="pl-s1"&gt;file&lt;/span&gt;&lt;span class="pl-kos"&gt;;&lt;/span&gt;
      &lt;span class="pl-kos"&gt;}&lt;/span&gt;
    &lt;span class="pl-kos"&gt;}&lt;/span&gt;
  &lt;span class="pl-kos"&gt;}&lt;/span&gt;
  &lt;span class="pl-k"&gt;if&lt;/span&gt; &lt;span class="pl-kos"&gt;(&lt;/span&gt;&lt;span class="pl-s1"&gt;info&lt;/span&gt;&lt;span class="pl-kos"&gt;.&lt;/span&gt;&lt;span class="pl-c1"&gt;files&lt;/span&gt;&lt;span class="pl-kos"&gt;.&lt;/span&gt;&lt;span class="pl-en"&gt;hasOwnProperty&lt;/span&gt;&lt;span class="pl-kos"&gt;(&lt;/span&gt;&lt;span class="pl-s1"&gt;fileName&lt;/span&gt;&lt;span class="pl-kos"&gt;)&lt;/span&gt; &lt;span class="pl-c1"&gt;===&lt;/span&gt; &lt;span class="pl-c1"&gt;false&lt;/span&gt;&lt;span class="pl-kos"&gt;)&lt;/span&gt; &lt;span class="pl-kos"&gt;{&lt;/span&gt;
    &lt;span class="pl-k"&gt;throw&lt;/span&gt; &lt;span class="pl-k"&gt;new&lt;/span&gt; &lt;span class="pl-v"&gt;Error&lt;/span&gt;&lt;span class="pl-kos"&gt;(&lt;/span&gt;&lt;span class="pl-s"&gt;'File &amp;lt;strong&amp;gt;'&lt;/span&gt; &lt;span class="pl-c1"&gt;+&lt;/span&gt; &lt;span class="pl-s1"&gt;fileName&lt;/span&gt; &lt;span class="pl-c1"&gt;+&lt;/span&gt; &lt;span class="pl-s"&gt;'&amp;lt;/strong&amp;gt; is not exist'&lt;/span&gt;&lt;span class="pl-kos"&gt;)&lt;/span&gt;&lt;span class="pl-kos"&gt;;&lt;/span&gt;
  &lt;span class="pl-kos"&gt;}&lt;/span&gt;
  &lt;span class="pl-k"&gt;var&lt;/span&gt; &lt;span class="pl-s1"&gt;content&lt;/span&gt; &lt;span class="pl-c1"&gt;=&lt;/span&gt; &lt;span class="pl-s1"&gt;info&lt;/span&gt;&lt;span class="pl-kos"&gt;.&lt;/span&gt;&lt;span class="pl-c1"&gt;files&lt;/span&gt;&lt;span class="pl-kos"&gt;[&lt;/span&gt;&lt;span class="pl-s1"&gt;fileName&lt;/span&gt;&lt;span class="pl-kos"&gt;]&lt;/span&gt;&lt;span class="pl-kos"&gt;.&lt;/span&gt;&lt;span class="pl-c1"&gt;content&lt;/span&gt;&lt;span class="pl-kos"&gt;;&lt;/span&gt;
  &lt;span class="pl-smi"&gt;document&lt;/span&gt;&lt;span class="pl-kos"&gt;.&lt;/span&gt;&lt;span class="pl-en"&gt;write&lt;/span&gt;&lt;span class="pl-kos"&gt;(&lt;/span&gt;&lt;span class="pl-s1"&gt;content&lt;/span&gt;&lt;span class="pl-kos"&gt;)&lt;/span&gt;&lt;span class="pl-kos"&gt;;&lt;/span&gt;
&lt;span class="pl-kos"&gt;}&lt;/span&gt;&lt;span class="pl-kos"&gt;)&lt;/span&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;This chain of promises fetches the Gist content from the GitHub API, finds the section of that JSON corresponding to the requested file name and then outputs it to the page like this:&lt;/p&gt;
&lt;div class="highlight highlight-source-js"&gt;&lt;pre&gt;&lt;span class="pl-smi"&gt;document&lt;/span&gt;&lt;span class="pl-kos"&gt;.&lt;/span&gt;&lt;span class="pl-en"&gt;write&lt;/span&gt;&lt;span class="pl-kos"&gt;(&lt;/span&gt;&lt;span class="pl-s1"&gt;content&lt;/span&gt;&lt;span class="pl-kos"&gt;)&lt;/span&gt;&lt;span class="pl-kos"&gt;;&lt;/span&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;This is smart. Injecting the content using &lt;code&gt;document.body.innerHTML = content&lt;/code&gt; would fail to execute inline scripts. Using &lt;code&gt;document.write()&lt;/code&gt; causes the browser to treat the HTML as if it was directly part of the parent page.&lt;/p&gt;
&lt;p&gt;That's pretty much the whole trick! Read the Gist ID from the query string, fetch the content via the JSON API and &lt;code&gt;document.write()&lt;/code&gt; it into the page.&lt;/p&gt;
&lt;p&gt;Here's a demo:&lt;/p&gt;
&lt;p&gt;&lt;a href="https://gistpreview.github.io/?d168778e8e62f65886000f3f314d63e3"&gt;https://gistpreview.github.io/?d168778e8e62f65886000f3f314d63e3&lt;/a&gt;&lt;/p&gt;
&lt;h4 id="fixes-for-gisthost-github-io"&gt;Fixes for gisthost.github.io&lt;/h4&gt;
&lt;p&gt;I forked &lt;code&gt;gistpreview&lt;/code&gt; to add two new features:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;A workaround for Substack mangling the URLs&lt;/li&gt;
&lt;li&gt;The ability to serve larger files that get truncated in the JSON API&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;I also removed some dependencies (jQuery and Bootstrap and an old &lt;code&gt;fetch()&lt;/code&gt; polyfill) and inlined the JavaScript into &lt;a href="https://github.com/gisthost/gisthost.github.io/blob/main/index.html"&gt;a single index.html file&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;The Substack issue was small but frustrating. If you email out a link to a &lt;code&gt;gistpreview&lt;/code&gt; page via Substack it modifies the URL to look like this:&lt;/p&gt;
&lt;p&gt;&lt;a href="https://gistpreview.github.io/?f40971b693024fbe984a68b73cc283d2=&amp;amp;utm_source=substack&amp;amp;utm_medium=email"&gt;https://gistpreview.github.io/?f40971b693024fbe984a68b73cc283d2=&amp;amp;utm_source=substack&amp;amp;utm_medium=email&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;This breaks &lt;code&gt;gistpreview&lt;/code&gt; because it treats &lt;code&gt;f40971b693024fbe984a68b73cc283d2=&amp;amp;utm_source...&lt;/code&gt; as the Gist ID.&lt;/p&gt;
&lt;p&gt;The fix is to read everything up to that equals sign. I &lt;a href="https://github.com/gistpreview/gistpreview.github.io/pull/7"&gt;submitted a PR&lt;/a&gt; for that back in November.&lt;/p&gt;
&lt;p&gt;The second issue around truncated files was &lt;a href="https://github.com/simonw/claude-code-transcripts/issues/26#issuecomment-3699668871"&gt;reported against my claude-code-transcripts project&lt;/a&gt; a few days ago.&lt;/p&gt;
&lt;p&gt;That project provides a CLI tool for exporting HTML rendered versions of Claude Code sessions. It includes a &lt;code&gt;--gist&lt;/code&gt; option which uses the &lt;code&gt;gh&lt;/code&gt; CLI tool to publish the resulting HTML to a Gist and returns a gistpreview URL that the user can share.&lt;/p&gt;
&lt;p&gt;These exports can get pretty big, and some of the resulting HTML was past the size limit of what comes back from the Gist API.&lt;/p&gt;
&lt;p&gt;As of &lt;a href="https://github.com/simonw/claude-code-transcripts/releases/tag/0.5"&gt;claude-code-transcripts 0.5&lt;/a&gt; the &lt;code&gt;--gist&lt;/code&gt; option now publishes to &lt;a href="https://gisthost.github.io/"&gt;gisthost.github.io&lt;/a&gt; instead, fixing both bugs.&lt;/p&gt;
&lt;p&gt;Here's &lt;a href="https://gisthost.github.io/?02ced545666128ce4206103df6185536"&gt;the Claude Code transcript&lt;/a&gt; that refactored Gist Host to remove those dependencies, which I published to Gist Host using the following command:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;uvx claude-code-transcripts web --gist
&lt;/code&gt;&lt;/pre&gt;
    
        &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/github"&gt;github&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/http"&gt;http&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/javascript"&gt;javascript&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/projects"&gt;projects&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/ai-assisted-programming"&gt;ai-assisted-programming&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/cors"&gt;cors&lt;/a&gt;&lt;/p&gt;
    

</summary><category term="github"/><category term="http"/><category term="javascript"/><category term="projects"/><category term="ai-assisted-programming"/><category term="cors"/></entry><entry><title>Using Claude in Chrome to navigate out the Cloudflare dashboard</title><link href="https://simonwillison.net/2025/Dec/22/claude-chrome-cloudflare/#atom-tag" rel="alternate"/><published>2025-12-22T16:10:30+00:00</published><updated>2025-12-22T16:10:30+00:00</updated><id>https://simonwillison.net/2025/Dec/22/claude-chrome-cloudflare/#atom-tag</id><summary type="html">
    &lt;p&gt;I just had my first success using a browser agent - in this case the &lt;a href="https://support.claude.com/en/articles/12012173-getting-started-with-claude-in-chrome"&gt;Claude in Chrome extension&lt;/a&gt; - to solve an actual problem.&lt;/p&gt;
&lt;p&gt;A while ago I set things up so anything served from the &lt;code&gt;https://static.simonwillison.net/static/cors-allow/&lt;/code&gt; directory of my S3 bucket would have open &lt;code&gt;Access-Control-Allow-Origin: *&lt;/code&gt; headers. This is useful for hosting files online that can be loaded into web applications hosted on other domains.&lt;/p&gt;
&lt;p&gt;Problem is I couldn't remember how I did it! I initially thought it was an S3 setting, but it turns out S3 lets you set CORS at the bucket-level but not for individual prefixes.&lt;/p&gt;
&lt;p&gt;I then suspected Cloudflare, but I find the Cloudflare dashboard really difficult to navigate.&lt;/p&gt;
&lt;p&gt;So I decided to give Claude in Chrome a go. I installed and enabled the extension (you then have to click the little puzzle icon and click "pin" next to Claude for the icon to appear, I had to ask Claude itself for help figuring that out), signed into Cloudflare, opened the Claude panel and prompted:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;code&gt;I'm trying to figure out how come all pages under http://static.simonwillison.net/static/cors/ have an open CORS policy, I think I set that up through Cloudflare but I can't figure out where&lt;/code&gt;&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;Off it went. It took 1m45s to find exactly what I needed.&lt;/p&gt;
&lt;p&gt;&lt;img alt="Screenshot of Chrome - on the left is the Cloudflare dashboard. On the right a Claude chat panel which shows the final result: Summary  Found it! Your open CORS policy for the /static/cors/ directory on static.simonwillison.net is configured through Cloudflare Transform Rules, specifically a Response Header Transform Rule named static.simonwillis.net/static/cors-allow/*.  Here's exactly what's set up:  Rule Configuration:      Type: HTTP Response Header Transform Rule     Applies to: Any URL matching https://static.simonwillison.net/static/cors-allow/* (wildcard match)     Action: Modifies the response header access-control-allow-origin and sets it to *  This is why all pages under that path have an open CORS policy — the * value for Access-Control-Allow-Origin means any origin can access those resources.  Where to find/modify it in the dashboard:      Go to your domain settings (simonwillison.net)     Navigate to Rules &amp;gt; Rules Overview (in the left sidebar)     Scroll down to Response Header Transform Rules (should show 2 active)     Click on the rule named static.simonwillis.net/static/cors-allow/* to view or edit it  " src="https://static.simonwillison.net/static/2025/claude-chrome-cloudflare.jpg" /&gt;&lt;/p&gt;
&lt;p&gt;Claude's conclusion:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;Found it! Your open CORS policy for the &lt;code&gt;/static/cors/&lt;/code&gt; directory on &lt;code&gt;static.simonwillison.net&lt;/code&gt; is configured through &lt;strong&gt;Cloudflare Transform Rules&lt;/strong&gt;, specifically a &lt;strong&gt;Response Header Transform Rule&lt;/strong&gt; named &lt;code&gt;static.simonwillis.net/static/cors-allow/*&lt;/code&gt;&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;There's no "share transcript" option but I used copy and paste and two gnarly Claude Code sessions (&lt;a href="https://gistpreview.github.io/?56adf4212345d9967c22aab1362b847b"&gt;one&lt;/a&gt;, &lt;a href="https://gistpreview.github.io/?1d5f524616bef403cdde4bc92da5b0ba"&gt;two&lt;/a&gt;) to turn it into an HTML transcript which &lt;a href="https://static.simonwillison.net/static/2025/claude-chrome-transcript.html"&gt;you can take a look at here&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;I remain deeply skeptical of the entire browsing agent category due to my concerns about prompt injection risks—I watched what it was doing here like a &lt;em&gt;hawk&lt;/em&gt;—but I have to admit this was a very positive experience.&lt;/p&gt;

    &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/chrome"&gt;chrome&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/ai"&gt;ai&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/cloudflare"&gt;cloudflare&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/prompt-injection"&gt;prompt-injection&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/generative-ai"&gt;generative-ai&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/llms"&gt;llms&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/anthropic"&gt;anthropic&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/claude"&gt;claude&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/cors"&gt;cors&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/ai-agents"&gt;ai-agents&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/browser-agents"&gt;browser-agents&lt;/a&gt;&lt;/p&gt;



</summary><category term="chrome"/><category term="ai"/><category term="cloudflare"/><category term="prompt-injection"/><category term="generative-ai"/><category term="llms"/><category term="anthropic"/><category term="claude"/><category term="cors"/><category term="ai-agents"/><category term="browser-agents"/></entry><entry><title>Bluesky Thread Viewer thread by @simonwillison.net</title><link href="https://simonwillison.net/2025/Nov/28/bluesky-thread-viewer/#atom-tag" rel="alternate"/><published>2025-11-28T23:57:22+00:00</published><updated>2025-11-28T23:57:22+00:00</updated><id>https://simonwillison.net/2025/Nov/28/bluesky-thread-viewer/#atom-tag</id><summary type="html">
    
&lt;p&gt;&lt;strong&gt;&lt;a href="https://tools.simonwillison.net/bluesky-thread.html?url=https%3A%2F%2Fbsky.app%2Fprofile%2Fsimonwillison.net%2Fpost%2F3m6pmebfass24&amp;amp;view=thread"&gt;Bluesky Thread Viewer thread by @simonwillison.net&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;
I've been having a lot of fun hacking on my Bluesky Thread Viewer JavaScript tool with Claude Code recently. Here it renders a thread (complete with &lt;a href="https://bsky.app/profile/simonwillison.net/post/3m6pmebfass24"&gt;demo video&lt;/a&gt;) talking about the latest improvements to the tool itself.&lt;/p&gt;
&lt;p&gt;&lt;img alt="This short animated GIF demo starts with the Thread by @simonwillison.net page where a URL to a Bluesky post has been entered and a Fetch Thread button clicked. The thread is shown as a nested collection of replies. A &amp;quot;Hide other replies&amp;quot; button hides the replies revealing just the top-level self-replies by the original author - and turns into a &amp;quot;Show 11 other replies&amp;quot; button when toggled. There are tabs for Thread View and Most Recent First - the latter when clicked shows a linear list of posts with the most recent at the top. There are &amp;quot;Copy&amp;quot; and Copy JSON&amp;quot; green buttons at the top of the page." src="https://static.simonwillison.net/static/2025/bluesky-thread-viewer-demo.gif" /&gt;&lt;/p&gt;
&lt;p&gt;I've been mostly vibe-coding this thing since April, now spanning &lt;a href="https://github.com/simonw/tools/commits/main/bluesky-thread.html"&gt;15 commits&lt;/a&gt; with contributions from ChatGPT, Claude, Claude Code for Web and Claude Code on my laptop. Each of those commits links to the transcript that created the changes in the commit.&lt;/p&gt;
&lt;p&gt;Bluesky is a &lt;em&gt;lot&lt;/em&gt; of fun to build tools like this against because the API supports CORS (so you can talk to it from an HTML+JavaScript page hosted anywhere) and doesn't require authentication.


    &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/projects"&gt;projects&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/tools"&gt;tools&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/ai"&gt;ai&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/generative-ai"&gt;generative-ai&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/llms"&gt;llms&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/cors"&gt;cors&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/bluesky"&gt;bluesky&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/vibe-coding"&gt;vibe-coding&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/coding-agents"&gt;coding-agents&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/claude-code"&gt;claude-code&lt;/a&gt;&lt;/p&gt;



</summary><category term="projects"/><category term="tools"/><category term="ai"/><category term="generative-ai"/><category term="llms"/><category term="cors"/><category term="bluesky"/><category term="vibe-coding"/><category term="coding-agents"/><category term="claude-code"/></entry><entry><title>ChatGPT Canvas can make API requests now, but it's complicated</title><link href="https://simonwillison.net/2024/Dec/10/chatgpt-canvas/#atom-tag" rel="alternate"/><published>2024-12-10T21:49:55+00:00</published><updated>2024-12-10T21:49:55+00:00</updated><id>https://simonwillison.net/2024/Dec/10/chatgpt-canvas/#atom-tag</id><summary type="html">
    &lt;p&gt;Today's &lt;a href="https://openai.com/12-days/?day=4"&gt;12 Days of OpenAI&lt;/a&gt; release concerned &lt;a href="https://help.openai.com/en/articles/9930697-what-is-the-canvas-feature-in-chatgpt-and-how-do-i-use-it"&gt;ChatGPT Canvas&lt;/a&gt;, a new ChatGPT feature that enables ChatGPT to pop open a side panel with a shared editor in it where you can collaborate with ChatGPT on editing a document or writing code.&lt;/p&gt;
&lt;p&gt;I'm always excited to see a new form of UI on top of LLMs, and it's great seeing OpenAI stretch out beyond pure chat for this. It's definitely worth playing around with to get a feel for how a collaborative human+LLM interface can work. The feature where you can ask ChatGPT for "comments on my document" and it will attach them Google Docs style is particularly neat.&lt;/p&gt;
&lt;p&gt;I wanted to focus in on one particular aspect of Canvas, because it illustrates a concept I've been talking about for a little while now: the increasing complexity of fully understanding the capabilities of core LLM tools.&lt;/p&gt;
&lt;h4 id="canvas-runs-python-via-pyodide"&gt;Canvas runs Python via Pyodide&lt;/h4&gt;
&lt;p&gt;If a canvas editor contains Python code, ChatGPT adds a new "Run" button at the top of the editor.&lt;/p&gt;
&lt;p&gt;ChatGPT has had the ability to run Python for a long time via the excellent &lt;a href="https://simonwillison.net/tags/code-interpreter/"&gt;Code Interpreter&lt;/a&gt; feature, which executes Python server-side in a tightly locked down Kubernetes container managed by OpenAI.&lt;/p&gt;
&lt;p&gt;The new Canvas run button is &lt;strong&gt;not the same thing&lt;/strong&gt; - it's an entirely new implementation of code execution that runs code directly in your browser using &lt;a href="https://pyodide.org/"&gt;Pyodide&lt;/a&gt; (Python compiled to WebAssembly).&lt;/p&gt;
&lt;p&gt;The first time I tried this button I got the following dialog:&lt;/p&gt;
&lt;p&gt;&lt;img src="https://static.simonwillison.net/static/2024/run-python-code.jpg" alt="Run Python code? Python in canvas can make network requests and interact with external systems. Please review your code carefully before proceeding." style="max-width: 100%;" /&gt;&lt;/p&gt;
&lt;p&gt;"Python in canvas can make network requests"‽ This is a &lt;em&gt;very new&lt;/em&gt; capability. ChatGPT Code Interpreter has all network access blocked, but apparently ChatGPT Canvas Python does not share that limitation.&lt;/p&gt;
&lt;p&gt;I tested this a little bit and it turns out it can make direct HTTP calls from your browser to anywhere online with compatible CORS headers.&lt;/p&gt;
&lt;p&gt;(Understanding CORS is &lt;a href="https://simonwillison.net/search/?q=cors&amp;amp;sort=date&amp;amp;tag=llms"&gt;a recurring theme&lt;/a&gt; in working with LLMs as a consumer, which I find deeply amusing because it remains a pretty obscure topic even among professional web developers.)&lt;/p&gt;
&lt;p&gt;&lt;a href="https://simonwillison.net/tags/claude-artifacts/"&gt;Claude Artifacts&lt;/a&gt; allow full JavaScript execution in a Canvas-like interface within Claude, but even those are severely restricted in terms of the endpoints they can access. OpenAI have apparently made the opposite decision, throwing everything wide open as far as allowed network request targets go.&lt;/p&gt;
&lt;p&gt;I prompted ChatGPT like this:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;code&gt;use python to fetch "https://datasette.io/content.json?sql=select+*+from+stats++limit+10%0D%0A&amp;amp;_shape=array" and then display it nicely - the JSON looks like this:&lt;/code&gt;&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;[
  {
    "package": "airtable-export",
    "date": "2020-12-14",
    "downloads": 2
  },
&lt;/code&gt;&lt;/pre&gt;
&lt;/blockquote&gt;
&lt;p&gt;I often find pasting the first few lines of a larger JSON example into an LLM gives it enough information to guess the rest.&lt;/p&gt;
&lt;p&gt;Here's the result. ChatGPT wrote the code and showed it in a canvas, then I clicked "Run" and had the resulting data displayed in a neat table below:&lt;/p&gt;
&lt;p&gt;&lt;img src="https://static.simonwillison.net/static/2024/canvas-python.jpg" alt="Two columns. On the left is my chat with my prompt. On the right Python code, with a table below showing the results of the API call." style="max-width: 100%;" /&gt;&lt;/p&gt;
&lt;p&gt;What a neat and interesting thing! I can now get ChatGPT to write me Python code that fetches from external APIs and displays me the results.&lt;/p&gt;
&lt;p&gt;It's not yet as powerful as Claude Artifacts which allows for completely custom HTML+CSS+JavaScript interfaces, but it's also &lt;em&gt;more&lt;/em&gt; powerful than Artifacts because those are not allowed to make outbound HTTP requests at all.&lt;/p&gt;
&lt;h4 id="what-this-all-means"&gt;What this all means&lt;/h4&gt;
&lt;p&gt;With the introduction of Canvas, here are some new points that an expert user of ChatGPT now needs to understand:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;ChatGPT can write and then execute code in Python, but there are two different ways it can do that:
&lt;ul&gt;
&lt;li&gt;If run using Code Interpreter it can access files you upload to it and &lt;a href="https://github.com/simonw/scrape-openai-code-interpreter/blob/main/packages.txt"&gt;a collection of built-in libraries&lt;/a&gt; but cannot make API requests.&lt;/li&gt;
&lt;li&gt;If run in a Canvas it uses Pyodide and can access API endpoints, but not files that you upload to it.&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;Code Interpreter cannot &lt;code&gt;pip install&lt;/code&gt; additional packages, though you may be able to &lt;a href="https://til.simonwillison.net/llms/code-interpreter-expansions"&gt;upload them as wheels&lt;/a&gt; and convince it to install them.&lt;/li&gt;
&lt;li&gt;Canvas Python can install extra packages using &lt;a href="https://micropip.pyodide.org/en/stable/project/usage.html"&gt;micropip&lt;/a&gt;, but this will only work for pure Python wheels that are compatible with Pyodide.&lt;/li&gt;
&lt;li&gt;Code interpreter is locked down: it cannot make API requests or communicate with the wider internet at all. If you want it to work on data you need to upload that data to it.&lt;/li&gt;
&lt;li&gt;Canvas Python can fetch data via API requests (directly into your browser), but only from sources that implement an open CORS policy.&lt;/li&gt;
&lt;li&gt;Both Canvas and Code Interpreter remain strictly limited in terms of the custom UI they can offer - but they both have access to the Pandas ecosystem of visualization tools so they can probably show you charts or tables.&lt;/li&gt;
&lt;/ul&gt;
&lt;h4 id="this-is-really-really-confusing"&gt;This is really, really confusing&lt;/h4&gt;
&lt;p&gt;Do you find this all hopelessly confusing? I don't blame you. I'm a professional web developer and a Python engineer of 20+ years and I can just about understand and internalize the above set of rules.&lt;/p&gt;
&lt;p&gt;I don't really have any suggestions for where we go from here. This stuff is &lt;em&gt;hard to use&lt;/em&gt;. The more features and capabilities we pile onto these systems the harder it becomes to obtain true mastery of them and really understand what they can do and how best to put them into practice.&lt;/p&gt;
&lt;p&gt;Maybe this doesn't matter? I don't know anyone with true mastery of Excel - to the point where they could compete in &lt;a href="https://fmworldcup.com/microsoft-excel-world-championship/"&gt;last week's Microsoft Excel World Championship&lt;/a&gt; - and yet plenty of people derive enormous value from Excel despite only scratching the surface of what it can do.&lt;/p&gt;
&lt;p&gt;I do think it's worth remembering this as a general theme though. Chatbots may sound easy to use, but they really aren't - and they're getting harder to use all the time.&lt;/p&gt;
&lt;h4 id="exfiltration"&gt;A new data exfiltration vector&lt;/h4&gt;
&lt;p&gt;Thinking about this a little more, I think the most meaningful potential security impact from this could be opening up a new data exfiltration vector.&lt;/p&gt;
&lt;p&gt;Data exfiltration attacks occur when an attacker tricks someone into pasting malicious instructions into their prompt (often via a &lt;a href="https://simonwillison.net/tags/prompt-injection/"&gt;prompt injection attack&lt;/a&gt;) that cause ChatGPT to gather up any available private information from the current conversation and leak it to that attacker in some way.&lt;/p&gt;
&lt;p&gt;I imagine it may be possible to construct a pretty gnarly attack that convinces ChatGPT to open up a Canvas and then run Python that leaks any gathered private data to the attacker via an API call.&lt;/p&gt;
    
        &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/python"&gt;python&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/security"&gt;security&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/usability"&gt;usability&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/ai"&gt;ai&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/webassembly"&gt;webassembly&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/pyodide"&gt;pyodide&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/openai"&gt;openai&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/prompt-injection"&gt;prompt-injection&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/generative-ai"&gt;generative-ai&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/chatgpt"&gt;chatgpt&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/llms"&gt;llms&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/claude-artifacts"&gt;claude-artifacts&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/cors"&gt;cors&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/prompt-to-app"&gt;prompt-to-app&lt;/a&gt;&lt;/p&gt;
    

</summary><category term="python"/><category term="security"/><category term="usability"/><category term="ai"/><category term="webassembly"/><category term="pyodide"/><category term="openai"/><category term="prompt-injection"/><category term="generative-ai"/><category term="chatgpt"/><category term="llms"/><category term="claude-artifacts"/><category term="cors"/><category term="prompt-to-app"/></entry><entry><title>Prompt GPT-4o audio</title><link href="https://simonwillison.net/2024/Oct/28/prompt-gpt-4o-audio/#atom-tag" rel="alternate"/><published>2024-10-28T04:38:28+00:00</published><updated>2024-10-28T04:38:28+00:00</updated><id>https://simonwillison.net/2024/Oct/28/prompt-gpt-4o-audio/#atom-tag</id><summary type="html">
    
&lt;p&gt;&lt;strong&gt;&lt;a href="https://tools.simonwillison.net/openai-audio-output"&gt;Prompt GPT-4o audio&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;
A week and a half ago &lt;a href="https://simonwillison.net/2024/Oct/18/openai-audio/"&gt;I built a tool&lt;/a&gt; for experimenting with OpenAI's new audio input. I just put together the other side of that, for experimenting with audio output.&lt;/p&gt;
&lt;p&gt;Once you've provided an API key (which is saved in localStorage) you can use this to prompt the &lt;code&gt;gpt-4o-audio-preview&lt;/code&gt; model with a system and regular prompt and select a voice for the response.&lt;/p&gt;
&lt;p&gt;&lt;img class="blogmark-image" style="width: 90%" src="https://static.simonwillison.net/static/2024/openai-audio-output.jpg" alt="Screenshot of a text-to-speech interface showing a system prompt &amp;quot;Speak with a thick french accent, speaking fast&amp;quot;, user prompt &amp;quot;Tell me all about pelicans, in just a sentence&amp;quot;, voice dropdown set to &amp;quot;Alloy&amp;quot;, audio player at 0:13/0:13, and generated text about pelicans: &amp;quot;Pelicans are large waterbirds with a distinctive pouch under their beak, known for their impressive fishing skills as they dive into the water to catch fish, often working together in groups to herd their prey.&amp;quot; Also shows a Generate Speech button, Download Audio button, and partial API response with id &amp;quot;chatcmpl-ANBZcJi4DbN06f9i7z51Uy9SCVtZr&amp;quot; and object &amp;quot;chat.completion&amp;quot;"&gt;&lt;/p&gt;
&lt;p&gt;I built it with assistance from Claude: &lt;a href="https://gist.github.com/simonw/43bc2c59a5d1dc317076713c7f3870d0"&gt;initial app&lt;/a&gt;, &lt;a href="https://gist.github.com/simonw/9ed87231c365164d6b7328aa04a16b59"&gt;adding system prompt support&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;You can preview and download the resulting &lt;code&gt;wav&lt;/code&gt; file, and you can also copy out the raw JSON. If you save &lt;em&gt;that&lt;/em&gt; in a Gist you can then feed its Gist ID to &lt;code&gt;https://tools.simonwillison.net/gpt-4o-audio-player?gist=GIST_ID_HERE&lt;/code&gt; (&lt;a href="https://gist.github.com/simonw/88e8789c329a70ec5f68328f2cf60767"&gt;Claude transcript&lt;/a&gt;) to play it back again.&lt;/p&gt;
&lt;p&gt;You can try using that to listen to &lt;a href="https://tools.simonwillison.net/gpt-4o-audio-player?gist=4a982d3fe7ba8cb4c01e89c69a4a5335"&gt;my French accented pelican description&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;There's something really interesting to me here about this form of application which exists entirely as HTML and JavaScript that uses CORS to talk to various APIs. GitHub's Gist API is accessible via CORS too, so it wouldn't take much more work to add a "save" button which writes out a new Gist after prompting for a personal access token. I &lt;a href="https://gist.github.com/simonw/e0a784d258925e84af2a00c98d61accc"&gt;prototyped that a bit here&lt;/a&gt;.


    &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/audio"&gt;audio&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/github"&gt;github&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/javascript"&gt;javascript&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/tools"&gt;tools&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/ai"&gt;ai&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/openai"&gt;openai&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/generative-ai"&gt;generative-ai&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/gpt-4"&gt;gpt-4&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/llms"&gt;llms&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/ai-assisted-programming"&gt;ai-assisted-programming&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/claude"&gt;claude&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/claude-artifacts"&gt;claude-artifacts&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/claude-3-5-sonnet"&gt;claude-3-5-sonnet&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/cors"&gt;cors&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/multi-modal-output"&gt;multi-modal-output&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/prompt-to-app"&gt;prompt-to-app&lt;/a&gt;&lt;/p&gt;



</summary><category term="audio"/><category term="github"/><category term="javascript"/><category term="tools"/><category term="ai"/><category term="openai"/><category term="generative-ai"/><category term="gpt-4"/><category term="llms"/><category term="ai-assisted-programming"/><category term="claude"/><category term="claude-artifacts"/><category term="claude-3-5-sonnet"/><category term="cors"/><category term="multi-modal-output"/><category term="prompt-to-app"/></entry><entry><title>My Jina Reader tool</title><link href="https://simonwillison.net/2024/Oct/14/my-jina-reader-tool/#atom-tag" rel="alternate"/><published>2024-10-14T16:47:56+00:00</published><updated>2024-10-14T16:47:56+00:00</updated><id>https://simonwillison.net/2024/Oct/14/my-jina-reader-tool/#atom-tag</id><summary type="html">
    
&lt;p&gt;&lt;strong&gt;&lt;a href="https://tools.simonwillison.net/jina-reader"&gt;My Jina Reader tool&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;
I wanted to feed the &lt;a href="https://developers.cloudflare.com/durable-objects/api/storage-api/"&gt;Cloudflare Durable Objects SQLite&lt;/a&gt; documentation into Claude, but I was on my iPhone so copying and pasting was inconvenient. Jina offer a &lt;a href="https://jina.ai/reader/"&gt;Reader API&lt;/a&gt; which can turn any URL into LLM-friendly Markdown and it turns out it supports CORS, so I &lt;a href="https://gist.github.com/simonw/053b271e023ed1b834529e2fbd0efc3b"&gt;got Claude to build me this tool&lt;/a&gt; (&lt;a href="https://gist.github.com/simonw/e56d55e6a87a547faac7070eb912b32d"&gt;second iteration&lt;/a&gt;, &lt;a href="https://gist.github.com/simonw/e0a841a580038d15c7bf22bd7d104ce3"&gt;third iteration&lt;/a&gt;, &lt;a href="https://github.com/simonw/tools/blob/main/jina-reader.html"&gt;final source code&lt;/a&gt;).&lt;/p&gt;
&lt;p&gt;Paste in a URL to get the Jina Markdown version, along with an all important "Copy to clipboard" button.&lt;/p&gt;
&lt;p&gt;&lt;img src="https://static.simonwillison.net/static/2024/jina-reader.jpg" class="blogmark-image" style="max-width: 90%"&gt;


    &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/projects"&gt;projects&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/markdown"&gt;markdown&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/ai"&gt;ai&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/generative-ai"&gt;generative-ai&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/llms"&gt;llms&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/ai-assisted-programming"&gt;ai-assisted-programming&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/claude"&gt;claude&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/claude-3-5-sonnet"&gt;claude-3-5-sonnet&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/cors"&gt;cors&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/jina"&gt;jina&lt;/a&gt;&lt;/p&gt;



</summary><category term="projects"/><category term="markdown"/><category term="ai"/><category term="generative-ai"/><category term="llms"/><category term="ai-assisted-programming"/><category term="claude"/><category term="claude-3-5-sonnet"/><category term="cors"/><category term="jina"/></entry><entry><title>Calling LLMs from client-side JavaScript, converting PDFs to HTML + weeknotes</title><link href="https://simonwillison.net/2024/Sep/6/weeknotes/#atom-tag" rel="alternate"/><published>2024-09-06T02:28:38+00:00</published><updated>2024-09-06T02:28:38+00:00</updated><id>https://simonwillison.net/2024/Sep/6/weeknotes/#atom-tag</id><summary type="html">
    &lt;p&gt;I've been having a bunch of fun taking advantage of CORS-enabled LLM APIs to build client-side JavaScript applications that access LLMs directly. I also span up a new Datasette plugin for advanced permission management.&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;&lt;a href="https://simonwillison.net/2024/Sep/6/weeknotes/#llms-from-client-side-javascript"&gt;LLMs from client-side JavaScript&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href="https://simonwillison.net/2024/Sep/6/weeknotes/#converting-pdfs-to-html-and-markdown"&gt;Converting PDFs to HTML and Markdown&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href="https://simonwillison.net/2024/Sep/6/weeknotes/#adding-some-class-to-datasette-forms"&gt;Adding some class to Datasette forms&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href="https://simonwillison.net/2024/Sep/6/weeknotes/#on-the-blog"&gt;On the blog&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href="https://simonwillison.net/2024/Sep/6/weeknotes/#releases"&gt;Releases&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href="https://simonwillison.net/2024/Sep/6/weeknotes/#tils"&gt;TILs&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;h4 id="llms-from-client-side-javascript"&gt;LLMs from client-side JavaScript&lt;/h4&gt;
&lt;p&gt;Anthropic &lt;a href="https://simonwillison.net/2024/Aug/23/anthropic-dangerous-direct-browser-access/"&gt;recently added CORS support&lt;/a&gt; to their Claude APIs. It's a little hard to use - you have to add &lt;code&gt;anthropic-dangerous-direct-browser-access: true&lt;/code&gt; to your request headers to enable it - but once you know the trick you can start building web applications that talk to Anthropic's LLMs directly, without any additional server-side code.&lt;/p&gt;
&lt;p&gt;I later found out that both OpenAI and Google Gemini have this capability too, without needing the special header.&lt;/p&gt;
&lt;p&gt;The problem with this approach is security: it's very important not to embed an API key attached to your billing account in client-side HTML and JavaScript for anyone to see!&lt;/p&gt;
&lt;p&gt;For my purposes though that doesn't matter. I've been building tools which &lt;code&gt;prompt()&lt;/code&gt; a user for their own API key (sadly restricting their usage to the tiny portion of people who both understand API keys and have created API accounts with one of the big providers) - then I stash that key in &lt;code&gt;localStorage&lt;/code&gt; and start using it to make requests.&lt;/p&gt;
&lt;p&gt;My &lt;a href="https://github.com/simonw/tools"&gt;simonw/tools&lt;/a&gt; repository is home to a growing collection of pure HTML+JavaScript tools, hosted at &lt;a href="https://tools.simonwillison.net/"&gt;tools.simonwillison.net&lt;/a&gt; using GitHub Pages. I love not having to even think about hosting server-side code for these tools.&lt;/p&gt;
&lt;p&gt;I've published three tools there that talk to LLMs directly so far:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;
&lt;a href="https://tools.simonwillison.net/haiku"&gt;haiku&lt;/a&gt; is a fun demo that requests access to the user's camera and then writes a Haiku about what it sees. It uses Anthropic's Claude 3 Haiku model for this - the whole project is one terrible pun. &lt;a href="https://github.com/simonw/tools/blob/main/haiku.html"&gt;Haiku source code here&lt;/a&gt;.&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://tools.simonwillison.net/gemini-bbox"&gt;gemini-bbox&lt;/a&gt; uses the Gemini 1.5 Pro (or Flash) API to prompt those models to return bounding boxes for objects in an image, then renders those bounding boxes. Gemini Pro is the only of the vision LLMs that I've tried that has reliable support for bounding boxes. I wrote about this in &lt;a href="https://simonwillison.net/2024/Aug/26/gemini-bounding-box-visualization/"&gt;Building a tool showing how Gemini Pro can return bounding boxes for objects in images&lt;/a&gt;.&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://tools.simonwillison.net/gemini-chat"&gt;Gemini Chat App&lt;/a&gt; is a more traditional LLM chat interface that again talks to Gemini models (including the new super-speedy &lt;code&gt;gemini-1.5-flash-8b-exp-0827&lt;/code&gt;). I built this partly to try out those new models and partly to experiment with implementing a streaming chat interface agaist the Gemini API directly in a browser. I wrote more about how that works &lt;a href="https://simonwillison.net/2024/Aug/27/gemini-chat-app/"&gt;in this post&lt;/a&gt;.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Here's that Gemini Bounding Box visualization tool:&lt;/p&gt;
&lt;p&gt;&lt;img src="https://static.simonwillison.net/static/2024/goats-bbox-fixed.jpg" alt="Gemini API Image Bounding Box Visualization - browse for file goats.jpeg, prompt is Return bounding boxes as JSON arrays [ymin, xmin, ymax, xmax] - there follows output coordinates and then a red and a green box around the goats in a photo, with grid lines showing the coordinates from 0-1000 on both axes" style="max-width: 100%;" /&gt;&lt;/p&gt;
&lt;p&gt;All three of these tools made heavy use of AI-assisted development: Claude 3.5 Sonnet wrote almost every line of the last two, and the Haiku one was put together a few months ago using Claude 3 Opus.&lt;/p&gt;
&lt;p&gt;My personal style of HTML and JavaScript apps turns out to be highly compatible with LLMs: I like using vanilla HTML and JavaScript and keeping everything in the same file, which makes it easy to paste the entire thing into the model and ask it to make some changes for me. This approach also works really well with &lt;a href="https://simonwillison.net/tags/claude-artifacts/"&gt;Claude Artifacts&lt;/a&gt;, though I have to tell it "no React" to make sure I get an artifact I can hack on without needing to configure a React build step.&lt;/p&gt;
&lt;h4 id="converting-pdfs-to-html-and-markdown"&gt;Converting PDFs to HTML and Markdown&lt;/h4&gt;
&lt;p&gt;I have a long standing vendetta against PDFs for sharing information. They're painful to read on a mobile phone, they have poor accessibility, and even things like copying and pasting text from them can be a pain.&lt;/p&gt;
&lt;p&gt;Complaining without doing something about it isn't really my style. Twice in the past few weeks I've taken matters into my own hands:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Google Research released &lt;a href="https://research.google/pubs/sql-has-problems-we-can-fix-them-pipe-syntax-in-sql/"&gt;a PDF paper&lt;/a&gt; describing their new pipe syntax for SQL. I ran it through Gemini 1.5 Pro to convert it to HTML (&lt;a href="https://simonwillison.net/2024/Aug/24/pipe-syntax-in-sql/"&gt;prompts here&lt;/a&gt;) and &lt;a href="https://static.simonwillison.net/static/2024/Pipe-Syntax-In-SQL.html"&gt;got this&lt;/a&gt; - a pretty great initial result for the first prompt I tried!&lt;/li&gt;
&lt;li&gt;Nous Research released &lt;a href="https://github.com/NousResearch/DisTrO/blob/main/A_Preliminary_Report_on_DisTrO.pdf"&gt;a preliminary report PDF&lt;/a&gt; about their DisTro technology for distributed training of LLMs over low-bandwidth connections. I &lt;a href="https://simonwillison.net/2024/Aug/27/distro/"&gt;ran a prompt&lt;/a&gt; to use Gemini 1.5 Pro to convert that to &lt;a href="https://gist.github.com/simonw/46a33d66e069efe5c10b63625fdabb4e"&gt;this Markdown version&lt;/a&gt;, which even handled tables.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Within six hours of posting it my Pipe Syntax in SQL conversion was ranked third on Google for the title of the paper, at which point I set it to &lt;code&gt;&amp;lt;meta name="robots" content="noindex&amp;gt;&lt;/code&gt; to try and keep the unverified clone out of search. Yet more evidence that HTML is better than PDF!&lt;/p&gt;
&lt;p&gt;I've spent less than a total of ten minutes on using Gemini to convert PDFs in this way and the results have been very impressive. If I were to spend more time on this I'd target figures: I have a hunch that getting Gemini to return bounding boxes for figures on the PDF pages could be the key here, since then each figure could be automatically extracted as an image.&lt;/p&gt;
&lt;p&gt;I bet you could build that whole thing as a client-side app against the Gemini Pro API, too...&lt;/p&gt;
&lt;h4 id="adding-some-class-to-datasette-forms"&gt;Adding some class to Datasette forms&lt;/h4&gt;
&lt;p&gt;I've  been working on a new Datasette plugin for permissions management, &lt;a href="https://github.com/datasette/datasette-acl"&gt;datasette-acl&lt;/a&gt;, which I'll write about separately soon.&lt;/p&gt;
&lt;p&gt;I wanted to integrate &lt;a href="https://github.com/Choices-js/Choices"&gt;Choices.js&lt;/a&gt; with it, to provide a nicer interface for adding permissions to a user or group.&lt;/p&gt;
&lt;p&gt;My first attempt at integrating Choices ended up looking like this:&lt;/p&gt;
&lt;p&gt;&lt;img src="https://static.simonwillison.net/static/2024/datasette-acl-choices-bug.jpg" alt="The choices elements have big upgly blank boxes displayed where the remove icon should be. The Firefox DevTools console is open revealing CSS properties set on form button type=button, explaining the visual glitches" style="max-width: 100%;" /&gt;&lt;/p&gt;
&lt;p&gt;The weird visual glitches are caused by Datasette's core CSS, which included &lt;a href="https://github.com/simonw/datasette/blob/92c4d41ca605e0837a2711ee52fde9cf1eea74d0/datasette/static/app.css#L553-L564"&gt;the following rule&lt;/a&gt;:&lt;/p&gt;
&lt;div class="highlight highlight-source-css"&gt;&lt;pre&gt;&lt;span class="pl-ent"&gt;form&lt;/span&gt; &lt;span class="pl-ent"&gt;input&lt;/span&gt;[&lt;span class="pl-c1"&gt;type&lt;/span&gt;&lt;span class="pl-c1"&gt;=&lt;/span&gt;&lt;span class="pl-s"&gt;submit&lt;/span&gt;]&lt;span class="pl-kos"&gt;,&lt;/span&gt; &lt;span class="pl-ent"&gt;form&lt;/span&gt; &lt;span class="pl-ent"&gt;button&lt;/span&gt;[&lt;span class="pl-c1"&gt;type&lt;/span&gt;&lt;span class="pl-c1"&gt;=&lt;/span&gt;&lt;span class="pl-s"&gt;button&lt;/span&gt;] {
    &lt;span class="pl-c1"&gt;font-weight&lt;/span&gt;&lt;span class="pl-kos"&gt;:&lt;/span&gt; &lt;span class="pl-c1"&gt;400&lt;/span&gt;;
    &lt;span class="pl-c1"&gt;cursor&lt;/span&gt;&lt;span class="pl-kos"&gt;:&lt;/span&gt; pointer;
    &lt;span class="pl-c1"&gt;text-align&lt;/span&gt;&lt;span class="pl-kos"&gt;:&lt;/span&gt; center;
    &lt;span class="pl-c1"&gt;vertical-align&lt;/span&gt;&lt;span class="pl-kos"&gt;:&lt;/span&gt; middle;
    &lt;span class="pl-c1"&gt;border-width&lt;/span&gt;&lt;span class="pl-kos"&gt;:&lt;/span&gt; &lt;span class="pl-c1"&gt;1&lt;span class="pl-smi"&gt;px&lt;/span&gt;&lt;/span&gt;;
    &lt;span class="pl-c1"&gt;border-style&lt;/span&gt;&lt;span class="pl-kos"&gt;:&lt;/span&gt; solid;
    &lt;span class="pl-c1"&gt;padding&lt;/span&gt;&lt;span class="pl-kos"&gt;:&lt;/span&gt; &lt;span class="pl-c1"&gt;.5&lt;span class="pl-smi"&gt;em&lt;/span&gt;&lt;/span&gt; &lt;span class="pl-c1"&gt;0.8&lt;span class="pl-smi"&gt;em&lt;/span&gt;&lt;/span&gt;;
    &lt;span class="pl-c1"&gt;font-size&lt;/span&gt;&lt;span class="pl-kos"&gt;:&lt;/span&gt; &lt;span class="pl-c1"&gt;0.9&lt;span class="pl-smi"&gt;rem&lt;/span&gt;&lt;/span&gt;;
    &lt;span class="pl-c1"&gt;line-height&lt;/span&gt;&lt;span class="pl-kos"&gt;:&lt;/span&gt; &lt;span class="pl-c1"&gt;1&lt;/span&gt;;
    &lt;span class="pl-c1"&gt;border-radius&lt;/span&gt;&lt;span class="pl-kos"&gt;:&lt;/span&gt; &lt;span class="pl-c1"&gt;.25&lt;span class="pl-smi"&gt;rem&lt;/span&gt;&lt;/span&gt;;
}&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;These style rules apply to &lt;em&gt;any&lt;/em&gt; submit button or button-button that occurs inside a form!&lt;/p&gt;
&lt;p&gt;I'm glad I caught this before Datasette 1.0. I've now &lt;a href="https://github.com/simonw/datasette/issues/2415"&gt;started the process of fixing that&lt;/a&gt;, by ensuring these rules only apply to elements with &lt;code&gt;class="core"&lt;/code&gt; (or that class on a wrapping element). This ensures plugins can style these elements without being caught out by Datasette's defaults.&lt;/p&gt;
&lt;p&gt;The problem is... there are a whole bunch of existing plugins that currently rely on that behaviour. I have &lt;a href="https://github.com/simonw/datasette/issues/2417"&gt;a tricking issue&lt;/a&gt; about that, which identified 28 plugins that need updating. I've worked my way through 8 of those so far, hence the flurry of releases listed at the bottom of this post.&lt;/p&gt;
&lt;p&gt;This is also an excuse to revisit a bunch of older plugins, some of which had partially complete features that I've been finishing up.&lt;/p&gt;
&lt;p&gt;&lt;a href="https://github.com/simonw/datasette-write"&gt;datasette-write&lt;/a&gt; for example now has &lt;a href="https://github.com/simonw/datasette-write/issues/10"&gt;a neat row action menu item&lt;/a&gt; for updating a selected row using a pre-canned UPDATE query. Here's an animated demo of my first prototype of that feature:&lt;/p&gt;
&lt;p&gt;&lt;img src="https://static.simonwillison.net/static/2024/datasette-write-row.gif" alt="Animated demo - on the row page for a release I click row actions and select Update using SQL, which navigates to a page with a big UPDATE SQL query and a form showing all of the existing values." style="max-width: 100%;" /&gt;&lt;/p&gt;
&lt;h4 id="on-the-blog"&gt;On the blog&lt;/h4&gt;
&lt;p&gt;&lt;strong&gt;anthropic&lt;/strong&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;
&lt;a href="https://simonwillison.net/2024/Aug/23/anthropic-dangerous-direct-browser-access"&gt;Claude's API now supports CORS requests, enabling client-side applications&lt;/a&gt; - 2024-08-23&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://simonwillison.net/2024/Aug/23/explain-acls"&gt;Explain ACLs by showing me a SQLite table schema for implementing them&lt;/a&gt; - 2024-08-23&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://simonwillison.net/2024/Aug/24/oauth-llms"&gt;Musing about OAuth and LLMs on Mastodon&lt;/a&gt; - 2024-08-24&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://simonwillison.net/2024/Aug/26/gemini-bounding-box-visualization"&gt;Building a tool showing how Gemini Pro can return bounding boxes for objects in images&lt;/a&gt; - 2024-08-26&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://simonwillison.net/2024/Aug/26/long-context-prompting-tips"&gt;Long context prompting tips&lt;/a&gt; - 2024-08-26&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://simonwillison.net/2024/Aug/26/anthropic-system-prompts"&gt;Anthropic Release Notes: System Prompts&lt;/a&gt; - 2024-08-26&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://simonwillison.net/2024/Aug/26/alex-albert"&gt;Alex Albert: We've read and heard that you'd appreciate more t...&lt;/a&gt; - 2024-08-26&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://simonwillison.net/2024/Aug/27/gemini-chat-app"&gt;Gemini Chat App&lt;/a&gt; - 2024-08-27&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://simonwillison.net/2024/Aug/28/system-prompt-for-townie"&gt;System prompt for val.town/townie&lt;/a&gt; - 2024-08-28&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://simonwillison.net/2024/Aug/28/how-anthropic-built-artifacts"&gt;How Anthropic built Artifacts&lt;/a&gt; - 2024-08-28&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://simonwillison.net/2024/Aug/30/anthropic-prompt-engineering-interactive-tutorial"&gt;Anthropic's Prompt Engineering Interactive Tutorial&lt;/a&gt; - 2024-08-30&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://simonwillison.net/2024/Aug/30/llm-claude-3"&gt;llm-claude-3 0.4.1&lt;/a&gt; - 2024-08-30&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;strong&gt;ai-assisted-programming&lt;/strong&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;
&lt;a href="https://simonwillison.net/2024/Aug/24/andy-jassy-amazon-ceo"&gt;Andy Jassy, Amazon CEO: [...] here’s what we found when we integrated [Am...&lt;/a&gt; - 2024-08-24&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://simonwillison.net/2024/Aug/26/ai-powered-git-commit-function"&gt;AI-powered Git Commit Function&lt;/a&gt; - 2024-08-26&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://simonwillison.net/2024/Aug/30/openai-file-search"&gt;OpenAI: Improve file search result relevance with chunk ranking&lt;/a&gt; - 2024-08-30&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://simonwillison.net/2024/Aug/31/forrest-brazeal"&gt;Forrest Brazeal: I think that AI has killed, or is about to kill, ...&lt;/a&gt; - 2024-08-31&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;strong&gt;gemini&lt;/strong&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;
&lt;a href="https://simonwillison.net/2024/Aug/24/pipe-syntax-in-sql"&gt;SQL Has Problems. We Can Fix Them: Pipe Syntax In SQL&lt;/a&gt; - 2024-08-24&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://simonwillison.net/2024/Aug/27/distro"&gt;NousResearch/DisTrO&lt;/a&gt; - 2024-08-27&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;strong&gt;python&lt;/strong&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;
&lt;a href="https://simonwillison.net/2024/Sep/1/uvtrick"&gt;uvtrick&lt;/a&gt; - 2024-09-01&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://simonwillison.net/2024/Sep/2/anatomy-of-a-textual-user-interface"&gt;Anatomy of a Textual User Interface&lt;/a&gt; - 2024-09-02&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://simonwillison.net/2024/Sep/2/why-i-still-use-python-virtual-environments-in-docker"&gt;Why I Still Use Python Virtual Environments in Docker&lt;/a&gt; - 2024-09-02&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://simonwillison.net/2024/Sep/3/python-developers-survey-2023"&gt;Python Developers Survey 2023 Results&lt;/a&gt; - 2024-09-03&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;strong&gt;security&lt;/strong&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;
&lt;a href="https://simonwillison.net/2024/Aug/23/microsoft-copilot-data-governance"&gt;Top companies ground Microsoft Copilot over data governance concerns&lt;/a&gt; - 2024-08-23&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://simonwillison.net/2024/Aug/26/frederik-braun"&gt;Frederik Braun: In 2021 we [the Mozilla engineering team] found “...&lt;/a&gt; - 2024-08-26&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://simonwillison.net/2024/Sep/5/oauth-from-first-principles"&gt;OAuth from First Principles&lt;/a&gt; - 2024-09-05&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;strong&gt;projects&lt;/strong&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;
&lt;a href="https://simonwillison.net/2024/Aug/25/covidsewage-alt-text"&gt;My @covidsewage bot now includes useful alt text&lt;/a&gt; - 2024-08-25&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;strong&gt;armin-ronacher&lt;/strong&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;
&lt;a href="https://simonwillison.net/2024/Aug/27/minijinja"&gt;MiniJinja: Learnings from Building a Template Engine in Rust&lt;/a&gt; - 2024-08-27&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;strong&gt;ethics&lt;/strong&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;
&lt;a href="https://simonwillison.net/2024/Aug/27/john-gruber"&gt;John Gruber: Everyone alive today has grown up in a world wher...&lt;/a&gt; - 2024-08-27&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;strong&gt;open-source&lt;/strong&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;
&lt;a href="https://simonwillison.net/2024/Aug/27/open-source-ai"&gt;Debate over “open source AI” term brings new push to formalize definition&lt;/a&gt; - 2024-08-27&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://simonwillison.net/2024/Aug/29/elasticsearch-is-open-source-again"&gt;Elasticsearch is open source, again&lt;/a&gt; - 2024-08-29&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;strong&gt;performance&lt;/strong&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;
&lt;a href="https://simonwillison.net/2024/Aug/28/cerebras-inference"&gt;Cerebras Inference: AI at Instant Speed&lt;/a&gt; - 2024-08-28&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;strong&gt;sqlite&lt;/strong&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;
&lt;a href="https://simonwillison.net/2024/Aug/28/d-richard-hipp"&gt;D. Richard Hipp: My goal is to keep SQLite relevant and viable thr...&lt;/a&gt; - 2024-08-28&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;strong&gt;aws&lt;/strong&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;
&lt;a href="https://simonwillison.net/2024/Aug/30/leader-election-with-s3-conditional-writes"&gt;Leader Election With S3 Conditional Writes&lt;/a&gt; - 2024-08-30&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;strong&gt;javascript&lt;/strong&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;
&lt;a href="https://simonwillison.net/2024/Aug/31/andreas-giammarchi"&gt;Andreas Giammarchi: whenever you do this: `el.innerHTML += HTML`  ...&lt;/a&gt; - 2024-08-31&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;strong&gt;openai&lt;/strong&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;
&lt;a href="https://simonwillison.net/2024/Aug/31/openai-says-chatgpt-usage-has-doubled-since-last-year"&gt;OpenAI says ChatGPT usage has doubled since last year&lt;/a&gt; - 2024-08-31&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;strong&gt;art&lt;/strong&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;
&lt;a href="https://simonwillison.net/2024/Aug/31/ted-chiang"&gt;Ted Chiang: Art is notoriously hard to define, and so are the...&lt;/a&gt; - 2024-08-31&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;strong&gt;llm&lt;/strong&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;
&lt;a href="https://simonwillison.net/2024/Sep/3/anjor"&gt;anjor: `history | tail -n 2000 | llm -s "Write aliases f...&lt;/a&gt; - 2024-09-03&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;strong&gt;vision-llms&lt;/strong&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;
&lt;a href="https://simonwillison.net/2024/Sep/4/qwen2-vl"&gt;Qwen2-VL: To See the World More Clearly&lt;/a&gt; - 2024-09-04&lt;/li&gt;
&lt;/ul&gt;
&lt;h4 id="releases"&gt;Releases&lt;/h4&gt;
&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;&lt;a href="https://github.com/datasette/datasette-import/releases/tag/0.1a5"&gt;datasette-import 0.1a5&lt;/a&gt;&lt;/strong&gt; - 2024-09-04&lt;br /&gt;Tools for importing data into Datasette&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;&lt;a href="https://github.com/simonw/datasette-search-all/releases/tag/1.1.3"&gt;datasette-search-all 1.1.3&lt;/a&gt;&lt;/strong&gt; - 2024-09-04&lt;br /&gt;Datasette plugin for searching all searchable tables at once&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;&lt;a href="https://github.com/simonw/datasette-write/releases/tag/0.4"&gt;datasette-write 0.4&lt;/a&gt;&lt;/strong&gt; - 2024-09-04&lt;br /&gt;Datasette plugin providing a UI for executing SQL writes against the database&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;&lt;a href="https://github.com/datasette/datasette-debug-events/releases/tag/0.1a0"&gt;datasette-debug-events 0.1a0&lt;/a&gt;&lt;/strong&gt; - 2024-09-03&lt;br /&gt;Print Datasette events to standard error&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;&lt;a href="https://github.com/simonw/datasette-auth-passwords/releases/tag/1.1.1"&gt;datasette-auth-passwords 1.1.1&lt;/a&gt;&lt;/strong&gt; - 2024-09-03&lt;br /&gt;Datasette plugin for authentication using passwords&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;&lt;a href="https://github.com/datasette/datasette-enrichments/releases/tag/0.4.3"&gt;datasette-enrichments 0.4.3&lt;/a&gt;&lt;/strong&gt; - 2024-09-03&lt;br /&gt;Tools for running enrichments against data stored in Datasette&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;&lt;a href="https://github.com/simonw/datasette-configure-fts/releases/tag/1.1.4"&gt;datasette-configure-fts 1.1.4&lt;/a&gt;&lt;/strong&gt; - 2024-09-03&lt;br /&gt;Datasette plugin for enabling full-text search against selected table columns&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;&lt;a href="https://github.com/simonw/datasette-auth-tokens/releases/tag/0.4a10"&gt;datasette-auth-tokens 0.4a10&lt;/a&gt;&lt;/strong&gt; - 2024-09-03&lt;br /&gt;Datasette plugin for authenticating access using API tokens&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;&lt;a href="https://github.com/simonw/datasette-edit-schema/releases/tag/0.8a3"&gt;datasette-edit-schema 0.8a3&lt;/a&gt;&lt;/strong&gt; - 2024-09-03&lt;br /&gt;Datasette plugin for modifying table schemas&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;&lt;a href="https://github.com/datasette/datasette-pins/releases/tag/0.1a4"&gt;datasette-pins 0.1a4&lt;/a&gt;&lt;/strong&gt; - 2024-09-01&lt;br /&gt;Pin databases, tables, and other items to the Datasette homepage&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;&lt;a href="https://github.com/datasette/datasette-acl/releases/tag/0.4a2"&gt;datasette-acl 0.4a2&lt;/a&gt;&lt;/strong&gt; - 2024-09-01&lt;br /&gt;Advanced permission management for Datasette&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;&lt;a href="https://github.com/simonw/llm-claude-3/releases/tag/0.4.1"&gt;llm-claude-3 0.4.1&lt;/a&gt;&lt;/strong&gt; - 2024-08-30&lt;br /&gt;LLM plugin for interacting with the Claude 3 family of models&lt;/li&gt;
&lt;/ul&gt;
&lt;h4 id="tils"&gt;TILs&lt;/h4&gt;
&lt;ul&gt;
&lt;li&gt;
&lt;a href="https://til.simonwillison.net/playwright/testing-tables"&gt;Testing HTML tables with Playwright Python&lt;/a&gt; - 2024-09-04&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://til.simonwillison.net/pytest/namedtuple-parameterized-tests"&gt;Using namedtuple for pytest parameterized tests&lt;/a&gt; - 2024-08-31&lt;/li&gt;
&lt;/ul&gt;
    
        &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/css"&gt;css&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/javascript"&gt;javascript&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/pdf"&gt;pdf&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/projects"&gt;projects&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/ai"&gt;ai&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/datasette"&gt;datasette&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/weeknotes"&gt;weeknotes&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/generative-ai"&gt;generative-ai&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/llms"&gt;llms&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/anthropic"&gt;anthropic&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/claude"&gt;claude&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/gemini"&gt;gemini&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/claude-3-5-sonnet"&gt;claude-3-5-sonnet&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/cors"&gt;cors&lt;/a&gt;&lt;/p&gt;
    

</summary><category term="css"/><category term="javascript"/><category term="pdf"/><category term="projects"/><category term="ai"/><category term="datasette"/><category term="weeknotes"/><category term="generative-ai"/><category term="llms"/><category term="anthropic"/><category term="claude"/><category term="gemini"/><category term="claude-3-5-sonnet"/><category term="cors"/></entry><entry><title>Gemini Chat App</title><link href="https://simonwillison.net/2024/Aug/27/gemini-chat-app/#atom-tag" rel="alternate"/><published>2024-08-27T22:48:56+00:00</published><updated>2024-08-27T22:48:56+00:00</updated><id>https://simonwillison.net/2024/Aug/27/gemini-chat-app/#atom-tag</id><summary type="html">
    
&lt;p&gt;&lt;strong&gt;&lt;a href="https://tools.simonwillison.net/gemini-chat"&gt;Gemini Chat App&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;
Google &lt;a href="https://x.com/OfficialLoganK/status/1828480081574142227"&gt;released&lt;/a&gt; three new Gemini models today: improved versions of Gemini 1.5 Pro and Gemini 1.5 Flash plus a new model, Gemini 1.5 Flash-8B, which is significantly faster (and will presumably be cheaper) than the regular Flash model.&lt;/p&gt;
&lt;p&gt;The Flash-8B model is &lt;a href="https://arxiv.org/abs/2403.05530"&gt;described in the Gemini 1.5 family of models&lt;/a&gt; paper in section 8:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;By inheriting the same core architecture, optimizations, and data mixture refinements as its larger counterpart, Flash-8B demonstrates multimodal capabilities with support for context window exceeding 1 million tokens. This unique combination of speed, quality, and capabilities represents a step function leap in the domain of single-digit billion parameter models.&lt;/p&gt;
&lt;p&gt;While Flash-8B’s smaller form factor necessarily leads to a reduction in quality compared to Flash and 1.5 Pro, it unlocks substantial benefits, particularly in terms of high throughput and extremely low latency. This translates to affordable and timely large-scale multimodal deployments, facilitating novel use cases previously deemed infeasible due to resource constraints. &lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;The new models are available in &lt;a href="https://aistudio.google.com/"&gt;AI Studio&lt;/a&gt;, but since I built my own &lt;a href="https://simonwillison.net/2024/Aug/26/gemini-bounding-box-visualization/"&gt;custom prompting tool&lt;/a&gt; against the Gemini CORS-enabled API the other day I figured I'd build a quick UI for these new models as well.&lt;/p&gt;
&lt;p&gt;&lt;img src="https://static.simonwillison.net/static/2024/gemini-chat-skunk.gif" alt="Animated screenshot of Gemini Chat App. A select box allows the user to switch between four different models. I select the flash-8b model and prompt "a poem about a skunk" - it streams out a terrible poem. At the bottom it confirms that the API call took 1.44 seconds and used 10 prompt tokens and 201 candidate tokens." class="blogmark-image" /&gt;&lt;/p&gt;
&lt;p&gt;Building this with Claude 3.5 Sonnet took literally ten minutes from start to finish - you can see that &lt;a href="https://gist.github.com/simonw/498a66c1c4b5053a6dfa2015c3675e24"&gt;from the timestamps in the conversation&lt;/a&gt;. Here's the &lt;a href="https://tools.simonwillison.net/gemini-chat"&gt;deployed app&lt;/a&gt; and the &lt;a href="https://github.com/simonw/tools/blob/2f2bfd10d2ef829273d43a95e8a86b1ae0140668/gemini-chat.html"&gt;finished code&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;The feature I really wanted to build was streaming support. I started with &lt;a href="https://github.com/google-gemini/generative-ai-js/blob/1ad800656dc870c1c5a60c1201baa56ad48b88ee/samples/chat.js"&gt;this example code&lt;/a&gt; showing how to run streaming prompts in a Node.js application, then told Claude to figure out what the client-side code for that should look like based on a snippet from my bounding box interface hack. My starting prompt:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;code&gt;Build me a JavaScript app (no react) that I can use to chat with the Gemini model, using the above strategy for API key usage&lt;/code&gt;&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;I still keep hearing from people who are skeptical that &lt;a href="https://simonwillison.net/tags/ai-assisted-programming/"&gt;AI-assisted programming&lt;/a&gt; like this has any value. It's honestly getting a little frustrating at this point - the gains for things like rapid prototyping are &lt;em&gt;so self-evident&lt;/em&gt; now.


    &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/javascript"&gt;javascript&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/projects"&gt;projects&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/ai"&gt;ai&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/generative-ai"&gt;generative-ai&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/llms"&gt;llms&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/ai-assisted-programming"&gt;ai-assisted-programming&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/anthropic"&gt;anthropic&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/claude"&gt;claude&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/gemini"&gt;gemini&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/claude-3-5-sonnet"&gt;claude-3-5-sonnet&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/cors"&gt;cors&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/llm-release"&gt;llm-release&lt;/a&gt;&lt;/p&gt;



</summary><category term="javascript"/><category term="projects"/><category term="ai"/><category term="generative-ai"/><category term="llms"/><category term="ai-assisted-programming"/><category term="anthropic"/><category term="claude"/><category term="gemini"/><category term="claude-3-5-sonnet"/><category term="cors"/><category term="llm-release"/></entry><entry><title>Quoting Frederik Braun</title><link href="https://simonwillison.net/2024/Aug/26/frederik-braun/#atom-tag" rel="alternate"/><published>2024-08-26T20:26:31+00:00</published><updated>2024-08-26T20:26:31+00:00</updated><id>https://simonwillison.net/2024/Aug/26/frederik-braun/#atom-tag</id><summary type="html">
    &lt;blockquote cite="https://lobste.rs/s/98rp8f/cors_is_stupid#c_9dtjao"&gt;&lt;p&gt;In 2021 we [the Mozilla engineering team] found “samesite=lax by default” isn’t shippable without what you call the &lt;a href="https://simonwillison.net/2021/Aug/3/samesite/#chrome-2-minute-twist"&gt;“two minute twist”&lt;/a&gt; - you risk breaking a lot of websites. If you have that kind of two-minute exception, a lot of exploits that were supposed to be prevented remain possible.&lt;/p&gt;
&lt;p&gt;When we tried rolling it out, we had to deal with a lot of broken websites: Debugging cookie behavior in website backends is nontrivial from a browser.&lt;/p&gt;
&lt;p&gt;Firefox also had a prototype of what I believe is a better protection (including additional privacy benefits) already underway (called &lt;a href="https://blog.mozilla.org/en/mozilla/firefox-rolls-out-total-cookie-protection-by-default-to-all-users-worldwide/"&gt;total cookie protection&lt;/a&gt;).&lt;/p&gt;
&lt;p&gt;Given all of this, we paused samesite lax by default development in favor of this.&lt;/p&gt;&lt;/blockquote&gt;
&lt;p class="cite"&gt;&amp;mdash; &lt;a href="https://lobste.rs/s/98rp8f/cors_is_stupid#c_9dtjao"&gt;Frederik Braun&lt;/a&gt;&lt;/p&gt;

    &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/browsers"&gt;browsers&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/cookies"&gt;cookies&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/firefox"&gt;firefox&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/mozilla"&gt;mozilla&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/privacy"&gt;privacy&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/security"&gt;security&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/cors"&gt;cors&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/samesite"&gt;samesite&lt;/a&gt;&lt;/p&gt;



</summary><category term="browsers"/><category term="cookies"/><category term="firefox"/><category term="mozilla"/><category term="privacy"/><category term="security"/><category term="cors"/><category term="samesite"/></entry><entry><title>Building a tool showing how Gemini Pro can return bounding boxes for objects in images</title><link href="https://simonwillison.net/2024/Aug/26/gemini-bounding-box-visualization/#atom-tag" rel="alternate"/><published>2024-08-26T04:55:28+00:00</published><updated>2024-08-26T04:55:28+00:00</updated><id>https://simonwillison.net/2024/Aug/26/gemini-bounding-box-visualization/#atom-tag</id><summary type="html">
    &lt;p&gt;I was browsing through Google's Gemini documentation while researching &lt;a href="https://github.com/simonw/llm/issues/557"&gt;how different multi-model LLM APIs work&lt;/a&gt; when I stumbled across &lt;a href="https://ai.google.dev/gemini-api/docs/vision?lang=node#bbox"&gt;this note&lt;/a&gt; in the vision documentation:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;You can ask the model for the coordinates of bounding boxes for objects in images. For object detection, the Gemini model has been trained to provide these coordinates as relative widths or heights in range &lt;code&gt;[0,1]&lt;/code&gt;, scaled by 1000 and converted to an integer. Effectively, the coordinates given are for a 1000x1000 version of the original image, and need to be converted back to the dimensions of the original image.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;This is a pretty neat capability! OpenAI's GPT-4o and Anthropic's Claude 3 and Claude 3.5 models can't do this (yet).&lt;/p&gt;
&lt;p&gt;I tried a few prompts using &lt;a href="https://pypi.org/project/google-generativeai/"&gt;Google's Python library&lt;/a&gt; and got back what looked like bounding boxes!&lt;/p&gt;
&lt;div class="highlight highlight-text-python-console"&gt;&lt;pre&gt;&amp;gt;&amp;gt;&amp;gt; &lt;span class="pl-k"&gt;import&lt;/span&gt; google.generativeai &lt;span class="pl-k"&gt;as&lt;/span&gt; genai
&amp;gt;&amp;gt;&amp;gt; genai.configure(&lt;span class="pl-v"&gt;api_key&lt;/span&gt;&lt;span class="pl-k"&gt;=&lt;/span&gt;&lt;span class="pl-s"&gt;&lt;span class="pl-pds"&gt;"&lt;/span&gt;...&lt;span class="pl-pds"&gt;"&lt;/span&gt;&lt;/span&gt;)
&amp;gt;&amp;gt;&amp;gt; model &lt;span class="pl-k"&gt;=&lt;/span&gt; genai.GenerativeModel(&lt;span class="pl-v"&gt;model_name&lt;/span&gt;&lt;span class="pl-k"&gt;=&lt;/span&gt;&lt;span class="pl-s"&gt;&lt;span class="pl-pds"&gt;"&lt;/span&gt;gemini-1.5-pro-latest&lt;span class="pl-pds"&gt;"&lt;/span&gt;&lt;/span&gt;)
&amp;gt;&amp;gt;&amp;gt; &lt;span class="pl-k"&gt;import&lt;/span&gt; &lt;span class="pl-c1"&gt;PIL&lt;/span&gt;.Image
&amp;gt;&amp;gt;&amp;gt; goats &lt;span class="pl-k"&gt;=&lt;/span&gt; &lt;span class="pl-c1"&gt;PIL&lt;/span&gt;.Image.open(&lt;span class="pl-s"&gt;&lt;span class="pl-pds"&gt;"&lt;/span&gt;/tmp/goats.jpeg&lt;span class="pl-pds"&gt;"&lt;/span&gt;&lt;/span&gt;)
&amp;gt;&amp;gt;&amp;gt; prompt &lt;span class="pl-k"&gt;=&lt;/span&gt; &lt;span class="pl-s"&gt;&lt;span class="pl-pds"&gt;'&lt;/span&gt;Return bounding boxes around every goat, for each one return [ymin, xmin, ymax, xmax]&lt;span class="pl-pds"&gt;'&lt;/span&gt;&lt;/span&gt;
&amp;gt;&amp;gt;&amp;gt; response &lt;span class="pl-k"&gt;=&lt;/span&gt; model.generate_content([goats, prompt])
print(response.text)
&amp;gt;&amp;gt;&amp;gt; &lt;span class="pl-c1"&gt;print&lt;/span&gt;(response.text)
- [200, 90, 745, 527]
- [300, 610, 904, 937]&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;But how to verify that these were useful co-ordinates? I fired up Claude 3.5 Sonnet and started iterating on &lt;a href="https://support.anthropic.com/en/articles/9487310-what-are-artifacts-and-how-do-i-use-them"&gt;Artifacts&lt;/a&gt; there to try and visualize those co-ordinates against the original image.&lt;/p&gt;
&lt;p&gt;After some fiddling around, I built &lt;a href="https://static.simonwillison.net/static/2024/gemini-bounding-box-tool-fixed.html"&gt;an initial debug tool&lt;/a&gt; that I could paste co-ordinates into and select an image and see that image rendered.&lt;/p&gt;
&lt;h4 id="a-tool-for-prompting-with-an-image-and-rendering-the-bounding-boxes"&gt;A tool for prompting with an image and rendering the bounding boxes&lt;/h4&gt;
&lt;p&gt;I wrote the other day about Anthropic's &lt;a href="https://simonwillison.net/2024/Aug/23/anthropic-dangerous-direct-browser-access/"&gt;new support for CORS headers&lt;/a&gt;, enabling direct browser access to their APIs.&lt;/p&gt;
&lt;p&gt;Google Gemini supports CORS as well! So do OpenAI, which means that all three of the largest LLM providers can now be accessed directly from the browser.&lt;/p&gt;
&lt;p&gt;I decided to build a combined tool that could prompt Gemini 1.5 Pro with an image directly from the browser, then render the returned bounding boxes on that image.&lt;/p&gt;
&lt;p&gt;The new tool lives here: &lt;strong&gt;&lt;a href="https://tools.simonwillison.net/gemini-bbox"&gt;https://tools.simonwillison.net/gemini-bbox&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;&lt;img src="https://static.simonwillison.net/static/2024/goats-bbox-fixed.jpg" alt="Gemini API Image Bounding Box Visualization - browse for file goats.jpeg, prompt is Return bounding boxes as JSON arrays [ymin, xmin, ymax, xmax] - there follows output coordinates and then a red and a green box around the goats in a photo, with grid lines showing the coordinates from 0-1000 on both axes" style="max-width: 100%;" /&gt;&lt;/p&gt;
&lt;p&gt;The first time you run a prompt it will ask you for a &lt;a href="https://aistudio.google.com/app/apikey"&gt;Gemini API key&lt;/a&gt;, which it stores in your browser's &lt;code&gt;localStorage&lt;/code&gt;. I promise not to add code that steals your keys in the future, but if you don't want to trust that you can &lt;a href="https://github.com/simonw/tools/blob/main/gemini-bbox.html"&gt;grab a copy of the code&lt;/a&gt;, verify it and then run it yourself.&lt;/p&gt;
&lt;h4 id="building-this-tool-with-claude-3-5-sonnet"&gt;Building this tool with Claude 3.5 Sonnet&lt;/h4&gt;
&lt;p&gt;This is yet another example of a tool that I mostly built by prompting Claude 3.5 Sonnet. Here are &lt;a href="https://simonwillison.net/search/?tag=claude-3-5-sonnet&amp;amp;tag=projects"&gt;some more&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;I started out with &lt;a href="https://gist.github.com/simonw/40ff639e96d55a1df7ebfa7db1974b92"&gt;this lengthy conversation&lt;/a&gt; (transcript exported &lt;a href="https://simonwillison.net/2024/Aug/8/convert-claude-json-to-markdown/"&gt;with this tool&lt;/a&gt;) to help build the original tool for opening an image and pasting in those bounding box coordinates. That sequence started like this:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;Build an artifact where I can open an image from my browser and paste the following style of text into a textarea:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;- [488, 945, 519, 999]
- [460, 259, 487, 307]
- [472, 574, 498, 612]
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;(The hyphens may not be there, so scan with a regex for [ num, num, num, num ])&lt;/p&gt;
&lt;p&gt;Each of those represent [ymin, xmin, ymax, xmax] coordinates on the image - but they are numbers between 0 and 1000 so they correspond to the image is if it had been resized to 1000x1000&lt;/p&gt;
&lt;p&gt;As soon as the coords are pasted the corresponding boxes should be drawn on the images, corrected for its actual dimensions&lt;/p&gt;
&lt;p&gt;The image should be show with a width of 80% of the page&lt;/p&gt;
&lt;p&gt;The boxes should be in different colours, and hovering over each box should show the original bounding box coordinates below the image&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;Once that tool appeared to be doing the right thing (I had to muck around with how the coordinates were processed a bunch) I used my favourite prompting trick to build the combined tool that called the Gemini API. I found &lt;a href="https://github.com/google-gemini/generative-ai-js/blob/1ad800656dc870c1c5a60c1201baa56ad48b88ee/samples/web/utils/shared.js"&gt;this example&lt;/a&gt; that calls the &lt;a href="https://www.npmjs.com/package/@google/generative-ai"&gt;@google/generative-ai&lt;/a&gt; API from a browser, pasted the full example into Claude along with my previous bounding box visualization tool and had it combine them to achieve the desired result:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;Based on that example text, build me an HTML page with Vanilla JS that loads the Gemini API from esm.run - it should have a file input and a textarea and a submit button - you attach an image, enter a prompt and then click the button and it does a Gemini prompt with that image and prompt and injects the returned result into a div on the page&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;Then this follow-up prompt:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;now incorporate the logic from this tool &lt;em&gt;(I pasted in that HTML too)&lt;/em&gt;, such that when the response is returned from the prompt the image is displayed with any rendered bounding boxes&lt;/p&gt;
&lt;/blockquote&gt;
&lt;h4 id="dealing-with-image-orientation-bugs"&gt;Dealing with image orientation bugs&lt;/h4&gt;
&lt;p&gt;Bounding boxes are fiddly things. The code I had produced above seemed to work... but in some of my testing the boxes didn't show up in quite the right place. Was this just Gemini 1.5 Pro being unreliable in how it returned the boxes? That seemed likely, but I had some nagging doubts.&lt;/p&gt;
&lt;p&gt;On a hunch, I took &lt;a href="https://static.simonwillison.net/static/2024/goats-weird-rotation.jpeg"&gt;an image&lt;/a&gt; that was behaving strangely, took a screenshot of it and tried &lt;a href="https://static.simonwillison.net/static/2024/goats-no-rotation.jpg"&gt;that screenshot as a JPEG&lt;/a&gt;. The bounding boxes that came back were different - they appeared rotated!&lt;/p&gt;
&lt;p&gt;I've seen this kind of thing before with photos taken on an iPhone. There's an obscure piece of JPEG metadata which can set the orientation on a photo, and some software fails to respect that.&lt;/p&gt;
&lt;p&gt;Was that affecting my bounding box tool? I started digging into those photos to figure that out, using a combination of ChatGPT Code Interpreter (since that can read JPEG binary data using Python) and Claude Artifacts (to build me a visible UI for exploring my photos).&lt;/p&gt;
&lt;p&gt;My hunch turned out to be correct: my iPhone photos included TIFF orientation metadata which the Gemini API appeared not to respect. As a result, some photos taken by my phone would return bounding boxes that were rotated 180 degrees.&lt;/p&gt;
&lt;p&gt;My eventual fix was to take the image provided by the user, render it to a &lt;code&gt;&amp;lt;canvas&amp;gt;&lt;/code&gt; element and then export it back out as a JPEG again - &lt;a href="https://github.com/simonw/tools/blob/66552828b1ce6f823baccfc95ccdd81d8bb5992a/gemini-bbox.html#L41-L71"&gt;code here&lt;/a&gt;. I got Claude to add that for me based on code I pasted in from my earlier &lt;a href="https://tools.simonwillison.net/image-resize-quality"&gt;image resize quality&lt;/a&gt; tool, also &lt;a href="https://simonwillison.net/2024/Jul/26/image-resize-and-quality-comparison/"&gt;built for me by Claude&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;As part of this investigation I built another tool, which can read orientation TIFF data from a JPEG entirely in JavaScript and help show what's going on:&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;&lt;a href="https://tools.simonwillison.net/tiff-orientation"&gt;https://tools.simonwillison.net/tiff-orientation&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;&lt;img src="https://static.simonwillison.net/static/2024/tiff-orientation.jpg" alt="Drag &amp;amp; Drop a JPEG image here or click to select. TIFF Orientation: Rotated 180°. Orientation value: 3" style="max-width: 100%;" /&gt;&lt;/p&gt;
&lt;p&gt;Here's &lt;a href="https://github.com/simonw/tools/blob/main/tiff-orientation.html"&gt;the source code for that&lt;/a&gt;. The source code is a great example of the kind of thing that LLMs can do much more effectively than I can - here's an illustrative snippet:&lt;/p&gt;
&lt;div class="highlight highlight-source-js"&gt;&lt;pre&gt;&lt;span class="pl-c"&gt;// Determine endianness&lt;/span&gt;
&lt;span class="pl-k"&gt;const&lt;/span&gt; &lt;span class="pl-s1"&gt;endian&lt;/span&gt; &lt;span class="pl-c1"&gt;=&lt;/span&gt; &lt;span class="pl-s1"&gt;view&lt;/span&gt;&lt;span class="pl-kos"&gt;.&lt;/span&gt;&lt;span class="pl-en"&gt;getUint16&lt;/span&gt;&lt;span class="pl-kos"&gt;(&lt;/span&gt;&lt;span class="pl-s1"&gt;tiffStart&lt;/span&gt;&lt;span class="pl-kos"&gt;,&lt;/span&gt; &lt;span class="pl-c1"&gt;false&lt;/span&gt;&lt;span class="pl-kos"&gt;)&lt;/span&gt;&lt;span class="pl-kos"&gt;;&lt;/span&gt;
&lt;span class="pl-k"&gt;const&lt;/span&gt; &lt;span class="pl-s1"&gt;isLittleEndian&lt;/span&gt; &lt;span class="pl-c1"&gt;=&lt;/span&gt; &lt;span class="pl-kos"&gt;(&lt;/span&gt;&lt;span class="pl-s1"&gt;endian&lt;/span&gt; &lt;span class="pl-c1"&gt;===&lt;/span&gt; &lt;span class="pl-c1"&gt;0x4949&lt;/span&gt;&lt;span class="pl-kos"&gt;)&lt;/span&gt;&lt;span class="pl-kos"&gt;;&lt;/span&gt;  &lt;span class="pl-c"&gt;// 'II' in ASCII&lt;/span&gt;
&lt;span class="pl-s1"&gt;debugInfo&lt;/span&gt; &lt;span class="pl-c1"&gt;+=&lt;/span&gt; &lt;span class="pl-s"&gt;`Endianness: &lt;span class="pl-s1"&gt;&lt;span class="pl-kos"&gt;${&lt;/span&gt;&lt;span class="pl-s1"&gt;isLittleEndian&lt;/span&gt; ? &lt;span class="pl-s"&gt;'Little Endian'&lt;/span&gt; : &lt;span class="pl-s"&gt;'Big Endian'&lt;/span&gt;&lt;span class="pl-kos"&gt;}&lt;/span&gt;&lt;/span&gt;\n`&lt;/span&gt;&lt;span class="pl-kos"&gt;;&lt;/span&gt;

&lt;span class="pl-c"&gt;// Check TIFF header validity&lt;/span&gt;
&lt;span class="pl-k"&gt;const&lt;/span&gt; &lt;span class="pl-s1"&gt;tiffMagic&lt;/span&gt; &lt;span class="pl-c1"&gt;=&lt;/span&gt; &lt;span class="pl-s1"&gt;view&lt;/span&gt;&lt;span class="pl-kos"&gt;.&lt;/span&gt;&lt;span class="pl-en"&gt;getUint16&lt;/span&gt;&lt;span class="pl-kos"&gt;(&lt;/span&gt;&lt;span class="pl-s1"&gt;tiffStart&lt;/span&gt; &lt;span class="pl-c1"&gt;+&lt;/span&gt; &lt;span class="pl-c1"&gt;2&lt;/span&gt;&lt;span class="pl-kos"&gt;,&lt;/span&gt; &lt;span class="pl-s1"&gt;isLittleEndian&lt;/span&gt;&lt;span class="pl-kos"&gt;)&lt;/span&gt;&lt;span class="pl-kos"&gt;;&lt;/span&gt;
&lt;span class="pl-k"&gt;if&lt;/span&gt; &lt;span class="pl-kos"&gt;(&lt;/span&gt;&lt;span class="pl-s1"&gt;tiffMagic&lt;/span&gt; &lt;span class="pl-c1"&gt;!==&lt;/span&gt; &lt;span class="pl-c1"&gt;42&lt;/span&gt;&lt;span class="pl-kos"&gt;)&lt;/span&gt; &lt;span class="pl-kos"&gt;{&lt;/span&gt;
    &lt;span class="pl-k"&gt;throw&lt;/span&gt; &lt;span class="pl-v"&gt;Object&lt;/span&gt;&lt;span class="pl-kos"&gt;.&lt;/span&gt;&lt;span class="pl-en"&gt;assign&lt;/span&gt;&lt;span class="pl-kos"&gt;(&lt;/span&gt;&lt;span class="pl-k"&gt;new&lt;/span&gt; &lt;span class="pl-v"&gt;Error&lt;/span&gt;&lt;span class="pl-kos"&gt;(&lt;/span&gt;&lt;span class="pl-s"&gt;'Not a valid TIFF header'&lt;/span&gt;&lt;span class="pl-kos"&gt;)&lt;/span&gt;&lt;span class="pl-kos"&gt;,&lt;/span&gt; &lt;span class="pl-kos"&gt;{&lt;/span&gt; debugInfo &lt;span class="pl-kos"&gt;}&lt;/span&gt;&lt;span class="pl-kos"&gt;)&lt;/span&gt;&lt;span class="pl-kos"&gt;;&lt;/span&gt;
&lt;span class="pl-kos"&gt;}&lt;/span&gt;
&lt;span class="pl-s1"&gt;debugInfo&lt;/span&gt; &lt;span class="pl-c1"&gt;+=&lt;/span&gt; &lt;span class="pl-s"&gt;'Valid TIFF header\n'&lt;/span&gt;&lt;span class="pl-kos"&gt;;&lt;/span&gt;

&lt;span class="pl-c"&gt;// Get offset to first IFD&lt;/span&gt;
&lt;span class="pl-k"&gt;const&lt;/span&gt; &lt;span class="pl-s1"&gt;ifdOffset&lt;/span&gt; &lt;span class="pl-c1"&gt;=&lt;/span&gt; &lt;span class="pl-s1"&gt;view&lt;/span&gt;&lt;span class="pl-kos"&gt;.&lt;/span&gt;&lt;span class="pl-en"&gt;getUint32&lt;/span&gt;&lt;span class="pl-kos"&gt;(&lt;/span&gt;&lt;span class="pl-s1"&gt;tiffStart&lt;/span&gt; &lt;span class="pl-c1"&gt;+&lt;/span&gt; &lt;span class="pl-c1"&gt;4&lt;/span&gt;&lt;span class="pl-kos"&gt;,&lt;/span&gt; &lt;span class="pl-s1"&gt;isLittleEndian&lt;/span&gt;&lt;span class="pl-kos"&gt;)&lt;/span&gt;&lt;span class="pl-kos"&gt;;&lt;/span&gt;
&lt;span class="pl-k"&gt;const&lt;/span&gt; &lt;span class="pl-s1"&gt;ifdStart&lt;/span&gt; &lt;span class="pl-c1"&gt;=&lt;/span&gt; &lt;span class="pl-s1"&gt;tiffStart&lt;/span&gt; &lt;span class="pl-c1"&gt;+&lt;/span&gt; &lt;span class="pl-s1"&gt;ifdOffset&lt;/span&gt;&lt;span class="pl-kos"&gt;;&lt;/span&gt;
&lt;span class="pl-s1"&gt;debugInfo&lt;/span&gt; &lt;span class="pl-c1"&gt;+=&lt;/span&gt; &lt;span class="pl-s"&gt;`IFD start: &lt;span class="pl-s1"&gt;&lt;span class="pl-kos"&gt;${&lt;/span&gt;&lt;span class="pl-s1"&gt;ifdStart&lt;/span&gt;&lt;span class="pl-kos"&gt;}&lt;/span&gt;&lt;/span&gt;\n`&lt;/span&gt;&lt;span class="pl-kos"&gt;;&lt;/span&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;LLMs know their binary file formats, so I frequently find myself asking them to write me custom binary processing code like this.&lt;/p&gt;
&lt;p&gt;Here's &lt;a href="https://gist.github.com/simonw/9bf1bd4cce6d113c55db0e5a03769b52"&gt;the Claude conversation&lt;/a&gt; I had to build that tool. After failing to get it to work several times I pasted in Python code that I'd built using ChatGPT Code Interpreter and prompted:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;Here's Python code that finds it correctly:&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;Which turned out to provide the missing details to help it build me the JavaScript version I could run in my browser. Here's &lt;a href="https://gist.github.com/simonw/327a906e83efaf3e79853eec8bdd72e0"&gt;the ChatGPT conversation&lt;/a&gt; that got me that Python code.&lt;/p&gt;
&lt;h4 id="mixing-up-a-whole-bunch-of-models"&gt;Mixing up a whole bunch of models&lt;/h4&gt;
&lt;p&gt;This whole process was &lt;em&gt;very&lt;/em&gt; messy, but it's a pretty accurate representation of my workflow when using these models. I used three different tools here:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Gemini 1.5 Pro and the Gemini API to take images and a prompt and return bounding boxes&lt;/li&gt;
&lt;li&gt;Claude 3.5 Sonnet and Claude Artifacts to write code for working against that API and build me interactive tools for visualizing the results&lt;/li&gt;
&lt;li&gt;GPT-4o and ChatGPT Code Interpreter to write and execute Python code to try and help me figure out what was going on with my weird JPEG image orientation bugs&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;I copied code between models a bunch of times too - pasting Python code written by GPT-4o into Claude 3.5 Sonnet to help it write the correct JavaScript for example.&lt;/p&gt;
&lt;p&gt;How good is the code that I produced by the end of this all? It honestly doesn't matter very much to me: this is a very low-stakes project, where the goal was a single web page tool that can run a prompt through a model and visualize the response.&lt;/p&gt;
&lt;p&gt;If I was writing code "for production" - for a long-term project, or code that I intended to package up and release as an open source library - I would sweat the details a whole lot more. But for this kind of exploratory and prototyping work I'm increasingly comfortable hacking away at whatever the models spit out until it achieves the desired effect.&lt;/p&gt;
    
        &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/google"&gt;google&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/projects"&gt;projects&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/ai"&gt;ai&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/generative-ai"&gt;generative-ai&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/llms"&gt;llms&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/ai-assisted-programming"&gt;ai-assisted-programming&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/anthropic"&gt;anthropic&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/claude"&gt;claude&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/gemini"&gt;gemini&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/vision-llms"&gt;vision-llms&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/claude-artifacts"&gt;claude-artifacts&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/claude-3-5-sonnet"&gt;claude-3-5-sonnet&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/cors"&gt;cors&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/prompt-to-app"&gt;prompt-to-app&lt;/a&gt;&lt;/p&gt;
    

</summary><category term="google"/><category term="projects"/><category term="ai"/><category term="generative-ai"/><category term="llms"/><category term="ai-assisted-programming"/><category term="anthropic"/><category term="claude"/><category term="gemini"/><category term="vision-llms"/><category term="claude-artifacts"/><category term="claude-3-5-sonnet"/><category term="cors"/><category term="prompt-to-app"/></entry><entry><title>Claude's API now supports CORS requests, enabling client-side applications</title><link href="https://simonwillison.net/2024/Aug/23/anthropic-dangerous-direct-browser-access/#atom-tag" rel="alternate"/><published>2024-08-23T02:29:08+00:00</published><updated>2024-08-23T02:29:08+00:00</updated><id>https://simonwillison.net/2024/Aug/23/anthropic-dangerous-direct-browser-access/#atom-tag</id><summary type="html">
    &lt;p&gt;Anthropic have enabled CORS support for their JSON APIs, which means it's now possible to call the Claude LLMs directly from a user's browser.&lt;/p&gt;

&lt;p&gt;This massively significant new feature is tucked away in this pull request: &lt;a href="https://github.com/anthropics/anthropic-sdk-typescript/pull/504"&gt;anthropic-sdk-typescript: add support for browser usage&lt;/a&gt;, via &lt;a href="https://github.com/anthropics/anthropic-sdk-typescript/issues/248#issuecomment-2302791227" title="Add a dangerouslyAllowBrowser option to allow running in the browser"&gt;this issue&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;This change to the &lt;a href="https://github.com/anthropics/anthropic-sdk-typescript"&gt;Anthropic TypeScript SDK&lt;/a&gt; reveals the new JSON API feature, which I found &lt;a href="https://github.com/anthropics/anthropic-sdk-typescript/blob/e400d2e8a54aa736717ed849ef8b44a3490fce68/src/index.ts#L151"&gt;by digging through the code&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;You can now add the following HTTP request header to enable CORS support for the Anthropic API, which means you can make calls to Anthropic's models directly from a browser:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;anthropic-dangerous-direct-browser-access: true
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Anthropic had been resistant to adding this feature because it can encourage a nasty anti-pattern: if you embed your API key in your client code, anyone with access to that site can steal your API key and use it to make requests on your behalf. &lt;/p&gt;
&lt;p&gt;Despite that, there are legitimate use cases for this feature. It's fine for internal tools exposed to trusted users, or you can implement a "bring your own API key" pattern where users supply their own key to use with your client-side app.&lt;/p&gt;
&lt;p&gt;As it happens, I've built one of those apps myself! My &lt;a href="https://tools.simonwillison.net/haiku"&gt;Haiku&lt;/a&gt; page is a simple client-side app that requests access to your webcam, asks for &lt;a href="https://console.anthropic.com/settings/keys"&gt;an Anthropic API key&lt;/a&gt; (which it stores in the browser’s &lt;code&gt;localStorage&lt;/code&gt;), and then lets you take a photo and turns it into a Haiku using their fast and inexpensive &lt;a href="https://www.anthropic.com/news/claude-3-haiku"&gt;Haiku model&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;&lt;img src="https://static.simonwillison.net/static/2024/cleo-haiku-card.jpg" alt="Screenshot of the app - Cleo the dog sits patiently on the floor, a haiku reads Loyal canine friend,
Gentle eyes, awaiting praise
Cherished companion - buttons are visible for taking the photo and switching the camera" /&gt;&lt;/p&gt;
&lt;p&gt;Previously I had to run my own &lt;a href="https://github.com/simonw/tools/blob/main/vercel/anthropic-proxy/index.js"&gt;proxy on Vercel&lt;/a&gt; adding CORS support to the Anthropic API just to get my Haiku app to work.&lt;/p&gt;
&lt;p&gt;This evening I &lt;a href="https://github.com/simonw/tools/commit/0249ab83775861f549abb1aa80af0ca3614dc5ff"&gt;upgraded the app&lt;/a&gt; to send that new header, and now it can talk to Anthropic directly without needing my proxy.&lt;/p&gt;
&lt;p&gt;I actually got Claude &lt;a href="https://gist.github.com/simonw/6ff7bc0d47575a53463abc3482608f74"&gt;to modify the code for me&lt;/a&gt; (Claude built the Haiku app in the first place). Amusingly Claude first argued against it:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;I must strongly advise against making direct API calls from a browser, as it exposes your API key and violates best practices for API security.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;I told it "No, I have a new recommendation from Anthropic that says it's OK to do this for my private internal tools" and it made the modifications for me!&lt;/p&gt;
&lt;p&gt;The full source code &lt;a href="https://github.com/simonw/tools/blob/0249ab83775861f549abb1aa80af0ca3614dc5ff/haiku.html"&gt;can be seen here&lt;/a&gt;. Here's a simplified JavaScript snippet illustrating how to call their API from the browser using the new header:&lt;/p&gt;
&lt;div class="highlight highlight-source-js"&gt;&lt;pre&gt;&lt;span class="pl-en"&gt;fetch&lt;/span&gt;&lt;span class="pl-kos"&gt;(&lt;/span&gt;&lt;span class="pl-s"&gt;"https://api.anthropic.com/v1/messages"&lt;/span&gt;&lt;span class="pl-kos"&gt;,&lt;/span&gt; &lt;span class="pl-kos"&gt;{&lt;/span&gt;
  &lt;span class="pl-c1"&gt;method&lt;/span&gt;: &lt;span class="pl-s"&gt;"POST"&lt;/span&gt;&lt;span class="pl-kos"&gt;,&lt;/span&gt;
  &lt;span class="pl-c1"&gt;headers&lt;/span&gt;: &lt;span class="pl-kos"&gt;{&lt;/span&gt;
    &lt;span class="pl-s"&gt;"x-api-key"&lt;/span&gt;: &lt;span class="pl-s1"&gt;apiKey&lt;/span&gt;&lt;span class="pl-kos"&gt;,&lt;/span&gt;
    &lt;span class="pl-s"&gt;"anthropic-version"&lt;/span&gt;: &lt;span class="pl-s"&gt;"2023-06-01"&lt;/span&gt;&lt;span class="pl-kos"&gt;,&lt;/span&gt;
    &lt;span class="pl-s"&gt;"content-type"&lt;/span&gt;: &lt;span class="pl-s"&gt;"application/json"&lt;/span&gt;&lt;span class="pl-kos"&gt;,&lt;/span&gt;
    &lt;span class="pl-s"&gt;"anthropic-dangerous-direct-browser-access"&lt;/span&gt;: &lt;span class="pl-s"&gt;"true"&lt;/span&gt;&lt;span class="pl-kos"&gt;,&lt;/span&gt;
  &lt;span class="pl-kos"&gt;}&lt;/span&gt;&lt;span class="pl-kos"&gt;,&lt;/span&gt;
  &lt;span class="pl-c1"&gt;body&lt;/span&gt;: &lt;span class="pl-c1"&gt;JSON&lt;/span&gt;&lt;span class="pl-kos"&gt;.&lt;/span&gt;&lt;span class="pl-en"&gt;stringify&lt;/span&gt;&lt;span class="pl-kos"&gt;(&lt;/span&gt;&lt;span class="pl-kos"&gt;{&lt;/span&gt;
    &lt;span class="pl-c1"&gt;model&lt;/span&gt;: &lt;span class="pl-s"&gt;"claude-3-haiku-20240307"&lt;/span&gt;&lt;span class="pl-kos"&gt;,&lt;/span&gt;
    &lt;span class="pl-c1"&gt;max_tokens&lt;/span&gt;: &lt;span class="pl-c1"&gt;1024&lt;/span&gt;&lt;span class="pl-kos"&gt;,&lt;/span&gt;
    &lt;span class="pl-c1"&gt;messages&lt;/span&gt;: &lt;span class="pl-kos"&gt;[&lt;/span&gt;
      &lt;span class="pl-kos"&gt;{&lt;/span&gt;
        &lt;span class="pl-c1"&gt;role&lt;/span&gt;: &lt;span class="pl-s"&gt;"user"&lt;/span&gt;&lt;span class="pl-kos"&gt;,&lt;/span&gt;
        &lt;span class="pl-c1"&gt;content&lt;/span&gt;: &lt;span class="pl-kos"&gt;[&lt;/span&gt;
          &lt;span class="pl-kos"&gt;{&lt;/span&gt; &lt;span class="pl-c1"&gt;type&lt;/span&gt;: &lt;span class="pl-s"&gt;"text"&lt;/span&gt;&lt;span class="pl-kos"&gt;,&lt;/span&gt; &lt;span class="pl-c1"&gt;text&lt;/span&gt;: &lt;span class="pl-s"&gt;"Return a haiku about how great pelicans are"&lt;/span&gt; &lt;span class="pl-kos"&gt;}&lt;/span&gt;&lt;span class="pl-kos"&gt;,&lt;/span&gt;
        &lt;span class="pl-kos"&gt;]&lt;/span&gt;&lt;span class="pl-kos"&gt;,&lt;/span&gt;
      &lt;span class="pl-kos"&gt;}&lt;/span&gt;&lt;span class="pl-kos"&gt;,&lt;/span&gt;
    &lt;span class="pl-kos"&gt;]&lt;/span&gt;&lt;span class="pl-kos"&gt;,&lt;/span&gt;
  &lt;span class="pl-kos"&gt;}&lt;/span&gt;&lt;span class="pl-kos"&gt;)&lt;/span&gt;&lt;span class="pl-kos"&gt;,&lt;/span&gt;
&lt;span class="pl-kos"&gt;}&lt;/span&gt;&lt;span class="pl-kos"&gt;)&lt;/span&gt;
  &lt;span class="pl-kos"&gt;.&lt;/span&gt;&lt;span class="pl-en"&gt;then&lt;/span&gt;&lt;span class="pl-kos"&gt;(&lt;/span&gt;&lt;span class="pl-kos"&gt;(&lt;/span&gt;&lt;span class="pl-s1"&gt;response&lt;/span&gt;&lt;span class="pl-kos"&gt;)&lt;/span&gt; &lt;span class="pl-c1"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="pl-s1"&gt;response&lt;/span&gt;&lt;span class="pl-kos"&gt;.&lt;/span&gt;&lt;span class="pl-en"&gt;json&lt;/span&gt;&lt;span class="pl-kos"&gt;(&lt;/span&gt;&lt;span class="pl-kos"&gt;)&lt;/span&gt;&lt;span class="pl-kos"&gt;)&lt;/span&gt;
  &lt;span class="pl-kos"&gt;.&lt;/span&gt;&lt;span class="pl-en"&gt;then&lt;/span&gt;&lt;span class="pl-kos"&gt;(&lt;/span&gt;&lt;span class="pl-kos"&gt;(&lt;/span&gt;&lt;span class="pl-s1"&gt;data&lt;/span&gt;&lt;span class="pl-kos"&gt;)&lt;/span&gt; &lt;span class="pl-c1"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="pl-kos"&gt;{&lt;/span&gt;
    &lt;span class="pl-k"&gt;const&lt;/span&gt; &lt;span class="pl-s1"&gt;haiku&lt;/span&gt; &lt;span class="pl-c1"&gt;=&lt;/span&gt; &lt;span class="pl-s1"&gt;data&lt;/span&gt;&lt;span class="pl-kos"&gt;.&lt;/span&gt;&lt;span class="pl-c1"&gt;content&lt;/span&gt;&lt;span class="pl-kos"&gt;[&lt;/span&gt;&lt;span class="pl-c1"&gt;0&lt;/span&gt;&lt;span class="pl-kos"&gt;]&lt;/span&gt;&lt;span class="pl-kos"&gt;.&lt;/span&gt;&lt;span class="pl-c1"&gt;text&lt;/span&gt;&lt;span class="pl-kos"&gt;;&lt;/span&gt;
    &lt;span class="pl-en"&gt;alert&lt;/span&gt;&lt;span class="pl-kos"&gt;(&lt;/span&gt;&lt;span class="pl-s1"&gt;haiku&lt;/span&gt;&lt;span class="pl-kos"&gt;)&lt;/span&gt;&lt;span class="pl-kos"&gt;;&lt;/span&gt;
  &lt;span class="pl-kos"&gt;}&lt;/span&gt;&lt;span class="pl-kos"&gt;)&lt;/span&gt;&lt;span class="pl-kos"&gt;;&lt;/span&gt;&lt;/pre&gt;&lt;/div&gt;
    
        &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/apis"&gt;apis&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/javascript"&gt;javascript&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/projects"&gt;projects&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/security"&gt;security&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/ai"&gt;ai&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/generative-ai"&gt;generative-ai&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/llms"&gt;llms&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/ai-assisted-programming"&gt;ai-assisted-programming&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/anthropic"&gt;anthropic&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/claude"&gt;claude&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/cors"&gt;cors&lt;/a&gt;&lt;/p&gt;
    

</summary><category term="apis"/><category term="javascript"/><category term="projects"/><category term="security"/><category term="ai"/><category term="generative-ai"/><category term="llms"/><category term="ai-assisted-programming"/><category term="anthropic"/><category term="claude"/><category term="cors"/></entry><entry><title>Phanpy</title><link href="https://simonwillison.net/2024/Mar/16/phanpy/#atom-tag" rel="alternate"/><published>2024-03-16T01:34:04+00:00</published><updated>2024-03-16T01:34:04+00:00</updated><id>https://simonwillison.net/2024/Mar/16/phanpy/#atom-tag</id><summary type="html">
    
&lt;p&gt;&lt;strong&gt;&lt;a href="https://phanpy.social/"&gt;Phanpy&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;
Phanpy is "a minimalistic opinionated Mastodon web client" &lt;a href="https://github.com/cheeaun/phanpy"&gt;by Chee Aun&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;I think that description undersells it. It's beautifully crafted and designed and has a ton of innovative ideas - they way it displays threads and replies, the "Catch-up" beta feature, it's all a really thoughtful and fresh perspective on how Mastodon can work.&lt;/p&gt;
&lt;p&gt;I love that all Mastodon servers (including my own dedicated instance) offer a CORS-enabled JSON API which directly supports building these kinds of alternative clients.&lt;/p&gt;
&lt;p&gt;Building a full-featured client like this one is a huge amount of work, but building a much simpler client that just displays the user's incoming timeline could be a pretty great educational project for people who are looking to deepen their front-end development skills.


    &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/javascript"&gt;javascript&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/mastodon"&gt;mastodon&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/fediverse"&gt;fediverse&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/cors"&gt;cors&lt;/a&gt;&lt;/p&gt;



</summary><category term="javascript"/><category term="mastodon"/><category term="fediverse"/><category term="cors"/></entry><entry><title>urllib3 2.2.0</title><link href="https://simonwillison.net/2024/Jan/30/urllib3/#atom-tag" rel="alternate"/><published>2024-01-30T16:31:00+00:00</published><updated>2024-01-30T16:31:00+00:00</updated><id>https://simonwillison.net/2024/Jan/30/urllib3/#atom-tag</id><summary type="html">
    
&lt;p&gt;&lt;strong&gt;&lt;a href="https://github.com/urllib3/urllib3/releases/tag/2.2.0"&gt;urllib3 2.2.0&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;
Highlighted feature: “urllib3 now works in the browser”—the core urllib3 library now includes code that can integrate with Pyodide, using the browser’s fetch() or XMLHttpRequest APIs to make HTTP requests (to CORS-enabled endpoints).


    &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/python"&gt;python&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/webassembly"&gt;webassembly&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/pyodide"&gt;pyodide&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/cors"&gt;cors&lt;/a&gt;&lt;/p&gt;



</summary><category term="python"/><category term="webassembly"/><category term="pyodide"/><category term="cors"/></entry><entry><title>Semi-automating a Substack newsletter with an Observable notebook</title><link href="https://simonwillison.net/2023/Apr/4/substack-observable/#atom-tag" rel="alternate"/><published>2023-04-04T17:55:28+00:00</published><updated>2023-04-04T17:55:28+00:00</updated><id>https://simonwillison.net/2023/Apr/4/substack-observable/#atom-tag</id><summary type="html">
    &lt;p&gt;I recently started sending out &lt;a href="https://simonw.substack.com/"&gt;a weekly-ish email newsletter&lt;/a&gt; consisting of content from my blog. I've mostly automated that, using &lt;a href="https://observablehq.com/@simonw/blog-to-newsletter"&gt;an Observable Notebook&lt;/a&gt; to generate the HTML. Here's how that system works.&lt;/p&gt;
&lt;p&gt;&lt;img src="https://static.simonwillison.net/static/2023/substack-index.jpg" alt="Screenshot of Substack: Simon Willison' Newsletter, with a big podcast promo image next to Think of language models like GhatGPT as a calculator for words, followed by two other recent newsletter headlines." style="max-width: 100%;" /&gt;&lt;/p&gt;
&lt;h4&gt;What goes in my newsletter&lt;/h4&gt;
&lt;p&gt;My blog has three types of content: &lt;a href="https://simonwillison.net/search/?type=entry"&gt;entries&lt;/a&gt;, &lt;a href="https://simonwillison.net/search/?type=blogmark"&gt;blogmarks&lt;/a&gt; and &lt;a href="https://simonwillison.net/search/?type=quotation"&gt;quotations&lt;/a&gt;. "Blogmarks" is a name I came up with for bookmarks &lt;a href="https://simonwillison.net/2003/Nov/24/blogmarks/"&gt;in 2003&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;Blogmarks and quotations show up in my blog's sidebar, entries get the main column - but on mobile the three are combined into a single flow.&lt;/p&gt;
&lt;p&gt;These live in a PostgreSQL database managed by Django. You can see them defined &lt;a href="https://github.com/simonw/simonwillisonblog/blob/main/blog/models.py"&gt;in models.py&lt;/a&gt; in my blog's open source repo.&lt;/p&gt;
&lt;p&gt;My newsletter consists of all of the new entries, blogmarks and quotations since I last sent it out. I include the entries first in reverse chronological order, since usually the entry I've just written is the one I want to use for the email subject. The blogmarks and quotations come in chronological order afterwards.&lt;/p&gt;
&lt;p&gt;I'm including the full HTML for everything: people don't need to click through back to my blog to read it, all of the content should be right there in their email client.&lt;/p&gt;
&lt;h4&gt;The Substack API: RSS and copy-and-paste&lt;/h4&gt;
&lt;p&gt;Substack doesn't yet offer an API, and &lt;a href="https://support.substack.com/hc/en-us/articles/360038433912-Does-Substack-have-an-API-"&gt;have no public plans&lt;/a&gt; to do so.&lt;/p&gt;
&lt;p&gt;They do offer an RSS feed of each newsletter though - add &lt;code&gt;/feed&lt;/code&gt; to the newsletter subdomain to get it. Mine is at &lt;a href="https://simonw.substack.com/feed"&gt;https://simonw.substack.com/feed&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;So we can get data back out again... but what about getting data in? I don't want to manually assemble a newsletter from all of these different sources of data.&lt;/p&gt;
&lt;p&gt;That's where copy-and-paste comes in.&lt;/p&gt;
&lt;p&gt;The Substack compose editor incorporates a well built rich-text editor. You can paste content into it and it will clean it up to fit the subset of HTML that Substack supports... but that's a pretty decent subset. Headings, paragraphs, lists, links, code blocks and images are all supported.&lt;/p&gt;
&lt;p&gt;The vast majority of content on my blog fits that subset neatly.&lt;/p&gt;
&lt;p&gt;Crucially, pasting in images as part of that rich text content Just Works: Substack automatically copies any images to their &lt;code&gt;substack-post-media&lt;/code&gt; S3 bucket and embeds links to their CDN in the body of the newsletter.&lt;/p&gt;
&lt;p&gt;So... if I can generate the intended rich-text HTML for my whole newsletter, I can copy and paste it directly into the Substack.&lt;/p&gt;
&lt;p&gt;That's exactly what my new Observable notebook does: &lt;a href="https://observablehq.com/@simonw/blog-to-newsletter"&gt;https://observablehq.com/@simonw/blog-to-newsletter&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;Generating HTML is a well trodden path, but I also wanted a "copy to clipboard" button that would copy the rich text version of that HTML such that pasting it into Substack would do the right thing.&lt;/p&gt;
&lt;p&gt;With a bit of help from &lt;a href="https://developer.mozilla.org/en-US/docs/Mozilla/Add-ons/WebExtensions/Interact_with_the_clipboard"&gt;MDN&lt;/a&gt; and &lt;a href="https://til.simonwillison.net/javascript/copy-rich-text-to-clipboard"&gt;ChatGPT (my TIL)&lt;/a&gt; I figured out the following:&lt;/p&gt;
&lt;div class="highlight highlight-source-js"&gt;&lt;pre&gt;&lt;span class="pl-k"&gt;function&lt;/span&gt; &lt;span class="pl-en"&gt;copyRichText&lt;/span&gt;&lt;span class="pl-kos"&gt;(&lt;/span&gt;&lt;span class="pl-s1"&gt;html&lt;/span&gt;&lt;span class="pl-kos"&gt;)&lt;/span&gt; &lt;span class="pl-kos"&gt;{&lt;/span&gt;
  &lt;span class="pl-k"&gt;const&lt;/span&gt; &lt;span class="pl-s1"&gt;htmlContent&lt;/span&gt; &lt;span class="pl-c1"&gt;=&lt;/span&gt; &lt;span class="pl-s1"&gt;html&lt;/span&gt;&lt;span class="pl-kos"&gt;;&lt;/span&gt;
  &lt;span class="pl-c"&gt;// Create a temporary element to hold the HTML content&lt;/span&gt;
  &lt;span class="pl-k"&gt;const&lt;/span&gt; &lt;span class="pl-s1"&gt;tempElement&lt;/span&gt; &lt;span class="pl-c1"&gt;=&lt;/span&gt; &lt;span class="pl-smi"&gt;document&lt;/span&gt;&lt;span class="pl-kos"&gt;.&lt;/span&gt;&lt;span class="pl-en"&gt;createElement&lt;/span&gt;&lt;span class="pl-kos"&gt;(&lt;/span&gt;&lt;span class="pl-s"&gt;"div"&lt;/span&gt;&lt;span class="pl-kos"&gt;)&lt;/span&gt;&lt;span class="pl-kos"&gt;;&lt;/span&gt;
  &lt;span class="pl-s1"&gt;tempElement&lt;/span&gt;&lt;span class="pl-kos"&gt;.&lt;/span&gt;&lt;span class="pl-c1"&gt;innerHTML&lt;/span&gt; &lt;span class="pl-c1"&gt;=&lt;/span&gt; &lt;span class="pl-s1"&gt;htmlContent&lt;/span&gt;&lt;span class="pl-kos"&gt;;&lt;/span&gt;
  &lt;span class="pl-smi"&gt;document&lt;/span&gt;&lt;span class="pl-kos"&gt;.&lt;/span&gt;&lt;span class="pl-c1"&gt;body&lt;/span&gt;&lt;span class="pl-kos"&gt;.&lt;/span&gt;&lt;span class="pl-en"&gt;appendChild&lt;/span&gt;&lt;span class="pl-kos"&gt;(&lt;/span&gt;&lt;span class="pl-s1"&gt;tempElement&lt;/span&gt;&lt;span class="pl-kos"&gt;)&lt;/span&gt;&lt;span class="pl-kos"&gt;;&lt;/span&gt;
  &lt;span class="pl-c"&gt;// Select the HTML content&lt;/span&gt;
  &lt;span class="pl-k"&gt;const&lt;/span&gt; &lt;span class="pl-s1"&gt;range&lt;/span&gt; &lt;span class="pl-c1"&gt;=&lt;/span&gt; &lt;span class="pl-smi"&gt;document&lt;/span&gt;&lt;span class="pl-kos"&gt;.&lt;/span&gt;&lt;span class="pl-en"&gt;createRange&lt;/span&gt;&lt;span class="pl-kos"&gt;(&lt;/span&gt;&lt;span class="pl-kos"&gt;)&lt;/span&gt;&lt;span class="pl-kos"&gt;;&lt;/span&gt;
  &lt;span class="pl-s1"&gt;range&lt;/span&gt;&lt;span class="pl-kos"&gt;.&lt;/span&gt;&lt;span class="pl-en"&gt;selectNode&lt;/span&gt;&lt;span class="pl-kos"&gt;(&lt;/span&gt;&lt;span class="pl-s1"&gt;tempElement&lt;/span&gt;&lt;span class="pl-kos"&gt;)&lt;/span&gt;&lt;span class="pl-kos"&gt;;&lt;/span&gt;
  &lt;span class="pl-c"&gt;// Copy the selected HTML content to the clipboard&lt;/span&gt;
  &lt;span class="pl-k"&gt;const&lt;/span&gt; &lt;span class="pl-s1"&gt;selection&lt;/span&gt; &lt;span class="pl-c1"&gt;=&lt;/span&gt; &lt;span class="pl-smi"&gt;window&lt;/span&gt;&lt;span class="pl-kos"&gt;.&lt;/span&gt;&lt;span class="pl-en"&gt;getSelection&lt;/span&gt;&lt;span class="pl-kos"&gt;(&lt;/span&gt;&lt;span class="pl-kos"&gt;)&lt;/span&gt;&lt;span class="pl-kos"&gt;;&lt;/span&gt;
  &lt;span class="pl-s1"&gt;selection&lt;/span&gt;&lt;span class="pl-kos"&gt;.&lt;/span&gt;&lt;span class="pl-en"&gt;removeAllRanges&lt;/span&gt;&lt;span class="pl-kos"&gt;(&lt;/span&gt;&lt;span class="pl-kos"&gt;)&lt;/span&gt;&lt;span class="pl-kos"&gt;;&lt;/span&gt;
  &lt;span class="pl-s1"&gt;selection&lt;/span&gt;&lt;span class="pl-kos"&gt;.&lt;/span&gt;&lt;span class="pl-en"&gt;addRange&lt;/span&gt;&lt;span class="pl-kos"&gt;(&lt;/span&gt;&lt;span class="pl-s1"&gt;range&lt;/span&gt;&lt;span class="pl-kos"&gt;)&lt;/span&gt;&lt;span class="pl-kos"&gt;;&lt;/span&gt;
  &lt;span class="pl-smi"&gt;document&lt;/span&gt;&lt;span class="pl-kos"&gt;.&lt;/span&gt;&lt;span class="pl-en"&gt;execCommand&lt;/span&gt;&lt;span class="pl-kos"&gt;(&lt;/span&gt;&lt;span class="pl-s"&gt;"copy"&lt;/span&gt;&lt;span class="pl-kos"&gt;)&lt;/span&gt;&lt;span class="pl-kos"&gt;;&lt;/span&gt;
  &lt;span class="pl-s1"&gt;selection&lt;/span&gt;&lt;span class="pl-kos"&gt;.&lt;/span&gt;&lt;span class="pl-en"&gt;removeAllRanges&lt;/span&gt;&lt;span class="pl-kos"&gt;(&lt;/span&gt;&lt;span class="pl-kos"&gt;)&lt;/span&gt;&lt;span class="pl-kos"&gt;;&lt;/span&gt;
  &lt;span class="pl-smi"&gt;document&lt;/span&gt;&lt;span class="pl-kos"&gt;.&lt;/span&gt;&lt;span class="pl-c1"&gt;body&lt;/span&gt;&lt;span class="pl-kos"&gt;.&lt;/span&gt;&lt;span class="pl-en"&gt;removeChild&lt;/span&gt;&lt;span class="pl-kos"&gt;(&lt;/span&gt;&lt;span class="pl-s1"&gt;tempElement&lt;/span&gt;&lt;span class="pl-kos"&gt;)&lt;/span&gt;&lt;span class="pl-kos"&gt;;&lt;/span&gt;
&lt;span class="pl-kos"&gt;}&lt;/span&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;This works great! Set up a button that triggers that function and clicking that button will copy a rich text version of the HTML to the clipboard, such that pasting it directly into the Substack editor has the desired effect.&lt;/p&gt;
&lt;h4&gt;Assembling the HTML&lt;/h4&gt;
&lt;p&gt;I love using &lt;a href="https://observablehq.com/"&gt;Observable Notebooks&lt;/a&gt; for this kind of project: quick data integration tools that need a UI and will likely be incrementally improved over time.&lt;/p&gt;
&lt;p&gt;Using Observable for these means I don't need to host anything and I can iterate my way to the right solution really quickly.&lt;/p&gt;
&lt;p&gt;First, I needed to retrieve my entries, blogmarks and quotations.&lt;/p&gt;
&lt;p&gt;I never built an API for my Django blog directly, but a while ago I set up a mechanism that &lt;a href="https://github.com/simonw/simonwillisonblog-backup/blob/main/.github/workflows/backup.yml"&gt;exports the contents&lt;/a&gt; of my blog to &lt;a href="https://github.com/simonw/simonwillisonblog-backup"&gt;my simonwillisonblog-backup&lt;/a&gt; GitHub repository for safety, and then deploys a Datasette/SQLite copy of that data to &lt;a href="https://datasette.simonwillison.net/"&gt;https://datasette.simonwillison.net/&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;&lt;a href="https://datasette.io/"&gt;Datasette&lt;/a&gt; offers a JSON API for querying that data, and exposes open CORS headers which means JavaScript running in Observable can query it directly.&lt;/p&gt;
&lt;p&gt;&lt;a href="https://datasette.simonwillison.net/simonwillisonblog?sql=select+*+from+blog_entry+order+by+id+desc+limit+5"&gt;Here's an example SQL query&lt;/a&gt; running against that Datasette instance - click the &lt;code&gt;.json&lt;/code&gt; link on that page to get that data back as JSON instead.&lt;/p&gt;
&lt;p&gt;My Observable notebook can then retrieve the exact data it needs to construct the HTML for the newsletter.&lt;/p&gt;
&lt;p&gt;The smart thing to do would have been to retrieve the data from the API and then use JavaScript inside Observable to compose that together into the HTML for the newsletter.&lt;/p&gt;
&lt;p&gt;I decided to challenge myself to doing most of the work in SQL instead, and came up with the following absolute monster of a query:&lt;/p&gt;
&lt;div class="highlight highlight-source-sql"&gt;&lt;pre&gt;with content &lt;span class="pl-k"&gt;as&lt;/span&gt; (
  &lt;span class="pl-k"&gt;select&lt;/span&gt;
    &lt;span class="pl-s"&gt;&lt;span class="pl-pds"&gt;'&lt;/span&gt;entry&lt;span class="pl-pds"&gt;'&lt;/span&gt;&lt;/span&gt; &lt;span class="pl-k"&gt;as&lt;/span&gt; type, title, created, slug,
    &lt;span class="pl-s"&gt;&lt;span class="pl-pds"&gt;'&lt;/span&gt;&amp;lt;h3&amp;gt;&amp;lt;a href="&lt;span class="pl-pds"&gt;'&lt;/span&gt;&lt;/span&gt; &lt;span class="pl-k"&gt;||&lt;/span&gt; &lt;span class="pl-s"&gt;&lt;span class="pl-pds"&gt;'&lt;/span&gt;https://simonwillison.net/&lt;span class="pl-pds"&gt;'&lt;/span&gt;&lt;/span&gt; &lt;span class="pl-k"&gt;||&lt;/span&gt; strftime(&lt;span class="pl-s"&gt;&lt;span class="pl-pds"&gt;'&lt;/span&gt;%Y/&lt;span class="pl-pds"&gt;'&lt;/span&gt;&lt;/span&gt;, created)
      &lt;span class="pl-k"&gt;||&lt;/span&gt; substr(&lt;span class="pl-s"&gt;&lt;span class="pl-pds"&gt;'&lt;/span&gt;JanFebMarAprMayJunJulAugSepOctNovDec&lt;span class="pl-pds"&gt;'&lt;/span&gt;&lt;/span&gt;, (strftime(&lt;span class="pl-s"&gt;&lt;span class="pl-pds"&gt;'&lt;/span&gt;%m&lt;span class="pl-pds"&gt;'&lt;/span&gt;&lt;/span&gt;, created) &lt;span class="pl-k"&gt;-&lt;/span&gt; &lt;span class="pl-c1"&gt;1&lt;/span&gt;) &lt;span class="pl-k"&gt;*&lt;/span&gt; &lt;span class="pl-c1"&gt;3&lt;/span&gt; &lt;span class="pl-k"&gt;+&lt;/span&gt; &lt;span class="pl-c1"&gt;1&lt;/span&gt;, &lt;span class="pl-c1"&gt;3&lt;/span&gt;) 
      &lt;span class="pl-k"&gt;||&lt;/span&gt; &lt;span class="pl-s"&gt;&lt;span class="pl-pds"&gt;'&lt;/span&gt;/&lt;span class="pl-pds"&gt;'&lt;/span&gt;&lt;/span&gt; &lt;span class="pl-k"&gt;||&lt;/span&gt; cast(strftime(&lt;span class="pl-s"&gt;&lt;span class="pl-pds"&gt;'&lt;/span&gt;%d&lt;span class="pl-pds"&gt;'&lt;/span&gt;&lt;/span&gt;, created) &lt;span class="pl-k"&gt;as&lt;/span&gt; &lt;span class="pl-k"&gt;integer&lt;/span&gt;) &lt;span class="pl-k"&gt;||&lt;/span&gt; &lt;span class="pl-s"&gt;&lt;span class="pl-pds"&gt;'&lt;/span&gt;/&lt;span class="pl-pds"&gt;'&lt;/span&gt;&lt;/span&gt; &lt;span class="pl-k"&gt;||&lt;/span&gt; slug &lt;span class="pl-k"&gt;||&lt;/span&gt; &lt;span class="pl-s"&gt;&lt;span class="pl-pds"&gt;'&lt;/span&gt;/&lt;span class="pl-pds"&gt;'&lt;/span&gt;&lt;/span&gt; &lt;span class="pl-k"&gt;||&lt;/span&gt; &lt;span class="pl-s"&gt;&lt;span class="pl-pds"&gt;'&lt;/span&gt;"&amp;gt;&lt;span class="pl-pds"&gt;'&lt;/span&gt;&lt;/span&gt; 
      &lt;span class="pl-k"&gt;||&lt;/span&gt; title &lt;span class="pl-k"&gt;||&lt;/span&gt; &lt;span class="pl-s"&gt;&lt;span class="pl-pds"&gt;'&lt;/span&gt;&amp;lt;/a&amp;gt; - &lt;span class="pl-pds"&gt;'&lt;/span&gt;&lt;/span&gt; &lt;span class="pl-k"&gt;||&lt;/span&gt; &lt;span class="pl-k"&gt;date&lt;/span&gt;(created) &lt;span class="pl-k"&gt;||&lt;/span&gt; &lt;span class="pl-s"&gt;&lt;span class="pl-pds"&gt;'&lt;/span&gt;&amp;lt;/h3&amp;gt;&lt;span class="pl-pds"&gt;'&lt;/span&gt;&lt;/span&gt; &lt;span class="pl-k"&gt;||&lt;/span&gt; body
      &lt;span class="pl-k"&gt;as&lt;/span&gt; html,
    &lt;span class="pl-s"&gt;&lt;span class="pl-pds"&gt;'&lt;/span&gt;&lt;span class="pl-pds"&gt;'&lt;/span&gt;&lt;/span&gt; &lt;span class="pl-k"&gt;as&lt;/span&gt; external_url
  &lt;span class="pl-k"&gt;from&lt;/span&gt; blog_entry
  &lt;span class="pl-k"&gt;union all&lt;/span&gt;
  &lt;span class="pl-k"&gt;select&lt;/span&gt;
    &lt;span class="pl-s"&gt;&lt;span class="pl-pds"&gt;'&lt;/span&gt;blogmark&lt;span class="pl-pds"&gt;'&lt;/span&gt;&lt;/span&gt; &lt;span class="pl-k"&gt;as&lt;/span&gt; type,
    link_title, created, slug,
    &lt;span class="pl-s"&gt;&lt;span class="pl-pds"&gt;'&lt;/span&gt;&amp;lt;p&amp;gt;&amp;lt;strong&amp;gt;Link&amp;lt;/strong&amp;gt; &lt;span class="pl-pds"&gt;'&lt;/span&gt;&lt;/span&gt; &lt;span class="pl-k"&gt;||&lt;/span&gt; &lt;span class="pl-k"&gt;date&lt;/span&gt;(created) &lt;span class="pl-k"&gt;||&lt;/span&gt; &lt;span class="pl-s"&gt;&lt;span class="pl-pds"&gt;'&lt;/span&gt; &amp;lt;a href="&lt;span class="pl-pds"&gt;'&lt;/span&gt;&lt;/span&gt;&lt;span class="pl-k"&gt;||&lt;/span&gt; link_url &lt;span class="pl-k"&gt;||&lt;/span&gt; &lt;span class="pl-s"&gt;&lt;span class="pl-pds"&gt;'&lt;/span&gt;"&amp;gt;&lt;span class="pl-pds"&gt;'&lt;/span&gt;&lt;/span&gt;
      &lt;span class="pl-k"&gt;||&lt;/span&gt; link_title &lt;span class="pl-k"&gt;||&lt;/span&gt; &lt;span class="pl-s"&gt;&lt;span class="pl-pds"&gt;'&lt;/span&gt;&amp;lt;/a&amp;gt;:&lt;span class="pl-pds"&gt;'&lt;/span&gt;&lt;/span&gt; &lt;span class="pl-k"&gt;||&lt;/span&gt; &lt;span class="pl-s"&gt;&lt;span class="pl-pds"&gt;'&lt;/span&gt; &lt;span class="pl-pds"&gt;'&lt;/span&gt;&lt;/span&gt; &lt;span class="pl-k"&gt;||&lt;/span&gt; commentary &lt;span class="pl-k"&gt;||&lt;/span&gt; &lt;span class="pl-s"&gt;&lt;span class="pl-pds"&gt;'&lt;/span&gt;&amp;lt;/p&amp;gt;&lt;span class="pl-pds"&gt;'&lt;/span&gt;&lt;/span&gt;
      &lt;span class="pl-k"&gt;as&lt;/span&gt; html,
  link_url &lt;span class="pl-k"&gt;as&lt;/span&gt; external_url
  &lt;span class="pl-k"&gt;from&lt;/span&gt; blog_blogmark
  &lt;span class="pl-k"&gt;union all&lt;/span&gt;
  &lt;span class="pl-k"&gt;select&lt;/span&gt;
    &lt;span class="pl-s"&gt;&lt;span class="pl-pds"&gt;'&lt;/span&gt;quotation&lt;span class="pl-pds"&gt;'&lt;/span&gt;&lt;/span&gt; &lt;span class="pl-k"&gt;as&lt;/span&gt; type,
    source, created, slug,
    &lt;span class="pl-s"&gt;&lt;span class="pl-pds"&gt;'&lt;/span&gt;&amp;lt;strong&amp;gt;Quote&amp;lt;/strong&amp;gt; &lt;span class="pl-pds"&gt;'&lt;/span&gt;&lt;/span&gt; &lt;span class="pl-k"&gt;||&lt;/span&gt; &lt;span class="pl-k"&gt;date&lt;/span&gt;(created) &lt;span class="pl-k"&gt;||&lt;/span&gt; &lt;span class="pl-s"&gt;&lt;span class="pl-pds"&gt;'&lt;/span&gt;&amp;lt;blockquote&amp;gt;&amp;lt;p&amp;gt;&amp;lt;em&amp;gt;&lt;span class="pl-pds"&gt;'&lt;/span&gt;&lt;/span&gt;
    &lt;span class="pl-k"&gt;||&lt;/span&gt; replace(quotation, &lt;span class="pl-s"&gt;&lt;span class="pl-pds"&gt;'&lt;/span&gt;&lt;/span&gt;
&lt;span class="pl-s"&gt;&lt;span class="pl-pds"&gt;'&lt;/span&gt;&lt;/span&gt;, &lt;span class="pl-s"&gt;&lt;span class="pl-pds"&gt;'&lt;/span&gt;&amp;lt;br&amp;gt;&lt;span class="pl-pds"&gt;'&lt;/span&gt;&lt;/span&gt;) &lt;span class="pl-k"&gt;||&lt;/span&gt; &lt;span class="pl-s"&gt;&lt;span class="pl-pds"&gt;'&lt;/span&gt;&amp;lt;/em&amp;gt;&amp;lt;/p&amp;gt;&amp;lt;/blockquote&amp;gt;&amp;lt;p&amp;gt;&amp;lt;a href="&lt;span class="pl-pds"&gt;'&lt;/span&gt;&lt;/span&gt; &lt;span class="pl-k"&gt;||&lt;/span&gt;
    coalesce(source_url, &lt;span class="pl-s"&gt;&lt;span class="pl-pds"&gt;'&lt;/span&gt;#&lt;span class="pl-pds"&gt;'&lt;/span&gt;&lt;/span&gt;) &lt;span class="pl-k"&gt;||&lt;/span&gt; &lt;span class="pl-s"&gt;&lt;span class="pl-pds"&gt;'&lt;/span&gt;"&amp;gt;&lt;span class="pl-pds"&gt;'&lt;/span&gt;&lt;/span&gt; &lt;span class="pl-k"&gt;||&lt;/span&gt; source &lt;span class="pl-k"&gt;||&lt;/span&gt; &lt;span class="pl-s"&gt;&lt;span class="pl-pds"&gt;'&lt;/span&gt;&amp;lt;/a&amp;gt;&amp;lt;/p&amp;gt;&lt;span class="pl-pds"&gt;'&lt;/span&gt;&lt;/span&gt;
    &lt;span class="pl-k"&gt;as&lt;/span&gt; html,
    source_url &lt;span class="pl-k"&gt;as&lt;/span&gt; external_url
  &lt;span class="pl-k"&gt;from&lt;/span&gt; blog_quotation
),
collected &lt;span class="pl-k"&gt;as&lt;/span&gt; (
  &lt;span class="pl-k"&gt;select&lt;/span&gt;
    type,
    title,
    &lt;span class="pl-s"&gt;&lt;span class="pl-pds"&gt;'&lt;/span&gt;https://simonwillison.net/&lt;span class="pl-pds"&gt;'&lt;/span&gt;&lt;/span&gt; &lt;span class="pl-k"&gt;||&lt;/span&gt; strftime(&lt;span class="pl-s"&gt;&lt;span class="pl-pds"&gt;'&lt;/span&gt;%Y/&lt;span class="pl-pds"&gt;'&lt;/span&gt;&lt;/span&gt;, created)
      &lt;span class="pl-k"&gt;||&lt;/span&gt; substr(&lt;span class="pl-s"&gt;&lt;span class="pl-pds"&gt;'&lt;/span&gt;JanFebMarAprMayJunJulAugSepOctNovDec&lt;span class="pl-pds"&gt;'&lt;/span&gt;&lt;/span&gt;, (strftime(&lt;span class="pl-s"&gt;&lt;span class="pl-pds"&gt;'&lt;/span&gt;%m&lt;span class="pl-pds"&gt;'&lt;/span&gt;&lt;/span&gt;, created) &lt;span class="pl-k"&gt;-&lt;/span&gt; &lt;span class="pl-c1"&gt;1&lt;/span&gt;) &lt;span class="pl-k"&gt;*&lt;/span&gt; &lt;span class="pl-c1"&gt;3&lt;/span&gt; &lt;span class="pl-k"&gt;+&lt;/span&gt; &lt;span class="pl-c1"&gt;1&lt;/span&gt;, &lt;span class="pl-c1"&gt;3&lt;/span&gt;) &lt;span class="pl-k"&gt;||&lt;/span&gt; 
      &lt;span class="pl-s"&gt;&lt;span class="pl-pds"&gt;'&lt;/span&gt;/&lt;span class="pl-pds"&gt;'&lt;/span&gt;&lt;/span&gt; &lt;span class="pl-k"&gt;||&lt;/span&gt; cast(strftime(&lt;span class="pl-s"&gt;&lt;span class="pl-pds"&gt;'&lt;/span&gt;%d&lt;span class="pl-pds"&gt;'&lt;/span&gt;&lt;/span&gt;, created) &lt;span class="pl-k"&gt;as&lt;/span&gt; &lt;span class="pl-k"&gt;integer&lt;/span&gt;) &lt;span class="pl-k"&gt;||&lt;/span&gt; &lt;span class="pl-s"&gt;&lt;span class="pl-pds"&gt;'&lt;/span&gt;/&lt;span class="pl-pds"&gt;'&lt;/span&gt;&lt;/span&gt; &lt;span class="pl-k"&gt;||&lt;/span&gt; slug &lt;span class="pl-k"&gt;||&lt;/span&gt; &lt;span class="pl-s"&gt;&lt;span class="pl-pds"&gt;'&lt;/span&gt;/&lt;span class="pl-pds"&gt;'&lt;/span&gt;&lt;/span&gt;
      &lt;span class="pl-k"&gt;as&lt;/span&gt; url,
    created,
    html,
    external_url
  &lt;span class="pl-k"&gt;from&lt;/span&gt; content
  &lt;span class="pl-k"&gt;where&lt;/span&gt; created &lt;span class="pl-k"&gt;&amp;gt;=&lt;/span&gt; &lt;span class="pl-k"&gt;date&lt;/span&gt;(&lt;span class="pl-s"&gt;&lt;span class="pl-pds"&gt;'&lt;/span&gt;now&lt;span class="pl-pds"&gt;'&lt;/span&gt;&lt;/span&gt;, &lt;span class="pl-s"&gt;&lt;span class="pl-pds"&gt;'&lt;/span&gt;-&lt;span class="pl-pds"&gt;'&lt;/span&gt;&lt;/span&gt; &lt;span class="pl-k"&gt;||&lt;/span&gt; :numdays &lt;span class="pl-k"&gt;||&lt;/span&gt; &lt;span class="pl-s"&gt;&lt;span class="pl-pds"&gt;'&lt;/span&gt; days&lt;span class="pl-pds"&gt;'&lt;/span&gt;&lt;/span&gt;)   
  &lt;span class="pl-k"&gt;order by&lt;/span&gt; created &lt;span class="pl-k"&gt;desc&lt;/span&gt;
)
&lt;span class="pl-k"&gt;select&lt;/span&gt; type, title, url, created, html, external_url
&lt;span class="pl-k"&gt;from&lt;/span&gt; collected 
&lt;span class="pl-k"&gt;order by&lt;/span&gt; 
  case type 
    when &lt;span class="pl-s"&gt;&lt;span class="pl-pds"&gt;'&lt;/span&gt;entry&lt;span class="pl-pds"&gt;'&lt;/span&gt;&lt;/span&gt; then &lt;span class="pl-c1"&gt;0&lt;/span&gt; 
    else &lt;span class="pl-c1"&gt;1&lt;/span&gt; 
  end,
  case type 
    when &lt;span class="pl-s"&gt;&lt;span class="pl-pds"&gt;'&lt;/span&gt;entry&lt;span class="pl-pds"&gt;'&lt;/span&gt;&lt;/span&gt; then created 
    else &lt;span class="pl-k"&gt;-&lt;/span&gt;strftime(&lt;span class="pl-s"&gt;&lt;span class="pl-pds"&gt;'&lt;/span&gt;%s&lt;span class="pl-pds"&gt;'&lt;/span&gt;&lt;/span&gt;, created) 
  end &lt;span class="pl-k"&gt;desc&lt;/span&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;This logic really should be in the JavaScript instead! You can &lt;a href="https://datasette.simonwillison.net/simonwillisonblog?sql=with+content+as+%28%0D%0A++select%0D%0A++++%27entry%27+as+type%2C+title%2C+created%2C+slug%2C%0D%0A++++%27%3Ch3%3E%3Ca+href%3D%22%27+%7C%7C+%27https%3A%2F%2Fsimonwillison.net%2F%27+%7C%7C+strftime%28%27%25Y%2F%27%2C+created%29%0D%0A++++++%7C%7C+substr%28%27JanFebMarAprMayJunJulAugSepOctNovDec%27%2C+%28strftime%28%27%25m%27%2C+created%29+-+1%29+*+3+%2B+1%2C+3%29+%0D%0A++++++%7C%7C+%27%2F%27+%7C%7C+cast%28strftime%28%27%25d%27%2C+created%29+as+integer%29+%7C%7C+%27%2F%27+%7C%7C+slug+%7C%7C+%27%2F%27+%7C%7C+%27%22%3E%27+%0D%0A++++++%7C%7C+title+%7C%7C+%27%3C%2Fa%3E+-+%27+%7C%7C+date%28created%29+%7C%7C+%27%3C%2Fh3%3E%27+%7C%7C+body%0D%0A++++++as+html%2C%0D%0A++++%27%27+as+external_url%0D%0A++from+blog_entry%0D%0A++union+all%0D%0A++select%0D%0A++++%27blogmark%27+as+type%2C%0D%0A++++link_title%2C+created%2C+slug%2C%0D%0A++++%27%3Cp%3E%3Cstrong%3ELink%3C%2Fstrong%3E+%27+%7C%7C+date%28created%29+%7C%7C+%27+%3Ca+href%3D%22%27%7C%7C+link_url+%7C%7C+%27%22%3E%27%0D%0A++++++%7C%7C+link_title+%7C%7C+%27%3C%2Fa%3E%3A%27+%7C%7C+%27+%27+%7C%7C+commentary+%7C%7C+%27%3C%2Fp%3E%27%0D%0A++++++as+html%2C%0D%0A++link_url+as+external_url%0D%0A++from+blog_blogmark%0D%0A++union+all%0D%0A++select%0D%0A++++%27quotation%27+as+type%2C%0D%0A++++source%2C+created%2C+slug%2C%0D%0A++++%27%3Cstrong%3EQuote%3C%2Fstrong%3E+%27+%7C%7C+date%28created%29+%7C%7C+%27%3Cblockquote%3E%3Cp%3E%3Cem%3E%27%0D%0A++++%7C%7C+replace%28quotation%2C+%27%0D%0A%27%2C+%27%3Cbr%3E%27%29+%7C%7C+%27%3C%2Fem%3E%3C%2Fp%3E%3C%2Fblockquote%3E%3Cp%3E%3Ca+href%3D%22%27+%7C%7C%0D%0A++++coalesce%28source_url%2C+%27%23%27%29+%7C%7C+%27%22%3E%27+%7C%7C+source+%7C%7C+%27%3C%2Fa%3E%3C%2Fp%3E%27%0D%0A++++as+html%2C%0D%0A++++source_url+as+external_url%0D%0A++from+blog_quotation%0D%0A%29%2C%0D%0Acollected+as+%28%0D%0A++select%0D%0A++++type%2C%0D%0A++++title%2C%0D%0A++++%27https%3A%2F%2Fsimonwillison.net%2F%27+%7C%7C+strftime%28%27%25Y%2F%27%2C+created%29%0D%0A++++++%7C%7C+substr%28%27JanFebMarAprMayJunJulAugSepOctNovDec%27%2C+%28strftime%28%27%25m%27%2C+created%29+-+1%29+*+3+%2B+1%2C+3%29+%7C%7C+%0D%0A++++++%27%2F%27+%7C%7C+cast%28strftime%28%27%25d%27%2C+created%29+as+integer%29+%7C%7C+%27%2F%27+%7C%7C+slug+%7C%7C+%27%2F%27%0D%0A++++++as+url%2C%0D%0A++++created%2C%0D%0A++++html%2C%0D%0A++++external_url%0D%0A++from+content%0D%0A++where+created+%3E%3D+date%28%27now%27%2C+%27-%27+%7C%7C+%3Anumdays+%7C%7C+%27+days%27%29+++%0D%0A++order+by+created+desc%0D%0A%29%0D%0Aselect+type%2C+title%2C+url%2C+created%2C+html%2C+external_url%0D%0Afrom+collected+%0D%0Aorder+by+%0D%0A++case+type+%0D%0A++++when+%27entry%27+then+0+%0D%0A++++else+1+%0D%0A++end%2C%0D%0A++case+type+%0D%0A++++when+%27entry%27+then+created+%0D%0A++++else+-strftime%28%27%25s%27%2C+created%29+%0D%0A++end+desc&amp;amp;numdays=7"&gt;try that query in Datasette&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;There are a bunch of tricks in there, but my favourite is this one:&lt;/p&gt;
&lt;div class="highlight highlight-source-sql"&gt;&lt;pre&gt;&lt;span class="pl-k"&gt;select&lt;/span&gt; &lt;span class="pl-s"&gt;&lt;span class="pl-pds"&gt;'&lt;/span&gt;https://simonwillison.net/&lt;span class="pl-pds"&gt;'&lt;/span&gt;&lt;/span&gt; &lt;span class="pl-k"&gt;||&lt;/span&gt; strftime(&lt;span class="pl-s"&gt;&lt;span class="pl-pds"&gt;'&lt;/span&gt;%Y/&lt;span class="pl-pds"&gt;'&lt;/span&gt;&lt;/span&gt;, created)
  &lt;span class="pl-k"&gt;||&lt;/span&gt; substr(
    &lt;span class="pl-s"&gt;&lt;span class="pl-pds"&gt;'&lt;/span&gt;JanFebMarAprMayJunJulAugSepOctNovDec&lt;span class="pl-pds"&gt;'&lt;/span&gt;&lt;/span&gt;,
    (strftime(&lt;span class="pl-s"&gt;&lt;span class="pl-pds"&gt;'&lt;/span&gt;%m&lt;span class="pl-pds"&gt;'&lt;/span&gt;&lt;/span&gt;, created) &lt;span class="pl-k"&gt;-&lt;/span&gt; &lt;span class="pl-c1"&gt;1&lt;/span&gt;) &lt;span class="pl-k"&gt;*&lt;/span&gt; &lt;span class="pl-c1"&gt;3&lt;/span&gt; &lt;span class="pl-k"&gt;+&lt;/span&gt; &lt;span class="pl-c1"&gt;1&lt;/span&gt;, &lt;span class="pl-c1"&gt;3&lt;/span&gt;
  ) &lt;span class="pl-k"&gt;||&lt;/span&gt;  &lt;span class="pl-s"&gt;&lt;span class="pl-pds"&gt;'&lt;/span&gt;/&lt;span class="pl-pds"&gt;'&lt;/span&gt;&lt;/span&gt; &lt;span class="pl-k"&gt;||&lt;/span&gt; cast(strftime(&lt;span class="pl-s"&gt;&lt;span class="pl-pds"&gt;'&lt;/span&gt;%d&lt;span class="pl-pds"&gt;'&lt;/span&gt;&lt;/span&gt;, created) &lt;span class="pl-k"&gt;as&lt;/span&gt; &lt;span class="pl-k"&gt;integer&lt;/span&gt;) &lt;span class="pl-k"&gt;||&lt;/span&gt; &lt;span class="pl-s"&gt;&lt;span class="pl-pds"&gt;'&lt;/span&gt;/&lt;span class="pl-pds"&gt;'&lt;/span&gt;&lt;/span&gt; &lt;span class="pl-k"&gt;||&lt;/span&gt; slug &lt;span class="pl-k"&gt;||&lt;/span&gt; &lt;span class="pl-s"&gt;&lt;span class="pl-pds"&gt;'&lt;/span&gt;/&lt;span class="pl-pds"&gt;'&lt;/span&gt;&lt;/span&gt;
  &lt;span class="pl-k"&gt;as&lt;/span&gt; url&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;This is the trick I'm using to generate the URL for each entry, blogmark and quotation.&lt;/p&gt;
&lt;p&gt;These are stored as datetime values in the database, but the eventual URLs look like this:&lt;/p&gt;
&lt;p&gt;&lt;a href="https://simonwillison.net/2023/Apr/2/calculator-for-words/"&gt;https://simonwillison.net/2023/Apr/2/calculator-for-words/&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;So I need to turn that date into a YYYY/Mon/DD URL component.&lt;/p&gt;
&lt;p&gt;One problem: SQLite doesn't have a date format string that produces a three letter month abbreviation. But... with cunning application of the &lt;code&gt;substr()&lt;/code&gt; function and a string of all the month abbreviations I can get what I need.&lt;/p&gt;
&lt;p&gt;The above SQL query plus a little bit of JavaScript provides almost everything I need to generate the HTML for my newsletter.&lt;/p&gt;
&lt;h4&gt;Excluding previously sent content&lt;/h4&gt;
&lt;p&gt;There's one last problem to solve: I want to send a newsletter containing everything that's new since my last edition - I don't want to send out the same content twice.&lt;/p&gt;
&lt;p&gt;I came up with a delightfully gnarly solution to that as well.&lt;/p&gt;
&lt;p&gt;As mentioned earlier, Substack provides an RSS feed of previous editions. I can use that data to avoid including content that's already been sent.&lt;/p&gt;
&lt;p&gt;One problem: the Substack RSS feed does't include CORS headers, which means I can't access it directly from my notebook.&lt;/p&gt;
&lt;p&gt;GitHub offers CORS headers for every file in every repository. I already had a repo that was backing up my blog... so why not set that to backup my RSS feed from Substack as well?&lt;/p&gt;
&lt;p&gt;I &lt;a href="https://github.com/simonw/simonwillisonblog-backup/blob/c42b3afd6bd8cb2a4e8fa928c77426ec71552194/.github/workflows/backup.yml#L70-L74"&gt;added this&lt;/a&gt; to my existing &lt;code&gt;backup.yml&lt;/code&gt; GitHub Actions workflow:&lt;/p&gt;
&lt;div class="highlight highlight-source-yaml"&gt;&lt;pre&gt;- &lt;span class="pl-ent"&gt;name&lt;/span&gt;: &lt;span class="pl-s"&gt;Backup Substack&lt;/span&gt;
  &lt;span class="pl-ent"&gt;run&lt;/span&gt;: &lt;span class="pl-s"&gt;|-&lt;/span&gt;
&lt;span class="pl-s"&gt;    curl 'https://simonw.substack.com/feed' | \&lt;/span&gt;
&lt;span class="pl-s"&gt;      python -c "import sys, xml.dom.minidom; print(xml.dom.minidom.parseString(sys.stdin.read()).toprettyxml(indent='  '))" \&lt;/span&gt;
&lt;span class="pl-s"&gt;      &amp;gt; simonw-substack-com.xml&lt;/span&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;I'm piping it through a tiny Python script here to pretty-print the XML before saving it, because pretty-printed XML is easier to read diffs against later on.&lt;/p&gt;
&lt;p&gt;Now &lt;a href="https://github.com/simonw/simonwillisonblog-backup/blob/c42b3afd6bd8cb2a4e8fa928c77426ec71552194/simonw-substack-com.xml"&gt;simonw-substack-com.xml&lt;/a&gt; is a copy of my RSS feed in a GitHub repo, which means I can access the data directly from JavaScript running on Observable.&lt;/p&gt;
&lt;p&gt;Here's the code I wrote there to fetch that RSS feed, parse it as XML and return a string containing just the HTML of all of the posts:&lt;/p&gt;
&lt;div class="highlight highlight-source-js"&gt;&lt;pre&gt;&lt;span class="pl-s1"&gt;previousNewsletters&lt;/span&gt; &lt;span class="pl-c1"&gt;=&lt;/span&gt; &lt;span class="pl-kos"&gt;{&lt;/span&gt;
  &lt;span class="pl-s1"&gt;const&lt;/span&gt; response &lt;span class="pl-c1"&gt;=&lt;/span&gt; &lt;span class="pl-k"&gt;await&lt;/span&gt; &lt;span class="pl-en"&gt;fetch&lt;/span&gt;&lt;span class="pl-kos"&gt;(&lt;/span&gt;
    &lt;span class="pl-s"&gt;"https://raw.githubusercontent.com/simonw/simonwillisonblog-backup/main/simonw-substack-com.xml"&lt;/span&gt;
  &lt;span class="pl-kos"&gt;)&lt;/span&gt;&lt;span class="pl-kos"&gt;;&lt;/span&gt;
  &lt;span class="pl-k"&gt;const&lt;/span&gt; &lt;span class="pl-s1"&gt;rss&lt;/span&gt; &lt;span class="pl-c1"&gt;=&lt;/span&gt; &lt;span class="pl-k"&gt;await&lt;/span&gt; &lt;span class="pl-s1"&gt;response&lt;/span&gt;&lt;span class="pl-kos"&gt;.&lt;/span&gt;&lt;span class="pl-en"&gt;text&lt;/span&gt;&lt;span class="pl-kos"&gt;(&lt;/span&gt;&lt;span class="pl-kos"&gt;)&lt;/span&gt;&lt;span class="pl-kos"&gt;;&lt;/span&gt;
  &lt;span class="pl-k"&gt;const&lt;/span&gt; &lt;span class="pl-s1"&gt;parser&lt;/span&gt; &lt;span class="pl-c1"&gt;=&lt;/span&gt; &lt;span class="pl-k"&gt;new&lt;/span&gt; &lt;span class="pl-v"&gt;DOMParser&lt;/span&gt;&lt;span class="pl-kos"&gt;(&lt;/span&gt;&lt;span class="pl-kos"&gt;)&lt;/span&gt;&lt;span class="pl-kos"&gt;;&lt;/span&gt;
  &lt;span class="pl-k"&gt;const&lt;/span&gt; &lt;span class="pl-s1"&gt;xmlDoc&lt;/span&gt; &lt;span class="pl-c1"&gt;=&lt;/span&gt; &lt;span class="pl-s1"&gt;parser&lt;/span&gt;&lt;span class="pl-kos"&gt;.&lt;/span&gt;&lt;span class="pl-en"&gt;parseFromString&lt;/span&gt;&lt;span class="pl-kos"&gt;(&lt;/span&gt;&lt;span class="pl-s1"&gt;rss&lt;/span&gt;&lt;span class="pl-kos"&gt;,&lt;/span&gt; &lt;span class="pl-s"&gt;"application/xml"&lt;/span&gt;&lt;span class="pl-kos"&gt;)&lt;/span&gt;&lt;span class="pl-kos"&gt;;&lt;/span&gt;
  &lt;span class="pl-k"&gt;const&lt;/span&gt; &lt;span class="pl-s1"&gt;xpathExpression&lt;/span&gt; &lt;span class="pl-c1"&gt;=&lt;/span&gt; &lt;span class="pl-s"&gt;"//content:encoded"&lt;/span&gt;&lt;span class="pl-kos"&gt;;&lt;/span&gt;

  &lt;span class="pl-k"&gt;const&lt;/span&gt; &lt;span class="pl-en"&gt;namespaceResolver&lt;/span&gt; &lt;span class="pl-c1"&gt;=&lt;/span&gt; &lt;span class="pl-kos"&gt;(&lt;/span&gt;&lt;span class="pl-s1"&gt;prefix&lt;/span&gt;&lt;span class="pl-kos"&gt;)&lt;/span&gt; &lt;span class="pl-c1"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="pl-kos"&gt;{&lt;/span&gt;
    &lt;span class="pl-k"&gt;const&lt;/span&gt; &lt;span class="pl-s1"&gt;ns&lt;/span&gt; &lt;span class="pl-c1"&gt;=&lt;/span&gt; &lt;span class="pl-kos"&gt;{&lt;/span&gt;
      &lt;span class="pl-c1"&gt;content&lt;/span&gt;: &lt;span class="pl-s"&gt;"http://purl.org/rss/1.0/modules/content/"&lt;/span&gt;
    &lt;span class="pl-kos"&gt;}&lt;/span&gt;&lt;span class="pl-kos"&gt;;&lt;/span&gt;
    &lt;span class="pl-k"&gt;return&lt;/span&gt; &lt;span class="pl-s1"&gt;ns&lt;/span&gt;&lt;span class="pl-kos"&gt;[&lt;/span&gt;&lt;span class="pl-s1"&gt;prefix&lt;/span&gt;&lt;span class="pl-kos"&gt;]&lt;/span&gt; &lt;span class="pl-c1"&gt;||&lt;/span&gt; &lt;span class="pl-c1"&gt;null&lt;/span&gt;&lt;span class="pl-kos"&gt;;&lt;/span&gt;
  &lt;span class="pl-kos"&gt;}&lt;/span&gt;&lt;span class="pl-kos"&gt;;&lt;/span&gt;

  &lt;span class="pl-k"&gt;const&lt;/span&gt; &lt;span class="pl-s1"&gt;result&lt;/span&gt; &lt;span class="pl-c1"&gt;=&lt;/span&gt; &lt;span class="pl-s1"&gt;xmlDoc&lt;/span&gt;&lt;span class="pl-kos"&gt;.&lt;/span&gt;&lt;span class="pl-en"&gt;evaluate&lt;/span&gt;&lt;span class="pl-kos"&gt;(&lt;/span&gt;
    &lt;span class="pl-s1"&gt;xpathExpression&lt;/span&gt;&lt;span class="pl-kos"&gt;,&lt;/span&gt;
    &lt;span class="pl-s1"&gt;xmlDoc&lt;/span&gt;&lt;span class="pl-kos"&gt;,&lt;/span&gt;
    &lt;span class="pl-en"&gt;namespaceResolver&lt;/span&gt;&lt;span class="pl-kos"&gt;,&lt;/span&gt;
    &lt;span class="pl-v"&gt;XPathResult&lt;/span&gt;&lt;span class="pl-kos"&gt;.&lt;/span&gt;&lt;span class="pl-c1"&gt;ANY_TYPE&lt;/span&gt;&lt;span class="pl-kos"&gt;,&lt;/span&gt;
    &lt;span class="pl-c1"&gt;null&lt;/span&gt;
  &lt;span class="pl-kos"&gt;)&lt;/span&gt;&lt;span class="pl-kos"&gt;;&lt;/span&gt;
  &lt;span class="pl-k"&gt;let&lt;/span&gt; &lt;span class="pl-s1"&gt;node&lt;/span&gt;&lt;span class="pl-kos"&gt;;&lt;/span&gt;
  &lt;span class="pl-k"&gt;let&lt;/span&gt; &lt;span class="pl-s1"&gt;text&lt;/span&gt; &lt;span class="pl-c1"&gt;=&lt;/span&gt; &lt;span class="pl-kos"&gt;[&lt;/span&gt;&lt;span class="pl-kos"&gt;]&lt;/span&gt;&lt;span class="pl-kos"&gt;;&lt;/span&gt;
  &lt;span class="pl-k"&gt;while&lt;/span&gt; &lt;span class="pl-kos"&gt;(&lt;/span&gt;&lt;span class="pl-kos"&gt;(&lt;/span&gt;&lt;span class="pl-s1"&gt;node&lt;/span&gt; &lt;span class="pl-c1"&gt;=&lt;/span&gt; &lt;span class="pl-s1"&gt;result&lt;/span&gt;&lt;span class="pl-kos"&gt;.&lt;/span&gt;&lt;span class="pl-en"&gt;iterateNext&lt;/span&gt;&lt;span class="pl-kos"&gt;(&lt;/span&gt;&lt;span class="pl-kos"&gt;)&lt;/span&gt;&lt;span class="pl-kos"&gt;)&lt;/span&gt;&lt;span class="pl-kos"&gt;)&lt;/span&gt; &lt;span class="pl-kos"&gt;{&lt;/span&gt;
    &lt;span class="pl-s1"&gt;text&lt;/span&gt;&lt;span class="pl-kos"&gt;.&lt;/span&gt;&lt;span class="pl-en"&gt;push&lt;/span&gt;&lt;span class="pl-kos"&gt;(&lt;/span&gt;&lt;span class="pl-s1"&gt;node&lt;/span&gt;&lt;span class="pl-kos"&gt;.&lt;/span&gt;&lt;span class="pl-c1"&gt;textContent&lt;/span&gt;&lt;span class="pl-kos"&gt;)&lt;/span&gt;&lt;span class="pl-kos"&gt;;&lt;/span&gt;
  &lt;span class="pl-kos"&gt;}&lt;/span&gt;
  &lt;span class="pl-k"&gt;return&lt;/span&gt; &lt;span class="pl-s1"&gt;text&lt;/span&gt;&lt;span class="pl-kos"&gt;.&lt;/span&gt;&lt;span class="pl-en"&gt;join&lt;/span&gt;&lt;span class="pl-kos"&gt;(&lt;/span&gt;&lt;span class="pl-s"&gt;"\n"&lt;/span&gt;&lt;span class="pl-kos"&gt;)&lt;/span&gt;&lt;span class="pl-kos"&gt;;&lt;/span&gt;
&lt;span class="pl-kos"&gt;}&lt;/span&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;Then I span up a regular expression to extract all of the URLs from that HTML:&lt;/p&gt;
&lt;div class="highlight highlight-source-js"&gt;&lt;pre&gt;&lt;span class="pl-s1"&gt;previousLinks&lt;/span&gt; &lt;span class="pl-c1"&gt;=&lt;/span&gt; &lt;span class="pl-kos"&gt;{&lt;/span&gt;
  &lt;span class="pl-s1"&gt;const&lt;/span&gt; regex &lt;span class="pl-c1"&gt;=&lt;/span&gt; &lt;span class="pl-pds"&gt;&lt;span class="pl-c1"&gt;/&lt;/span&gt;&lt;span class="pl-kos"&gt;(?:&lt;/span&gt;"&lt;span class="pl-c1"&gt;|&lt;/span&gt;&amp;amp;quot;&lt;span class="pl-kos"&gt;)&lt;/span&gt;&lt;span class="pl-kos"&gt;(&lt;/span&gt;https?:&lt;span class="pl-cce"&gt;\/&lt;/span&gt;&lt;span class="pl-cce"&gt;\/&lt;/span&gt;&lt;span class="pl-kos"&gt;[&lt;/span&gt;^&lt;span class="pl-cce"&gt;\s&lt;/span&gt;"&amp;lt;&amp;gt;&lt;span class="pl-kos"&gt;]&lt;/span&gt;&lt;span class="pl-c1"&gt;+&lt;/span&gt;&lt;span class="pl-kos"&gt;)&lt;/span&gt;&lt;span class="pl-kos"&gt;(?:&lt;/span&gt;"&lt;span class="pl-c1"&gt;|&lt;/span&gt;&amp;amp;quot;&lt;span class="pl-kos"&gt;)&lt;/span&gt;&lt;span class="pl-c1"&gt;/&lt;/span&gt;g&lt;/span&gt;&lt;span class="pl-kos"&gt;;&lt;/span&gt;
  &lt;span class="pl-k"&gt;return&lt;/span&gt; &lt;span class="pl-v"&gt;Array&lt;/span&gt;&lt;span class="pl-kos"&gt;.&lt;/span&gt;&lt;span class="pl-en"&gt;from&lt;/span&gt;&lt;span class="pl-kos"&gt;(&lt;/span&gt;&lt;span class="pl-s1"&gt;previousNewsletters&lt;/span&gt;&lt;span class="pl-kos"&gt;.&lt;/span&gt;&lt;span class="pl-en"&gt;matchAll&lt;/span&gt;&lt;span class="pl-kos"&gt;(&lt;/span&gt;&lt;span class="pl-s1"&gt;regex&lt;/span&gt;&lt;span class="pl-kos"&gt;)&lt;/span&gt;&lt;span class="pl-kos"&gt;,&lt;/span&gt; &lt;span class="pl-kos"&gt;(&lt;/span&gt;&lt;span class="pl-s1"&gt;match&lt;/span&gt;&lt;span class="pl-kos"&gt;)&lt;/span&gt; &lt;span class="pl-c1"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="pl-s1"&gt;match&lt;/span&gt;&lt;span class="pl-kos"&gt;[&lt;/span&gt;&lt;span class="pl-c1"&gt;1&lt;/span&gt;&lt;span class="pl-kos"&gt;]&lt;/span&gt;&lt;span class="pl-kos"&gt;)&lt;/span&gt;&lt;span class="pl-kos"&gt;;&lt;/span&gt;
&lt;span class="pl-kos"&gt;}&lt;/span&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;Added a "skip existing" toggle checkbox to my notebook:&lt;/p&gt;
&lt;div class="highlight highlight-source-js"&gt;&lt;pre&gt;&lt;span class="pl-s1"&gt;viewof&lt;/span&gt; &lt;span class="pl-s1"&gt;skipExisting&lt;/span&gt; &lt;span class="pl-c1"&gt;=&lt;/span&gt; &lt;span class="pl-v"&gt;Inputs&lt;/span&gt;&lt;span class="pl-kos"&gt;.&lt;/span&gt;&lt;span class="pl-en"&gt;toggle&lt;/span&gt;&lt;span class="pl-kos"&gt;(&lt;/span&gt;&lt;span class="pl-kos"&gt;{&lt;/span&gt;
  &lt;span class="pl-c1"&gt;label&lt;/span&gt;: &lt;span class="pl-s"&gt;"Skip content sent in prior newsletters"&lt;/span&gt;
&lt;span class="pl-kos"&gt;}&lt;/span&gt;&lt;span class="pl-kos"&gt;)&lt;/span&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;And added this code to filter the raw content based on whether or not the toggle was selected:&lt;/p&gt;
&lt;div class="highlight highlight-source-js"&gt;&lt;pre&gt;&lt;span class="pl-s1"&gt;content&lt;/span&gt; &lt;span class="pl-c1"&gt;=&lt;/span&gt; &lt;span class="pl-s1"&gt;skipExisting&lt;/span&gt;
  ? &lt;span class="pl-s1"&gt;raw_content&lt;/span&gt;&lt;span class="pl-kos"&gt;.&lt;/span&gt;&lt;span class="pl-en"&gt;filter&lt;/span&gt;&lt;span class="pl-kos"&gt;(&lt;/span&gt;
      &lt;span class="pl-kos"&gt;(&lt;/span&gt;&lt;span class="pl-s1"&gt;e&lt;/span&gt;&lt;span class="pl-kos"&gt;)&lt;/span&gt; &lt;span class="pl-c1"&gt;=&amp;gt;&lt;/span&gt;
        &lt;span class="pl-c1"&gt;!&lt;/span&gt;&lt;span class="pl-s1"&gt;previousLinks&lt;/span&gt;&lt;span class="pl-kos"&gt;.&lt;/span&gt;&lt;span class="pl-en"&gt;includes&lt;/span&gt;&lt;span class="pl-kos"&gt;(&lt;/span&gt;&lt;span class="pl-s1"&gt;e&lt;/span&gt;&lt;span class="pl-kos"&gt;.&lt;/span&gt;&lt;span class="pl-c1"&gt;url&lt;/span&gt;&lt;span class="pl-kos"&gt;)&lt;/span&gt; &lt;span class="pl-c1"&gt;&amp;amp;&amp;amp;&lt;/span&gt;
        &lt;span class="pl-c1"&gt;!&lt;/span&gt;&lt;span class="pl-s1"&gt;previousLinks&lt;/span&gt;&lt;span class="pl-kos"&gt;.&lt;/span&gt;&lt;span class="pl-en"&gt;includes&lt;/span&gt;&lt;span class="pl-kos"&gt;(&lt;/span&gt;&lt;span class="pl-s1"&gt;e&lt;/span&gt;&lt;span class="pl-kos"&gt;.&lt;/span&gt;&lt;span class="pl-c1"&gt;external_url&lt;/span&gt;&lt;span class="pl-kos"&gt;)&lt;/span&gt;
    &lt;span class="pl-kos"&gt;)&lt;/span&gt;
  : &lt;span class="pl-s1"&gt;raw_content&lt;/span&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;The &lt;code&gt;url&lt;/code&gt; is the URL to the post on my blog. &lt;code&gt;external_url&lt;/code&gt; is the URL to the original source of the blogmark or quotation. A match against ether of those should exclude the content from my next newsletter.&lt;/p&gt;
&lt;h4&gt;My workflow for sending a newsletter&lt;/h4&gt;
&lt;p&gt;Given all of the above, sending a newsletter out is hardly any work at all:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;Ensure the most recent backup of my blog has run, such that the Datasette instance contains my latest content. I do that by &lt;a href="https://github.com/simonw/simonwillisonblog-backup/actions/workflows/backup.yml"&gt;triggering this action&lt;/a&gt;.&lt;/li&gt;
&lt;li&gt;Navigate to &lt;a href="https://observablehq.com/@simonw/blog-to-newsletter"&gt;https://observablehq.com/@simonw/blog-to-newsletter&lt;/a&gt; - select "Skip content sent in prior newsletters" and then click the "Copy rich text newsletter to clipboard" button.&lt;/li&gt;
&lt;li&gt;Navigate to the Substack "publish" interface and paste that content into the rich text editor.&lt;/li&gt;
&lt;li&gt;Pick a title and subheading, and maybe add a bit of introductory text.&lt;/li&gt;
&lt;li&gt;Preview it. If the preview looks good, hit "send".&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;&lt;img src="https://static.simonwillison.net/static/2023/newsletter-small.gif" alt="Animated screenshot showing the process of sending the newsletter as described above" style="max-width: 100%;" loading="lazy" /&gt;&lt;/p&gt;
&lt;h4&gt;Copy and paste APIs&lt;/h4&gt;
&lt;p&gt;I think copy and paste is under-rated as an API mechanism.&lt;/p&gt;
&lt;p&gt;There are no rate limits or API keys to worry about.&lt;/p&gt;
&lt;p&gt;It's supported by almost every application, even ones that are resistant to API integrations.&lt;/p&gt;
&lt;p&gt;It even works great on mobile phones, especially if you include a "copy to clipboard" button.&lt;/p&gt;
&lt;p&gt;My &lt;a href="https://datasette.io/plugins/datasette-copyable"&gt;datasette-copyable&lt;/a&gt; plugin for Datasette is one of my earlier explorations of this. It makes it easy to copy data out of Datasette in a variety of useful formats.&lt;/p&gt;
&lt;p&gt;This Observable newsletter project has further convinced me that the clipboard is an under-utilized mechanism for building tools to help integrate data together in creative ways.&lt;/p&gt;
    
        &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/blogging"&gt;blogging&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/projects"&gt;projects&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/datasette"&gt;datasette&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/observable"&gt;observable&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/cors"&gt;cors&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/newsletter"&gt;newsletter&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/substack"&gt;substack&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/site-upgrades"&gt;site-upgrades&lt;/a&gt;&lt;/p&gt;
    

</summary><category term="blogging"/><category term="projects"/><category term="datasette"/><category term="observable"/><category term="cors"/><category term="newsletter"/><category term="substack"/><category term="site-upgrades"/></entry><entry><title>Tracking Mastodon user numbers over time with a bucket of tricks</title><link href="https://simonwillison.net/2022/Nov/20/tracking-mastodon/#atom-tag" rel="alternate"/><published>2022-11-20T07:00:54+00:00</published><updated>2022-11-20T07:00:54+00:00</updated><id>https://simonwillison.net/2022/Nov/20/tracking-mastodon/#atom-tag</id><summary type="html">
    &lt;p&gt;&lt;a href="https://joinmastodon.org/"&gt;Mastodon&lt;/a&gt; is definitely having a moment. User growth is skyrocketing as more and more people migrate over from Twitter.&lt;/p&gt;
&lt;p&gt;I've set up a new &lt;a href="https://simonwillison.net/2020/Oct/9/git-scraping/"&gt;git scraper&lt;/a&gt; to track the number of registered user accounts on known Mastodon instances over time.&lt;/p&gt;
&lt;p&gt;It's only been running for a few hours, but it's already collected enough data to &lt;a href="https://observablehq.com/@simonw/mastodon-users-and-statuses-over-time"&gt;render this chart&lt;/a&gt;:&lt;/p&gt;
&lt;p&gt;&lt;img src="https://static.simonwillison.net/static/2022/mastodon-users-few-hours.png" alt="The chart starts at around 1am with 4,694,000 users - it climbs to 4,716,000 users by 6am in a relatively straight line" style="max-width: 100%;" /&gt;&lt;/p&gt;
&lt;p&gt;I'm looking forward to seeing how this trend continues to develop over the next days and weeks.&lt;/p&gt;
&lt;h4&gt;Scraping the data&lt;/h4&gt;
&lt;p&gt;My scraper works by tracking &lt;a href="https://instances.social/"&gt;https://instances.social/&lt;/a&gt; - a website that lists a large number (but not all) of the Mastodon instances that are out there.&lt;/p&gt;
&lt;p&gt;That site publishes an &lt;a href="https://instances.social/instances.json"&gt;instances.json&lt;/a&gt; array which currently contains 1,830 objects representing Mastodon instances. Each of those objects looks something like this:&lt;/p&gt;
&lt;div class="highlight highlight-source-json"&gt;&lt;pre&gt;{
    &lt;span class="pl-ent"&gt;"name"&lt;/span&gt;: &lt;span class="pl-s"&gt;&lt;span class="pl-pds"&gt;"&lt;/span&gt;pleroma.otter.sh&lt;span class="pl-pds"&gt;"&lt;/span&gt;&lt;/span&gt;,
    &lt;span class="pl-ent"&gt;"title"&lt;/span&gt;: &lt;span class="pl-s"&gt;&lt;span class="pl-pds"&gt;"&lt;/span&gt;Otterland&lt;span class="pl-pds"&gt;"&lt;/span&gt;&lt;/span&gt;,
    &lt;span class="pl-ent"&gt;"short_description"&lt;/span&gt;: &lt;span class="pl-c1"&gt;null&lt;/span&gt;,
    &lt;span class="pl-ent"&gt;"description"&lt;/span&gt;: &lt;span class="pl-s"&gt;&lt;span class="pl-pds"&gt;"&lt;/span&gt;Otters does squeak squeak&lt;span class="pl-pds"&gt;"&lt;/span&gt;&lt;/span&gt;,
    &lt;span class="pl-ent"&gt;"uptime"&lt;/span&gt;: &lt;span class="pl-c1"&gt;0.944757&lt;/span&gt;,
    &lt;span class="pl-ent"&gt;"up"&lt;/span&gt;: &lt;span class="pl-c1"&gt;true&lt;/span&gt;,
    &lt;span class="pl-ent"&gt;"https_score"&lt;/span&gt;: &lt;span class="pl-c1"&gt;null&lt;/span&gt;,
    &lt;span class="pl-ent"&gt;"https_rank"&lt;/span&gt;: &lt;span class="pl-c1"&gt;null&lt;/span&gt;,
    &lt;span class="pl-ent"&gt;"ipv6"&lt;/span&gt;: &lt;span class="pl-c1"&gt;true&lt;/span&gt;,
    &lt;span class="pl-ent"&gt;"openRegistrations"&lt;/span&gt;: &lt;span class="pl-c1"&gt;false&lt;/span&gt;,
    &lt;span class="pl-ent"&gt;"users"&lt;/span&gt;: &lt;span class="pl-c1"&gt;5&lt;/span&gt;,
    &lt;span class="pl-ent"&gt;"statuses"&lt;/span&gt;: &lt;span class="pl-s"&gt;&lt;span class="pl-pds"&gt;"&lt;/span&gt;54870&lt;span class="pl-pds"&gt;"&lt;/span&gt;&lt;/span&gt;,
    &lt;span class="pl-ent"&gt;"connections"&lt;/span&gt;: &lt;span class="pl-c1"&gt;9821&lt;/span&gt;,
}&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;I have &lt;a href="https://github.com/simonw/scrape-instances-social/blob/main/.github/workflows/scrape.yml"&gt;a GitHub Actions workflow&lt;/a&gt; running approximately every 20 minutes that fetches a copy of that file and commits it back to this repository:&lt;/p&gt;
&lt;p&gt;&lt;a href="https://github.com/simonw/scrape-instances-social"&gt;https://github.com/simonw/scrape-instances-social&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;Since each instance includes a &lt;code&gt;users&lt;/code&gt; count, the commit history of my &lt;code&gt;instances.json&lt;/code&gt; file tells the story of Mastodon's growth over time.&lt;/p&gt;
&lt;h4&gt;Building a database&lt;/h4&gt;
&lt;p&gt;A commit log of a JSON file is interesting, but the next step is to turn that into actionable information.&lt;/p&gt;
&lt;p&gt;My &lt;a href="https://simonwillison.net/2021/Dec/7/git-history/"&gt;git-history tool&lt;/a&gt; is designed to do exactly that.&lt;/p&gt;
&lt;p&gt;For the chart up above, the only number I care about is the total number of users listed in each snapshot of the file - the sum of that &lt;code&gt;users&lt;/code&gt; field for each instance.&lt;/p&gt;
&lt;p&gt;Here's how to run &lt;code&gt;git-history&lt;/code&gt; against that file's commit history to generate tables showing how that count has changed over time:&lt;/p&gt;
&lt;div class="highlight highlight-source-shell"&gt;&lt;pre&gt;git-history file counts.db instances.json \
  --convert &lt;span class="pl-s"&gt;&lt;span class="pl-pds"&gt;"&lt;/span&gt;return [&lt;/span&gt;
&lt;span class="pl-s"&gt;    {&lt;/span&gt;
&lt;span class="pl-s"&gt;        'id': 'all',&lt;/span&gt;
&lt;span class="pl-s"&gt;        'users': sum(d['users'] or 0 for d in json.loads(content)),&lt;/span&gt;
&lt;span class="pl-s"&gt;        'statuses': sum(int(d['statuses'] or 0) for d in json.loads(content)),&lt;/span&gt;
&lt;span class="pl-s"&gt;    }&lt;/span&gt;
&lt;span class="pl-s"&gt;  ]&lt;span class="pl-pds"&gt;"&lt;/span&gt;&lt;/span&gt; --id id&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;I'm creating a file called &lt;code&gt;counts.db&lt;/code&gt; that shows the history of the &lt;code&gt;instances.json&lt;/code&gt; file.&lt;/p&gt;
&lt;p&gt;The real trick here though is that &lt;code&gt;--convert&lt;/code&gt; argument. I'm using that to compress each snapshot down to a single row that looks like this:&lt;/p&gt;
&lt;div class="highlight highlight-source-json"&gt;&lt;pre&gt;{
    &lt;span class="pl-ent"&gt;"id"&lt;/span&gt;: &lt;span class="pl-s"&gt;&lt;span class="pl-pds"&gt;"&lt;/span&gt;all&lt;span class="pl-pds"&gt;"&lt;/span&gt;&lt;/span&gt;,
    &lt;span class="pl-ent"&gt;"users"&lt;/span&gt;: &lt;span class="pl-c1"&gt;4717781&lt;/span&gt;,
    &lt;span class="pl-ent"&gt;"statuses"&lt;/span&gt;: &lt;span class="pl-c1"&gt;374217860&lt;/span&gt;
}&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;Normally &lt;code&gt;git-history&lt;/code&gt; expects to work against an array of objects, tracking the history of changes to each one based on their &lt;code&gt;id&lt;/code&gt; property.&lt;/p&gt;
&lt;p&gt;Here I'm tricking it a bit - I only return a single object with the ID of &lt;code&gt;all&lt;/code&gt;. This means that &lt;code&gt;git-history&lt;/code&gt; will only track the history of changes to that single object.&lt;/p&gt;
&lt;p&gt;It works though! The result is a &lt;code&gt;counts.db&lt;/code&gt; file which is currently 52KB and has the following schema (truncated to the most interesting bits):&lt;/p&gt;
&lt;div class="highlight highlight-source-sql"&gt;&lt;pre&gt;CREATE TABLE [commits] (
   [id] &lt;span class="pl-k"&gt;INTEGER&lt;/span&gt; &lt;span class="pl-k"&gt;PRIMARY KEY&lt;/span&gt;,
   [namespace] &lt;span class="pl-k"&gt;INTEGER&lt;/span&gt; &lt;span class="pl-k"&gt;REFERENCES&lt;/span&gt; [namespaces]([id]),
   [hash] &lt;span class="pl-k"&gt;TEXT&lt;/span&gt;,
   [commit_at] &lt;span class="pl-k"&gt;TEXT&lt;/span&gt;
);
CREATE TABLE [item_version] (
   [_id] &lt;span class="pl-k"&gt;INTEGER&lt;/span&gt; &lt;span class="pl-k"&gt;PRIMARY KEY&lt;/span&gt;,
   [_item] &lt;span class="pl-k"&gt;INTEGER&lt;/span&gt; &lt;span class="pl-k"&gt;REFERENCES&lt;/span&gt; [item]([_id]),
   [_version] &lt;span class="pl-k"&gt;INTEGER&lt;/span&gt;,
   [_commit] &lt;span class="pl-k"&gt;INTEGER&lt;/span&gt; &lt;span class="pl-k"&gt;REFERENCES&lt;/span&gt; [commits]([id]),
   [id] &lt;span class="pl-k"&gt;TEXT&lt;/span&gt;,
   [users] &lt;span class="pl-k"&gt;INTEGER&lt;/span&gt;,
   [statuses] &lt;span class="pl-k"&gt;INTEGER&lt;/span&gt;,
   [_item_full_hash] &lt;span class="pl-k"&gt;TEXT&lt;/span&gt;
);&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;Each &lt;code&gt;item_version&lt;/code&gt; row will tell us the number of users and statuses at a particular point in time, based on a join against that &lt;code&gt;commits&lt;/code&gt; table to find the &lt;code&gt;commit_at&lt;/code&gt; date.&lt;/p&gt;
&lt;h4&gt;Publishing the database&lt;/h4&gt;
&lt;p&gt;For this project, I decided to publish the SQLite database to an S3 bucket. I considered pushing the binary SQLite file directly to the GitHub repository but this felt rude, since a binary file that changes every 20 minutes would bloat the repository.&lt;/p&gt;
&lt;p&gt;I wanted to serve the file with open CORS headers so I could load it into Datasette Lite and Observable notebooks.&lt;/p&gt;
&lt;p&gt;I used my &lt;a href="https://s3-credentials.readthedocs.io/"&gt;s3-credentials&lt;/a&gt; tool to create a bucket for this:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;~ % s3-credentials create scrape-instances-social --public --website --create-bucket
Created bucket: scrape-instances-social
Attached bucket policy allowing public access
Configured website: IndexDocument=index.html, ErrorDocument=error.html
Created  user: 's3.read-write.scrape-instances-social' with permissions boundary: 'arn:aws:iam::aws:policy/AmazonS3FullAccess'
Attached policy s3.read-write.scrape-instances-social to user s3.read-write.scrape-instances-social
Created access key for user: s3.read-write.scrape-instances-social
{
    "UserName": "s3.read-write.scrape-instances-social",
    "AccessKeyId": "AKIAWXFXAIOZI5NUS6VU",
    "Status": "Active",
    "SecretAccessKey": "...",
    "CreateDate": "2022-11-20 05:52:22+00:00"
}
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;This created a new bucket called &lt;code&gt;scrape-instances-social&lt;/code&gt; configured to work as a website and allow public access.&lt;/p&gt;
&lt;p&gt;It also generated an access key and a secret access key with access to just that bucket. I saved these in GitHub Actions secrets called &lt;code&gt;AWS_ACCESS_KEY_ID&lt;/code&gt; and &lt;code&gt;AWS_SECRET_ACCESS_KEY&lt;/code&gt;.&lt;/p&gt;
&lt;p&gt;I enabled a CORS policy on the bucket like this:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;s3-credentials set-cors-policy scrape-instances-social
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Then I added the following to my GitHub Actions workflow to build and upload the database after each run of the scraper:&lt;/p&gt;
&lt;div class="highlight highlight-source-yaml"&gt;&lt;pre&gt;    - &lt;span class="pl-ent"&gt;name&lt;/span&gt;: &lt;span class="pl-s"&gt;Build and publish database using git-history&lt;/span&gt;
      &lt;span class="pl-ent"&gt;env&lt;/span&gt;:
        &lt;span class="pl-ent"&gt;AWS_ACCESS_KEY_ID&lt;/span&gt;: &lt;span class="pl-s"&gt;${{ secrets.AWS_ACCESS_KEY_ID }}&lt;/span&gt;
        &lt;span class="pl-ent"&gt;AWS_SECRET_ACCESS_KEY&lt;/span&gt;: &lt;span class="pl-s"&gt;${{ secrets.AWS_SECRET_ACCESS_KEY }}&lt;/span&gt;
      &lt;span class="pl-ent"&gt;run&lt;/span&gt;: &lt;span class="pl-s"&gt;|-&lt;/span&gt;
&lt;span class="pl-s"&gt;        # First download previous database to save some time&lt;/span&gt;
&lt;span class="pl-s"&gt;        wget https://scrape-instances-social.s3.amazonaws.com/counts.db&lt;/span&gt;
&lt;span class="pl-s"&gt;        # Update with latest commits&lt;/span&gt;
&lt;span class="pl-s"&gt;        ./build-count-history.sh&lt;/span&gt;
&lt;span class="pl-s"&gt;        # Upload to S3&lt;/span&gt;
&lt;span class="pl-s"&gt;        s3-credentials put-object scrape-instances-social counts.db counts.db \&lt;/span&gt;
&lt;span class="pl-s"&gt;          --access-key $AWS_ACCESS_KEY_ID \&lt;/span&gt;
&lt;span class="pl-s"&gt;          --secret-key $AWS_SECRET_ACCESS_KEY&lt;/span&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;&lt;code&gt;git-history&lt;/code&gt; knows how to only process commits since the last time the database was built, so downloading the previous copy saves a lot of time.&lt;/p&gt;
&lt;h4&gt;Exploring the data&lt;/h4&gt;
&lt;p&gt;Now that I have a SQLite database that's being served over CORS-enabled HTTPS I can open it in &lt;a href="https://simonwillison.net/2022/May/4/datasette-lite/"&gt;Datasette Lite&lt;/a&gt; - my implementation of Datasette compiled to WebAssembly that runs entirely in a browser.&lt;/p&gt;
&lt;p&gt;&lt;a href="https://lite.datasette.io/?url=https://scrape-instances-social.s3.amazonaws.com/counts.db"&gt;https://lite.datasette.io/?url=https://scrape-instances-social.s3.amazonaws.com/counts.db&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;Any time anyone follows this link their browser will fetch the latest copy of the &lt;code&gt;counts.db&lt;/code&gt; file directly from S3.&lt;/p&gt;
&lt;p&gt;The most interesting page in there is the &lt;code&gt;item_version_detail&lt;/code&gt; SQL view, which joins against the commits table to show the date of each change:&lt;/p&gt;
&lt;p&gt;&lt;a href="https://lite.datasette.io/?url=https://scrape-instances-social.s3.amazonaws.com/counts.db#/counts/item_version_detail"&gt;https://lite.datasette.io/?url=https://scrape-instances-social.s3.amazonaws.com/counts.db#/counts/item_version_detail&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;(Datasette Lite lets you link directly to pages within Datasette itself via a &lt;code&gt;#hash&lt;/code&gt;.)&lt;/p&gt;
&lt;h4&gt;Plotting a chart&lt;/h4&gt;
&lt;p&gt;Datasette Lite doesn't have charting yet, so I decided to turn to my favourite visualization tool, an &lt;a href="https://observablehq.com/"&gt;Observable&lt;/a&gt; notebook.&lt;/p&gt;
&lt;p&gt;Observable has the ability to query SQLite databases (that are served via CORS) directly these days!&lt;/p&gt;
&lt;p&gt;Here's my notebook:&lt;/p&gt;
&lt;p&gt;&lt;a href="https://observablehq.com/@simonw/mastodon-users-and-statuses-over-time"&gt;https://observablehq.com/@simonw/mastodon-users-and-statuses-over-time&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;There are only four cells needed to create the chart shown above.&lt;/p&gt;
&lt;p&gt;First, we need to open the SQLite database from the remote URL:&lt;/p&gt;
&lt;div class="highlight highlight-source-js"&gt;&lt;pre&gt;&lt;span class="pl-s1"&gt;database&lt;/span&gt; &lt;span class="pl-c1"&gt;=&lt;/span&gt; &lt;span class="pl-v"&gt;SQLiteDatabaseClient&lt;/span&gt;&lt;span class="pl-kos"&gt;.&lt;/span&gt;&lt;span class="pl-en"&gt;open&lt;/span&gt;&lt;span class="pl-kos"&gt;(&lt;/span&gt;
  &lt;span class="pl-s"&gt;"https://scrape-instances-social.s3.amazonaws.com/counts.db"&lt;/span&gt;
&lt;span class="pl-kos"&gt;)&lt;/span&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;Next we need to use an Obervable Database query cell to execute SQL against that database and pull out the data we want to plot - and store it in a &lt;code&gt;query&lt;/code&gt; variable:&lt;/p&gt;
&lt;div class="highlight highlight-source-sql"&gt;&lt;pre&gt;&lt;span class="pl-k"&gt;SELECT&lt;/span&gt; _commit_at &lt;span class="pl-k"&gt;as&lt;/span&gt; &lt;span class="pl-k"&gt;date&lt;/span&gt;, users, statuses
&lt;span class="pl-k"&gt;FROM&lt;/span&gt; item_version_detail&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;We need to make one change to that data - we need to convert the &lt;code&gt;date&lt;/code&gt; column from a string to a JavaScript date object:&lt;/p&gt;
&lt;div class="highlight highlight-source-js"&gt;&lt;pre&gt;&lt;span class="pl-s1"&gt;points&lt;/span&gt; &lt;span class="pl-c1"&gt;=&lt;/span&gt; &lt;span class="pl-s1"&gt;query&lt;/span&gt;&lt;span class="pl-kos"&gt;.&lt;/span&gt;&lt;span class="pl-en"&gt;map&lt;/span&gt;&lt;span class="pl-kos"&gt;(&lt;/span&gt;&lt;span class="pl-kos"&gt;(&lt;/span&gt;&lt;span class="pl-s1"&gt;d&lt;/span&gt;&lt;span class="pl-kos"&gt;)&lt;/span&gt; &lt;span class="pl-c1"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="pl-kos"&gt;(&lt;/span&gt;&lt;span class="pl-kos"&gt;{&lt;/span&gt;
  &lt;span class="pl-c1"&gt;date&lt;/span&gt;: &lt;span class="pl-k"&gt;new&lt;/span&gt; &lt;span class="pl-v"&gt;Date&lt;/span&gt;&lt;span class="pl-kos"&gt;(&lt;/span&gt;&lt;span class="pl-s1"&gt;d&lt;/span&gt;&lt;span class="pl-kos"&gt;.&lt;/span&gt;&lt;span class="pl-c1"&gt;date&lt;/span&gt;&lt;span class="pl-kos"&gt;)&lt;/span&gt;&lt;span class="pl-kos"&gt;,&lt;/span&gt;
  &lt;span class="pl-c1"&gt;users&lt;/span&gt;: &lt;span class="pl-s1"&gt;d&lt;/span&gt;&lt;span class="pl-kos"&gt;.&lt;/span&gt;&lt;span class="pl-c1"&gt;users&lt;/span&gt;&lt;span class="pl-kos"&gt;,&lt;/span&gt;
  &lt;span class="pl-c1"&gt;statuses&lt;/span&gt;: &lt;span class="pl-s1"&gt;d&lt;/span&gt;&lt;span class="pl-kos"&gt;.&lt;/span&gt;&lt;span class="pl-c1"&gt;statuses&lt;/span&gt;
&lt;span class="pl-kos"&gt;}&lt;/span&gt;&lt;span class="pl-kos"&gt;)&lt;/span&gt;&lt;span class="pl-kos"&gt;)&lt;/span&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;Finally, we can plot the data using the &lt;a href="https://observablehq.com/@observablehq/plot"&gt;Observable Plot&lt;/a&gt; charting library like this:&lt;/p&gt;
&lt;div class="highlight highlight-source-js"&gt;&lt;pre&gt;&lt;span class="pl-v"&gt;Plot&lt;/span&gt;&lt;span class="pl-kos"&gt;.&lt;/span&gt;&lt;span class="pl-en"&gt;plot&lt;/span&gt;&lt;span class="pl-kos"&gt;(&lt;/span&gt;&lt;span class="pl-kos"&gt;{&lt;/span&gt;
  &lt;span class="pl-c1"&gt;y&lt;/span&gt;: &lt;span class="pl-kos"&gt;{&lt;/span&gt;
    &lt;span class="pl-c1"&gt;grid&lt;/span&gt;: &lt;span class="pl-c1"&gt;true&lt;/span&gt;&lt;span class="pl-kos"&gt;,&lt;/span&gt;
    &lt;span class="pl-c1"&gt;label&lt;/span&gt;: &lt;span class="pl-s"&gt;"Total users over time across all tracked instances"&lt;/span&gt;
  &lt;span class="pl-kos"&gt;}&lt;/span&gt;&lt;span class="pl-kos"&gt;,&lt;/span&gt;
  &lt;span class="pl-c1"&gt;marks&lt;/span&gt;: &lt;span class="pl-kos"&gt;[&lt;/span&gt;&lt;span class="pl-v"&gt;Plot&lt;/span&gt;&lt;span class="pl-kos"&gt;.&lt;/span&gt;&lt;span class="pl-en"&gt;line&lt;/span&gt;&lt;span class="pl-kos"&gt;(&lt;/span&gt;&lt;span class="pl-s1"&gt;points&lt;/span&gt;&lt;span class="pl-kos"&gt;,&lt;/span&gt; &lt;span class="pl-kos"&gt;{&lt;/span&gt; &lt;span class="pl-c1"&gt;x&lt;/span&gt;: &lt;span class="pl-s"&gt;"date"&lt;/span&gt;&lt;span class="pl-kos"&gt;,&lt;/span&gt; &lt;span class="pl-c1"&gt;y&lt;/span&gt;: &lt;span class="pl-s"&gt;"users"&lt;/span&gt; &lt;span class="pl-kos"&gt;}&lt;/span&gt;&lt;span class="pl-kos"&gt;)&lt;/span&gt;&lt;span class="pl-kos"&gt;]&lt;/span&gt;&lt;span class="pl-kos"&gt;,&lt;/span&gt;
  &lt;span class="pl-c1"&gt;marginLeft&lt;/span&gt;: &lt;span class="pl-c1"&gt;100&lt;/span&gt;
&lt;span class="pl-kos"&gt;}&lt;/span&gt;&lt;span class="pl-kos"&gt;)&lt;/span&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;I added 100px of margin to the left of the chart to ensure there was space for the large (4,696,000 and up) labels on the y-axis.&lt;/p&gt;
&lt;h4&gt;A bunch of tricks combined&lt;/h4&gt;
&lt;p&gt;This project combines a whole bunch of tricks I've been pulling together over the past few years:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;
&lt;a href="https://simonwillison.net/2020/Oct/9/git-scraping/"&gt;Git scraping&lt;/a&gt; is the technique I use to gather the initial data, turning a static listing of instances into a record of changes over time&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://datasette.io/tools/git-history"&gt;git-history&lt;/a&gt; is my tool for turning a scraped Git history into a SQLite database that's easier to work with&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://s3-credentials.readthedocs.io/"&gt;s3-credentials&lt;/a&gt; makes working with S3 buckets - in particular creating credentials that are restricted to just one bucket - much less frustrating&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://simonwillison.net/2022/May/4/datasette-lite/"&gt;Datasette Lite&lt;/a&gt; means that once you have a SQLite database online somewhere you can explore it in your browser - without having to run my full server-side &lt;a href="https://datasette.io/"&gt;Datasette&lt;/a&gt; Python application on a machine somewhere&lt;/li&gt;
&lt;li&gt;And finally, combining the above means I can take advantage of &lt;a href="https://observablehq.com/"&gt;Observable notebooks&lt;/a&gt; for ad-hoc visualization of data that's hosted online, in this case as a static SQLite database file served from S3&lt;/li&gt;
&lt;/ul&gt;
    
        &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/github"&gt;github&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/projects"&gt;projects&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/datasette"&gt;datasette&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/observable"&gt;observable&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/github-actions"&gt;github-actions&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/git-scraping"&gt;git-scraping&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/git-history"&gt;git-history&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/s3-credentials"&gt;s3-credentials&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/datasette-lite"&gt;datasette-lite&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/mastodon"&gt;mastodon&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/cors"&gt;cors&lt;/a&gt;&lt;/p&gt;
    

</summary><category term="github"/><category term="projects"/><category term="datasette"/><category term="observable"/><category term="github-actions"/><category term="git-scraping"/><category term="git-history"/><category term="s3-credentials"/><category term="datasette-lite"/><category term="mastodon"/><category term="cors"/></entry><entry><title>Datasette Lite: Loading JSON data</title><link href="https://simonwillison.net/2022/Nov/18/datasette-lite-loading-json-data/#atom-tag" rel="alternate"/><published>2022-11-18T18:43:48+00:00</published><updated>2022-11-18T18:43:48+00:00</updated><id>https://simonwillison.net/2022/Nov/18/datasette-lite-loading-json-data/#atom-tag</id><summary type="html">
    
&lt;p&gt;&lt;strong&gt;&lt;a href="https://github.com/simonw/datasette-lite/blob/main/README.md#loading-json-data"&gt;Datasette Lite: Loading JSON data&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;
I added a new feature to Datasette Lite: you can now pass it the URL to a JSON file (hosted on a CORS-compatible hosting provider such as GitHub or GitHub Gists) and it will load that file into a database table for you. It expects an array of objects, but if your file has an object as the root it will search through it looking for the first key that is an array of objects and load those instead.

    &lt;p&gt;&lt;small&gt;&lt;/small&gt;Via &lt;a href="https://github.com/simonw/datasette-lite/issues/54"&gt;Issue 54:  ?json=URL parameter for loading JSON data&lt;/a&gt;&lt;/small&gt;&lt;/p&gt;


    &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/json"&gt;json&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/projects"&gt;projects&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/datasette"&gt;datasette&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/datasette-lite"&gt;datasette-lite&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/cors"&gt;cors&lt;/a&gt;&lt;/p&gt;



</summary><category term="json"/><category term="projects"/><category term="datasette"/><category term="datasette-lite"/><category term="cors"/></entry><entry><title>Plugin support for Datasette Lite</title><link href="https://simonwillison.net/2022/Aug/17/datasette-lite-plugins/#atom-tag" rel="alternate"/><published>2022-08-17T18:20:59+00:00</published><updated>2022-08-17T18:20:59+00:00</updated><id>https://simonwillison.net/2022/Aug/17/datasette-lite-plugins/#atom-tag</id><summary type="html">
    &lt;p&gt;I've added a new feature to &lt;a href="https://lite.datasette.io/"&gt;Datasette Lite&lt;/a&gt;, my distribution of &lt;a href="https://datasette.io/"&gt;Datasette&lt;/a&gt; that runs entirely in the browser using Python and SQLite compiled to WebAssembly. You can now install additional &lt;a href="https://datasette.io/plugins"&gt;Datasette plugins&lt;/a&gt; by passing them in the URL.&lt;/p&gt;
&lt;h4&gt;Datasette Lite background&lt;/h4&gt;
&lt;p&gt;Datasette Lite runs Datasette in the browser. I initially built it as a fun technical proof of concept, but I'm increasingly finding it to be a genuinely useful tool for quick ad-hoc data analysis and publication. Not having any server-side components at all makes it effectively free to use without fear of racking up cloud computing costs for a throwaway project.&lt;/p&gt;
&lt;p&gt;You can read more about Datasette Lite in these posts:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href="https://simonwillison.net/2022/May/4/datasette-lite/"&gt;Datasette Lite: a server-side Python web application running in a browser&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://simonwillison.net/2022/Jun/20/datasette-lite-csvs/"&gt;Joining CSV files in your browser using Datasette Lite&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://simonwillison.net/2022/Jul/30/psf-board/#scraping-datasette-lite"&gt;Scraping data into Datasette Lite&lt;/a&gt; shows an example project where I scraped PSF board resolutions, stored the results in a CSV file in &lt;a href="https://gist.github.com/simonw/71b03ca3688c8f37fe1b35859ebc458b"&gt;a GitHub Gist&lt;/a&gt; and then constructed &lt;a href="https://lite.datasette.io/?csv=https%3A%2F%2Fgist.githubusercontent.com%2Fsimonw%2F71b03ca3688c8f37fe1b35859ebc458b%2Fraw%2Fd3ca708e5d6242848201d1d4c1c2ddbba15c4d28%2Fpsf-resolutions.csv#/data?sql=with+filtered+as+%28%0A++select+*+from%0A++++%5Bpsf-resolutions%5D%0A++where%0A++++%22dollars%22+is+not+null%0A++++and+%22text%22+like+%27%25%27+%7C%7C+%3Asearch+%7C%7C+%27%25%27%0A%29%0Aselect%0A++%27Total%3A+%24%27+%7C%7C+printf%28%27%25%2Cd%27%2C+sum%28dollars%29%29+as+text%2C%0A++null+as+date%0Afrom+filtered%0Aunion+all%0Aselect%0A++text%2C+date%0Afrom+filtered%3B&amp;amp;search=Nigeria"&gt;this URL&lt;/a&gt; to open the result in Datasette Lite and execute a SQL query.&lt;/li&gt;
&lt;/ul&gt;
&lt;h4&gt;Adding plugins to Datasette Lite&lt;/h4&gt;
&lt;p&gt;One of Datasette's key features is support for plugins. There are over 90 listed in the &lt;a href="https://datasette.io/plugins"&gt;plugin directory&lt;/a&gt; now, with more emerging all the time. They're a fantastic way to explore new feature ideas and extend the software to handle non-default use cases.&lt;/p&gt;
&lt;p&gt;Plugins are Python packages, published to &lt;a href="https://pypi.org/"&gt;PyPI&lt;/a&gt;. You can add them to Datasette Lite using the new &lt;code&gt;?install=name-of-plugin&lt;/code&gt; query string parameter.&lt;/p&gt;
&lt;p&gt;Here's an example URL that loads the &lt;a href="https://datasette.io/plugins/datasette-jellyfish"&gt;datasette-jellyfish&lt;/a&gt; plugin, which adds new SQL functions for calculating distances between strings, then executes a SQL query that demonstrates that plugin:&lt;/p&gt;
&lt;p&gt;&lt;a href="https://lite.datasette.io/?install=datasette-jellyfish#/fixtures?sql=SELECT%0A++++levenshtein_distance%28%3As1%2C+%3As2%29%2C%0A++++damerau_levenshtein_distance%28%3As1%2C+%3As2%29%2C%0A++++hamming_distance%28%3As1%2C+%3As2%29%2C%0A++++jaro_similarity%28%3As1%2C+%3As2%29%2C%0A++++jaro_winkler_similarity%28%3As1%2C+%3As2%29%2C%0A++++match_rating_comparison%28%3As1%2C+%3As2%29%3B&amp;amp;s1=barrack+obama&amp;amp;s2=barrack+h+obama"&gt;https://lite.datasette.io/?install=datasette-jellyfish#/fixtures?sql=SELECT%0A++++levenshtein_distance%28%3As1%2C+%3As2%29%2C%0A++++damerau_levenshtein_distance%28%3As1%2C+%3As2%29%2C%0A++++hamming_distance%28%3As1%2C+%3As2%29%2C%0A++++jaro_similarity%28%3As1%2C+%3As2%29%2C%0A++++jaro_winkler_similarity%28%3As1%2C+%3As2%29%2C%0A++++match_rating_comparison%28%3As1%2C+%3As2%29%3B&amp;amp;s1=barrack+obama&amp;amp;s2=barrack+h+obama&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;That URL uses &lt;code&gt;?install=datasette-jellyfish&lt;/code&gt; to install the plugin, then executes the following SQL query:&lt;/p&gt;
&lt;div class="highlight highlight-source-sql"&gt;&lt;pre&gt;&lt;span class="pl-k"&gt;SELECT&lt;/span&gt;
    levenshtein_distance(:s1, :s2),
    damerau_levenshtein_distance(:s1, :s2),
    hamming_distance(:s1, :s2),
    jaro_similarity(:s1, :s2),
    jaro_winkler_similarity(:s1, :s2),
    match_rating_comparison(:s1, :s2);&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;It sets &lt;code&gt;s1&lt;/code&gt; to &lt;code&gt;"barack obama"&lt;/code&gt; and &lt;code&gt;s2&lt;/code&gt; to &lt;code&gt;"barrack h obama&lt;/code&gt;".&lt;/p&gt;
&lt;p&gt;&lt;img src="https://static.simonwillison.net/static/2022/datasette-lite-plugin.jpg" alt="Screenshot showing the results of that SQL query running in Datasette Lite. It compares the string barrack obama with the string barrack h obama and shows various different scores." style="max-width: 100%;" /&gt;&lt;/p&gt;
&lt;h4&gt;Plugin compatibility&lt;/h4&gt;
&lt;p&gt;Unfortunately, many existing Datasette plugins aren't yet compatible with Datasette Lite. Most importantly, visualization plugins such as &lt;a href="https://datasette.io/plugins/datasette-cluster-map"&gt;datasette-cluster-map&lt;/a&gt; and &lt;a href="https://datasette.io/plugins/datasette-vega"&gt;datasette-vega&lt;/a&gt; don't work.&lt;/p&gt;
&lt;p&gt;This is because I haven't yet solved the challenge of loading additional JavaScript and CSS into Datasette Lite - see &lt;a href="https://github.com/simonw/datasette-lite/issues/8"&gt;issue #8&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;Here's the full list of plugins that I've confirmed work with Datasette Lite so far:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;
&lt;a href="https://datasette.io/plugins/datasette-packages"&gt;datasette-packages&lt;/a&gt; - Show a list of currently installed Python packages - &lt;a href="https://lite.datasette.io/?install=datasette-packages#/-/packages"&gt;demo&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://datasette.io/plugins/datasette-dateutil"&gt;datasette-dateutil&lt;/a&gt; - dateutil functions for Datasette - &lt;a href="https://lite.datasette.io/?install=datasette-dateutil#/fixtures?sql=select%0A++dateutil_parse%28%2210+october+2020+3pm%22%29%2C%0A++dateutil_parse_fuzzy%28%22This+is+due+10+september%22%29%2C%0A++dateutil_parse%28%221%2F2%2F2020%22%29%2C%0A++dateutil_parse%28%222020-03-04%22%29%2C%0A++dateutil_parse_dayfirst%28%222020-03-04%22%29%3B"&gt;demo&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://datasette.io/plugins/datasette-schema-versions"&gt;datasette-schema-versions&lt;/a&gt; - Datasette plugin that shows the schema version of every attached database - &lt;a href="https://lite.datasette.io/?install=datasette-schema-versions#/-/schema-versions"&gt;demo&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://datasette.io/plugins/datasette-debug-asgi"&gt;datasette-debug-asgi&lt;/a&gt; - Datasette plugin for dumping out the ASGI scope. - &lt;a href="https://lite.datasette.io/?install=datasette-debug-asgi#/-/asgi-scope"&gt;demo&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://datasette.io/plugins/datasette-query-links"&gt;datasette-query-links&lt;/a&gt; - Turn SELECT queries returned by a query into links to execute them - &lt;a href="https://lite.datasette.io/?install=datasette-query-links#/fixtures?sql=select%0D%0A++'select+*+from+%5Bfacetable%5D'+as+query%0D%0Aunion%0D%0Aselect%0D%0A++'select+sqlite_version()'%0D%0Aunion%0D%0Aselect%0D%0A++'select+this+is+invalid+SQL+so+will+not+be+linked'"&gt;demo&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://datasette.io/plugins/datasette-json-html"&gt;datasette-json-html&lt;/a&gt; - Datasette plugin for rendering HTML based on JSON values - &lt;a href="https://lite.datasette.io/?install=datasette-json-html#/fixtures?sql=select+%27%5B%0A++++%7B%0A++++++++%22href%22%3A+%22https%3A%2F%2Fsimonwillison.net%2F%22%2C%0A++++++++%22label%22%3A+%22Simon+Willison%22%0A++++%7D%2C%0A++++%7B%0A++++++++%22href%22%3A+%22https%3A%2F%2Fgithub.com%2Fsimonw%2Fdatasette%22%2C%0A++++++++%22label%22%3A+%22Datasette%22%0A++++%7D%0A%5D%27+as+output"&gt;demo&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://datasette.io/plugins/datasette-haversine"&gt;datasette-haversine&lt;/a&gt; - Datasette plugin that adds a custom SQL function for haversine distances - &lt;a href="https://lite.datasette.io/?install=datasette-haversine#/fixtures?sql=select+haversine%280%2C+154%2C+1%2C+131%29"&gt;demo&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://datasette.io/plugins/datasette-jellyfish"&gt;datasette-jellyfish&lt;/a&gt; - Datasette plugin that adds custom SQL functions for fuzzy string matching, built on top of the Jellyfish Python library - &lt;a href="https://lite.datasette.io/?install=datasette-jellyfish#/fixtures?sql=SELECT%0A++++levenshtein_distance%28%3As1%2C+%3As2%29%2C%0A++++damerau_levenshtein_distance%28%3As1%2C+%3As2%29%2C%0A++++hamming_distance%28%3As1%2C+%3As2%29%2C%0A++++jaro_similarity%28%3As1%2C+%3As2%29%2C%0A++++jaro_winkler_similarity%28%3As1%2C+%3As2%29%2C%0A++++match_rating_comparison%28%3As1%2C+%3As2%29%3B&amp;amp;s1=barrack+obama&amp;amp;s2=barrack+h+obama"&gt;demo&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://datasette.io/plugins/datasette-pretty-json"&gt;datasette-pretty-json&lt;/a&gt; - Datasette plugin that pretty-prints any column values that are valid JSON objects or arrays. - &lt;a href="https://lite.datasette.io/?install=datasette-pretty-json#/fixtures?sql=select+%27%7B%22this%22%3A+%5B%22is%22%2C+%22nested%22%2C+%22json%22%5D%7D%27"&gt;demo&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://datasette.io/plugins/datasette-yaml"&gt;datasette-yaml&lt;/a&gt; - Export Datasette records as YAML - &lt;a href="https://lite.datasette.io/?install=datasette-yaml#/fixtures/compound_three_primary_keys.yaml"&gt;demo&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://datasette.io/plugins/datasette-copyable"&gt;datasette-copyable&lt;/a&gt; - Datasette plugin for outputting tables in formats suitable for copy and paste - &lt;a href="https://lite.datasette.io/?install=datasette-copyable#/fixtures/compound_three_primary_keys.copyable?_table_format=github"&gt;demo&lt;/a&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;h4&gt;How it works&lt;/h4&gt;
&lt;p&gt;The implementation is pretty simple - it can be seen &lt;a href="https://github.com/simonw/datasette-lite/commit/5e348e84ff2cb38e4749aa36182db78e235dd9bc"&gt;in this commit&lt;/a&gt;. The short version is that &lt;code&gt;?install=&lt;/code&gt; options are passed through to the Python web worker that powers Datasette Lite, which then runs the following:&lt;/p&gt;
&lt;pre&gt;&lt;span class="pl-k"&gt;for&lt;/span&gt; &lt;span class="pl-s1"&gt;install_url&lt;/span&gt; &lt;span class="pl-c1"&gt;in&lt;/span&gt; &lt;span class="pl-s1"&gt;install_urls&lt;/span&gt;:
    &lt;span class="pl-k"&gt;await&lt;/span&gt; &lt;span class="pl-s1"&gt;micropip&lt;/span&gt;.&lt;span class="pl-en"&gt;install&lt;/span&gt;(&lt;span class="pl-s1"&gt;install_url&lt;/span&gt;)&lt;/pre&gt;
&lt;p&gt;&lt;a href="https://pyodide.org/en/stable/usage/api/micropip-api.html"&gt;micropip&lt;/a&gt; is a component of Pyodide which knows how to install pure Python wheels directly from PyPI into the browser's emulated Python environment. If you open up the browser devtools networking panel you can see that in action!&lt;/p&gt;
&lt;p&gt;&lt;img src="https://static.simonwillison.net/static/2022/datasette-lite-devtools.jpg" alt="The Firefox Network pane shows a flurry of traffic, some of it to PyPI to look up the JSON descriptions of packages followed by downloads of .whl files from files.pythonhosted.org" style="max-width: 100%;" /&gt;&lt;/p&gt;
&lt;p&gt;Since the &lt;code&gt;?install=&lt;/code&gt; parameter is being passed directly to &lt;code&gt;micropip.install()&lt;/code&gt; you don't even need to provide names of packages hosted on PyPI - you could instead provide the URL to a wheel file that you're hosting elsewhere.&lt;/p&gt;
&lt;p&gt;This means you can use &lt;code&gt;?install=&lt;/code&gt; as a code injection attack - you can install any Python code you want into the environent. I think that's fine - the only person who will be affected by this is the user who is viewing the page, and the &lt;code&gt;lite.datasette.io&lt;/code&gt; domain deliberately doesn't have any cookies set that could cause  problems if someone were to steal them in some way.&lt;/p&gt;
    
        &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/plugins"&gt;plugins&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/projects"&gt;projects&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/pypi"&gt;pypi&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/datasette"&gt;datasette&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/webassembly"&gt;webassembly&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/pyodide"&gt;pyodide&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/datasette-lite"&gt;datasette-lite&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/cors"&gt;cors&lt;/a&gt;&lt;/p&gt;
    

</summary><category term="plugins"/><category term="projects"/><category term="pypi"/><category term="datasette"/><category term="webassembly"/><category term="pyodide"/><category term="datasette-lite"/><category term="cors"/></entry><entry><title>Joining CSV files in your browser using Datasette Lite</title><link href="https://simonwillison.net/2022/Jun/20/datasette-lite-csvs/#atom-tag" rel="alternate"/><published>2022-06-20T21:20:16+00:00</published><updated>2022-06-20T21:20:16+00:00</updated><id>https://simonwillison.net/2022/Jun/20/datasette-lite-csvs/#atom-tag</id><summary type="html">
    &lt;p&gt;I added a new feature to &lt;a href="https://lite.datasette.io/"&gt;Datasette Lite&lt;/a&gt; - my version of &lt;a href="https://datasette.io/"&gt;Datasette&lt;/a&gt; that runs entirely in your browser using WebAssembly (&lt;a href="https://simonwillison.net/2022/May/4/datasette-lite/"&gt;previously&lt;/a&gt;): you can now use it to load one or more CSV files by URL, and then run SQL queries against them - including joins across data from multiple files.&lt;/p&gt;
&lt;p&gt;Your CSV file needs to be hosted somewhere with &lt;code&gt;access-control-allow-origin: *&lt;/code&gt; CORS headers. Any CSV file hosted on GitHub provides these, if you use the link you get by clicking on the "Raw" version.&lt;/p&gt;
&lt;h4&gt;Loading CSV data from a URL&lt;/h4&gt;
&lt;p&gt;Here's the URL to a CSV file of college fight songs collected by FiveThirtyEight &lt;a href="https://github.com/fivethirtyeight/data/tree/master/fight-songs"&gt;in their data repo&lt;/a&gt; as part of the reporting for &lt;a href="https://projects.fivethirtyeight.com/college-fight-song-lyrics/"&gt;this story&lt;/a&gt; a few years ago:&lt;/p&gt;
&lt;p&gt;&lt;a href="https://raw.githubusercontent.com/fivethirtyeight/data/master/fight-songs/fight-songs.csv"&gt;https://raw.githubusercontent.com/fivethirtyeight/data/master/fight-songs/fight-songs.csv&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;You can pass this to Datasette Lite in two ways:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;You can &lt;a href="https://lite.datasette.io/"&gt;load the web app&lt;/a&gt;, click the "Load data by URL to a CSV file" button and paste in the URL&lt;/li&gt;
&lt;li&gt;Or you can pass it as a &lt;code&gt;?csv=&lt;/code&gt; parameter to the application, like this: &lt;a href="https://lite.datasette.io/?csv=https://raw.githubusercontent.com/fivethirtyeight/data/master/fight-songs/fight-songs.csv"&gt;https://lite.datasette.io/?csv=https://raw.githubusercontent.com/fivethirtyeight/data/master/fight-songs/fight-songs.csv&lt;/a&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Once Datasette has loaded, a &lt;code&gt;data&lt;/code&gt; database will be available with a single table called &lt;code&gt;fight-songs&lt;/code&gt;.&lt;/p&gt;
&lt;p&gt;As you navigate around in Datasette the URL bar will update to reflect current state - which means you can deep-link to table views with applied filters and facets:&lt;/p&gt;
&lt;p&gt;&lt;a href="https://lite.datasette.io/?csv=https://raw.githubusercontent.com/fivethirtyeight/data/master/fight-songs/fight-songs.csv#/data/fight-songs?_facet=conference&amp;amp;_facet=student_writer&amp;amp;_facet=official_song"&gt;https://lite.datasette.io/?csv=https://raw.githubusercontent.com/fivethirtyeight/data/master/fight-songs/fight-songs.csv#/data/fight-songs?_facet=conference&amp;amp;_facet=student_writer&amp;amp;_facet=official_song&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;Or even link to the result of a custom SQL query:&lt;/p&gt;
&lt;p&gt;&lt;a href="https://lite.datasette.io/?csv=https://raw.githubusercontent.com/fivethirtyeight/data/master/fight-songs/fight-songs.csv#/data?sql=select+school%2C+conference%2C+song_name%2C+writers%2C+year%2C+student_writer+spotify_id+from+%5Bfight-songs%5D+order+by+rowid+limit+101"&gt;https://lite.datasette.io/?csv=https://raw.githubusercontent.com/fivethirtyeight/data/master/fight-songs/fight-songs.csv#/data?sql=select+school%2C+conference%2C+song_name%2C+writers%2C+year%2C+student_writer+spotify_id+from+%5Bfight-songs%5D+order+by+rowid+limit+101&lt;/a&gt;&lt;/p&gt;
&lt;h4&gt;Loading multiple files and joining data&lt;/h4&gt;
&lt;p&gt;You can pass the &lt;code&gt;?csv=&lt;/code&gt; parameter more than once to load data from multiple CSV files into the same virtual &lt;code&gt;data&lt;/code&gt; database. Each CSV file will result in a separate table.&lt;/p&gt;
&lt;p&gt;For this demo I'll use two CSV files.&lt;/p&gt;
&lt;p&gt;The first is &lt;a href="https://github.com/nytimes/covid-19-data/blob/master/us-counties-recent.csv"&gt;us-counties-recent.csv&lt;/a&gt; from the NY Times &lt;a href="https://github.com/nytimes/covid-19-data"&gt;covid-19-data&lt;/a&gt; repository, which lists the most recent numbers for Covid cases for every US county.&lt;/p&gt;
&lt;p&gt;The second is &lt;a href="https://github.com/simonw/covid-19-datasette/blob/main/us_census_county_populations_2019.csv"&gt;us_census_county_populations_2019.csv&lt;/a&gt;, a CSV file listing the population of each county according to the 2019 US Census which I extracted from &lt;a href="https://www.census.gov/data/datasets/time-series/demo/popest/2010s-state-total.html"&gt;this page&lt;/a&gt; on the US Census website.&lt;/p&gt;
&lt;p&gt;Both of those tables include a column called &lt;code&gt;fips&lt;/code&gt;, representing the &lt;a href="https://en.wikipedia.org/wiki/FIPS_county_code"&gt;FIPS county code&lt;/a&gt; for each county. These 4-5 digit codes are ideal for joining the two tables.&lt;/p&gt;
&lt;p&gt;Here's a SQL query which joins the two tables, filters for the data for the most recent date represented (using &lt;code&gt;where date = (select max(date) from [us-counties-recent])&lt;/code&gt;) and calculates &lt;code&gt;cases_per_million&lt;/code&gt; using the cases and the population:&lt;/p&gt;
&lt;div class="highlight highlight-source-sql"&gt;&lt;pre&gt;&lt;span class="pl-k"&gt;select&lt;/span&gt;
  [us&lt;span class="pl-k"&gt;-&lt;/span&gt;counties&lt;span class="pl-k"&gt;-&lt;/span&gt;recent].&lt;span class="pl-k"&gt;*&lt;/span&gt;,
  &lt;span class="pl-c1"&gt;us_census_county_populations_2019&lt;/span&gt;.&lt;span class="pl-c1"&gt;population&lt;/span&gt;,
  &lt;span class="pl-c1"&gt;1&lt;/span&gt;.&lt;span class="pl-c1"&gt;0&lt;/span&gt; &lt;span class="pl-k"&gt;*&lt;/span&gt; [us&lt;span class="pl-k"&gt;-&lt;/span&gt;counties&lt;span class="pl-k"&gt;-&lt;/span&gt;recent].cases &lt;span class="pl-k"&gt;/&lt;/span&gt; &lt;span class="pl-c1"&gt;us_census_county_populations_2019&lt;/span&gt;.&lt;span class="pl-c1"&gt;population&lt;/span&gt; &lt;span class="pl-k"&gt;*&lt;/span&gt; &lt;span class="pl-c1"&gt;1000000&lt;/span&gt; &lt;span class="pl-k"&gt;as&lt;/span&gt; cases_per_million
&lt;span class="pl-k"&gt;from&lt;/span&gt;
  [us&lt;span class="pl-k"&gt;-&lt;/span&gt;counties&lt;span class="pl-k"&gt;-&lt;/span&gt;recent]
  &lt;span class="pl-k"&gt;join&lt;/span&gt; us_census_county_populations_2019 &lt;span class="pl-k"&gt;on&lt;/span&gt; &lt;span class="pl-c1"&gt;us_census_county_populations_2019&lt;/span&gt;.&lt;span class="pl-c1"&gt;fips&lt;/span&gt; &lt;span class="pl-k"&gt;=&lt;/span&gt; [us&lt;span class="pl-k"&gt;-&lt;/span&gt;counties&lt;span class="pl-k"&gt;-&lt;/span&gt;recent].fips
&lt;span class="pl-k"&gt;where&lt;/span&gt;
  &lt;span class="pl-k"&gt;date&lt;/span&gt; &lt;span class="pl-k"&gt;=&lt;/span&gt; (&lt;span class="pl-k"&gt;select&lt;/span&gt; &lt;span class="pl-c1"&gt;max&lt;/span&gt;(&lt;span class="pl-k"&gt;date&lt;/span&gt;) &lt;span class="pl-k"&gt;from&lt;/span&gt; [us&lt;span class="pl-k"&gt;-&lt;/span&gt;counties&lt;span class="pl-k"&gt;-&lt;/span&gt;recent])
&lt;span class="pl-k"&gt;order by&lt;/span&gt;
  cases_per_million &lt;span class="pl-k"&gt;desc&lt;/span&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;&lt;img src="https://static.simonwillison.net/static/2022/datasette-lite-csv-join.png" alt="A screenshot of that query running in Datasette. Loving county Texas has the worst result - 1,289,940 cases per million - but that's because they have a population of just 169 people and 218 recorded cases." style="max-width:100%;" /&gt;&lt;/p&gt;
&lt;p&gt;And since everything in Datasette Lite can be bookmarked, here's the super long URL (&lt;a href="https://lite.datasette.io/?csv=https://raw.githubusercontent.com/nytimes/covid-19-data/master/us-counties-recent.csv&amp;amp;csv=https://raw.githubusercontent.com/simonw/covid-19-datasette/main/us_census_county_populations_2019.csv#/data?sql=select%0A++%5Bus-counties-recent%5D.*%2C%0A++us_census_county_populations_2019.population%2C%0A++1.0+*+%5Bus-counties-recent%5D.cases+%2F+us_census_county_populations_2019.population+*+1000000+as+cases_per_million%0Afrom%0A++%5Bus-counties-recent%5D%0A++join+us_census_county_populations_2019+on+us_census_county_populations_2019.fips+%3D+%5Bus-counties-recent%5D.fips%0Awhere%0A++date+%3D+%28select+max%28date%29+from+%5Bus-counties-recent%5D%29%0Aorder+by%0A++cases_per_million+desc"&gt;clickable version here&lt;/a&gt;) that executes that query against those two CSV files:&lt;/p&gt;
&lt;p&gt;&lt;code&gt;https://lite.datasette.io/?csv=https://raw.githubusercontent.com/nytimes/covid-19-data/master/us-counties-recent.csv&amp;amp;csv=https://raw.githubusercontent.com/simonw/covid-19-datasette/main/us_census_county_populations_2019.csv#/data?sql=select%0A++%5Bus-counties-recent%5D.*%2C%0A++us_census_county_populations_2019.population%2C%0A++1.0+*+%5Bus-counties-recent%5D.cases+%2F+us_census_county_populations_2019.population+*+1000000+as+cases_per_million%0Afrom%0A++%5Bus-counties-recent%5D%0A++join+us_census_county_populations_2019+on+us_census_county_populations_2019.fips+%3D+%5Bus-counties-recent%5D.fips%0Awhere%0A++date+%3D+%28select+max%28date%29+from+%5Bus-counties-recent%5D%29%0Aorder+by%0A++cases_per_million+desc&lt;/code&gt;&lt;/p&gt;
    
        &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/csv"&gt;csv&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/projects"&gt;projects&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/sql"&gt;sql&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/datasette"&gt;datasette&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/webassembly"&gt;webassembly&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/datasette-lite"&gt;datasette-lite&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/cors"&gt;cors&lt;/a&gt;&lt;/p&gt;
    

</summary><category term="csv"/><category term="projects"/><category term="sql"/><category term="datasette"/><category term="webassembly"/><category term="datasette-lite"/><category term="cors"/></entry><entry><title>Datasette Lite: a server-side Python web application running in a browser</title><link href="https://simonwillison.net/2022/May/4/datasette-lite/#atom-tag" rel="alternate"/><published>2022-05-04T15:16:49+00:00</published><updated>2022-05-04T15:16:49+00:00</updated><id>https://simonwillison.net/2022/May/4/datasette-lite/#atom-tag</id><summary type="html">
    &lt;p&gt;&lt;a href="https://github.com/simonw/datasette-lite"&gt;Datasette Lite&lt;/a&gt; is a new way to run &lt;a href="https://datasette.io/"&gt;Datasette&lt;/a&gt;: entirely in a browser, taking advantage of the incredible &lt;a href="https://pyodide.org/"&gt;Pyodide&lt;/a&gt; project which provides Python compiled to WebAssembly plus a whole suite of useful extras.&lt;/p&gt;
&lt;p&gt;You can try it out here:&lt;/p&gt;
&lt;p&gt;&lt;a href="https://lite.datasette.io/"&gt;https://lite.datasette.io/&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;img src="https://static.simonwillison.net/static/2022/datasette-lite.jpg" alt="A screenshot of the pypi_packages database table running in Google Chrome in a page with the URL of lite.datasette.io/#/content/pypi_packages?_facet=author" style="max-width:100%;" /&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;&lt;strong&gt;Update 20th June 2022&lt;/strong&gt;: Datasette Lite can now load CSV files directly by URL, see &lt;a href="https://simonwillison.net/2022/Jun/20/datasette-lite-csvs/"&gt;Joining CSV files in your browser using Datasette Lite&lt;/a&gt; for details.&lt;/em&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;&lt;strong&gt;Update 17th August 2022&lt;/strong&gt;: It can now &lt;a href="https://simonwillison.net/2022/Aug/17/datasette-lite-plugins/"&gt;load Datasette plugins as well&lt;/a&gt;.&lt;/em&gt;&lt;/p&gt;
&lt;p&gt;The initial example loads two databases - the classic &lt;a href="https://latest.datasette.io/fixtures"&gt;fixtures.db&lt;/a&gt; used by the Datasette test suite, and the &lt;a href="https://datasette.io/content"&gt;content.db&lt;/a&gt; database that powers the official &lt;a href="https://datasette.io/"&gt;datasette.io&lt;/a&gt; website (described in some detail in &lt;a href="https://simonwillison.net/2021/Jul/28/baked-data/"&gt;my post about Baked Data&lt;/a&gt;).&lt;/p&gt;
&lt;p&gt;You can instead use the "Load database by URL to a SQLite DB" button to paste in a URL to your own database. That file will need to be served with CORS headers that allow it to be fetched by the website (&lt;a href="https://github.com/simonw/datasette-lite/#opening-other-databases"&gt;see README&lt;/a&gt;).&lt;/p&gt;
&lt;p&gt;Try this URL, for example:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;https://congress-legislators.datasettes.com/legislators.db
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;You can &lt;a href="https://lite.datasette.io/?url=https%3A%2F%2Fcongress-legislators.datasettes.com%2Flegislators.db"&gt;follow this link&lt;/a&gt; to open that database in Datasette Lite.&lt;/p&gt;
&lt;p&gt;Datasette Lite supports almost all of Datasette's regular functionality: you can view tables, apply facets, run your own custom SQL results and export the results as CSV or JSON.&lt;/p&gt;
&lt;p&gt;It's basically the full Datasette experience, except it's running entirely in your browser with no server (other than the static file hosting provided here by GitHub Pages) required.&lt;/p&gt;
&lt;p&gt;I’m pretty stunned that this is possible now.&lt;/p&gt;
&lt;p&gt;I had to make some small changes to Datasette to get this to work, detailed below, but really nothing extravagant - the demo is running the exact same Python code as the regular server-side Datasette application, just inside a web worker process in a browser rather than on a server.&lt;/p&gt;
&lt;p&gt;The implementation is pretty small - around 300 lines of JavaScript. You can see the code in the &lt;a href="https://github.com/simonw/datasette-lite"&gt;simonw/datasette-lite&lt;/a&gt; repository - in two files, &lt;a href="https://github.com/simonw/datasette-lite/blob/6ae4cacf140f0c7c6eafa8cf0f92a2dae44425ff/index.html"&gt;index.html&lt;/a&gt; and &lt;a href="https://github.com/simonw/datasette-lite/blob/main/webworker.js"&gt;webworker.js&lt;/a&gt;&lt;/p&gt;
&lt;h4&gt;Why build this?&lt;/h4&gt;
&lt;p&gt;I built this because I want as many people as possible to be able to use my software.&lt;/p&gt;
&lt;p&gt;I've invested a ton of effort in reducing the friction to getting started with Datasette. I've &lt;a href="https://docs.datasette.io/en/stable/installation.html"&gt;documented the install process&lt;/a&gt;, I've &lt;a href="https://formulae.brew.sh/formula/datasette"&gt;packaged it for Homebrew&lt;/a&gt;, I've written guides to &lt;a href="https://docs.datasette.io/en/stable/getting_started.html#try-datasette-without-installing-anything-using-glitch"&gt;running it on Glitch&lt;/a&gt;, I've built tools to help deploy it to &lt;a href="https://docs.datasette.io/en/stable/publish.html#publishing-to-heroku"&gt;Heroku&lt;/a&gt;, &lt;a href="https://docs.datasette.io/en/stable/publish.html#publishing-to-google-cloud-run"&gt;Cloud Run&lt;/a&gt;, &lt;a href="https://docs.datasette.io/en/stable/publish.html#publishing-to-vercel"&gt;Vercel&lt;/a&gt; and &lt;a href="https://simonwillison.net/2022/Feb/15/fly-volumes/"&gt;Fly.io&lt;/a&gt;. I even taught myself Electron and built a macOS &lt;a href="https://datasette.io/desktop"&gt;Datasette Desktop&lt;/a&gt; application, so people could install it without having to think about their Python environment.&lt;/p&gt;
&lt;p&gt;Datasette Lite is my latest attempt at this. Anyone with a browser that can run WebAssembly can now run Datasette in it - if they can afford the 10MB load (which in many places with metered internet access is way too much).&lt;/p&gt;
&lt;p&gt;I also built this because I'm fascinated by WebAssembly and I've been looking for an opportunity to really try it out.&lt;/p&gt;
&lt;p&gt;And, I find this project deeply amusing. Running a Python server-side web application in a browser still feels like an absurd thing to do. I love that it works.&lt;/p&gt;
&lt;p&gt;I'm deeply inspired by &lt;a href="https://jupyterlite.readthedocs.io/en/latest/"&gt;JupyterLite&lt;/a&gt;. Datasette Lite's name is a tribute to that project.&lt;/p&gt;
&lt;h4&gt;How it works: Python in a Web Worker&lt;/h4&gt;
&lt;p&gt;Datasette Lite does most of its work in a &lt;a href="https://developer.mozilla.org/en-US/docs/Web/API/Web_Workers_API/Using_web_workers"&gt;Web Worker&lt;/a&gt; - a separate process that can run expensive CPU operations (like an entire Python interpreter) without blocking the main browser's UI thread.&lt;/p&gt;
&lt;p&gt;The worker starts running when you load the page. It loads a WebAssembly compiled Python interpreter from a CDN, then installs Datasette and its dependencies into that interpreter using &lt;a href="https://pyodide.org/en/stable/usage/loading-packages.html#micropip"&gt;micropip&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;It also downloads the specified SQLite database files using the browser's HTTP fetching mechanism and writes them to a virtual in-memory filesystem managed by Pyodide.&lt;/p&gt;
&lt;p&gt;Once everything is installed, it imports &lt;code&gt;datasette&lt;/code&gt; and creates a &lt;code&gt;Datasette()&lt;/code&gt; object called &lt;code&gt;ds&lt;/code&gt;. This object stays resident in the web worker.&lt;/p&gt;
&lt;p&gt;To render pages, the &lt;code&gt;index.html&lt;/code&gt; page sends a message to the web worker specifying which Datasette path has been requested - &lt;code&gt;/&lt;/code&gt; for the homepage, &lt;code&gt;/fixtures&lt;/code&gt; for the database index page, &lt;code&gt;/fixtures/facetable&lt;/code&gt; for a table page and so on.&lt;/p&gt;
&lt;p&gt;The web worker then simulates an HTTP GET against that path within Datasette using the following code:&lt;/p&gt;
&lt;pre&gt;&lt;span class="pl-s1"&gt;response&lt;/span&gt; &lt;span class="pl-c1"&gt;=&lt;/span&gt; &lt;span class="pl-k"&gt;await&lt;/span&gt; &lt;span class="pl-s1"&gt;ds&lt;/span&gt;.&lt;span class="pl-s1"&gt;client&lt;/span&gt;.&lt;span class="pl-en"&gt;get&lt;/span&gt;(&lt;span class="pl-s1"&gt;path&lt;/span&gt;, &lt;span class="pl-s1"&gt;follow_redirects&lt;/span&gt;&lt;span class="pl-c1"&gt;=&lt;/span&gt;&lt;span class="pl-c1"&gt;True&lt;/span&gt;)&lt;/pre&gt;
&lt;p&gt;This takes advantage of a really useful internal Datasette API: &lt;a href="https://docs.datasette.io/en/stable/internals.html#datasette-client"&gt;datasette.client&lt;/a&gt; is an &lt;a href="https://www.python-httpx.org/"&gt;HTTPX&lt;/a&gt; client object that can be used to execute HTTP requests against Datasette internally, without doing a round-trip across the network.&lt;/p&gt;
&lt;p&gt;I initially added &lt;code&gt;datasette.client&lt;/code&gt; with the goal of making any JSON APIs that Datasette provides available for internal calls by plugins as well, and to make it easier to write automated tests. It turns out to have other interesting applications too!&lt;/p&gt;
&lt;p&gt;The web worker sends a message back to &lt;code&gt;index.html&lt;/code&gt; with the status code, content type and content retrieved from Datasette. JavaScript in &lt;code&gt;index.html&lt;/code&gt; then injects that HTML into the page using &lt;code&gt;.innerHTML&lt;/code&gt;.&lt;/p&gt;
&lt;p&gt;To get internal links working, Datasette Lite uses a trick I originally learned from jQuery: it applies a capturing event listener to the area of the page displaying the content, such that any link clicks or form submissions will be intercepted by a JavaScript function. That JavaScript can then turn them into new messages to the web worker rather than navigating to another page.&lt;/p&gt;
&lt;h4&gt;Some annotated code&lt;/h4&gt;
&lt;p&gt;Here are annotated versions of the most important pieces of code. In &lt;code&gt;index.html&lt;/code&gt; this code manages the worker and updates the page when it recieves messages from it:&lt;/p&gt;
&lt;div class="highlight highlight-source-js"&gt;&lt;pre&gt;&lt;span class="pl-c"&gt;// Load the worker script&lt;/span&gt;
&lt;span class="pl-k"&gt;const&lt;/span&gt; &lt;span class="pl-s1"&gt;datasetteWorker&lt;/span&gt; &lt;span class="pl-c1"&gt;=&lt;/span&gt; &lt;span class="pl-k"&gt;new&lt;/span&gt; &lt;span class="pl-v"&gt;Worker&lt;/span&gt;&lt;span class="pl-kos"&gt;(&lt;/span&gt;&lt;span class="pl-s"&gt;"webworker.js"&lt;/span&gt;&lt;span class="pl-kos"&gt;)&lt;/span&gt;&lt;span class="pl-kos"&gt;;&lt;/span&gt;

&lt;span class="pl-c"&gt;// Extract the ?url= from the current page's URL&lt;/span&gt;
&lt;span class="pl-k"&gt;const&lt;/span&gt; &lt;span class="pl-s1"&gt;initialUrl&lt;/span&gt; &lt;span class="pl-c1"&gt;=&lt;/span&gt; &lt;span class="pl-k"&gt;new&lt;/span&gt; &lt;span class="pl-v"&gt;URLSearchParams&lt;/span&gt;&lt;span class="pl-kos"&gt;(&lt;/span&gt;&lt;span class="pl-s1"&gt;location&lt;/span&gt;&lt;span class="pl-kos"&gt;.&lt;/span&gt;&lt;span class="pl-c1"&gt;search&lt;/span&gt;&lt;span class="pl-kos"&gt;)&lt;/span&gt;&lt;span class="pl-kos"&gt;.&lt;/span&gt;&lt;span class="pl-en"&gt;get&lt;/span&gt;&lt;span class="pl-kos"&gt;(&lt;/span&gt;&lt;span class="pl-s"&gt;'url'&lt;/span&gt;&lt;span class="pl-kos"&gt;)&lt;/span&gt;&lt;span class="pl-kos"&gt;;&lt;/span&gt;

&lt;span class="pl-c"&gt;// Message that to the worker: {type: 'startup', initialUrl: url}&lt;/span&gt;
&lt;span class="pl-s1"&gt;datasetteWorker&lt;/span&gt;&lt;span class="pl-kos"&gt;.&lt;/span&gt;&lt;span class="pl-en"&gt;postMessage&lt;/span&gt;&lt;span class="pl-kos"&gt;(&lt;/span&gt;&lt;span class="pl-kos"&gt;{&lt;/span&gt;&lt;span class="pl-c1"&gt;type&lt;/span&gt;: &lt;span class="pl-s"&gt;'startup'&lt;/span&gt;&lt;span class="pl-kos"&gt;,&lt;/span&gt; initialUrl&lt;span class="pl-kos"&gt;}&lt;/span&gt;&lt;span class="pl-kos"&gt;)&lt;/span&gt;&lt;span class="pl-kos"&gt;;&lt;/span&gt;

&lt;span class="pl-c"&gt;// This function does most of the work - it responds to messages sent&lt;/span&gt;
&lt;span class="pl-c"&gt;// back from the worker to the index page:&lt;/span&gt;
&lt;span class="pl-s1"&gt;datasetteWorker&lt;/span&gt;&lt;span class="pl-kos"&gt;.&lt;/span&gt;&lt;span class="pl-en"&gt;onmessage&lt;/span&gt; &lt;span class="pl-c1"&gt;=&lt;/span&gt; &lt;span class="pl-kos"&gt;(&lt;/span&gt;&lt;span class="pl-s1"&gt;event&lt;/span&gt;&lt;span class="pl-kos"&gt;)&lt;/span&gt; &lt;span class="pl-c1"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="pl-kos"&gt;{&lt;/span&gt;
  &lt;span class="pl-c"&gt;// {type: log, line: ...} messages are appended to a log textarea:&lt;/span&gt;
  &lt;span class="pl-k"&gt;var&lt;/span&gt; &lt;span class="pl-s1"&gt;ta&lt;/span&gt; &lt;span class="pl-c1"&gt;=&lt;/span&gt; &lt;span class="pl-smi"&gt;document&lt;/span&gt;&lt;span class="pl-kos"&gt;.&lt;/span&gt;&lt;span class="pl-en"&gt;getElementById&lt;/span&gt;&lt;span class="pl-kos"&gt;(&lt;/span&gt;&lt;span class="pl-s"&gt;'loading-logs'&lt;/span&gt;&lt;span class="pl-kos"&gt;)&lt;/span&gt;&lt;span class="pl-kos"&gt;;&lt;/span&gt;
  &lt;span class="pl-k"&gt;if&lt;/span&gt; &lt;span class="pl-kos"&gt;(&lt;/span&gt;&lt;span class="pl-s1"&gt;event&lt;/span&gt;&lt;span class="pl-kos"&gt;.&lt;/span&gt;&lt;span class="pl-c1"&gt;data&lt;/span&gt;&lt;span class="pl-kos"&gt;.&lt;/span&gt;&lt;span class="pl-c1"&gt;type&lt;/span&gt; &lt;span class="pl-c1"&gt;==&lt;/span&gt; &lt;span class="pl-s"&gt;'log'&lt;/span&gt;&lt;span class="pl-kos"&gt;)&lt;/span&gt; &lt;span class="pl-kos"&gt;{&lt;/span&gt;
    &lt;span class="pl-s1"&gt;loadingLogs&lt;/span&gt;&lt;span class="pl-kos"&gt;.&lt;/span&gt;&lt;span class="pl-en"&gt;push&lt;/span&gt;&lt;span class="pl-kos"&gt;(&lt;/span&gt;&lt;span class="pl-s1"&gt;event&lt;/span&gt;&lt;span class="pl-kos"&gt;.&lt;/span&gt;&lt;span class="pl-c1"&gt;data&lt;/span&gt;&lt;span class="pl-kos"&gt;.&lt;/span&gt;&lt;span class="pl-c1"&gt;line&lt;/span&gt;&lt;span class="pl-kos"&gt;)&lt;/span&gt;&lt;span class="pl-kos"&gt;;&lt;/span&gt;
    &lt;span class="pl-s1"&gt;ta&lt;/span&gt;&lt;span class="pl-kos"&gt;.&lt;/span&gt;&lt;span class="pl-c1"&gt;value&lt;/span&gt; &lt;span class="pl-c1"&gt;=&lt;/span&gt; &lt;span class="pl-s1"&gt;loadingLogs&lt;/span&gt;&lt;span class="pl-kos"&gt;.&lt;/span&gt;&lt;span class="pl-en"&gt;join&lt;/span&gt;&lt;span class="pl-kos"&gt;(&lt;/span&gt;&lt;span class="pl-s"&gt;"\n"&lt;/span&gt;&lt;span class="pl-kos"&gt;)&lt;/span&gt;&lt;span class="pl-kos"&gt;;&lt;/span&gt;
    &lt;span class="pl-s1"&gt;ta&lt;/span&gt;&lt;span class="pl-kos"&gt;.&lt;/span&gt;&lt;span class="pl-c1"&gt;scrollTop&lt;/span&gt; &lt;span class="pl-c1"&gt;=&lt;/span&gt; &lt;span class="pl-s1"&gt;ta&lt;/span&gt;&lt;span class="pl-kos"&gt;.&lt;/span&gt;&lt;span class="pl-c1"&gt;scrollHeight&lt;/span&gt;&lt;span class="pl-kos"&gt;;&lt;/span&gt;
    &lt;span class="pl-k"&gt;return&lt;/span&gt;&lt;span class="pl-kos"&gt;;&lt;/span&gt;
  &lt;span class="pl-kos"&gt;}&lt;/span&gt;
  &lt;span class="pl-k"&gt;let&lt;/span&gt; &lt;span class="pl-s1"&gt;html&lt;/span&gt; &lt;span class="pl-c1"&gt;=&lt;/span&gt; &lt;span class="pl-s"&gt;''&lt;/span&gt;&lt;span class="pl-kos"&gt;;&lt;/span&gt;
  &lt;span class="pl-c"&gt;// If it's an {error: ...} message show it in a &amp;lt;pre&amp;gt; in a &amp;lt;div&amp;gt;&lt;/span&gt;
  &lt;span class="pl-k"&gt;if&lt;/span&gt; &lt;span class="pl-kos"&gt;(&lt;/span&gt;&lt;span class="pl-s1"&gt;event&lt;/span&gt;&lt;span class="pl-kos"&gt;.&lt;/span&gt;&lt;span class="pl-c1"&gt;data&lt;/span&gt;&lt;span class="pl-kos"&gt;.&lt;/span&gt;&lt;span class="pl-c1"&gt;error&lt;/span&gt;&lt;span class="pl-kos"&gt;)&lt;/span&gt; &lt;span class="pl-kos"&gt;{&lt;/span&gt;
    &lt;span class="pl-s1"&gt;html&lt;/span&gt; &lt;span class="pl-c1"&gt;=&lt;/span&gt; &lt;span class="pl-s"&gt;`&amp;lt;div style="padding: 0.5em"&amp;gt;&amp;lt;h3&amp;gt;Error&amp;lt;/h3&amp;gt;&amp;lt;pre&amp;gt;&lt;span class="pl-s1"&gt;&lt;span class="pl-kos"&gt;${&lt;/span&gt;&lt;span class="pl-en"&gt;escapeHtml&lt;/span&gt;&lt;span class="pl-kos"&gt;(&lt;/span&gt;&lt;span class="pl-s1"&gt;event&lt;/span&gt;&lt;span class="pl-kos"&gt;.&lt;/span&gt;&lt;span class="pl-c1"&gt;data&lt;/span&gt;&lt;span class="pl-kos"&gt;.&lt;/span&gt;&lt;span class="pl-c1"&gt;error&lt;/span&gt;&lt;span class="pl-kos"&gt;)&lt;/span&gt;&lt;span class="pl-kos"&gt;}&lt;/span&gt;&lt;/span&gt;&amp;lt;/pre&amp;gt;&amp;lt;/div&amp;gt;`&lt;/span&gt;&lt;span class="pl-kos"&gt;;&lt;/span&gt;
  &lt;span class="pl-c"&gt;// If contentType is text/html, show it as straight HTML&lt;/span&gt;
  &lt;span class="pl-kos"&gt;}&lt;/span&gt; &lt;span class="pl-k"&gt;else&lt;/span&gt; &lt;span class="pl-k"&gt;if&lt;/span&gt; &lt;span class="pl-kos"&gt;(&lt;/span&gt;&lt;span class="pl-pds"&gt;&lt;span class="pl-c1"&gt;/&lt;/span&gt;&lt;span class="pl-cce"&gt;^&lt;/span&gt;text&lt;span class="pl-cce"&gt;\/&lt;/span&gt;html&lt;span class="pl-c1"&gt;/&lt;/span&gt;&lt;/span&gt;&lt;span class="pl-kos"&gt;.&lt;/span&gt;&lt;span class="pl-en"&gt;exec&lt;/span&gt;&lt;span class="pl-kos"&gt;(&lt;/span&gt;&lt;span class="pl-s1"&gt;event&lt;/span&gt;&lt;span class="pl-kos"&gt;.&lt;/span&gt;&lt;span class="pl-c1"&gt;data&lt;/span&gt;&lt;span class="pl-kos"&gt;.&lt;/span&gt;&lt;span class="pl-c1"&gt;contentType&lt;/span&gt;&lt;span class="pl-kos"&gt;)&lt;/span&gt;&lt;span class="pl-kos"&gt;)&lt;/span&gt; &lt;span class="pl-kos"&gt;{&lt;/span&gt;
    &lt;span class="pl-s1"&gt;html&lt;/span&gt; &lt;span class="pl-c1"&gt;=&lt;/span&gt; &lt;span class="pl-s1"&gt;event&lt;/span&gt;&lt;span class="pl-kos"&gt;.&lt;/span&gt;&lt;span class="pl-c1"&gt;data&lt;/span&gt;&lt;span class="pl-kos"&gt;.&lt;/span&gt;&lt;span class="pl-c1"&gt;text&lt;/span&gt;&lt;span class="pl-kos"&gt;;&lt;/span&gt;
  &lt;span class="pl-c"&gt;// For contentType of application/json parse and pretty-print it&lt;/span&gt;
  &lt;span class="pl-kos"&gt;}&lt;/span&gt; &lt;span class="pl-k"&gt;else&lt;/span&gt; &lt;span class="pl-k"&gt;if&lt;/span&gt; &lt;span class="pl-kos"&gt;(&lt;/span&gt;&lt;span class="pl-pds"&gt;&lt;span class="pl-c1"&gt;/&lt;/span&gt;&lt;span class="pl-cce"&gt;^&lt;/span&gt;application&lt;span class="pl-cce"&gt;\/&lt;/span&gt;json&lt;span class="pl-c1"&gt;/&lt;/span&gt;&lt;/span&gt;&lt;span class="pl-kos"&gt;.&lt;/span&gt;&lt;span class="pl-en"&gt;exec&lt;/span&gt;&lt;span class="pl-kos"&gt;(&lt;/span&gt;&lt;span class="pl-s1"&gt;event&lt;/span&gt;&lt;span class="pl-kos"&gt;.&lt;/span&gt;&lt;span class="pl-c1"&gt;data&lt;/span&gt;&lt;span class="pl-kos"&gt;.&lt;/span&gt;&lt;span class="pl-c1"&gt;contentType&lt;/span&gt;&lt;span class="pl-kos"&gt;)&lt;/span&gt;&lt;span class="pl-kos"&gt;)&lt;/span&gt; &lt;span class="pl-kos"&gt;{&lt;/span&gt;
    &lt;span class="pl-s1"&gt;html&lt;/span&gt; &lt;span class="pl-c1"&gt;=&lt;/span&gt; &lt;span class="pl-s"&gt;`&amp;lt;pre style="padding: 0.5em"&amp;gt;&lt;span class="pl-s1"&gt;&lt;span class="pl-kos"&gt;${&lt;/span&gt;&lt;span class="pl-en"&gt;escapeHtml&lt;/span&gt;&lt;span class="pl-kos"&gt;(&lt;/span&gt;&lt;span class="pl-c1"&gt;JSON&lt;/span&gt;&lt;span class="pl-kos"&gt;.&lt;/span&gt;&lt;span class="pl-en"&gt;stringify&lt;/span&gt;&lt;span class="pl-kos"&gt;(&lt;/span&gt;&lt;span class="pl-c1"&gt;JSON&lt;/span&gt;&lt;span class="pl-kos"&gt;.&lt;/span&gt;&lt;span class="pl-en"&gt;parse&lt;/span&gt;&lt;span class="pl-kos"&gt;(&lt;/span&gt;&lt;span class="pl-s1"&gt;event&lt;/span&gt;&lt;span class="pl-kos"&gt;.&lt;/span&gt;&lt;span class="pl-c1"&gt;data&lt;/span&gt;&lt;span class="pl-kos"&gt;.&lt;/span&gt;&lt;span class="pl-c1"&gt;text&lt;/span&gt;&lt;span class="pl-kos"&gt;)&lt;/span&gt;&lt;span class="pl-kos"&gt;,&lt;/span&gt; &lt;span class="pl-c1"&gt;null&lt;/span&gt;&lt;span class="pl-kos"&gt;,&lt;/span&gt; &lt;span class="pl-c1"&gt;4&lt;/span&gt;&lt;span class="pl-kos"&gt;)&lt;/span&gt;&lt;span class="pl-kos"&gt;)&lt;/span&gt;&lt;span class="pl-kos"&gt;}&lt;/span&gt;&lt;/span&gt;&amp;lt;/pre&amp;gt;`&lt;/span&gt;&lt;span class="pl-kos"&gt;;&lt;/span&gt;
  &lt;span class="pl-c"&gt;// Anything else (likely CSV data) escape it and show in a &amp;lt;pre&amp;gt;&lt;/span&gt;
  &lt;span class="pl-kos"&gt;}&lt;/span&gt; &lt;span class="pl-k"&gt;else&lt;/span&gt; &lt;span class="pl-kos"&gt;{&lt;/span&gt;
    &lt;span class="pl-s1"&gt;html&lt;/span&gt; &lt;span class="pl-c1"&gt;=&lt;/span&gt; &lt;span class="pl-s"&gt;`&amp;lt;pre style="padding: 0.5em"&amp;gt;&lt;span class="pl-s1"&gt;&lt;span class="pl-kos"&gt;${&lt;/span&gt;&lt;span class="pl-en"&gt;escapeHtml&lt;/span&gt;&lt;span class="pl-kos"&gt;(&lt;/span&gt;&lt;span class="pl-s1"&gt;event&lt;/span&gt;&lt;span class="pl-kos"&gt;.&lt;/span&gt;&lt;span class="pl-c1"&gt;data&lt;/span&gt;&lt;span class="pl-kos"&gt;.&lt;/span&gt;&lt;span class="pl-c1"&gt;text&lt;/span&gt;&lt;span class="pl-kos"&gt;)&lt;/span&gt;&lt;span class="pl-kos"&gt;}&lt;/span&gt;&lt;/span&gt;&amp;lt;/pre&amp;gt;`&lt;/span&gt;&lt;span class="pl-kos"&gt;;&lt;/span&gt;
  &lt;span class="pl-kos"&gt;}&lt;/span&gt;
  &lt;span class="pl-c"&gt;// Add the result to &amp;lt;div id="output"&amp;gt; using innerHTML&lt;/span&gt;
  &lt;span class="pl-smi"&gt;document&lt;/span&gt;&lt;span class="pl-kos"&gt;.&lt;/span&gt;&lt;span class="pl-en"&gt;getElementById&lt;/span&gt;&lt;span class="pl-kos"&gt;(&lt;/span&gt;&lt;span class="pl-s"&gt;"output"&lt;/span&gt;&lt;span class="pl-kos"&gt;)&lt;/span&gt;&lt;span class="pl-kos"&gt;.&lt;/span&gt;&lt;span class="pl-c1"&gt;innerHTML&lt;/span&gt; &lt;span class="pl-c1"&gt;=&lt;/span&gt; &lt;span class="pl-s1"&gt;html&lt;/span&gt;&lt;span class="pl-kos"&gt;;&lt;/span&gt;
  &lt;span class="pl-c"&gt;// Update the document.title if a &amp;lt;title&amp;gt; element is present&lt;/span&gt;
  &lt;span class="pl-k"&gt;let&lt;/span&gt; &lt;span class="pl-s1"&gt;title&lt;/span&gt; &lt;span class="pl-c1"&gt;=&lt;/span&gt; &lt;span class="pl-smi"&gt;document&lt;/span&gt;&lt;span class="pl-kos"&gt;.&lt;/span&gt;&lt;span class="pl-en"&gt;getElementById&lt;/span&gt;&lt;span class="pl-kos"&gt;(&lt;/span&gt;&lt;span class="pl-s"&gt;"output"&lt;/span&gt;&lt;span class="pl-kos"&gt;)&lt;/span&gt;&lt;span class="pl-kos"&gt;.&lt;/span&gt;&lt;span class="pl-en"&gt;querySelector&lt;/span&gt;&lt;span class="pl-kos"&gt;(&lt;/span&gt;&lt;span class="pl-s"&gt;"title"&lt;/span&gt;&lt;span class="pl-kos"&gt;)&lt;/span&gt;&lt;span class="pl-kos"&gt;;&lt;/span&gt;
  &lt;span class="pl-k"&gt;if&lt;/span&gt; &lt;span class="pl-kos"&gt;(&lt;/span&gt;&lt;span class="pl-s1"&gt;title&lt;/span&gt;&lt;span class="pl-kos"&gt;)&lt;/span&gt; &lt;span class="pl-kos"&gt;{&lt;/span&gt;
    &lt;span class="pl-smi"&gt;document&lt;/span&gt;&lt;span class="pl-kos"&gt;.&lt;/span&gt;&lt;span class="pl-c1"&gt;title&lt;/span&gt; &lt;span class="pl-c1"&gt;=&lt;/span&gt; &lt;span class="pl-s1"&gt;title&lt;/span&gt;&lt;span class="pl-kos"&gt;.&lt;/span&gt;&lt;span class="pl-c1"&gt;innerText&lt;/span&gt;&lt;span class="pl-kos"&gt;;&lt;/span&gt;
  &lt;span class="pl-kos"&gt;}&lt;/span&gt;
  &lt;span class="pl-c"&gt;// Scroll to the top of the page after each new page is loaded&lt;/span&gt;
  &lt;span class="pl-smi"&gt;window&lt;/span&gt;&lt;span class="pl-kos"&gt;.&lt;/span&gt;&lt;span class="pl-en"&gt;scrollTo&lt;/span&gt;&lt;span class="pl-kos"&gt;(&lt;/span&gt;&lt;span class="pl-kos"&gt;{&lt;/span&gt;&lt;span class="pl-c1"&gt;top&lt;/span&gt;: &lt;span class="pl-c1"&gt;0&lt;/span&gt;&lt;span class="pl-kos"&gt;,&lt;/span&gt; &lt;span class="pl-c1"&gt;left&lt;/span&gt;: &lt;span class="pl-c1"&gt;0&lt;/span&gt;&lt;span class="pl-kos"&gt;}&lt;/span&gt;&lt;span class="pl-kos"&gt;)&lt;/span&gt;&lt;span class="pl-kos"&gt;;&lt;/span&gt;
  &lt;span class="pl-c"&gt;// If we're showing the initial loading indicator, hide it&lt;/span&gt;
  &lt;span class="pl-smi"&gt;document&lt;/span&gt;&lt;span class="pl-kos"&gt;.&lt;/span&gt;&lt;span class="pl-en"&gt;getElementById&lt;/span&gt;&lt;span class="pl-kos"&gt;(&lt;/span&gt;&lt;span class="pl-s"&gt;'loading-indicator'&lt;/span&gt;&lt;span class="pl-kos"&gt;)&lt;/span&gt;&lt;span class="pl-kos"&gt;.&lt;/span&gt;&lt;span class="pl-c1"&gt;style&lt;/span&gt;&lt;span class="pl-kos"&gt;.&lt;/span&gt;&lt;span class="pl-c1"&gt;display&lt;/span&gt; &lt;span class="pl-c1"&gt;=&lt;/span&gt; &lt;span class="pl-s"&gt;'none'&lt;/span&gt;&lt;span class="pl-kos"&gt;;&lt;/span&gt;
&lt;span class="pl-kos"&gt;}&lt;/span&gt;&lt;span class="pl-kos"&gt;;&lt;/span&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;The &lt;code&gt;webworker.js&lt;/code&gt; script is where the real magic happens:&lt;/p&gt;
&lt;div class="highlight highlight-source-js"&gt;&lt;pre&gt;&lt;span class="pl-c"&gt;// Load Pyodide from the CDN&lt;/span&gt;
&lt;span class="pl-en"&gt;importScripts&lt;/span&gt;&lt;span class="pl-kos"&gt;(&lt;/span&gt;&lt;span class="pl-s"&gt;"https://cdn.jsdelivr.net/pyodide/dev/full/pyodide.js"&lt;/span&gt;&lt;span class="pl-kos"&gt;)&lt;/span&gt;&lt;span class="pl-kos"&gt;;&lt;/span&gt;

&lt;span class="pl-c"&gt;// Deliver log messages back to the index.html page&lt;/span&gt;
&lt;span class="pl-k"&gt;function&lt;/span&gt; &lt;span class="pl-en"&gt;log&lt;/span&gt;&lt;span class="pl-kos"&gt;(&lt;/span&gt;&lt;span class="pl-s1"&gt;line&lt;/span&gt;&lt;span class="pl-kos"&gt;)&lt;/span&gt; &lt;span class="pl-kos"&gt;{&lt;/span&gt;
  &lt;span class="pl-s1"&gt;self&lt;/span&gt;&lt;span class="pl-kos"&gt;.&lt;/span&gt;&lt;span class="pl-en"&gt;postMessage&lt;/span&gt;&lt;span class="pl-kos"&gt;(&lt;/span&gt;&lt;span class="pl-kos"&gt;{&lt;/span&gt;&lt;span class="pl-c1"&gt;type&lt;/span&gt;: &lt;span class="pl-s"&gt;'log'&lt;/span&gt;&lt;span class="pl-kos"&gt;,&lt;/span&gt; &lt;span class="pl-c1"&gt;line&lt;/span&gt;: &lt;span class="pl-s1"&gt;line&lt;/span&gt;&lt;span class="pl-kos"&gt;}&lt;/span&gt;&lt;span class="pl-kos"&gt;)&lt;/span&gt;&lt;span class="pl-kos"&gt;;&lt;/span&gt;
&lt;span class="pl-kos"&gt;}&lt;/span&gt;

&lt;span class="pl-c"&gt;// This function initializes Pyodide and installs Datasette&lt;/span&gt;
&lt;span class="pl-k"&gt;async&lt;/span&gt; &lt;span class="pl-k"&gt;function&lt;/span&gt; &lt;span class="pl-en"&gt;startDatasette&lt;/span&gt;&lt;span class="pl-kos"&gt;(&lt;/span&gt;&lt;span class="pl-s1"&gt;initialUrl&lt;/span&gt;&lt;span class="pl-kos"&gt;)&lt;/span&gt; &lt;span class="pl-kos"&gt;{&lt;/span&gt;
  &lt;span class="pl-c"&gt;// Mechanism for downloading and saving specified DB files&lt;/span&gt;
  &lt;span class="pl-k"&gt;let&lt;/span&gt; &lt;span class="pl-s1"&gt;toLoad&lt;/span&gt; &lt;span class="pl-c1"&gt;=&lt;/span&gt; &lt;span class="pl-kos"&gt;[&lt;/span&gt;&lt;span class="pl-kos"&gt;]&lt;/span&gt;&lt;span class="pl-kos"&gt;;&lt;/span&gt;
  &lt;span class="pl-k"&gt;if&lt;/span&gt; &lt;span class="pl-kos"&gt;(&lt;/span&gt;&lt;span class="pl-s1"&gt;initialUrl&lt;/span&gt;&lt;span class="pl-kos"&gt;)&lt;/span&gt; &lt;span class="pl-kos"&gt;{&lt;/span&gt;
    &lt;span class="pl-k"&gt;let&lt;/span&gt; &lt;span class="pl-s1"&gt;name&lt;/span&gt; &lt;span class="pl-c1"&gt;=&lt;/span&gt; &lt;span class="pl-s1"&gt;initialUrl&lt;/span&gt;&lt;span class="pl-kos"&gt;.&lt;/span&gt;&lt;span class="pl-en"&gt;split&lt;/span&gt;&lt;span class="pl-kos"&gt;(&lt;/span&gt;&lt;span class="pl-s"&gt;'.db'&lt;/span&gt;&lt;span class="pl-kos"&gt;)&lt;/span&gt;&lt;span class="pl-kos"&gt;[&lt;/span&gt;&lt;span class="pl-c1"&gt;0&lt;/span&gt;&lt;span class="pl-kos"&gt;]&lt;/span&gt;&lt;span class="pl-kos"&gt;.&lt;/span&gt;&lt;span class="pl-en"&gt;split&lt;/span&gt;&lt;span class="pl-kos"&gt;(&lt;/span&gt;&lt;span class="pl-s"&gt;'/'&lt;/span&gt;&lt;span class="pl-kos"&gt;)&lt;/span&gt;&lt;span class="pl-kos"&gt;.&lt;/span&gt;&lt;span class="pl-en"&gt;slice&lt;/span&gt;&lt;span class="pl-kos"&gt;(&lt;/span&gt;&lt;span class="pl-c1"&gt;-&lt;/span&gt;&lt;span class="pl-c1"&gt;1&lt;/span&gt;&lt;span class="pl-kos"&gt;)&lt;/span&gt;&lt;span class="pl-kos"&gt;[&lt;/span&gt;&lt;span class="pl-c1"&gt;0&lt;/span&gt;&lt;span class="pl-kos"&gt;]&lt;/span&gt;&lt;span class="pl-kos"&gt;;&lt;/span&gt;
    &lt;span class="pl-s1"&gt;toLoad&lt;/span&gt;&lt;span class="pl-kos"&gt;.&lt;/span&gt;&lt;span class="pl-en"&gt;push&lt;/span&gt;&lt;span class="pl-kos"&gt;(&lt;/span&gt;&lt;span class="pl-kos"&gt;[&lt;/span&gt;&lt;span class="pl-s1"&gt;name&lt;/span&gt;&lt;span class="pl-kos"&gt;,&lt;/span&gt; &lt;span class="pl-s1"&gt;initialUrl&lt;/span&gt;&lt;span class="pl-kos"&gt;]&lt;/span&gt;&lt;span class="pl-kos"&gt;)&lt;/span&gt;&lt;span class="pl-kos"&gt;;&lt;/span&gt;
  &lt;span class="pl-kos"&gt;}&lt;/span&gt; &lt;span class="pl-k"&gt;else&lt;/span&gt; &lt;span class="pl-kos"&gt;{&lt;/span&gt;
    &lt;span class="pl-c"&gt;// If no ?url= provided, loads these two demo databases instead:&lt;/span&gt;
    &lt;span class="pl-s1"&gt;toLoad&lt;/span&gt;&lt;span class="pl-kos"&gt;.&lt;/span&gt;&lt;span class="pl-en"&gt;push&lt;/span&gt;&lt;span class="pl-kos"&gt;(&lt;/span&gt;&lt;span class="pl-kos"&gt;[&lt;/span&gt;&lt;span class="pl-s"&gt;"fixtures.db"&lt;/span&gt;&lt;span class="pl-kos"&gt;,&lt;/span&gt; &lt;span class="pl-s"&gt;"https://latest.datasette.io/fixtures.db"&lt;/span&gt;&lt;span class="pl-kos"&gt;]&lt;/span&gt;&lt;span class="pl-kos"&gt;)&lt;/span&gt;&lt;span class="pl-kos"&gt;;&lt;/span&gt;
    &lt;span class="pl-s1"&gt;toLoad&lt;/span&gt;&lt;span class="pl-kos"&gt;.&lt;/span&gt;&lt;span class="pl-en"&gt;push&lt;/span&gt;&lt;span class="pl-kos"&gt;(&lt;/span&gt;&lt;span class="pl-kos"&gt;[&lt;/span&gt;&lt;span class="pl-s"&gt;"content.db"&lt;/span&gt;&lt;span class="pl-kos"&gt;,&lt;/span&gt; &lt;span class="pl-s"&gt;"https://datasette.io/content.db"&lt;/span&gt;&lt;span class="pl-kos"&gt;]&lt;/span&gt;&lt;span class="pl-kos"&gt;)&lt;/span&gt;&lt;span class="pl-kos"&gt;;&lt;/span&gt;
  &lt;span class="pl-kos"&gt;}&lt;/span&gt;
  &lt;span class="pl-c"&gt;// This does a LOT of work - it pulls down the WASM blob and starts it running&lt;/span&gt;
  &lt;span class="pl-s1"&gt;self&lt;/span&gt;&lt;span class="pl-kos"&gt;.&lt;/span&gt;&lt;span class="pl-c1"&gt;pyodide&lt;/span&gt; &lt;span class="pl-c1"&gt;=&lt;/span&gt; &lt;span class="pl-k"&gt;await&lt;/span&gt; &lt;span class="pl-en"&gt;loadPyodide&lt;/span&gt;&lt;span class="pl-kos"&gt;(&lt;/span&gt;&lt;span class="pl-kos"&gt;{&lt;/span&gt;
    &lt;span class="pl-c1"&gt;indexURL&lt;/span&gt;: &lt;span class="pl-s"&gt;"https://cdn.jsdelivr.net/pyodide/dev/full/"&lt;/span&gt;
  &lt;span class="pl-kos"&gt;}&lt;/span&gt;&lt;span class="pl-kos"&gt;)&lt;/span&gt;&lt;span class="pl-kos"&gt;;&lt;/span&gt;
  &lt;span class="pl-c"&gt;// We need these packages for the next bit of code to work&lt;/span&gt;
  &lt;span class="pl-k"&gt;await&lt;/span&gt; &lt;span class="pl-s1"&gt;pyodide&lt;/span&gt;&lt;span class="pl-kos"&gt;.&lt;/span&gt;&lt;span class="pl-en"&gt;loadPackage&lt;/span&gt;&lt;span class="pl-kos"&gt;(&lt;/span&gt;&lt;span class="pl-s"&gt;'micropip'&lt;/span&gt;&lt;span class="pl-kos"&gt;,&lt;/span&gt; &lt;span class="pl-s1"&gt;log&lt;/span&gt;&lt;span class="pl-kos"&gt;)&lt;/span&gt;&lt;span class="pl-kos"&gt;;&lt;/span&gt;
  &lt;span class="pl-k"&gt;await&lt;/span&gt; &lt;span class="pl-s1"&gt;pyodide&lt;/span&gt;&lt;span class="pl-kos"&gt;.&lt;/span&gt;&lt;span class="pl-en"&gt;loadPackage&lt;/span&gt;&lt;span class="pl-kos"&gt;(&lt;/span&gt;&lt;span class="pl-s"&gt;'ssl'&lt;/span&gt;&lt;span class="pl-kos"&gt;,&lt;/span&gt; &lt;span class="pl-s1"&gt;log&lt;/span&gt;&lt;span class="pl-kos"&gt;)&lt;/span&gt;&lt;span class="pl-kos"&gt;;&lt;/span&gt;
  &lt;span class="pl-k"&gt;await&lt;/span&gt; &lt;span class="pl-s1"&gt;pyodide&lt;/span&gt;&lt;span class="pl-kos"&gt;.&lt;/span&gt;&lt;span class="pl-en"&gt;loadPackage&lt;/span&gt;&lt;span class="pl-kos"&gt;(&lt;/span&gt;&lt;span class="pl-s"&gt;'setuptools'&lt;/span&gt;&lt;span class="pl-kos"&gt;,&lt;/span&gt; &lt;span class="pl-s1"&gt;log&lt;/span&gt;&lt;span class="pl-kos"&gt;)&lt;/span&gt;&lt;span class="pl-kos"&gt;;&lt;/span&gt; &lt;span class="pl-c"&gt;// For pkg_resources&lt;/span&gt;
  &lt;span class="pl-k"&gt;try&lt;/span&gt; &lt;span class="pl-kos"&gt;{&lt;/span&gt;
    &lt;span class="pl-c"&gt;// Now we switch to Python code&lt;/span&gt;
    &lt;span class="pl-k"&gt;await&lt;/span&gt; &lt;span class="pl-s1"&gt;self&lt;/span&gt;&lt;span class="pl-kos"&gt;.&lt;/span&gt;&lt;span class="pl-c1"&gt;pyodide&lt;/span&gt;&lt;span class="pl-kos"&gt;.&lt;/span&gt;&lt;span class="pl-en"&gt;runPythonAsync&lt;/span&gt;&lt;span class="pl-kos"&gt;(&lt;/span&gt;&lt;span class="pl-s"&gt;`&lt;/span&gt;
&lt;span class="pl-s"&gt;    # Here's where we download and save those .db files - they are saved&lt;/span&gt;
&lt;span class="pl-s"&gt;    # to a virtual in-memory filesystem provided by Pyodide&lt;/span&gt;
&lt;span class="pl-s"&gt;&lt;/span&gt;
&lt;span class="pl-s"&gt;    # pyfetch is a wrapper around the JS fetch() function - calls using&lt;/span&gt;
&lt;span class="pl-s"&gt;    # it are handled by the browser's regular HTTP fetching mechanism&lt;/span&gt;
&lt;span class="pl-s"&gt;    from pyodide.http import pyfetch&lt;/span&gt;
&lt;span class="pl-s"&gt;    names = []&lt;/span&gt;
&lt;span class="pl-s"&gt;    for name, url in &lt;span class="pl-s1"&gt;&lt;span class="pl-kos"&gt;${&lt;/span&gt;&lt;span class="pl-c1"&gt;JSON&lt;/span&gt;&lt;span class="pl-kos"&gt;.&lt;/span&gt;&lt;span class="pl-en"&gt;stringify&lt;/span&gt;&lt;span class="pl-kos"&gt;(&lt;/span&gt;&lt;span class="pl-s1"&gt;toLoad&lt;/span&gt;&lt;span class="pl-kos"&gt;)&lt;/span&gt;&lt;span class="pl-kos"&gt;}&lt;/span&gt;&lt;/span&gt;:&lt;/span&gt;
&lt;span class="pl-s"&gt;        response = await pyfetch(url)&lt;/span&gt;
&lt;span class="pl-s"&gt;        with open(name, "wb") as fp:&lt;/span&gt;
&lt;span class="pl-s"&gt;            fp.write(await response.bytes())&lt;/span&gt;
&lt;span class="pl-s"&gt;        names.append(name)&lt;/span&gt;
&lt;span class="pl-s"&gt;&lt;/span&gt;
&lt;span class="pl-s"&gt;    import micropip&lt;/span&gt;
&lt;span class="pl-s"&gt;    # Workaround for Requested 'h11&amp;lt;0.13,&amp;gt;=0.11', but h11==0.13.0 is already installed&lt;/span&gt;
&lt;span class="pl-s"&gt;    await micropip.install("h11==0.12.0")&lt;/span&gt;
&lt;span class="pl-s"&gt;    # Install Datasette itself!&lt;/span&gt;
&lt;span class="pl-s"&gt;    await micropip.install("datasette==0.62a0")&lt;/span&gt;
&lt;span class="pl-s"&gt;    # Now we can create a Datasette() object that can respond to fake requests&lt;/span&gt;
&lt;span class="pl-s"&gt;    from datasette.app import Datasette&lt;/span&gt;
&lt;span class="pl-s"&gt;    ds = Datasette(names, settings={&lt;/span&gt;
&lt;span class="pl-s"&gt;        "num_sql_threads": 0,&lt;/span&gt;
&lt;span class="pl-s"&gt;    }, metadata = {&lt;/span&gt;
&lt;span class="pl-s"&gt;        # This metadata is displayed in Datasette's footer&lt;/span&gt;
&lt;span class="pl-s"&gt;        "about": "Datasette Lite",&lt;/span&gt;
&lt;span class="pl-s"&gt;        "about_url": "https://github.com/simonw/datasette-lite"&lt;/span&gt;
&lt;span class="pl-s"&gt;    })&lt;/span&gt;
&lt;span class="pl-s"&gt;    `&lt;/span&gt;&lt;span class="pl-kos"&gt;)&lt;/span&gt;&lt;span class="pl-kos"&gt;;&lt;/span&gt;
    &lt;span class="pl-en"&gt;datasetteLiteReady&lt;/span&gt;&lt;span class="pl-kos"&gt;(&lt;/span&gt;&lt;span class="pl-kos"&gt;)&lt;/span&gt;&lt;span class="pl-kos"&gt;;&lt;/span&gt;
  &lt;span class="pl-kos"&gt;}&lt;/span&gt; &lt;span class="pl-k"&gt;catch&lt;/span&gt; &lt;span class="pl-kos"&gt;(&lt;/span&gt;&lt;span class="pl-s1"&gt;error&lt;/span&gt;&lt;span class="pl-kos"&gt;)&lt;/span&gt; &lt;span class="pl-kos"&gt;{&lt;/span&gt;
    &lt;span class="pl-s1"&gt;self&lt;/span&gt;&lt;span class="pl-kos"&gt;.&lt;/span&gt;&lt;span class="pl-en"&gt;postMessage&lt;/span&gt;&lt;span class="pl-kos"&gt;(&lt;/span&gt;&lt;span class="pl-kos"&gt;{&lt;/span&gt;&lt;span class="pl-c1"&gt;error&lt;/span&gt;: &lt;span class="pl-s1"&gt;error&lt;/span&gt;&lt;span class="pl-kos"&gt;.&lt;/span&gt;&lt;span class="pl-c1"&gt;message&lt;/span&gt;&lt;span class="pl-kos"&gt;}&lt;/span&gt;&lt;span class="pl-kos"&gt;)&lt;/span&gt;&lt;span class="pl-kos"&gt;;&lt;/span&gt;
  &lt;span class="pl-kos"&gt;}&lt;/span&gt;
&lt;span class="pl-kos"&gt;}&lt;/span&gt;

&lt;span class="pl-c"&gt;// Outside promise pattern&lt;/span&gt;
&lt;span class="pl-c"&gt;// https://github.com/simonw/datasette-lite/issues/25#issuecomment-1116948381&lt;/span&gt;
&lt;span class="pl-k"&gt;let&lt;/span&gt; &lt;span class="pl-s1"&gt;datasetteLiteReady&lt;/span&gt;&lt;span class="pl-kos"&gt;;&lt;/span&gt;
&lt;span class="pl-k"&gt;let&lt;/span&gt; &lt;span class="pl-s1"&gt;readyPromise&lt;/span&gt; &lt;span class="pl-c1"&gt;=&lt;/span&gt; &lt;span class="pl-k"&gt;new&lt;/span&gt; &lt;span class="pl-v"&gt;Promise&lt;/span&gt;&lt;span class="pl-kos"&gt;(&lt;/span&gt;&lt;span class="pl-k"&gt;function&lt;/span&gt;&lt;span class="pl-kos"&gt;(&lt;/span&gt;&lt;span class="pl-s1"&gt;resolve&lt;/span&gt;&lt;span class="pl-kos"&gt;)&lt;/span&gt; &lt;span class="pl-kos"&gt;{&lt;/span&gt;
  &lt;span class="pl-s1"&gt;datasetteLiteReady&lt;/span&gt; &lt;span class="pl-c1"&gt;=&lt;/span&gt; &lt;span class="pl-s1"&gt;resolve&lt;/span&gt;&lt;span class="pl-kos"&gt;;&lt;/span&gt;
&lt;span class="pl-kos"&gt;}&lt;/span&gt;&lt;span class="pl-kos"&gt;)&lt;/span&gt;&lt;span class="pl-kos"&gt;;&lt;/span&gt;

&lt;span class="pl-c"&gt;// This function handles messages sent from index.html to webworker.js&lt;/span&gt;
&lt;span class="pl-s1"&gt;self&lt;/span&gt;&lt;span class="pl-kos"&gt;.&lt;/span&gt;&lt;span class="pl-en"&gt;onmessage&lt;/span&gt; &lt;span class="pl-c1"&gt;=&lt;/span&gt; &lt;span class="pl-k"&gt;async&lt;/span&gt; &lt;span class="pl-kos"&gt;(&lt;/span&gt;&lt;span class="pl-s1"&gt;event&lt;/span&gt;&lt;span class="pl-kos"&gt;)&lt;/span&gt; &lt;span class="pl-c1"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="pl-kos"&gt;{&lt;/span&gt;
  &lt;span class="pl-c"&gt;// The first message should be that startup message, carrying the URL&lt;/span&gt;
  &lt;span class="pl-k"&gt;if&lt;/span&gt; &lt;span class="pl-kos"&gt;(&lt;/span&gt;&lt;span class="pl-s1"&gt;event&lt;/span&gt;&lt;span class="pl-kos"&gt;.&lt;/span&gt;&lt;span class="pl-c1"&gt;data&lt;/span&gt;&lt;span class="pl-kos"&gt;.&lt;/span&gt;&lt;span class="pl-c1"&gt;type&lt;/span&gt; &lt;span class="pl-c1"&gt;==&lt;/span&gt; &lt;span class="pl-s"&gt;'startup'&lt;/span&gt;&lt;span class="pl-kos"&gt;)&lt;/span&gt; &lt;span class="pl-kos"&gt;{&lt;/span&gt;
    &lt;span class="pl-k"&gt;await&lt;/span&gt; &lt;span class="pl-en"&gt;startDatasette&lt;/span&gt;&lt;span class="pl-kos"&gt;(&lt;/span&gt;&lt;span class="pl-s1"&gt;event&lt;/span&gt;&lt;span class="pl-kos"&gt;.&lt;/span&gt;&lt;span class="pl-c1"&gt;data&lt;/span&gt;&lt;span class="pl-kos"&gt;.&lt;/span&gt;&lt;span class="pl-c1"&gt;initialUrl&lt;/span&gt;&lt;span class="pl-kos"&gt;)&lt;/span&gt;&lt;span class="pl-kos"&gt;;&lt;/span&gt;
    &lt;span class="pl-k"&gt;return&lt;/span&gt;&lt;span class="pl-kos"&gt;;&lt;/span&gt;
  &lt;span class="pl-kos"&gt;}&lt;/span&gt;
  &lt;span class="pl-c"&gt;// This promise trick ensures that we don't run the next block until we&lt;/span&gt;
  &lt;span class="pl-c"&gt;// are certain that startDatasette() has finished and the ds.client&lt;/span&gt;
  &lt;span class="pl-c"&gt;// Python object is ready to use&lt;/span&gt;
  &lt;span class="pl-k"&gt;await&lt;/span&gt; &lt;span class="pl-s1"&gt;readyPromise&lt;/span&gt;&lt;span class="pl-kos"&gt;;&lt;/span&gt;
  &lt;span class="pl-c"&gt;// Run the reuest in Python to get a status code, content type and text&lt;/span&gt;
  &lt;span class="pl-k"&gt;try&lt;/span&gt; &lt;span class="pl-kos"&gt;{&lt;/span&gt;
    &lt;span class="pl-k"&gt;let&lt;/span&gt; &lt;span class="pl-kos"&gt;[&lt;/span&gt;&lt;span class="pl-s1"&gt;status&lt;/span&gt;&lt;span class="pl-kos"&gt;,&lt;/span&gt; &lt;span class="pl-s1"&gt;contentType&lt;/span&gt;&lt;span class="pl-kos"&gt;,&lt;/span&gt; &lt;span class="pl-s1"&gt;text&lt;/span&gt;&lt;span class="pl-kos"&gt;]&lt;/span&gt; &lt;span class="pl-c1"&gt;=&lt;/span&gt; &lt;span class="pl-k"&gt;await&lt;/span&gt; &lt;span class="pl-s1"&gt;self&lt;/span&gt;&lt;span class="pl-kos"&gt;.&lt;/span&gt;&lt;span class="pl-c1"&gt;pyodide&lt;/span&gt;&lt;span class="pl-kos"&gt;.&lt;/span&gt;&lt;span class="pl-en"&gt;runPythonAsync&lt;/span&gt;&lt;span class="pl-kos"&gt;(&lt;/span&gt;
      &lt;span class="pl-s"&gt;`&lt;/span&gt;
&lt;span class="pl-s"&gt;      import json&lt;/span&gt;
&lt;span class="pl-s"&gt;      # ds.client.get(path) simulates running a request through Datasette&lt;/span&gt;
&lt;span class="pl-s"&gt;      response = await ds.client.get(&lt;/span&gt;
&lt;span class="pl-s"&gt;          # Using json here is a quick way to generate a quoted string&lt;/span&gt;
&lt;span class="pl-s"&gt;          &lt;span class="pl-s1"&gt;&lt;span class="pl-kos"&gt;${&lt;/span&gt;&lt;span class="pl-c1"&gt;JSON&lt;/span&gt;&lt;span class="pl-kos"&gt;.&lt;/span&gt;&lt;span class="pl-en"&gt;stringify&lt;/span&gt;&lt;span class="pl-kos"&gt;(&lt;/span&gt;&lt;span class="pl-s1"&gt;event&lt;/span&gt;&lt;span class="pl-kos"&gt;.&lt;/span&gt;&lt;span class="pl-c1"&gt;data&lt;/span&gt;&lt;span class="pl-kos"&gt;.&lt;/span&gt;&lt;span class="pl-c1"&gt;path&lt;/span&gt;&lt;span class="pl-kos"&gt;)&lt;/span&gt;&lt;span class="pl-kos"&gt;}&lt;/span&gt;&lt;/span&gt;,&lt;/span&gt;
&lt;span class="pl-s"&gt;          # If Datasette redirects to another page we want to follow that&lt;/span&gt;
&lt;span class="pl-s"&gt;          follow_redirects=True&lt;/span&gt;
&lt;span class="pl-s"&gt;      )&lt;/span&gt;
&lt;span class="pl-s"&gt;      [response.status_code, response.headers.get("content-type"), response.text]&lt;/span&gt;
&lt;span class="pl-s"&gt;      `&lt;/span&gt;
    &lt;span class="pl-kos"&gt;)&lt;/span&gt;&lt;span class="pl-kos"&gt;;&lt;/span&gt;
    &lt;span class="pl-c"&gt;// Message the results back to index.html&lt;/span&gt;
    &lt;span class="pl-s1"&gt;self&lt;/span&gt;&lt;span class="pl-kos"&gt;.&lt;/span&gt;&lt;span class="pl-en"&gt;postMessage&lt;/span&gt;&lt;span class="pl-kos"&gt;(&lt;/span&gt;&lt;span class="pl-kos"&gt;{&lt;/span&gt;status&lt;span class="pl-kos"&gt;,&lt;/span&gt; contentType&lt;span class="pl-kos"&gt;,&lt;/span&gt; text&lt;span class="pl-kos"&gt;}&lt;/span&gt;&lt;span class="pl-kos"&gt;)&lt;/span&gt;&lt;span class="pl-kos"&gt;;&lt;/span&gt;
  &lt;span class="pl-kos"&gt;}&lt;/span&gt; &lt;span class="pl-k"&gt;catch&lt;/span&gt; &lt;span class="pl-kos"&gt;(&lt;/span&gt;&lt;span class="pl-s1"&gt;error&lt;/span&gt;&lt;span class="pl-kos"&gt;)&lt;/span&gt; &lt;span class="pl-kos"&gt;{&lt;/span&gt;
    &lt;span class="pl-c"&gt;// If an error occurred, send that back as a {error: ...} message&lt;/span&gt;
    &lt;span class="pl-s1"&gt;self&lt;/span&gt;&lt;span class="pl-kos"&gt;.&lt;/span&gt;&lt;span class="pl-en"&gt;postMessage&lt;/span&gt;&lt;span class="pl-kos"&gt;(&lt;/span&gt;&lt;span class="pl-kos"&gt;{&lt;/span&gt;&lt;span class="pl-c1"&gt;error&lt;/span&gt;: &lt;span class="pl-s1"&gt;error&lt;/span&gt;&lt;span class="pl-kos"&gt;.&lt;/span&gt;&lt;span class="pl-c1"&gt;message&lt;/span&gt;&lt;span class="pl-kos"&gt;}&lt;/span&gt;&lt;span class="pl-kos"&gt;)&lt;/span&gt;&lt;span class="pl-kos"&gt;;&lt;/span&gt;
  &lt;span class="pl-kos"&gt;}&lt;/span&gt;
&lt;span class="pl-kos"&gt;}&lt;/span&gt;&lt;span class="pl-kos"&gt;;&lt;/span&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;One last bit of code: here's the JavaScript in &lt;code&gt;index.html&lt;/code&gt; which intercepts clicks on links and turns them into messages to the worker:&lt;/p&gt;
&lt;div class="highlight highlight-source-js"&gt;&lt;pre&gt;&lt;span class="pl-k"&gt;let&lt;/span&gt; &lt;span class="pl-s1"&gt;output&lt;/span&gt; &lt;span class="pl-c1"&gt;=&lt;/span&gt; &lt;span class="pl-smi"&gt;document&lt;/span&gt;&lt;span class="pl-kos"&gt;.&lt;/span&gt;&lt;span class="pl-en"&gt;getElementById&lt;/span&gt;&lt;span class="pl-kos"&gt;(&lt;/span&gt;&lt;span class="pl-s"&gt;'output'&lt;/span&gt;&lt;span class="pl-kos"&gt;)&lt;/span&gt;&lt;span class="pl-kos"&gt;;&lt;/span&gt;
&lt;span class="pl-c"&gt;// This captures any click on any element within &amp;lt;div id="output"&amp;gt;&lt;/span&gt;
&lt;span class="pl-s1"&gt;output&lt;/span&gt;&lt;span class="pl-kos"&gt;.&lt;/span&gt;&lt;span class="pl-en"&gt;addEventListener&lt;/span&gt;&lt;span class="pl-kos"&gt;(&lt;/span&gt;&lt;span class="pl-s"&gt;'click'&lt;/span&gt;&lt;span class="pl-kos"&gt;,&lt;/span&gt; &lt;span class="pl-kos"&gt;(&lt;/span&gt;&lt;span class="pl-s1"&gt;ev&lt;/span&gt; &lt;span class="pl-c1"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="pl-kos"&gt;{&lt;/span&gt;
  &lt;span class="pl-c"&gt;// .closest("a") traverses up the DOM to find if this is an a&lt;/span&gt;
  &lt;span class="pl-c"&gt;// or an element nested in an a. We ignore other clicks.&lt;/span&gt;
  &lt;span class="pl-k"&gt;var&lt;/span&gt; &lt;span class="pl-s1"&gt;link&lt;/span&gt; &lt;span class="pl-c1"&gt;=&lt;/span&gt; &lt;span class="pl-s1"&gt;ev&lt;/span&gt;&lt;span class="pl-kos"&gt;.&lt;/span&gt;&lt;span class="pl-c1"&gt;srcElement&lt;/span&gt;&lt;span class="pl-kos"&gt;.&lt;/span&gt;&lt;span class="pl-en"&gt;closest&lt;/span&gt;&lt;span class="pl-kos"&gt;(&lt;/span&gt;&lt;span class="pl-s"&gt;"a"&lt;/span&gt;&lt;span class="pl-kos"&gt;)&lt;/span&gt;&lt;span class="pl-kos"&gt;;&lt;/span&gt;
  &lt;span class="pl-k"&gt;if&lt;/span&gt; &lt;span class="pl-kos"&gt;(&lt;/span&gt;&lt;span class="pl-s1"&gt;link&lt;/span&gt; &lt;span class="pl-c1"&gt;&amp;amp;&amp;amp;&lt;/span&gt; &lt;span class="pl-s1"&gt;link&lt;/span&gt;&lt;span class="pl-kos"&gt;.&lt;/span&gt;&lt;span class="pl-c1"&gt;href&lt;/span&gt;&lt;span class="pl-kos"&gt;)&lt;/span&gt; &lt;span class="pl-kos"&gt;{&lt;/span&gt;
    &lt;span class="pl-c"&gt;// It was a click on a &amp;lt;a href="..."&amp;gt; link! Cancel the event:&lt;/span&gt;
    &lt;span class="pl-s1"&gt;ev&lt;/span&gt;&lt;span class="pl-kos"&gt;.&lt;/span&gt;&lt;span class="pl-en"&gt;stopPropagation&lt;/span&gt;&lt;span class="pl-kos"&gt;(&lt;/span&gt;&lt;span class="pl-kos"&gt;)&lt;/span&gt;&lt;span class="pl-kos"&gt;;&lt;/span&gt;
    &lt;span class="pl-s1"&gt;ev&lt;/span&gt;&lt;span class="pl-kos"&gt;.&lt;/span&gt;&lt;span class="pl-en"&gt;preventDefault&lt;/span&gt;&lt;span class="pl-kos"&gt;(&lt;/span&gt;&lt;span class="pl-kos"&gt;)&lt;/span&gt;&lt;span class="pl-kos"&gt;;&lt;/span&gt;
    &lt;span class="pl-c"&gt;// I want #fragment links to still work, using scrollIntoView()&lt;/span&gt;
    &lt;span class="pl-k"&gt;if&lt;/span&gt; &lt;span class="pl-kos"&gt;(&lt;/span&gt;&lt;span class="pl-en"&gt;isFragmentLink&lt;/span&gt;&lt;span class="pl-kos"&gt;(&lt;/span&gt;&lt;span class="pl-s1"&gt;link&lt;/span&gt;&lt;span class="pl-kos"&gt;.&lt;/span&gt;&lt;span class="pl-c1"&gt;href&lt;/span&gt;&lt;span class="pl-kos"&gt;)&lt;/span&gt;&lt;span class="pl-kos"&gt;)&lt;/span&gt; &lt;span class="pl-kos"&gt;{&lt;/span&gt;
      &lt;span class="pl-c"&gt;// Jump them to that element, but don't update the URL bar&lt;/span&gt;
      &lt;span class="pl-c"&gt;// since we use # in the URL to mean something else&lt;/span&gt;
      &lt;span class="pl-k"&gt;let&lt;/span&gt; &lt;span class="pl-s1"&gt;fragment&lt;/span&gt; &lt;span class="pl-c1"&gt;=&lt;/span&gt; &lt;span class="pl-k"&gt;new&lt;/span&gt; &lt;span class="pl-c1"&gt;URL&lt;/span&gt;&lt;span class="pl-kos"&gt;(&lt;/span&gt;&lt;span class="pl-s1"&gt;link&lt;/span&gt;&lt;span class="pl-kos"&gt;.&lt;/span&gt;&lt;span class="pl-c1"&gt;href&lt;/span&gt;&lt;span class="pl-kos"&gt;)&lt;/span&gt;&lt;span class="pl-kos"&gt;.&lt;/span&gt;&lt;span class="pl-c1"&gt;hash&lt;/span&gt;&lt;span class="pl-kos"&gt;.&lt;/span&gt;&lt;span class="pl-en"&gt;replace&lt;/span&gt;&lt;span class="pl-kos"&gt;(&lt;/span&gt;&lt;span class="pl-s"&gt;"#"&lt;/span&gt;&lt;span class="pl-kos"&gt;,&lt;/span&gt; &lt;span class="pl-s"&gt;""&lt;/span&gt;&lt;span class="pl-kos"&gt;)&lt;/span&gt;&lt;span class="pl-kos"&gt;;&lt;/span&gt;
      &lt;span class="pl-k"&gt;if&lt;/span&gt; &lt;span class="pl-kos"&gt;(&lt;/span&gt;&lt;span class="pl-s1"&gt;fragment&lt;/span&gt;&lt;span class="pl-kos"&gt;)&lt;/span&gt; &lt;span class="pl-kos"&gt;{&lt;/span&gt;
        &lt;span class="pl-k"&gt;let&lt;/span&gt; &lt;span class="pl-s1"&gt;el&lt;/span&gt; &lt;span class="pl-c1"&gt;=&lt;/span&gt; &lt;span class="pl-smi"&gt;document&lt;/span&gt;&lt;span class="pl-kos"&gt;.&lt;/span&gt;&lt;span class="pl-en"&gt;getElementById&lt;/span&gt;&lt;span class="pl-kos"&gt;(&lt;/span&gt;&lt;span class="pl-s1"&gt;fragment&lt;/span&gt;&lt;span class="pl-kos"&gt;)&lt;/span&gt;&lt;span class="pl-kos"&gt;;&lt;/span&gt;
        &lt;span class="pl-s1"&gt;el&lt;/span&gt;&lt;span class="pl-kos"&gt;.&lt;/span&gt;&lt;span class="pl-en"&gt;scrollIntoView&lt;/span&gt;&lt;span class="pl-kos"&gt;(&lt;/span&gt;&lt;span class="pl-kos"&gt;)&lt;/span&gt;&lt;span class="pl-kos"&gt;;&lt;/span&gt;
      &lt;span class="pl-kos"&gt;}&lt;/span&gt;
      &lt;span class="pl-k"&gt;return&lt;/span&gt;&lt;span class="pl-kos"&gt;;&lt;/span&gt;
    &lt;span class="pl-kos"&gt;}&lt;/span&gt;
    &lt;span class="pl-k"&gt;let&lt;/span&gt; &lt;span class="pl-s1"&gt;href&lt;/span&gt; &lt;span class="pl-c1"&gt;=&lt;/span&gt; &lt;span class="pl-s1"&gt;link&lt;/span&gt;&lt;span class="pl-kos"&gt;.&lt;/span&gt;&lt;span class="pl-en"&gt;getAttribute&lt;/span&gt;&lt;span class="pl-kos"&gt;(&lt;/span&gt;&lt;span class="pl-s"&gt;"href"&lt;/span&gt;&lt;span class="pl-kos"&gt;)&lt;/span&gt;&lt;span class="pl-kos"&gt;;&lt;/span&gt;
    &lt;span class="pl-c"&gt;// Links to external sites should open in a new window&lt;/span&gt;
    &lt;span class="pl-k"&gt;if&lt;/span&gt; &lt;span class="pl-kos"&gt;(&lt;/span&gt;&lt;span class="pl-en"&gt;isExternal&lt;/span&gt;&lt;span class="pl-kos"&gt;(&lt;/span&gt;&lt;span class="pl-s1"&gt;href&lt;/span&gt;&lt;span class="pl-kos"&gt;)&lt;/span&gt;&lt;span class="pl-kos"&gt;)&lt;/span&gt; &lt;span class="pl-kos"&gt;{&lt;/span&gt;
      &lt;span class="pl-smi"&gt;window&lt;/span&gt;&lt;span class="pl-kos"&gt;.&lt;/span&gt;&lt;span class="pl-en"&gt;open&lt;/span&gt;&lt;span class="pl-kos"&gt;(&lt;/span&gt;&lt;span class="pl-s1"&gt;href&lt;/span&gt;&lt;span class="pl-kos"&gt;)&lt;/span&gt;&lt;span class="pl-kos"&gt;;&lt;/span&gt;
      &lt;span class="pl-k"&gt;return&lt;/span&gt;&lt;span class="pl-kos"&gt;;&lt;/span&gt;
    &lt;span class="pl-kos"&gt;}&lt;/span&gt;
    &lt;span class="pl-c"&gt;// It's an internal link navigation - send it to the worker&lt;/span&gt;
    &lt;span class="pl-en"&gt;loadPath&lt;/span&gt;&lt;span class="pl-kos"&gt;(&lt;/span&gt;&lt;span class="pl-s1"&gt;href&lt;/span&gt;&lt;span class="pl-kos"&gt;)&lt;/span&gt;&lt;span class="pl-kos"&gt;;&lt;/span&gt;
  &lt;span class="pl-kos"&gt;}&lt;/span&gt;
&lt;span class="pl-kos"&gt;}&lt;/span&gt;&lt;span class="pl-kos"&gt;)&lt;/span&gt;&lt;span class="pl-kos"&gt;,&lt;/span&gt; &lt;span class="pl-c1"&gt;true&lt;/span&gt;&lt;span class="pl-kos"&gt;)&lt;/span&gt;&lt;span class="pl-kos"&gt;;&lt;/span&gt;

&lt;span class="pl-k"&gt;function&lt;/span&gt; &lt;span class="pl-en"&gt;loadPath&lt;/span&gt;&lt;span class="pl-kos"&gt;(&lt;/span&gt;&lt;span class="pl-s1"&gt;path&lt;/span&gt;&lt;span class="pl-kos"&gt;)&lt;/span&gt; &lt;span class="pl-kos"&gt;{&lt;/span&gt;
  &lt;span class="pl-c"&gt;// We don't want anything after #, and we only want the /path&lt;/span&gt;
  &lt;span class="pl-s1"&gt;path&lt;/span&gt; &lt;span class="pl-c1"&gt;=&lt;/span&gt; &lt;span class="pl-s1"&gt;path&lt;/span&gt;&lt;span class="pl-kos"&gt;.&lt;/span&gt;&lt;span class="pl-en"&gt;split&lt;/span&gt;&lt;span class="pl-kos"&gt;(&lt;/span&gt;&lt;span class="pl-s"&gt;"#"&lt;/span&gt;&lt;span class="pl-kos"&gt;)&lt;/span&gt;&lt;span class="pl-kos"&gt;[&lt;/span&gt;&lt;span class="pl-c1"&gt;0&lt;/span&gt;&lt;span class="pl-kos"&gt;]&lt;/span&gt;&lt;span class="pl-kos"&gt;.&lt;/span&gt;&lt;span class="pl-en"&gt;replace&lt;/span&gt;&lt;span class="pl-kos"&gt;(&lt;/span&gt;&lt;span class="pl-s"&gt;"http://localhost"&lt;/span&gt;&lt;span class="pl-kos"&gt;,&lt;/span&gt; &lt;span class="pl-s"&gt;""&lt;/span&gt;&lt;span class="pl-kos"&gt;)&lt;/span&gt;&lt;span class="pl-kos"&gt;;&lt;/span&gt;
  &lt;span class="pl-c"&gt;// Update the URL with the new # location&lt;/span&gt;
  &lt;span class="pl-s1"&gt;history&lt;/span&gt;&lt;span class="pl-kos"&gt;.&lt;/span&gt;&lt;span class="pl-en"&gt;pushState&lt;/span&gt;&lt;span class="pl-kos"&gt;(&lt;/span&gt;&lt;span class="pl-kos"&gt;{&lt;/span&gt;&lt;span class="pl-c1"&gt;path&lt;/span&gt;: &lt;span class="pl-s1"&gt;path&lt;/span&gt;&lt;span class="pl-kos"&gt;}&lt;/span&gt;&lt;span class="pl-kos"&gt;,&lt;/span&gt; &lt;span class="pl-s1"&gt;path&lt;/span&gt;&lt;span class="pl-kos"&gt;,&lt;/span&gt; &lt;span class="pl-s"&gt;"#"&lt;/span&gt; &lt;span class="pl-c1"&gt;+&lt;/span&gt; &lt;span class="pl-s1"&gt;path&lt;/span&gt;&lt;span class="pl-kos"&gt;)&lt;/span&gt;&lt;span class="pl-kos"&gt;;&lt;/span&gt;
  &lt;span class="pl-c"&gt;// Plausible analytics, see:&lt;/span&gt;
  &lt;span class="pl-c"&gt;// https://github.com/simonw/datasette-lite/issues/22&lt;/span&gt;
  &lt;span class="pl-s1"&gt;useAnalytics&lt;/span&gt; &lt;span class="pl-c1"&gt;&amp;amp;&amp;amp;&lt;/span&gt; &lt;span class="pl-en"&gt;plausible&lt;/span&gt;&lt;span class="pl-kos"&gt;(&lt;/span&gt;&lt;span class="pl-s"&gt;'pageview'&lt;/span&gt;&lt;span class="pl-kos"&gt;,&lt;/span&gt; &lt;span class="pl-kos"&gt;{&lt;/span&gt;&lt;span class="pl-c1"&gt;u&lt;/span&gt;: &lt;span class="pl-s1"&gt;location&lt;/span&gt;&lt;span class="pl-kos"&gt;.&lt;/span&gt;&lt;span class="pl-c1"&gt;href&lt;/span&gt;&lt;span class="pl-kos"&gt;.&lt;/span&gt;&lt;span class="pl-en"&gt;replace&lt;/span&gt;&lt;span class="pl-kos"&gt;(&lt;/span&gt;&lt;span class="pl-s"&gt;'?url='&lt;/span&gt;&lt;span class="pl-kos"&gt;,&lt;/span&gt; &lt;span class="pl-s"&gt;''&lt;/span&gt;&lt;span class="pl-kos"&gt;)&lt;/span&gt;&lt;span class="pl-kos"&gt;.&lt;/span&gt;&lt;span class="pl-en"&gt;replace&lt;/span&gt;&lt;span class="pl-kos"&gt;(&lt;/span&gt;&lt;span class="pl-s"&gt;'#'&lt;/span&gt;&lt;span class="pl-kos"&gt;,&lt;/span&gt; &lt;span class="pl-s"&gt;'/'&lt;/span&gt;&lt;span class="pl-kos"&gt;)&lt;/span&gt;&lt;span class="pl-kos"&gt;}&lt;/span&gt;&lt;span class="pl-kos"&gt;)&lt;/span&gt;&lt;span class="pl-kos"&gt;;&lt;/span&gt;
  &lt;span class="pl-c"&gt;// Send a {path: "/path"} message to the worker&lt;/span&gt;
  &lt;span class="pl-s1"&gt;datasetteWorker&lt;/span&gt;&lt;span class="pl-kos"&gt;.&lt;/span&gt;&lt;span class="pl-en"&gt;postMessage&lt;/span&gt;&lt;span class="pl-kos"&gt;(&lt;/span&gt;&lt;span class="pl-kos"&gt;{&lt;/span&gt;path&lt;span class="pl-kos"&gt;}&lt;/span&gt;&lt;span class="pl-kos"&gt;)&lt;/span&gt;&lt;span class="pl-kos"&gt;;&lt;/span&gt;
&lt;span class="pl-kos"&gt;}&lt;/span&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;h4&gt;Getting Datasette to work in Pyodide&lt;/h4&gt;
&lt;p&gt;&lt;a href="https://pyodide.org/"&gt;Pyodide&lt;/a&gt; is the secret sauce that makes this all possible. That project provides several key components:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;A custom WebAssembly build of the core Python interpreter, bundling the standard library (including a compiled WASM version of SQLite)&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://pyodide.org/en/stable/usage/loading-packages.html#micropip"&gt;micropip&lt;/a&gt; - a package that can install additional Python dependencies by downloading them from &lt;a href="https://pypi.org/"&gt;PyPI&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;A comprehensive JavaScript to Python bridge, including mechanisms for &lt;a href="https://pyodide.org/en/stable/usage/type-conversions.html"&gt;translating Python objects&lt;/a&gt; to JavaScript and vice-versa&lt;/li&gt;
&lt;li&gt;A &lt;a href="https://pyodide.org/en/stable/usage/api/js-api.html"&gt;JavaScript API&lt;/a&gt; for launching and then managing a Python interpreter process&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;I found the documentation on &lt;a href="https://pyodide.org/en/stable/usage/webworker.html"&gt;Using Pyodide in a web worker&lt;/a&gt; particularly helpful.&lt;/p&gt;
&lt;p&gt;I had to make a few changes to Datasette to get it working with Pyodide. My &lt;a href="https://github.com/simonw/datasette/issues/1733"&gt;tracking issue for that&lt;/a&gt; has the full details, but the short version is:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Ensure each of Datasette's dependencies had a wheel package on PyPI (as opposed to just a &lt;code&gt;.tar.gz&lt;/code&gt;) - &lt;code&gt;micropip&lt;/code&gt; only works with wheels. I ended up removing &lt;code&gt;python-baseconv&lt;/code&gt; as a dependency and replacing &lt;code&gt;click-default-group&lt;/code&gt; with my own &lt;code&gt;click-default-group-wheel&lt;/code&gt; forked package (&lt;a href="https://github.com/simonw/click-default-group-wheel"&gt;repo here&lt;/a&gt;). I got &lt;code&gt;sqlite-utils&lt;/code&gt; working in Pyodide with this change too, see the &lt;a href="https://sqlite-utils.datasette.io/en/stable/changelog.html#v3-26-1"&gt;3.26.1 release notes&lt;/a&gt;.&lt;/li&gt;
&lt;li&gt;Work around an error caused by importing &lt;code&gt;uvicorn&lt;/code&gt;. Since Datasette Lite doesn't actually run its own web server that dependency wasn't necessary, so I changed my code to catch the &lt;code&gt;ImportError&lt;/code&gt; in the right place.&lt;/li&gt;
&lt;li&gt;The biggest change: WebAssembly can't run threads, which means Python can't run threads, which means any attempts to start a thread in Python cause an error. Datasette only uses threads in one place: to execute SQL queries in a thread pool where they won't block the event loop. I added a new &lt;code&gt;--setting num_sql_threads 0&lt;/code&gt; feature for disabling threading entirely, see &lt;a href="https://github.com/simonw/datasette/issues/1735"&gt;issue 1735&lt;/a&gt;.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Having made those changes I shipped them in a &lt;a href="https://github.com/simonw/datasette/releases/tag/0.62a0"&gt;Datasette 0.62a0&lt;/a&gt; release. It's this release that Datasette Lite installs from PyPI.&lt;/p&gt;
&lt;h4&gt;Fragment hashes for navigation&lt;/h4&gt;
&lt;p&gt;You may have noticed that as you navigate through Datasette Lite the URL bar updates with URLs that look like the following:&lt;/p&gt;
&lt;p&gt;&lt;a href="https://lite.datasette.io/#/content/pypi_packages?_facet=author"&gt;https://lite.datasette.io/#/content/pypi_packages?_facet=author&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;I'm using the &lt;code&gt;#&lt;/code&gt; here to separate out the path within the virtual Datasette instance from the URL to the Datasette Lite application itself.&lt;/p&gt;
&lt;p&gt;Maintaining the state in the URL like this means that the Back and Forward browser buttons work, and also means that users can bookmark pages within the application and share links to them.&lt;/p&gt;
&lt;p&gt;I usually like to avoid &lt;code&gt;#&lt;/code&gt; URLs - the HTML history API makes it possible to use "real" URLs these days, even for JavaScript applications. But in the case of Datasette Lite those URLs wouldn't actually work - if someone attempted to refresh the page or navigate to a link GitHub Pages wouldn't know what file to serve.&lt;/p&gt;
&lt;p&gt;I could run this on my own domain with a catch-all page handler that serves the Datasette Lite HTML and JavaScript no matter what path is requested, but I wanted to keep this as pure and simple as possible.&lt;/p&gt;
&lt;p&gt;This also means I can reserve Datasette Lite's own query string for things like specifying the database to load, and potentially other options in the future.&lt;/p&gt;
&lt;h4&gt;Web Workers or Service Workers?&lt;/h4&gt;
&lt;p&gt;My initial idea for this project was to build it with &lt;a href="https://developer.mozilla.org/en-US/docs/Web/API/Service_Worker_API/Using_Service_Workers"&gt;Service Workers&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;Service Workers are some deep, deep browser magic: they let you install a process that can intercept browser traffic to a specific domain (or path within that domain) and run custom code to return a result. Effectively they let you run your own server-side code in the browser itself.&lt;/p&gt;
&lt;p&gt;They're mainly designed for building offline applications, but my hope was that I could use them to offer a full simulation of a server-side application instead.&lt;/p&gt;
&lt;p&gt;Here's my TIL on &lt;a href="https://til.simonwillison.net/service-workers/intercept-fetch"&gt;Intercepting fetch in a service worker&lt;/a&gt; that came out of my initial research.&lt;/p&gt;
&lt;p&gt;I managed to get a server-side JavaScript "hello world" demo working, but when I tried to add Pyodide I ran into some unavoidable road blocks. It turns out Service Workers are very restricted in which APIs they provide - in particular, they don't allow &lt;code&gt;XMLHttpRequest&lt;/code&gt; calls. Pyodide apparently depends on &lt;code&gt;XMLHttpRequest&lt;/code&gt;, so it was unable to run in a Service Worker at all. I &lt;a href="https://github.com/pyodide/pyodide/issues/2432"&gt;filed an issue&lt;/a&gt; about it with the Pyodide project.&lt;/p&gt;
&lt;p&gt;Initially I thought this would block the whole project, but eventually I figured out a way to achieve the same goals using Web Workers instead.&lt;/p&gt;
&lt;h3&gt;Is this an SPA or an MPA?&lt;/h3&gt;
&lt;p&gt;SPAs are Single Page Applications. MPAs are Multi Page Applications. Datasette Lite is a weird hybrid of the two.&lt;/p&gt;
&lt;p&gt;This amuses me greatly.&lt;/p&gt;
&lt;p&gt;Datasette itself is very deliberately architected as a multi page application.&lt;/p&gt;
&lt;p&gt;I think SPAs, as developed over the last decade, have mostly been a mistake. In my experience they take longer to build, have more bugs and provide worse performance than a server-side, multi-page alternative implementation.&lt;/p&gt;
&lt;p&gt;Obviously if you are building Figma or VS Code then SPAs are the right way to go. But most web applications are not Figma, and don't need to be!&lt;/p&gt;
&lt;p&gt;(I used to think Gmail was a shining example of an SPA, but it's so sludgy and slow loading these days that I now see it as more of an argument against the paradigm.)&lt;/p&gt;
&lt;p&gt;Datasette Lite is an SPA wrapper around an MPA. It literally simulates the existing MPA by running it in a web worker.&lt;/p&gt;
&lt;p&gt;It's very heavy - it loads 11MB of assets before it can show you anything. But it also inherits many of the benefits of the underlying MPA: it has obvious distinctions between pages, a deeply interlinked interface, working back and forward buttons, it's bookmarkable and it's easy to maintain and add new features.&lt;/p&gt;
&lt;p&gt;I'm not sure what my conclusion here is. I'm skeptical of SPAs, and now I've built a particularly weird one. Is this even a good idea? I'm looking forward to finding that out for myself.&lt;/p&gt;
&lt;h4&gt;Coming soon: JavaScript!&lt;/h4&gt;
&lt;p&gt;Another amusing detail about Datasette Lite is that the one part of Datasette that doesn't work yet is Datasette's existing JavaScript features!&lt;/p&gt;
&lt;p&gt;Datasette currently makes very sparing use of JavaScript in the UI: it's used to add some drop-down interactive menus (including the handy "cog" menu on column headings) and for a CodeMirror-enhanced SQL editing interface.&lt;/p&gt;
&lt;p&gt;JavaScript is used much more extensively by several popular Datasette plugins, including &lt;a href="https://datasette.io/plugins/datasette-cluster-map"&gt;datasette-cluster-map&lt;/a&gt; and &lt;a href="https://datasette.io/plugins/datasette-vega"&gt;datasette-vega&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;Unfortunately none of this works in Datasette Lite at the moment - because I don't yet have a good way to turn &lt;code&gt;&amp;lt;script src="..."&amp;gt;&lt;/code&gt; links into things that can load content from the Web Worker.&lt;/p&gt;
&lt;p&gt;This is one of the reasons I was initially hopeful about Service Workers.&lt;/p&gt;
&lt;p&gt;Thankfully, since Datasette is built on the principles of progressive enhancement this doesn't matter: the application remains usable even if none of the JavaScript enhancements are applied.&lt;/p&gt;
&lt;p&gt;I have an &lt;a href="https://github.com/simonw/datasette-lite/issues/8"&gt;open issue for this&lt;/a&gt;. I welcome suggestions as to how I can get all of Datasette's existing JavaScript working in the new environment with as little effort as possible.&lt;/p&gt;
&lt;h4 id="bonus-shot-scraper"&gt;Bonus: Testing it with shot-scraper&lt;/h4&gt;
&lt;p&gt;In building Datasette Lite, I've committed to making Pyodide a supported runtime environment for Datasette. How can I ensure that future changes I make to Datasette - accidentally introducing a new dependency that doesn't work there for example - don't break in Pyodide without me noticing?&lt;/p&gt;
&lt;p&gt;This felt like a great opportunity to exercise my &lt;a href="https://datasette.io/tools/shot-scraper"&gt;shot-scraper&lt;/a&gt; CLI tool, in particular its ability to run some JavaScript against a page and &lt;a href="https://github.com/simonw/shot-scraper/blob/0.13/README.md#handling-javascript-errors"&gt;pass or fail a CI job&lt;/a&gt; depending on if that JavaScript throws an error.&lt;/p&gt;
&lt;p&gt;Pyodide needs you to run it from a real web server, not just an HTML file saved to disk - so I put together a &lt;a href="https://github.com/simonw/datasette/blob/280ff372ab30df244f6c54f6f3002da57334b3d7/test-in-pyodide-with-shot-scraper.sh"&gt;very scrappy shell script&lt;/a&gt; which builds a Datasette wheel package, starts a localhost file server (using &lt;code&gt;python3 -m http.server&lt;/code&gt;), then uses &lt;code&gt;shot-scraper javascript&lt;/code&gt; to execute a test against it that installs Datasette from the wheel using &lt;code&gt;micropip&lt;/code&gt; and confirms that it can execute a simple SQL query via the JSON API.&lt;/p&gt;
&lt;p&gt;Here's the script in full, with extra comments:&lt;/p&gt;
&lt;div class="highlight highlight-source-shell"&gt;&lt;pre&gt;&lt;span class="pl-c"&gt;&lt;span class="pl-c"&gt;#!&lt;/span&gt;/bin/bash&lt;/span&gt;
&lt;span class="pl-c1"&gt;set&lt;/span&gt; -e
&lt;span class="pl-c"&gt;&lt;span class="pl-c"&gt;#&lt;/span&gt; I always forget to do this in my bash scripts - without it, any&lt;/span&gt;
&lt;span class="pl-c"&gt;&lt;span class="pl-c"&gt;#&lt;/span&gt; commands that fail in the script won't result in the script itself&lt;/span&gt;
&lt;span class="pl-c"&gt;&lt;span class="pl-c"&gt;#&lt;/span&gt; returning a non-zero exit code. I need it for running tests in CI.&lt;/span&gt;

&lt;span class="pl-c"&gt;&lt;span class="pl-c"&gt;#&lt;/span&gt; Build the wheel - this generates a file with a name similar to&lt;/span&gt;
&lt;span class="pl-c"&gt;&lt;span class="pl-c"&gt;#&lt;/span&gt; dist/datasette-0.62a0-py3-none-any.whl&lt;/span&gt;
python3 -m build

&lt;span class="pl-c"&gt;&lt;span class="pl-c"&gt;#&lt;/span&gt; Find the name of that wheel file, strip off the dist/&lt;/span&gt;
wheel=&lt;span class="pl-s"&gt;&lt;span class="pl-pds"&gt;$(&lt;/span&gt;basename &lt;span class="pl-s"&gt;&lt;span class="pl-pds"&gt;$(&lt;/span&gt;ls dist/&lt;span class="pl-k"&gt;*&lt;/span&gt;.whl&lt;span class="pl-pds"&gt;)&lt;/span&gt;&lt;/span&gt;&lt;span class="pl-pds"&gt;)&lt;/span&gt;&lt;/span&gt;
&lt;span class="pl-c"&gt;&lt;span class="pl-c"&gt;#&lt;/span&gt; $wheel is now datasette-0.62a0-py3-none-any.whl&lt;/span&gt;

&lt;span class="pl-c"&gt;&lt;span class="pl-c"&gt;#&lt;/span&gt; Create a blank index page that loads Pyodide&lt;/span&gt;
&lt;span class="pl-c1"&gt;echo&lt;/span&gt; &lt;span class="pl-s"&gt;&lt;span class="pl-pds"&gt;'&lt;/span&gt;&lt;/span&gt;
&lt;span class="pl-s"&gt;&amp;lt;script src="https://cdn.jsdelivr.net/pyodide/v0.20.0/full/pyodide.js"&amp;gt;&amp;lt;/script&amp;gt;&lt;/span&gt;
&lt;span class="pl-s"&gt;&lt;span class="pl-pds"&gt;'&lt;/span&gt;&lt;/span&gt; &lt;span class="pl-k"&gt;&amp;gt;&lt;/span&gt; dist/index.html

&lt;span class="pl-c"&gt;&lt;span class="pl-c"&gt;#&lt;/span&gt; Run a localhost web server for that dist/ folder, in the background&lt;/span&gt;
&lt;span class="pl-c"&gt;&lt;span class="pl-c"&gt;#&lt;/span&gt; so we can do more stuff in this script&lt;/span&gt;
&lt;span class="pl-c1"&gt;cd&lt;/span&gt; dist
python3 -m http.server 8529 &lt;span class="pl-k"&gt;&amp;amp;&lt;/span&gt;
&lt;span class="pl-c1"&gt;cd&lt;/span&gt; ..

&lt;span class="pl-c"&gt;&lt;span class="pl-c"&gt;#&lt;/span&gt; Now we use shot-scraper to run a block of JavaScript against our&lt;/span&gt;
&lt;span class="pl-c"&gt;&lt;span class="pl-c"&gt;#&lt;/span&gt; temporary web server. This will execute in the context of that&lt;/span&gt;
&lt;span class="pl-c"&gt;&lt;span class="pl-c"&gt;#&lt;/span&gt; index.html page we created earlier, which has loaded Pyodide&lt;/span&gt;
shot-scraper javascript http://localhost:8529/ &lt;span class="pl-s"&gt;&lt;span class="pl-pds"&gt;"&lt;/span&gt;&lt;/span&gt;
&lt;span class="pl-s"&gt;async () =&amp;gt; {&lt;/span&gt;
&lt;span class="pl-s"&gt;  // Load Pyodide and all of its necessary assets&lt;/span&gt;
&lt;span class="pl-s"&gt;  let pyodide = await loadPyodide();&lt;/span&gt;
&lt;span class="pl-s"&gt;  // We also need these packages for Datasette to work&lt;/span&gt;
&lt;span class="pl-s"&gt;  await pyodide.loadPackage(['micropip', 'ssl', 'setuptools']);&lt;/span&gt;
&lt;span class="pl-s"&gt;  // We need to escape the backticks because of Bash escaping rules&lt;/span&gt;
&lt;span class="pl-s"&gt;  let output = await pyodide.runPythonAsync(&lt;span class="pl-cce"&gt;\`&lt;/span&gt;&lt;/span&gt;
&lt;span class="pl-s"&gt;    import micropip&lt;/span&gt;
&lt;span class="pl-s"&gt;    // This is needed to avoid a dependency conflict error&lt;/span&gt;
&lt;span class="pl-s"&gt;    await micropip.install('h11==0.12.0')&lt;/span&gt;
&lt;span class="pl-s"&gt;    // Here we install the Datasette wheel package we created earlier&lt;/span&gt;
&lt;span class="pl-s"&gt;    await micropip.install('http://localhost:8529/&lt;span class="pl-smi"&gt;$wheel&lt;/span&gt;')&lt;/span&gt;
&lt;span class="pl-s"&gt;    // These imports avoid Pyodide errors importing datasette itself&lt;/span&gt;
&lt;span class="pl-s"&gt;    import ssl&lt;/span&gt;
&lt;span class="pl-s"&gt;    import setuptools&lt;/span&gt;
&lt;span class="pl-s"&gt;    from datasette.app import Datasette&lt;/span&gt;
&lt;span class="pl-s"&gt;    // num_sql_threads=0 is essential or Datasette will crash, since&lt;/span&gt;
&lt;span class="pl-s"&gt;    // Pyodide and WebAssembly cannot start threads&lt;/span&gt;
&lt;span class="pl-s"&gt;    ds = Datasette(memory=True, settings={'num_sql_threads': 0})&lt;/span&gt;
&lt;span class="pl-s"&gt;    // Simulate a hit to execute 'select 55 as itworks' and return the text&lt;/span&gt;
&lt;span class="pl-s"&gt;    (await ds.client.get(&lt;/span&gt;
&lt;span class="pl-s"&gt;      '/_memory.json?sql=select+55+as+itworks&amp;amp;_shape=array'&lt;/span&gt;
&lt;span class="pl-s"&gt;    )).text&lt;/span&gt;
&lt;span class="pl-s"&gt;  &lt;span class="pl-cce"&gt;\`&lt;/span&gt;);&lt;/span&gt;
&lt;span class="pl-s"&gt;  // The last expression in the runPythonAsync block is returned, here&lt;/span&gt;
&lt;span class="pl-s"&gt;  // that's the text returned by the simulated HTTP response to the JSON API&lt;/span&gt;
&lt;span class="pl-s"&gt;  if (JSON.parse(output)[0].itworks != 55) {&lt;/span&gt;
&lt;span class="pl-s"&gt;    // This throws if the JSON API did not return the expected result&lt;/span&gt;
&lt;span class="pl-s"&gt;    // shot-scraper turns that into a non-zero exit code for the script&lt;/span&gt;
&lt;span class="pl-s"&gt;    // which will cause the CI task to fail&lt;/span&gt;
&lt;span class="pl-s"&gt;    throw 'Got ' + output + ', expected itworks: 55';&lt;/span&gt;
&lt;span class="pl-s"&gt;  }&lt;/span&gt;
&lt;span class="pl-s"&gt;  // This gets displayed on the console, with a 0 exit code for a pass&lt;/span&gt;
&lt;span class="pl-s"&gt;  return 'Test passed!';&lt;/span&gt;
&lt;span class="pl-s"&gt;}&lt;/span&gt;
&lt;span class="pl-s"&gt;&lt;span class="pl-pds"&gt;"&lt;/span&gt;&lt;/span&gt;

&lt;span class="pl-c"&gt;&lt;span class="pl-c"&gt;#&lt;/span&gt; Shut down the server we started earlier, by searching for and killing&lt;/span&gt;
&lt;span class="pl-c"&gt;&lt;span class="pl-c"&gt;#&lt;/span&gt; a process that's running on the port we selected&lt;/span&gt;
pkill -f &lt;span class="pl-s"&gt;&lt;span class="pl-pds"&gt;'&lt;/span&gt;http.server 8529&lt;span class="pl-pds"&gt;'&lt;/span&gt;&lt;/span&gt;&lt;/pre&gt;&lt;/div&gt;
    
        &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/javascript"&gt;javascript&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/projects"&gt;projects&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/python"&gt;python&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/webworkers"&gt;webworkers&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/datasette"&gt;datasette&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/webassembly"&gt;webassembly&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/pyodide"&gt;pyodide&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/datasette-lite"&gt;datasette-lite&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/cors"&gt;cors&lt;/a&gt;&lt;/p&gt;
    

</summary><category term="javascript"/><category term="projects"/><category term="python"/><category term="webworkers"/><category term="datasette"/><category term="webassembly"/><category term="pyodide"/><category term="datasette-lite"/><category term="cors"/></entry><entry><title>How to win at CORS</title><link href="https://simonwillison.net/2021/Oct/12/how-to-win-at-cors/#atom-tag" rel="alternate"/><published>2021-10-12T14:07:49+00:00</published><updated>2021-10-12T14:07:49+00:00</updated><id>https://simonwillison.net/2021/Oct/12/how-to-win-at-cors/#atom-tag</id><summary type="html">
    
&lt;p&gt;&lt;strong&gt;&lt;a href="https://jakearchibald.com/2021/cors/"&gt;How to win at CORS&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;
Jake Archibald’s definitive guide to CORS, including a handy CORS playground interactive tool. Also includes a useful history explaining why we need CORS in the first place.

    &lt;p&gt;&lt;small&gt;&lt;/small&gt;Via &lt;a href="https://twitter.com/jaffathecake/status/1447900413609529347"&gt;@jaffathecake&lt;/a&gt;&lt;/small&gt;&lt;/p&gt;


    &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/javascript"&gt;javascript&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/security"&gt;security&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/jake-archibald"&gt;jake-archibald&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/cors"&gt;cors&lt;/a&gt;&lt;/p&gt;



</summary><category term="javascript"/><category term="security"/><category term="jake-archibald"/><category term="cors"/></entry><entry><title>datasette-cors</title><link href="https://simonwillison.net/2019/Jul/8/datasette-cors/#atom-tag" rel="alternate"/><published>2019-07-08T04:30:53+00:00</published><updated>2019-07-08T04:30:53+00:00</updated><id>https://simonwillison.net/2019/Jul/8/datasette-cors/#atom-tag</id><summary type="html">
    
&lt;p&gt;&lt;strong&gt;&lt;a href="https://github.com/simonw/datasette-cors"&gt;datasette-cors&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;
My other Datasette ASGI plugin: this one wraps my asgi-cors project and lets you configure CORS access from a list of domains (or a set of domain wildcards) so you can make JavaScript calls to a Datasette instance from a specific set of other hosts.

    &lt;p&gt;&lt;small&gt;&lt;/small&gt;Via &lt;a href="https://twitter.com/simonw/status/1148084447687786498"&gt;@simonw&lt;/a&gt;&lt;/small&gt;&lt;/p&gt;


    &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/projects"&gt;projects&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/datasette"&gt;datasette&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/asgi"&gt;asgi&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/cors"&gt;cors&lt;/a&gt;&lt;/p&gt;



</summary><category term="projects"/><category term="datasette"/><category term="asgi"/><category term="cors"/></entry><entry><title>asgi-cors</title><link href="https://simonwillison.net/2019/May/7/asgi-cors/#atom-tag" rel="alternate"/><published>2019-05-07T00:12:37+00:00</published><updated>2019-05-07T00:12:37+00:00</updated><id>https://simonwillison.net/2019/May/7/asgi-cors/#atom-tag</id><summary type="html">
    
&lt;p&gt;&lt;strong&gt;&lt;a href="https://github.com/simonw/asgi-cors"&gt;asgi-cors&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;
I’ve been trying out the new ASGI 3.0 spec and I just released my first piece of ASGI middleware: asgi-cors, which lets you wrap an ASGI application with Access-Control-Allow-Origin CORS headers (either “*” or dynamic headers based on an origin whitelist).

    &lt;p&gt;&lt;small&gt;&lt;/small&gt;Via &lt;a href="https://twitter.com/simonw/status/1125553970007568384"&gt;@simonw&lt;/a&gt;&lt;/small&gt;&lt;/p&gt;


    &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/projects"&gt;projects&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/security"&gt;security&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/asgi"&gt;asgi&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/cors"&gt;cors&lt;/a&gt;&lt;/p&gt;



</summary><category term="projects"/><category term="security"/><category term="asgi"/><category term="cors"/></entry><entry><title>isomorphic-git</title><link href="https://simonwillison.net/2018/May/16/isomorphic-git/#atom-tag" rel="alternate"/><published>2018-05-16T20:54:04+00:00</published><updated>2018-05-16T20:54:04+00:00</updated><id>https://simonwillison.net/2018/May/16/isomorphic-git/#atom-tag</id><summary type="html">
    
&lt;p&gt;&lt;strong&gt;&lt;a href="https://github.com/isomorphic-git/isomorphic-git"&gt;isomorphic-git&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;
A pure-JavaScript implementation of the git protocol and underlying tools which works both server-side (Node.js) AND in the client, using an emulation of the fs API. Given the right CORS headers it can clone a GitHub repository over HTTPS right into your browser. Impressive.

    &lt;p&gt;&lt;small&gt;&lt;/small&gt;Via &lt;a href="https://news.ycombinator.com/item?id=17083807"&gt;Hacker News&lt;/a&gt;&lt;/small&gt;&lt;/p&gt;


    &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/git"&gt;git&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/javascript"&gt;javascript&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/cors"&gt;cors&lt;/a&gt;&lt;/p&gt;



</summary><category term="git"/><category term="javascript"/><category term="cors"/></entry></feed>