<?xml version="1.0" encoding="utf-8"?>
<feed xml:lang="en-us" xmlns="http://www.w3.org/2005/Atom"><title>Simon Willison's Weblog: covid19</title><link href="http://simonwillison.net/" rel="alternate"/><link href="http://simonwillison.net/tags/covid19.atom" rel="self"/><id>http://simonwillison.net/</id><updated>2024-10-11T01:45:23+00:00</updated><author><name>Simon Willison</name></author><entry><title>Quoting Ed Yong</title><link href="https://simonwillison.net/2024/Oct/11/ed-yong/#atom-tag" rel="alternate"/><published>2024-10-11T01:45:23+00:00</published><updated>2024-10-11T01:45:23+00:00</updated><id>https://simonwillison.net/2024/Oct/11/ed-yong/#atom-tag</id><summary type="html">
    &lt;blockquote cite="https://xoxofest.com/2024/videos/ed-yong/"&gt;&lt;p&gt;Providing validation, strength, and stability to people who feel gaslit and dismissed and forgotten can help them feel stronger and surer in their decisions. These pieces made me understand that journalism can be a caretaking profession, even if it is never really thought about in those terms. It is often framed in terms of antagonism. Speaking truth to power turns into being hard-nosed and removed from our subject matter, which so easily turns into be an asshole and do whatever you like.&lt;/p&gt;
&lt;p&gt;This is a viewpoint that I reject. My pillars are empathy, curiosity, and kindness. And much else flows from that. For people who feel lost and alone, we get to say through our work, you are not. For people who feel like society has abandoned them and their lives do not matter, we get to say, actually, they fucking do. We are one of the only professions that can do that through our work and that can do that at scale.&lt;/p&gt;&lt;/blockquote&gt;
&lt;p class="cite"&gt;&amp;mdash; &lt;a href="https://xoxofest.com/2024/videos/ed-yong/"&gt;Ed Yong&lt;/a&gt;, at &lt;a href="https://www.youtube.com/watch?v=ddy5uMdzZB8&amp;amp;t=1187s"&gt;19:47&lt;/a&gt;&lt;/p&gt;

    &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/journalism"&gt;journalism&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/covid19"&gt;covid19&lt;/a&gt;&lt;/p&gt;



</summary><category term="journalism"/><category term="covid19"/></entry><entry><title>My @covidsewage bot now includes useful alt text</title><link href="https://simonwillison.net/2024/Aug/25/covidsewage-alt-text/#atom-tag" rel="alternate"/><published>2024-08-25T16:09:49+00:00</published><updated>2024-08-25T16:09:49+00:00</updated><id>https://simonwillison.net/2024/Aug/25/covidsewage-alt-text/#atom-tag</id><summary type="html">
    
&lt;p&gt;&lt;strong&gt;&lt;a href="https://fedi.simonwillison.net/@covidsewage/113023397159658020"&gt;My @covidsewage bot now includes useful alt text&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;
I've been running a &lt;a href="https://fedi.simonwillison.net/@covidsewage"&gt;@covidsewage&lt;/a&gt; Mastodon bot for a while now, posting daily screenshots (taken with &lt;a href="https://shot-scraper.datasette.io/"&gt;shot-scraper&lt;/a&gt;) of the Santa Clara County &lt;a href="https://publichealth.santaclaracounty.gov/health-information/health-data/disease-data/covid-19/covid-19-wastewater"&gt;COVID in wastewater&lt;/a&gt; dashboard.&lt;/p&gt;
&lt;p&gt;Prior to today the screenshot was accompanied by the decidedly unhelpful alt text "Screenshot of the latest Covid charts".&lt;/p&gt;
&lt;p&gt;I finally fixed that today, closing &lt;a href="https://github.com/simonw/covidsewage-bot/issues/2"&gt;issue #2&lt;/a&gt; more than two years after I first opened it.&lt;/p&gt;
&lt;p&gt;The screenshot is of a Microsoft Power BI dashboard. I hoped I could scrape the key information out of it using JavaScript, but the weirdness of their DOM proved insurmountable.&lt;/p&gt;
&lt;p&gt;Instead, I'm using GPT-4o - specifically, this Python code (run using a &lt;code&gt;python -c&lt;/code&gt; block in the GitHub Actions YAML file):&lt;/p&gt;
&lt;pre&gt;&lt;span class="pl-k"&gt;import&lt;/span&gt; &lt;span class="pl-s1"&gt;base64&lt;/span&gt;, &lt;span class="pl-s1"&gt;openai&lt;/span&gt;
&lt;span class="pl-s1"&gt;client&lt;/span&gt; &lt;span class="pl-c1"&gt;=&lt;/span&gt; &lt;span class="pl-s1"&gt;openai&lt;/span&gt;.&lt;span class="pl-v"&gt;OpenAI&lt;/span&gt;()
&lt;span class="pl-k"&gt;with&lt;/span&gt; &lt;span class="pl-en"&gt;open&lt;/span&gt;(&lt;span class="pl-s"&gt;'/tmp/covid.png'&lt;/span&gt;, &lt;span class="pl-s"&gt;'rb'&lt;/span&gt;) &lt;span class="pl-k"&gt;as&lt;/span&gt; &lt;span class="pl-s1"&gt;image_file&lt;/span&gt;:
    &lt;span class="pl-s1"&gt;encoded_image&lt;/span&gt; &lt;span class="pl-c1"&gt;=&lt;/span&gt; &lt;span class="pl-s1"&gt;base64&lt;/span&gt;.&lt;span class="pl-en"&gt;b64encode&lt;/span&gt;(&lt;span class="pl-s1"&gt;image_file&lt;/span&gt;.&lt;span class="pl-en"&gt;read&lt;/span&gt;()).&lt;span class="pl-en"&gt;decode&lt;/span&gt;(&lt;span class="pl-s"&gt;'utf-8'&lt;/span&gt;)
&lt;span class="pl-s1"&gt;messages&lt;/span&gt; &lt;span class="pl-c1"&gt;=&lt;/span&gt; [
    {&lt;span class="pl-s"&gt;'role'&lt;/span&gt;: &lt;span class="pl-s"&gt;'system'&lt;/span&gt;,
     &lt;span class="pl-s"&gt;'content'&lt;/span&gt;: &lt;span class="pl-s"&gt;'Return the concentration levels in the sewersheds - single paragraph, no markdown'&lt;/span&gt;},
    {&lt;span class="pl-s"&gt;'role'&lt;/span&gt;: &lt;span class="pl-s"&gt;'user'&lt;/span&gt;, &lt;span class="pl-s"&gt;'content'&lt;/span&gt;: [
        {&lt;span class="pl-s"&gt;'type'&lt;/span&gt;: &lt;span class="pl-s"&gt;'image_url'&lt;/span&gt;, &lt;span class="pl-s"&gt;'image_url'&lt;/span&gt;: {
            &lt;span class="pl-s"&gt;'url'&lt;/span&gt;: &lt;span class="pl-s"&gt;'data:image/png;base64,'&lt;/span&gt; &lt;span class="pl-c1"&gt;+&lt;/span&gt; &lt;span class="pl-s1"&gt;encoded_image&lt;/span&gt;
        }}
    ]}
]
&lt;span class="pl-s1"&gt;completion&lt;/span&gt; &lt;span class="pl-c1"&gt;=&lt;/span&gt; &lt;span class="pl-s1"&gt;client&lt;/span&gt;.&lt;span class="pl-s1"&gt;chat&lt;/span&gt;.&lt;span class="pl-s1"&gt;completions&lt;/span&gt;.&lt;span class="pl-en"&gt;create&lt;/span&gt;(&lt;span class="pl-s1"&gt;model&lt;/span&gt;&lt;span class="pl-c1"&gt;=&lt;/span&gt;&lt;span class="pl-s"&gt;'gpt-4o'&lt;/span&gt;, &lt;span class="pl-s1"&gt;messages&lt;/span&gt;&lt;span class="pl-c1"&gt;=&lt;/span&gt;&lt;span class="pl-s1"&gt;messages&lt;/span&gt;)
&lt;span class="pl-en"&gt;print&lt;/span&gt;(&lt;span class="pl-s1"&gt;completion&lt;/span&gt;.&lt;span class="pl-s1"&gt;choices&lt;/span&gt;[&lt;span class="pl-c1"&gt;0&lt;/span&gt;].&lt;span class="pl-s1"&gt;message&lt;/span&gt;.&lt;span class="pl-s1"&gt;content&lt;/span&gt;)&lt;/pre&gt;

&lt;p&gt;I'm base64 encoding the screenshot and sending it with this system prompt:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;Return the concentration levels in the sewersheds - single paragraph, no markdown&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;Given this input image:&lt;/p&gt;
&lt;p&gt;&lt;img alt="Screenshot of a Power BI dashboard showing information that is described below" src="https://static.simonwillison.net/static/2024/covid-power-bi.jpg" /&gt;&lt;/p&gt;
&lt;p&gt;Here's the text that comes back:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;The concentration levels of SARS-CoV-2 in the sewersheds from collected samples are as follows: San Jose Sewershed has a high concentration, Palo Alto Sewershed has a high concentration, Sunnyvale Sewershed has a high concentration, and Gilroy Sewershed has a medium concentration.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;The full implementation can be found in &lt;a href="https://github.com/simonw/covidsewage-bot/blob/main/.github/workflows/post.yml"&gt;the GitHub Actions workflow&lt;/a&gt;, which runs on a schedule at 7am Pacific time every day.


    &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/accessibility"&gt;accessibility&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/alt-text"&gt;alt-text&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/projects"&gt;projects&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/ai"&gt;ai&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/covid19"&gt;covid19&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/shot-scraper"&gt;shot-scraper&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/openai"&gt;openai&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/generative-ai"&gt;generative-ai&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/gpt-4"&gt;gpt-4&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/llms"&gt;llms&lt;/a&gt;&lt;/p&gt;



</summary><category term="accessibility"/><category term="alt-text"/><category term="projects"/><category term="ai"/><category term="covid19"/><category term="shot-scraper"/><category term="openai"/><category term="generative-ai"/><category term="gpt-4"/><category term="llms"/></entry><entry><title>Fix @covidsewage bot to handle a change to the underlying website</title><link href="https://simonwillison.net/2024/Aug/18/fix-covidsewage-bot/#atom-tag" rel="alternate"/><published>2024-08-18T17:26:32+00:00</published><updated>2024-08-18T17:26:32+00:00</updated><id>https://simonwillison.net/2024/Aug/18/fix-covidsewage-bot/#atom-tag</id><summary type="html">
    
&lt;p&gt;&lt;strong&gt;&lt;a href="https://github.com/simonw/covidsewage-bot/issues/6"&gt;Fix @covidsewage bot to handle a change to the underlying website&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;
I've been running &lt;a href="https://fedi.simonwillison.net/@covidsewage"&gt;@covidsewage&lt;/a&gt; on Mastodon since February last year tweeting a daily screenshot of the Santa Clara County charts showing Covid levels in wastewater.&lt;/p&gt;
&lt;p&gt;A few days ago the county changed their website, breaking the bot. The chart now lives on their new &lt;a href="https://publichealth.santaclaracounty.gov/health-information/health-data/disease-data/covid-19/covid-19-wastewater"&gt;COVID in wastewater&lt;/a&gt; page.&lt;/p&gt;
&lt;p&gt;It's still a Microsoft Power BI dashboard in an &lt;code&gt;&amp;lt;iframe&amp;gt;&lt;/code&gt;, but my initial attempts to scrape it didn't quite work. Eventually I realized that Cloudflare protection was blocking my attempts to access the page, but thankfully sending a Firefox user-agent fixed that problem.&lt;/p&gt;
&lt;p&gt;The new recipe I'm using to screenshot the chart involves a delightfully messy nested set of calls to &lt;a href="https://shot-scraper.datasette.io/"&gt;shot-scraper&lt;/a&gt; - first using &lt;code&gt;shot-scraper javascript&lt;/code&gt; to extract the URL attribute for that &lt;code&gt;&amp;lt;iframe&amp;gt;&lt;/code&gt;, then feeding that URL to a separate &lt;code&gt;shot-scraper&lt;/code&gt; call to generate the screenshot:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;shot-scraper -o /tmp/covid.png $(
  shot-scraper javascript \
    'https://publichealth.santaclaracounty.gov/health-information/health-data/disease-data/covid-19/covid-19-wastewater' \
    'document.querySelector("iframe").src' \
    -b firefox \
    --user-agent 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10.15; rv:128.0) Gecko/20100101 Firefox/128.0' \
    --raw
) --wait 5000 -b firefox --retina
&lt;/code&gt;&lt;/pre&gt;


    &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/projects"&gt;projects&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/covid19"&gt;covid19&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/shot-scraper"&gt;shot-scraper&lt;/a&gt;&lt;/p&gt;



</summary><category term="projects"/><category term="covid19"/><category term="shot-scraper"/></entry><entry><title>Building a Covid sewage Twitter bot (and other weeknotes)</title><link href="https://simonwillison.net/2022/Apr/18/covid-sewage/#atom-tag" rel="alternate"/><published>2022-04-18T02:49:06+00:00</published><updated>2022-04-18T02:49:06+00:00</updated><id>https://simonwillison.net/2022/Apr/18/covid-sewage/#atom-tag</id><summary type="html">
    &lt;p&gt;I built a new Twitter bot today: &lt;a href="https://twitter.com/covidsewage"&gt;@covidsewage&lt;/a&gt;. It tweets a daily screenshot of the latest &lt;a href="https://covid19.sccgov.org/dashboard-wastewater"&gt;Covid sewage monitoring data&lt;/a&gt; published by Santa Clara county.&lt;/p&gt;
&lt;p&gt;I'm increasingly distrustful of Covid numbers as fewer people are tested in ways that feed into the official statistics. But the sewage numbers don't lie! As the &lt;a href="https://covid19.sccgov.org/dashboard-wastewater"&gt;Santa Clara county page&lt;/a&gt; explains:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;SARS-CoV-2 (the virus that causes COVID-19) is shed in feces by infected individuals and can be measured in wastewater. More cases of COVID-19 in the community are associated with increased levels of SARS-CoV-2 in wastewater, meaning that data from wastewater analysis can be used as an indicator of the level of transmission of COVID-19 in the community.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;That page also embeds some beautiful charts of the latest numbers, powered by an embedded Observable notebook built by &lt;a href="https://www.zanarmstrong.com/"&gt;Zan Armstrong&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;Once a day, my bot tweets a screenshot of those latest charts that looks &lt;a href="https://twitter.com/covidsewage/status/1515832038443544578"&gt;like this&lt;/a&gt;:&lt;/p&gt;
&lt;p&gt;&lt;img src="https://static.simonwillison.net/static/2022/covidsewage.jpg" alt="Screenshot of a tweet that says &amp;quot;Latest Covid sewage charts for the SF Bay Area&amp;quot; with an attached screenshot of some charts. The numbers are trending up in an alarming direction." style="max-width:100%;" /&gt;&lt;/p&gt;
&lt;h4&gt;How the bot works&lt;/h4&gt;
&lt;p&gt;The bot runs once a daily using &lt;a href="https://github.com/simonw/covidsewage-bot/blob/main/.github/workflows/tweet.yml"&gt;this scheduled GitHub Actions workflow&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;Here's the bit of the workflow that generates the screenshot:&lt;/p&gt;
&lt;div class="highlight highlight-source-yaml"&gt;&lt;pre&gt;- &lt;span class="pl-ent"&gt;name&lt;/span&gt;: &lt;span class="pl-s"&gt;Generate screenshot with shot-scraper&lt;/span&gt;
  &lt;span class="pl-ent"&gt;run&lt;/span&gt;: &lt;span class="pl-s"&gt;|-&lt;/span&gt;
&lt;span class="pl-s"&gt;    shot-scraper https://covid19.sccgov.org/dashboard-wastewater \&lt;/span&gt;
&lt;span class="pl-s"&gt;      -s iframe --wait 3000 -b firefox --retina -o /tmp/covid.png&lt;/span&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;This uses my &lt;a href="https://datasette.io/tools/shot-scraper"&gt;shot-scraper&lt;/a&gt; screenshot tool, described here &lt;a href="https://simonwillison.net/2022/Mar/10/shot-scraper/"&gt;previously&lt;/a&gt;. It takes a retina screenshot just of the embedded iframe, and uses Firefox because for some reason the default Chromium screenshot failed to load the embed.&lt;/p&gt;
&lt;p&gt;This bit sends the tweet:&lt;/p&gt;
&lt;div class="highlight highlight-source-yaml"&gt;&lt;pre&gt;- &lt;span class="pl-ent"&gt;name&lt;/span&gt;: &lt;span class="pl-s"&gt;Tweet the new image&lt;/span&gt;
  &lt;span class="pl-ent"&gt;env&lt;/span&gt;:
    &lt;span class="pl-ent"&gt;TWITTER_CONSUMER_KEY&lt;/span&gt;: &lt;span class="pl-s"&gt;${{ secrets.TWITTER_CONSUMER_KEY }}&lt;/span&gt;
    &lt;span class="pl-ent"&gt;TWITTER_CONSUMER_SECRET&lt;/span&gt;: &lt;span class="pl-s"&gt;${{ secrets.TWITTER_CONSUMER_SECRET }}&lt;/span&gt;
    &lt;span class="pl-ent"&gt;TWITTER_ACCESS_TOKEN_KEY&lt;/span&gt;: &lt;span class="pl-s"&gt;${{ secrets.TWITTER_ACCESS_TOKEN_KEY }}&lt;/span&gt;
    &lt;span class="pl-ent"&gt;TWITTER_ACCESS_TOKEN_SECRET&lt;/span&gt;: &lt;span class="pl-s"&gt;${{ secrets.TWITTER_ACCESS_TOKEN_SECRET }}&lt;/span&gt;
  &lt;span class="pl-ent"&gt;run&lt;/span&gt;: &lt;span class="pl-s"&gt;|-&lt;/span&gt;
&lt;span class="pl-s"&gt;    tweet-images "Latest Covid sewage charts for the SF Bay Area" \&lt;/span&gt;
&lt;span class="pl-s"&gt;      /tmp/covid.png --alt "Screenshot of the charts" &amp;gt; latest-tweet.md&lt;/span&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;&lt;a href="https://github.com/simonw/tweet-images"&gt;tweet-images&lt;/a&gt; is a tiny new tool I built for this project. It uses the &lt;a href="https://github.com/bear/python-twitter"&gt;python-twitter&lt;/a&gt; library to send a tweet with one or more images attached to it.&lt;/p&gt;
&lt;p&gt;The hardest part of the project was getting the credentials for sending tweets with the bot! I had to go through Twitter's manual verification flow, presumably because I checked the "bot" option when I applied for the new developer account. I also had to figure out how to extract all four credentials (with write permissions) from the Twitter developer portal.&lt;/p&gt;
&lt;p&gt;I wrote up full notes on this in a TIL: &lt;a href="https://til.simonwillison.net/twitter/credentials-twitter-bot"&gt;How to get credentials for a new Twitter bot&lt;/a&gt;.&lt;/p&gt;
&lt;h4&gt;Datasette for geospatial analysis&lt;/h4&gt;
&lt;p&gt;I stumbled across &lt;a href="https://github.com/datanews/amtrak-geojson"&gt;datanews/amtrak-geojson&lt;/a&gt;, a GitHub repository containing GeoJSON files (from 2015) showing all of the Amtrak stations and sections of track in the USA.&lt;/p&gt;
&lt;p&gt;I decided to try exploring it using my &lt;a href="https://datasette.io/tools/geojson-to-sqlite"&gt;geojson-to-sqlite&lt;/a&gt; tool, which revealed &lt;a href="https://github.com/simonw/geojson-to-sqlite/issues/30"&gt;a bug&lt;/a&gt; triggered by records with a geometry but no properties. I fixed that in version &lt;a href="https://github.com/simonw/geojson-to-sqlite/releases/tag/1.0.1"&gt;1.0.1&lt;/a&gt;, and later shipped version &lt;a href="https://github.com/simonw/geojson-to-sqlite/releases/tag/1.1"&gt;1.1&lt;/a&gt; with improvements by Chris Amico.&lt;/p&gt;
&lt;p&gt;In exploring the Amtrak data I found myself needing to learn how to use the SpatiaLite &lt;code&gt;GUnion&lt;/code&gt; function to aggregate multiple geometries together. This resulted in a detailed TIL on using &lt;a href="https://til.simonwillison.net/spatialite/gunion-to-combine-geometries"&gt;GUnion to combine geometries in SpatiaLite&lt;/a&gt;, which further evolved as I used it as a chance to learn how to use Chris's &lt;a href="https://datasette.io/plugins/datasette-geojson-map"&gt;datasette-geojson-map&lt;/a&gt; and &lt;a href="https://datasette.io/plugins/sqlite-colorbrewer"&gt;sqlite-colorbrewer&lt;/a&gt; plugins.&lt;/p&gt;
&lt;p&gt;This was so much fun that I was inspired to add a new "uses" page to the official Datasette website: &lt;a href="https://datasette.io/for/geospatial"&gt;Datasette for geospatial analysis&lt;/a&gt; now gathers together links to plugins, tools and tutorials for handling geospatial data.&lt;/p&gt;
&lt;h4&gt;sqlite-utils 3.26&lt;/h4&gt;
&lt;p&gt;I'll quote the release notes for &lt;a href="https://sqlite-utils.datasette.io/en/stable/changelog.html#v3-26"&gt;sqlite-utils 3.26&lt;/a&gt; in full:&lt;/p&gt;
&lt;blockquote&gt;
&lt;ul&gt;
&lt;li&gt;New &lt;code&gt;errors=r.IGNORE/r.SET_NULL&lt;/code&gt; parameter for the &lt;code&gt;r.parsedatetime()&lt;/code&gt; and &lt;code&gt;r.parsedate()&lt;/code&gt; &lt;a href="https://sqlite-utils.datasette.io/en/stable/cli.html#cli-convert-recipes"&gt;convert recipes&lt;/a&gt;. (&lt;a href="https://github.com/simonw/sqlite-utils/issues/416"&gt;#416&lt;/a&gt;)&lt;/li&gt;
&lt;li&gt;Fixed a bug where &lt;code&gt;--multi&lt;/code&gt; could not be used in combination with &lt;code&gt;--dry-run&lt;/code&gt; for the &lt;a href="https://sqlite-utils.datasette.io/en/stable/cli.html#cli-convert"&gt;convert&lt;/a&gt; command. (&lt;a href="https://github.com/simonw/sqlite-utils/issues/415"&gt;#415&lt;/a&gt;)&lt;/li&gt;
&lt;li&gt;New documentation: &lt;a href="https://sqlite-utils.datasette.io/en/stable/cli.html#cli-convert-complex"&gt;Using a convert() function to execute initialization&lt;/a&gt;. (&lt;a href="https://github.com/simonw/sqlite-utils/issues/420"&gt;#420&lt;/a&gt;)&lt;/li&gt;
&lt;li&gt;More robust detection for whether or not &lt;code&gt;deterministic=True&lt;/code&gt; is supported. (&lt;a href="https://github.com/simonw/sqlite-utils/issues/425"&gt;#425&lt;/a&gt;)&lt;/li&gt;
&lt;/ul&gt;
&lt;/blockquote&gt;
&lt;h4&gt;shot-scraper 0.12&lt;/h4&gt;
&lt;p&gt;In addition to &lt;a href="https://github.com/simonw/shot-scraper/pull/56"&gt;support for WebKit&lt;/a&gt; contributed by Ryan Murphy, &lt;a href="https://github.com/simonw/shot-scraper/releases/tag/0.12"&gt;shot-scraper 0.12&lt;/a&gt; adds options for taking a screenshot that encompasses all of the elements on a page that match a CSS selector.&lt;/p&gt;
&lt;p&gt;In also adds a new &lt;code&gt;--js-selector&lt;/code&gt; option, &lt;a href="https://github.com/simonw/shot-scraper/issues/43"&gt;suggested by&lt;/a&gt; Tony Hirst. This covers the case where you want to take a screenshot of an element on the page that cannot be easily specified using a CSS selector. For example, this expression takes a screenshot of the first paragraph on a page that includes the text "shot-scraper":&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;shot-scraper https://simonwillison.net/2022/Apr/8/weeknotes/ \
  --js-selector 'el.tagName == "P" &amp;amp;&amp;amp; el.innerText.includes("shot-scraper")' \
  --padding 15 --retina
&lt;/code&gt;&lt;/pre&gt;
&lt;h4&gt;And an airship museum!&lt;/h4&gt;
&lt;p&gt;I finally got to add another listing to my &lt;a href="https://www.niche-museums.com/"&gt;www.niche-museums.com&lt;/a&gt; website about small or niche museums I have visited.&lt;/p&gt;
&lt;p&gt;The &lt;a href="https://www.niche-museums.com/105"&gt;Moffett Field Historical Society&lt;/a&gt; museum in Mountain View is situated in the shadow of Hangar One, an airship hangar built in 1933 to house the mighty USS Macon.&lt;/p&gt;
&lt;p&gt;It's the absolute best kind of local history museum. Our docent was a retired pilot who had landed planes on aircraft carriers using the kind of equipment now on display in the museum. They had dioramas and models. They even had a model railway. It was superb.&lt;/p&gt;
&lt;h4&gt;Releases this week&lt;/h4&gt;
&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;&lt;a href="https://github.com/simonw/tweet-images"&gt;tweet-images&lt;/a&gt;&lt;/strong&gt;: &lt;a href="https://github.com/simonw/tweet-images/releases/tag/0.1.1"&gt;0.1.1&lt;/a&gt; - (&lt;a href="https://github.com/simonw/tweet-images/releases"&gt;2 releases total&lt;/a&gt;) - 2022-04-17
&lt;br /&gt;Send tweets with images from the command line&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;&lt;a href="https://github.com/simonw/asyncinject"&gt;asyncinject&lt;/a&gt;&lt;/strong&gt;: &lt;a href="https://github.com/simonw/asyncinject/releases/tag/0.3"&gt;0.3&lt;/a&gt; - (&lt;a href="https://github.com/simonw/asyncinject/releases"&gt;5 releases total&lt;/a&gt;) - 2022-04-16
&lt;br /&gt;Run async workflows using pytest-fixtures-style dependency injection&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;&lt;a href="https://github.com/simonw/geojson-to-sqlite"&gt;geojson-to-sqlite&lt;/a&gt;&lt;/strong&gt;: &lt;a href="https://github.com/simonw/geojson-to-sqlite/releases/tag/1.1.1"&gt;1.1.1&lt;/a&gt; - (&lt;a href="https://github.com/simonw/geojson-to-sqlite/releases"&gt;11 releases total&lt;/a&gt;) - 2022-04-13
&lt;br /&gt;CLI tool for converting GeoJSON files to SQLite (with SpatiaLite)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;&lt;a href="https://github.com/simonw/sqlite-utils"&gt;sqlite-utils&lt;/a&gt;&lt;/strong&gt;: &lt;a href="https://github.com/simonw/sqlite-utils/releases/tag/3.26"&gt;3.26&lt;/a&gt; - (&lt;a href="https://github.com/simonw/sqlite-utils/releases"&gt;99 releases total&lt;/a&gt;) - 2022-04-13
&lt;br /&gt;Python CLI utility and library for manipulating SQLite databases&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;&lt;a href="https://github.com/simonw/summarize-template"&gt;summarize-template&lt;/a&gt;&lt;/strong&gt;: &lt;a href="https://github.com/simonw/summarize-template/releases/tag/0.1"&gt;0.1&lt;/a&gt; - 2022-04-13
&lt;br /&gt;Show a summary of a Django or Jinja template&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;&lt;a href="https://github.com/simonw/shot-scraper"&gt;shot-scraper&lt;/a&gt;&lt;/strong&gt;: &lt;a href="https://github.com/simonw/shot-scraper/releases/tag/0.12"&gt;0.12&lt;/a&gt; - (&lt;a href="https://github.com/simonw/shot-scraper/releases"&gt;13 releases total&lt;/a&gt;) - 2022-04-11
&lt;br /&gt;Tools for taking automated screenshots of websites&lt;/li&gt;
&lt;/ul&gt;
&lt;h4&gt;TIL this week&lt;/h4&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href="https://til.simonwillison.net/spatialite/gunion-to-combine-geometries"&gt;GUnion to combine geometries in SpatiaLite&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://til.simonwillison.net/macos/apple-photos-large-files"&gt;Trick Apple Photos into letting you access your video files&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://til.simonwillison.net/twitter/credentials-twitter-bot"&gt;How to get credentials for a new Twitter bot&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
    
        &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/projects"&gt;projects&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/twitter"&gt;twitter&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/datasette"&gt;datasette&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/weeknotes"&gt;weeknotes&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/github-actions"&gt;github-actions&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/covid19"&gt;covid19&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/sqlite-utils"&gt;sqlite-utils&lt;/a&gt;&lt;/p&gt;
    

</summary><category term="projects"/><category term="twitter"/><category term="datasette"/><category term="weeknotes"/><category term="github-actions"/><category term="covid19"/><category term="sqlite-utils"/></entry><entry><title>Weeknotes: CDC vaccination history fixes, developing in GitHub Codespaces</title><link href="https://simonwillison.net/2021/Sep/28/weeknotes/#atom-tag" rel="alternate"/><published>2021-09-28T01:53:49+00:00</published><updated>2021-09-28T01:53:49+00:00</updated><id>https://simonwillison.net/2021/Sep/28/weeknotes/#atom-tag</id><summary type="html">
    &lt;p&gt;I spent the last week mostly surrounded by boxes: we're completing our move to the new place and life is mostly unpacking now. I did find some time to fix some issues with my &lt;a href="https://cdc-vaccination-history.datasette.io/"&gt;CDC vaccination history&lt;/a&gt; Datasette instance though.&lt;/p&gt;
&lt;h4&gt;Fixing my CDC vaccination history site&lt;/h4&gt;
&lt;p&gt;I started tracking changes made to the &lt;a href="https://covid.cdc.gov/covid-data-tracker/#vaccinations_vacc-total-admin-rate-total"&gt;CDC's COVID Data Tracker&lt;/a&gt; website back in Feburary. I created &lt;a href="https://github.com/simonw/cdc-vaccination-history"&gt;a git scraper repository&lt;/a&gt; for it as part of my &lt;a href="https://simonwillison.net/2021/Mar/5/git-scraping/"&gt;five minute lightning talk on git scraping&lt;/a&gt; (notes and video) at this year's NICAR data journalism conference.&lt;/p&gt;
&lt;p&gt;Since then it's been quietly ticking along, recording the latest data in a git repository that now has &lt;a href="https://github.com/simonw/cdc-vaccination-history/commits/main"&gt;335 commits&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;In March I &lt;a href="https://github.com/simonw/cdc-vaccination-history/commit/bf88c1e6cc3e5b6344a7dfea5d2a70dcb0552847#diff-87ee5504a3e25ac558b343724c905f2f7949e8cec3d92b9c4300bb922afa164f"&gt;added a script&lt;/a&gt; to build the collected historic data into a SQLite database and publish it to Vercel using GitHub. That started breaking a few weeks ago, and it turnoud out that was because the database file had grown in size to the point where it was too large to deploy to Vercel (~100MB).&lt;/p&gt;
&lt;p&gt;I got a bug report about this, so I took some time to &lt;a href="https://github.com/simonw/cdc-vaccination-history/issues/8"&gt;move the deployment over&lt;/a&gt; to Google Cloud Run which doesn't have a documented size limit (though in my experience starts to creak once you go above about 2GB.)&lt;/p&gt;
&lt;p&gt;I also started publishing the raw collected data &lt;a href="https://github.com/simonw/cdc-vaccination-history/issues/9"&gt;directly as a CSV file&lt;/a&gt;, partly as an excuse to learn &lt;a href="https://til.simonwillison.net/googlecloud/gsutil-bucket"&gt;how to publish to Google Cloud Storage&lt;/a&gt;.&lt;/p&gt;
&lt;h4&gt;datasette-template-request&lt;/h4&gt;
&lt;p&gt;I released an extremely simple plugin this week called &lt;a href="https://datasette.io/plugins/datasette-template-request"&gt;datasette-template-request&lt;/a&gt; - all it does is expose Datasette's &lt;a href="https://docs.datasette.io/en/stable/internals.html#request-object"&gt;request object&lt;/a&gt; in the context passed to &lt;a href="https://docs.datasette.io/en/stable/custom_templates.html"&gt;custom templates&lt;/a&gt;, for people who want to update their custom page based on incoming request parameters.&lt;/p&gt;
&lt;p&gt;More notable is how I built the plugin: this is the first plugin I've developed, tested and released entirely in my browser using the new &lt;a href="https://github.com/features/codespaces"&gt;GitHub Codespaces&lt;/a&gt; online development environment.&lt;/p&gt;
&lt;p&gt;I created the new repo using my &lt;a href="https://github.com/simonw/datasette-plugin-template-repository"&gt;Datasette plugin template repository&lt;/a&gt;, opened it up in Codespaces, implemented the plugin and tests, tried it out using the port forwarding feature and then published it to PyPI using the &lt;a href="https://github.com/simonw/datasette-template-request/blob/0.1/.github/workflows/publish.yml"&gt;publish.yml&lt;/a&gt; workflow.&lt;/p&gt;
&lt;p&gt;Not having to even open a text editor on my laptop (let alone get a new Python development environment up and running) felt really good. I should turn this into a tutorial.&lt;/p&gt;
&lt;h4&gt;Releases this week&lt;/h4&gt;
&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;&lt;a href="https://github.com/simonw/datasette-template-request"&gt;datasette-template-request&lt;/a&gt;&lt;/strong&gt;: &lt;a href="https://github.com/simonw/datasette-template-request/releases/tag/0.1"&gt;0.1&lt;/a&gt; - 2021-09-23
&lt;br /&gt;Expose the Datasette request object to custom templates&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;&lt;a href="https://github.com/simonw/datasette-notebook"&gt;datasette-notebook&lt;/a&gt;&lt;/strong&gt;: &lt;a href="https://github.com/simonw/datasette-notebook/releases/tag/0.1a1"&gt;0.1a1&lt;/a&gt; - (&lt;a href="https://github.com/simonw/datasette-notebook/releases"&gt;2 releases total&lt;/a&gt;) - 2021-09-22
&lt;br /&gt;A markdown wiki and dashboarding system for Datasette&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;&lt;a href="https://github.com/simonw/datasette-render-markdown"&gt;datasette-render-markdown&lt;/a&gt;&lt;/strong&gt;: &lt;a href="https://github.com/simonw/datasette-render-markdown/releases/tag/2.0"&gt;2.0&lt;/a&gt; - (&lt;a href="https://github.com/simonw/datasette-render-markdown/releases"&gt;8 releases total&lt;/a&gt;) - 2021-09-22
&lt;br /&gt;Datasette plugin for rendering Markdown&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;&lt;a href="https://github.com/simonw/sqlite-utils"&gt;sqlite-utils&lt;/a&gt;&lt;/strong&gt;: &lt;a href="https://github.com/simonw/sqlite-utils/releases/tag/3.17.1"&gt;3.17.1&lt;/a&gt; - (&lt;a href="https://github.com/simonw/sqlite-utils/releases"&gt;87 releases total&lt;/a&gt;) - 2021-09-22
&lt;br /&gt;Python CLI utility and library for manipulating SQLite databases&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;&lt;a href="https://github.com/dogsheep/twitter-to-sqlite"&gt;twitter-to-sqlite&lt;/a&gt;&lt;/strong&gt;: &lt;a href="https://github.com/dogsheep/twitter-to-sqlite/releases/tag/0.22"&gt;0.22&lt;/a&gt; - (&lt;a href="https://github.com/dogsheep/twitter-to-sqlite/releases"&gt;28 releases total&lt;/a&gt;) - 2021-09-21
&lt;br /&gt;Save data from Twitter to a SQLite database&lt;/li&gt;
&lt;/ul&gt;
&lt;h4&gt;TIL this week&lt;/h4&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href="https://til.simonwillison.net/til/til/googlecloud_gsutil-bucket.md"&gt;Publishing to a public Google Cloud bucket with gsutil&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://til.simonwillison.net/til/til/javascript_lit-with-skypack.md"&gt;Loading lit from Skypack&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
    
        &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/github"&gt;github&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/projects"&gt;projects&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/weeknotes"&gt;weeknotes&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/covid19"&gt;covid19&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/git-scraping"&gt;git-scraping&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/github-codespaces"&gt;github-codespaces&lt;/a&gt;&lt;/p&gt;
    

</summary><category term="github"/><category term="projects"/><category term="weeknotes"/><category term="covid19"/><category term="git-scraping"/><category term="github-codespaces"/></entry><entry><title>Quoting Dan Sinker</title><link href="https://simonwillison.net/2021/Aug/23/dan-sinker/#atom-tag" rel="alternate"/><published>2021-08-23T01:59:52+00:00</published><updated>2021-08-23T01:59:52+00:00</updated><id>https://simonwillison.net/2021/Aug/23/dan-sinker/#atom-tag</id><summary type="html">
    &lt;blockquote cite="https://www.theatlantic.com/ideas/archive/2021/08/parents-are-not-okay/619859/"&gt;&lt;p&gt;The rapid increase of COVID-19 cases among kids has shattered last year’s oft-repeated falsehood that kids don’t get COVID-19, and if they do, it’s not that bad. It was a convenient lie that was easy to believe in part because we kept most of our kids home. With remote learning not an option now, this year we’ll find out how dangerous this virus is for children in the worst way possible.&lt;/p&gt;&lt;/blockquote&gt;
&lt;p class="cite"&gt;&amp;mdash; &lt;a href="https://www.theatlantic.com/ideas/archive/2021/08/parents-are-not-okay/619859/"&gt;Dan Sinker&lt;/a&gt;&lt;/p&gt;

    &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/covid19"&gt;covid19&lt;/a&gt;&lt;/p&gt;



</summary><category term="covid19"/></entry><entry><title>The Tyranny of Spreadsheets</title><link href="https://simonwillison.net/2021/Jul/23/the-tyranny-of-spreadsheets/#atom-tag" rel="alternate"/><published>2021-07-23T03:57:50+00:00</published><updated>2021-07-23T03:57:50+00:00</updated><id>https://simonwillison.net/2021/Jul/23/the-tyranny-of-spreadsheets/#atom-tag</id><summary type="html">
    
&lt;p&gt;&lt;strong&gt;&lt;a href="https://timharford.com/2021/07/the-tyranny-of-spreadsheets/"&gt;The Tyranny of Spreadsheets&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;
In discussing the notorious Excel incident last year when the UK lost track of 16,000 Covid cases due to a .xls row limit, Tim Harford presents a history of the spreadsheet, dating all the way back to Francesco di Marco Datini and double-entry bookkeeping in 1396. A delightful piece of writing.

    &lt;p&gt;&lt;small&gt;&lt;/small&gt;Via &lt;a href="https://news.ycombinator.com/item?id=27923998"&gt;Hacker News&lt;/a&gt;&lt;/small&gt;&lt;/p&gt;


    &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/history"&gt;history&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/spreadsheets"&gt;spreadsheets&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/covid19"&gt;covid19&lt;/a&gt;&lt;/p&gt;



</summary><category term="history"/><category term="spreadsheets"/><category term="covid19"/></entry><entry><title>Trying to end the pandemic a little earlier with VaccinateCA</title><link href="https://simonwillison.net/2021/Feb/28/vaccinateca/#atom-tag" rel="alternate"/><published>2021-02-28T05:40:28+00:00</published><updated>2021-02-28T05:40:28+00:00</updated><id>https://simonwillison.net/2021/Feb/28/vaccinateca/#atom-tag</id><summary type="html">
    &lt;p&gt;This week I got involved with the &lt;a href="https://www.vaccinateca.com/"&gt;VaccinateCA&lt;/a&gt; effort. We are trying to end the pandemic a little earlier, by building the most accurate database possible of vaccination locations and availability in California.&lt;/p&gt;

&lt;h4&gt;VaccinateCA&lt;/h4&gt;
&lt;p&gt;I’ve been following this project for a while through Twitter, mainly via &lt;a href="https://twitter.com/patio11"&gt;Patrick McKenzie&lt;/a&gt; - here’s &lt;a href="https://twitter.com/patio11/status/1351942635682816002"&gt;his tweet&lt;/a&gt; about the project from January 20th.&lt;/p&gt;

&lt;blockquote class="twitter-tweet"&gt;&lt;p lang="en" dir="ltr"&gt;&lt;a href="https://t.co/JrD5mb4TAN"&gt;https://t.co/JrD5mb4TAN&lt;/a&gt; calls medical professionals daily to ask who they could vaccinate and how to get in line. We publish this, covering the entire state of California, to help more people get their vaccines faster. Please tell your friends and networks.&lt;/p&gt;- Patrick McKenzie (@patio11) &lt;a href="https://twitter.com/patio11/status/1351942635682816002?ref_src=twsrc%5Etfw"&gt;January 20, 2021&lt;/a&gt;&lt;/blockquote&gt;

&lt;p&gt;The core idea is one of those things that sounds obviously correct the moment you hear it. The Covid vaccination roll-out is decentralized and pretty chaotic. VaccinateCA figured out that the best way to figure out where the vaccine is available is to call the places that are distributing it - pharmacies, hospitals, clinics - as often as possible and ask if they have any in stock, who is eligible for the shot and how people can sign up for an appointment.&lt;/p&gt;
&lt;p&gt;&lt;a href="https://blog.vaccinateca.com/what-weve-learned-so-far/"&gt;What We've Learned (So Far)&lt;/a&gt; by Patrick talks about lessons learned in the first 42 days of the project.&lt;/p&gt;
&lt;p&gt;There are three public-facing components to VaccinateCA:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;
&lt;a href="https://www.vaccinateca.com/"&gt;www.vaccinateca.com&lt;/a&gt; is a website to help you find available vaccines near you.&lt;/li&gt;
&lt;li&gt;&lt;code&gt;help.vaccinateca&lt;/code&gt; is the web app used by volunteers who make calls - it provides a script and buttons to submit information gleaned from the call. If you’re interested in volunteering there’s &lt;a href="https://www.vaccinateca.com/about-us"&gt;information on the website&lt;/a&gt;.&lt;/li&gt;
&lt;li&gt;&lt;code&gt;api.vaccinateca&lt;/code&gt; is the public API, which is &lt;a href="https://docs.vaccinateca.com/reference"&gt;documented here&lt;/a&gt; and is also used by the end-user facing website. It provides a full dump of collected location data, plus information on county policies and large-scale providers (pharmacy chains, health care providers).&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;The system currently mostly runs on &lt;a href="https://airtable.com/"&gt;Airtable&lt;/a&gt;, and takes advantage of pretty much every feature of that platform.&lt;/p&gt;
&lt;h4&gt;Why I got involved&lt;/h4&gt;
&lt;p&gt;&lt;a href="https://twitter.com/obra"&gt;Jesse Vincent&lt;/a&gt; convinced me to get involved. It turns out to be a perfect fit for both my interests and my skills and experience.&lt;/p&gt;
&lt;p&gt;I’ve built crowdsourcing platforms before - for &lt;a href="https://simonwillison.net/2009/Dec/20/crowdsourcing/"&gt;MP’s expense reports at the Guardian&lt;/a&gt;, and then for conference and event listings with our startup, Lanyrd.&lt;/p&gt;
&lt;p&gt;VaccinateCA is a very data-heavy organization: the key goal is to build a comprehensive database of vaccine locations and availability. My background in data journalism and the last three years I’ve spent working on &lt;a href="https://datasette.io/"&gt;Datasette&lt;/a&gt; have given me a wealth of relevant experience here.&lt;/p&gt;
&lt;p&gt;And finally… VaccinateCA are quickly running up against the limits of what you can sensibly do with Airtable - especially given Airtable’s hard limit at 100,000 records. They need to port critical tables to a custom PostgreSQL database, while maintaining as much as possible the agility that Airtable has enabled for them.&lt;/p&gt;
&lt;p&gt;&lt;a href="https://www.djangoproject.com/"&gt;Django&lt;/a&gt; is a great fit for this kind of challenge, and I know quite a bit about both Django and using Django to quickly build robust, scalable and maintainable applications!&lt;/p&gt;
&lt;p&gt;So I spent this week starting a Django replacement for the Airtable backend used by the volunteer calling application. I hope to get to feature parity (at least as an API backend that the application can write to) in the next few days, to demonstrate that a switch-over is both possible and a good idea.&lt;/p&gt;
&lt;h4&gt;What about Datasette?&lt;/h4&gt;
&lt;p&gt;On Monday I spun up a Datasette instance at &lt;a href="https://vaccinateca.datasette.io/"&gt;vaccinateca.datasette.io&lt;/a&gt; (&lt;a href="https://github.com/simonw/vaccinate-ca-datasette/"&gt;underlying repository&lt;/a&gt;) against data from the public VaccinateCA API. The map visualization of &lt;a href="https://vaccinateca.datasette.io/vaccinateca/locations?_facet=Affiliation&amp;amp;_facet=Latest+report+yes%3F&amp;amp;_facet_array=Availability+Info"&gt;all of the locations&lt;/a&gt; instantly proved useful in helping spot locations that had incorrectly been located with latitudes and longitudes outside of California.&lt;/p&gt;
&lt;p&gt;I hope to use Datasette for a variety of tasks like this, but it shouldn’t be the core of the solution. VaccinateCA is the perfect example of a problem that needs to be solved with &lt;a href="http://boringtechnology.club/"&gt;Boring Technology&lt;/a&gt; - it needs to Just Work, and time that could be spent learning exciting new technologies needs to be spent building what’s needed as quickly, robustly and risk-free as possible.&lt;/p&gt;
&lt;p&gt;That said, I’m already starting to experiment with the new &lt;a href="https://docs.djangoproject.com/en/3.1/ref/models/fields/#django.db.models.JSONField"&gt;JSONField&lt;/a&gt; introduced in Django 3.1 - I’m hoping that a few JSON columns can help compensate for the lack of flexibility compared to Airtable, which makes it ridiculously easy for anyone to add additional columns.&lt;/p&gt;
&lt;p&gt;(To be fair JSONField has been a feature of the Django PostgreSQL Django extension since &lt;a href="https://docs.djangoproject.com/en/3.1/releases/1.9/"&gt;version 1.9 in 2015&lt;/a&gt; so it’s just about made it into the boring technology bucket by now.)&lt;/p&gt;
&lt;h4&gt;Also this week&lt;/h4&gt;
&lt;p&gt;Working on VaccinateCA has given me a chance to use some of my tools in new and interesting ways, so I got to ship a bunch of small fixes, detailed in &lt;a href="#releases-2021-feb-27"&gt;Releases this week&lt;/a&gt; below.&lt;/p&gt;
&lt;p&gt;On Friday I gave a talk at &lt;a href="https://speakeasyjs.com/"&gt;Speakeasy JS&lt;/a&gt;, "the JavaScript meetup for &lt;g-emoji class="g-emoji" alias="lab_coat" fallback-src="https://github.githubassets.com/images/icons/emoji/unicode/1f97c.png"&gt;🥼&lt;/g-emoji&gt; mad science, &lt;g-emoji class="g-emoji" alias="mage_man" fallback-src="https://github.githubassets.com/images/icons/emoji/unicode/1f9d9-2642.png"&gt;🧙‍♂️&lt;/g-emoji&gt; hacking, and &lt;g-emoji class="g-emoji" alias="test_tube" fallback-src="https://github.githubassets.com/images/icons/emoji/unicode/1f9ea.png"&gt;🧪&lt;/g-emoji&gt; experiments" about why "SQL in your client-side JavaScript is a great idea". The video for that &lt;a href="https://www.youtube.com/watch?v=JyOYqJGrWak"&gt;is on YouTube&lt;/a&gt; and I plan to provide a full write-up soon.&lt;/p&gt;
&lt;p&gt;I also recorded a five minute lightning talk about &lt;a href="https://simonwillison.net/2020/Oct/9/git-scraping/"&gt;Git Scraping&lt;/a&gt; for next week's &lt;a href="https://www.ire.org/training/conferences/nicar-2021/"&gt;NICAR 2021&lt;/a&gt; data journalism conference.&lt;/p&gt;
&lt;p&gt;I also made a few small cosmetic upgrades to the way tags are displayed on my blog - they now show with a rounded border and purple background, and include a count of items published with that tag. My &lt;a href="https://simonwillison.net/tags/"&gt;tags page&lt;/a&gt; is one example of where I've now applied this style.&lt;/p&gt;
&lt;h4&gt;TIL this week&lt;/h4&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href="https://til.simonwillison.net/til/til/sphinx_sphinx-ext-extlinks.md"&gt;Using sphinx.ext.extlinks for issue links&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://til.simonwillison.net/til/til/postgresql_show-schema.md"&gt;Show the SQL schema for a PostgreSQL database&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://til.simonwillison.net/til/til/github-actions_postgresq-service-container.md"&gt;Running tests against PostgreSQL in a service container&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://til.simonwillison.net/til/til/django_extra-read-only-admin-information.md"&gt;Adding extra read-only information to a Django admin change page&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://til.simonwillison.net/til/til/postgresql_read-only-postgresql-user.md"&gt;Granting a PostgreSQL user read-only access to some tables&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;h4 id="releases-2021-feb-27"&gt;Releases this week&lt;/h4&gt;
&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;&lt;a href="https://github.com/simonw/flatten-single-item-arrays"&gt;flatten-single-item-arrays&lt;/a&gt;&lt;/strong&gt;: &lt;a href="https://github.com/simonw/flatten-single-item-arrays/releases/tag/0.1"&gt;0.1&lt;/a&gt; - 2021-02-25
&lt;br /&gt;Given a JSON list of objects, flatten any keys which always contain single item arrays to just a single value&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;&lt;a href="https://github.com/simonw/datasette-auth-github"&gt;datasette-auth-github&lt;/a&gt;&lt;/strong&gt;: &lt;a href="https://github.com/simonw/datasette-auth-github/releases/tag/0.13.1"&gt;0.13.1&lt;/a&gt; - (&lt;a href="https://github.com/simonw/datasette-auth-github/releases"&gt;25 releases total&lt;/a&gt;) - 2021-02-25
&lt;br /&gt;Datasette plugin that authenticates users against GitHub&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;&lt;a href="https://github.com/simonw/datasette-block"&gt;datasette-block&lt;/a&gt;&lt;/strong&gt;: &lt;a href="https://github.com/simonw/datasette-block/releases/tag/0.1.1"&gt;0.1.1&lt;/a&gt; - (&lt;a href="https://github.com/simonw/datasette-block/releases"&gt;2 releases total&lt;/a&gt;) - 2021-02-25
&lt;br /&gt;Block all access to specific path prefixes&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;&lt;a href="https://github.com/simonw/github-contents"&gt;github-contents&lt;/a&gt;&lt;/strong&gt;: &lt;a href="https://github.com/simonw/github-contents/releases/tag/0.2"&gt;0.2&lt;/a&gt; - 2021-02-24
&lt;br /&gt;Python class for reading and writing data to a GitHub repository&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;&lt;a href="https://github.com/simonw/csv-diff"&gt;csv-diff&lt;/a&gt;&lt;/strong&gt;: &lt;a href="https://github.com/simonw/csv-diff/releases/tag/1.1"&gt;1.1&lt;/a&gt; - (&lt;a href="https://github.com/simonw/csv-diff/releases"&gt;9 releases total&lt;/a&gt;) - 2021-02-23
&lt;br /&gt;Python CLI tool and library for diffing CSV and JSON files&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;&lt;a href="https://github.com/simonw/sqlite-transform"&gt;sqlite-transform&lt;/a&gt;&lt;/strong&gt;: &lt;a href="https://github.com/simonw/sqlite-transform/releases/tag/0.4"&gt;0.4&lt;/a&gt; - (&lt;a href="https://github.com/simonw/sqlite-transform/releases"&gt;5 releases total&lt;/a&gt;) - 2021-02-22
&lt;br /&gt;Tool for running transformations on columns in a SQLite database&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;&lt;a href="https://github.com/simonw/airtable-export"&gt;airtable-export&lt;/a&gt;&lt;/strong&gt;: &lt;a href="https://github.com/simonw/airtable-export/releases/tag/0.5"&gt;0.5&lt;/a&gt; - (&lt;a href="https://github.com/simonw/airtable-export/releases"&gt;7 releases total&lt;/a&gt;) - 2021-02-22
&lt;br /&gt;Export Airtable data to YAML, JSON or SQLite files on disk&lt;/li&gt;
&lt;/ul&gt;
    
        &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/crowdsourcing"&gt;crowdsourcing&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/django"&gt;django&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/postgresql"&gt;postgresql&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/patrick-mckenzie"&gt;patrick-mckenzie&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/datasette"&gt;datasette&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/weeknotes"&gt;weeknotes&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/covid19"&gt;covid19&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/vaccinate-ca"&gt;vaccinate-ca&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/personal-news"&gt;personal-news&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/jesse-vincent"&gt;jesse-vincent&lt;/a&gt;&lt;/p&gt;
    

</summary><category term="crowdsourcing"/><category term="django"/><category term="postgresql"/><category term="patrick-mckenzie"/><category term="datasette"/><category term="weeknotes"/><category term="covid19"/><category term="vaccinate-ca"/><category term="personal-news"/><category term="jesse-vincent"/></entry><entry><title>CoronaFaceImpact</title><link href="https://simonwillison.net/2020/Nov/15/coronafaceimpact/#atom-tag" rel="alternate"/><published>2020-11-15T22:41:16+00:00</published><updated>2020-11-15T22:41:16+00:00</updated><id>https://simonwillison.net/2020/Nov/15/coronafaceimpact/#atom-tag</id><summary type="html">
    
&lt;p&gt;&lt;strong&gt;&lt;a href="https://v-fonts.com/fonts/coronafaceimpact"&gt;CoronaFaceImpact&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;
Variable fonts are fonts that can be customized by passing in additional parameters, which is done in CSS using the font-variation-settings property. Here’s a ​variable font that shows multiple effects of Covid-19 lockdown on a bearded face, created by Friedrich Althausen.

    &lt;p&gt;&lt;small&gt;&lt;/small&gt;Via &lt;a href="https://chat.indieweb.org/2020-11-15/1605479988328700"&gt;Kevin Marks&lt;/a&gt;&lt;/small&gt;&lt;/p&gt;


    &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/css"&gt;css&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/fonts"&gt;fonts&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/typography"&gt;typography&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/covid19"&gt;covid19&lt;/a&gt;&lt;/p&gt;



</summary><category term="css"/><category term="fonts"/><category term="typography"/><category term="covid19"/></entry><entry><title>Quoting Wade Davis</title><link href="https://simonwillison.net/2020/Aug/8/wade-davis/#atom-tag" rel="alternate"/><published>2020-08-08T15:48:28+00:00</published><updated>2020-08-08T15:48:28+00:00</updated><id>https://simonwillison.net/2020/Aug/8/wade-davis/#atom-tag</id><summary type="html">
    &lt;blockquote cite="https://www.rollingstone.com/politics/political-commentary/covid-19-end-of-american-era-wade-davis-1038206/"&gt;&lt;p&gt;COVID-19 attacks our physical bodies, but also the cultural foundations of our lives, the toolbox of community and connectivity that is for the human what claws and teeth represent to the tiger.&lt;/p&gt;&lt;/blockquote&gt;
&lt;p class="cite"&gt;&amp;mdash; &lt;a href="https://www.rollingstone.com/politics/political-commentary/covid-19-end-of-american-era-wade-davis-1038206/"&gt;Wade Davis&lt;/a&gt;&lt;/p&gt;

    &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/covid19"&gt;covid19&lt;/a&gt;&lt;/p&gt;



</summary><category term="covid19"/></entry><entry><title>Weeknotes: datasette-auth-passwords, a Datasette logo and a whole lot more</title><link href="https://simonwillison.net/2020/Jul/17/weeknotes-datasette-logo/#atom-tag" rel="alternate"/><published>2020-07-17T03:41:13+00:00</published><updated>2020-07-17T03:41:13+00:00</updated><id>https://simonwillison.net/2020/Jul/17/weeknotes-datasette-logo/#atom-tag</id><summary type="html">
    &lt;p&gt;All sorts of project updates this week.&lt;/p&gt;

&lt;h4&gt;datasette-auth-passwords&lt;/h4&gt;

&lt;p&gt;&lt;a href="https://simonwillison.net/2020/Jun/12/annotated-release-notes/"&gt;Datasette 0.44&lt;/a&gt; added authentication support as a core concept, but left the actual implementation details up to the plugins.&lt;/p&gt;

&lt;p&gt;I released &lt;a href="https://github.com/simonw/datasette-auth-passwords"&gt;datasette-auth-passwords&lt;/a&gt; on Monday. It's an implementation of the most obvious form of authentication (as opposed to &lt;a href="https://github.com/simonw/datasette-auth-github"&gt;GitHub SSO&lt;/a&gt; or &lt;a href="https://github.com/simonw/datasette-auth-tokens"&gt;bearer tokens&lt;/a&gt; or &lt;a href="https://github.com/simonw/datasette-auth-existing-cookies"&gt;existing domain cookies&lt;/a&gt;): usernames and passwords, typed into a form.&lt;/p&gt;

&lt;p&gt;Implementing passwords responsibly is actually pretty tricky, due to the need to effectively hash them. After &lt;a href="https://github.com/simonw/datasette-auth-passwords/issues/1"&gt;some research&lt;/a&gt; I ended up mostly copying how Django does it (never a bad approach): I'm using 260,000 salted pbkdf2_hmac iterations, taking advantage of the Python standard library. I wrote this up &lt;a href="https://github.com/simonw/til/blob/master/python/password-hashing-with-pbkdf2.md"&gt;in a TIL&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;The plugin currently only supports hard-coded password hashes that are fed to Datasette via an environment variable - enough to set up a password-protected Datasette instance with a couple of users, but not really good for anything more complex than that. I have an &lt;a href="https://github.com/simonw/datasette-auth-passwords/issues/6"&gt;open issue&lt;/a&gt; for implementing database-backed password accounts, although again the big challenge is figuring out how to responsible store those password hashes.&lt;/p&gt;

&lt;p&gt;I've set up a live demo of the password plugin at &lt;a href="https://datasette-auth-passwords-demo.datasette.io/"&gt;datasette-auth-passwords-demo.datasette.io&lt;/a&gt; - you can sign into it to reveal a private database that's only available to authenticated users.&lt;/p&gt;

&lt;h4&gt;Datasette website and logo&lt;/h4&gt;

&lt;p&gt;I'm finally making good progress on a website for Datasette. As part of that I've been learning to use &lt;a href="https://www.figma.com/"&gt;Figma&lt;/a&gt;, which I used to create a Datasette logo.&lt;/p&gt;

&lt;p&gt;&lt;img alt="Datasette" src="https://static.simonwillison.net/static/2020/datasette-logo.svg" style="max-width: 100%; margin: 1.5em 0" /&gt;&lt;/p&gt;

&lt;p&gt;Figma is really neat: it's an entirely web-based vector image editor, aimed at supporting the kind of design work that goes into websites and apps. It has full collaborative editing for teams but it's free for single users. Most importantly it has &lt;a href="https://www.figma.com/blog/with-figmas-new-svg-exports-less-more/"&gt;extremely competent SVG exports&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;I've added the logo to &lt;a href="https://datasette.readthedocs.io/en/latest/"&gt;the latest version&lt;/a&gt; of the Datasette docs, and I have an &lt;a href="https://github.com/readthedocs/sphinx_rtd_theme/pull/978"&gt;open pull request&lt;/a&gt; to &lt;code&gt;sphinx_rtd_theme&lt;/code&gt; to add support for setting a custom link target on the logo so I can link back to the rest of the official site, when it goes live.&lt;/p&gt;

&lt;h4&gt;TIL search snippet highlighting&lt;/h4&gt;

&lt;p&gt;My &lt;a href="https://til.simonwillison.net/"&gt;TIL site&lt;/a&gt; has a search engine, but it didn't do snippet highlighting. I reused the pattern I described in &lt;a href="https://24ways.org/2018/fast-autocomplete-search-for-your-website/"&gt;Fast Autocomplete Search for Your Website&lt;/a&gt; - implemented server-side rather than client-side this time - to add that functionality. The implementation &lt;a href="https://github.com/simonw/til/commit/51f5daef61b6bbe6c5be564b8644d2bff6761ab0"&gt;is here&lt;/a&gt; - here's &lt;a href="https://til.simonwillison.net/til/search?q=asgi"&gt;a demo&lt;/a&gt; of it in action.&lt;/p&gt;

&lt;h4&gt;SRCCON schedule&lt;/h4&gt;

&lt;p&gt;I'm attending (virtually) the &lt;a href="https://2020.srccon.org/"&gt;SRCCON 2020&lt;/a&gt; journalism conference this week, and Datasette is part of the &lt;a href="https://2020.srccon.org/projects-products-research/#datasette"&gt;Projects, Products, &amp;amp; Research&lt;/a&gt; track.&lt;/p&gt;

&lt;p&gt;As a demo, I set up a Datasette powered copy of the conference schedule at &lt;a href="https://srccon-2020.datasette.io/"&gt;srccon-2020.datasette.io&lt;/a&gt; - it's running the &lt;a href="https://github.com/simonw/datasette-ics"&gt;datasette-ics&lt;/a&gt; plugin which means it can provide a URL that can be subscribed to in Google or Apple Calendar.&lt;/p&gt;

&lt;p&gt;The site runs out of the &lt;a href="https://github.com/simonw/srccon-2020-datasette"&gt;simonw/srccon-2020-datasette&lt;/a&gt; repository, which uses a GitHub Action to download the schedule JSON, modify it a little (mainly to turn the start and end dates into ISO datestamps), save it to a SQLite database with &lt;a href="https://github.com/simonw/sqlite-utils"&gt;sqlite-utils&lt;/a&gt; and publish it to &lt;a href="https://vercel.com/"&gt;Vercel&lt;/a&gt;.&lt;/p&gt;

&lt;h4&gt;Covid 19 population data&lt;/h4&gt;

&lt;p&gt;My &lt;a href="https://simonwillison.net/2020/Mar/11/covid-19/"&gt;Covid-19 tracker&lt;/a&gt; publishes updated numbers of cases and deaths from the New York Times, the LA Times and Johns Hopkins university on an hourly basis.&lt;/p&gt;

&lt;p&gt;One thing that was missing was county population data. US counties are identified in the data by their &lt;a href="https://en.wikipedia.org/wiki/FIPS_county_code"&gt;FIPS codes&lt;/a&gt;, which offers a mechanism for joining against population estimates pulled from the US Census.&lt;/p&gt;

&lt;p&gt;Thanks to &lt;a href="https://github.com/nytimes/covid-19-data/pull/155"&gt;Aaron King&lt;/a&gt; I've now incorporated that data into the site, as a new &lt;a href="https://covid-19.datasettes.com/covid/us_census_county_populations_2019"&gt;us_census_county_populations_2019&lt;/a&gt; table.&lt;/p&gt;

&lt;p&gt;I used that data to define a SQL view - &lt;a href="https://covid-19.datasettes.com/covid/latest_ny_times_counties_with_populations"&gt;latest_ny_times_counties_with_populations&lt;/a&gt; - which shows the latest New York Times county data with new derived  	&lt;code&gt;cases_per_million&lt;/code&gt; and &lt;code&gt;deaths_per_million&lt;/code&gt; columns.&lt;/p&gt;

&lt;h4&gt;Tweaks to this blog&lt;/h4&gt;

&lt;p&gt;For many years this blog's main content has sat on the left of the page - which looks increasingly strange as screens get wider and wider. As of &lt;a href="https://github.com/simonw/simonwillisonblog/commit/3d44c67a2cfee128d0168cb2e6a650f45211446a"&gt;this commit&lt;/a&gt; the main layout is centered, which I think looks much nicer.&lt;/p&gt;

&lt;p&gt;I also ran &lt;a href="https://github.com/simonw/simonwillisonblog/commit/b085679933985c44b8171b556d141cdef8f232d2"&gt;a data migration&lt;/a&gt; to fix some old internal links.&lt;/p&gt;

&lt;h4&gt;Miscellaneous&lt;/h4&gt;

&lt;p&gt;I gave a (virtual) talk at &lt;a href="https://www.djangolondon.com/"&gt;Django London&lt;/a&gt; on Monday about Datasette. I've taken to sharing a Google Doc for this kind of talk, which I prepare before the talk with notes and then update afterwards to reflect additional material from the Q&amp;amp;A. Here's &lt;a href="https://docs.google.com/document/d/17ZDlxHOqDGugKqn_Nh_Q7JER5vjKin1D3d17oPhrs9o/edit"&gt;the document&lt;/a&gt; from Monday's talk.&lt;/p&gt;

&lt;p&gt;San Francisco Public Works maintain a page of &lt;a href="https://sfpublicworks.org/tree-removal-notifications"&gt;tree removal notifications&lt;/a&gt; showing trees that are scheduled for removal. I &lt;a href="https://simonwillison.net/2019/Mar/13/tree-history/"&gt;like those trees&lt;/a&gt;. They don't provide an archive of notifications from that page, so I've set up a &lt;a href="https://simonwillison.net/tags/gitscraping/"&gt;git scraping&lt;/a&gt; &lt;a href="https://github.com/simonw/sfpublicworks-tree-removal-notifications"&gt;GitHub repository&lt;/a&gt; that scrapes the page daily and maintains a history of its contents in the commit log.&lt;/p&gt;

&lt;p&gt;I updated &lt;a href="https://github.com/simonw/datasette-publish-fly/releases/tag/1.0"&gt;datasette-publish-fly&lt;/a&gt; for compatibility with Datasette 0.44 and Python 3.6.&lt;/p&gt;

&lt;p&gt;I made a few tweaks to &lt;a href="https://simonwillison.net/2020/Jul/10/self-updating-profile-readme/"&gt;my GitHub profile README&lt;/a&gt;, which is now Apache 2 licensed so people know they can adapt it for their own purposes.&lt;/p&gt;

&lt;p&gt;I released &lt;a href="https://github.com/dogsheep/github-to-sqlite/releases/tag/2.3"&gt;github-to-sqlite 2.3&lt;/a&gt; with a new option for fetching information for just specific repositories.&lt;/p&gt;

&lt;p&gt;The Develomentor podcast published &lt;a href="https://develomentor.com/2020/07/16/simon-willison-data-journalism-the-importance-of-side-projects/"&gt;an interview with me&lt;/a&gt; about my career, and how it's been mostly defined by side-projects.&lt;/p&gt;

&lt;h4&gt;TIL this week&lt;/h4&gt;

&lt;ul&gt;&lt;li&gt;&lt;a href="https://github.com/simonw/til/blob/master/heroku/pg-pull.md"&gt;Using heroku pg:pull to restore a backup to a macOS laptop&lt;/a&gt;&lt;/li&gt;&lt;li&gt;&lt;a href="https://github.com/simonw/til/blob/master/python/password-hashing-with-pbkdf2.md"&gt;Password hashing in Python with pbkdf2&lt;/a&gt;&lt;/li&gt;&lt;/ul&gt;
    
        &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/design"&gt;design&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/passwords"&gt;passwords&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/projects"&gt;projects&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/datasette"&gt;datasette&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/weeknotes"&gt;weeknotes&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/covid19"&gt;covid19&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/git-scraping"&gt;git-scraping&lt;/a&gt;&lt;/p&gt;
    

</summary><category term="design"/><category term="passwords"/><category term="projects"/><category term="datasette"/><category term="weeknotes"/><category term="covid19"/><category term="git-scraping"/></entry><entry><title>Weeknotes: SBA Covid-19 PPP loans, Datasette talks, Datasette plugin upgrades</title><link href="https://simonwillison.net/2020/Jul/9/sba-covid-19-ppp-loans/#atom-tag" rel="alternate"/><published>2020-07-09T22:44:49+00:00</published><updated>2020-07-09T22:44:49+00:00</updated><id>https://simonwillison.net/2020/Jul/9/sba-covid-19-ppp-loans/#atom-tag</id><summary type="html">
    &lt;p&gt;This week I've mainly been exploring Small Business Administration Covid-19 loans data, pitching some talks and upgrading some plugins for compatibility with Datasette 0.44+.&lt;/p&gt;

&lt;h4&gt;SBA PPP Covid-19 loan data&lt;/h4&gt;

&lt;p&gt;On Monday the Small Business Administration and the Treasury Department &lt;a href="https://home.treasury.gov/news/press-releases/sm1052"&gt;released detailed loan-level data&lt;/a&gt; for loans made under the Paycheck Protection Program as part of their Covid-19 response.&lt;/p&gt;

&lt;p&gt;They released the data as &lt;a href="https://sba.app.box.com/s/wz72fqag1nd99kj3t9xlq49deoop6gzf"&gt;a zip file full of CSVs&lt;/a&gt; on their Box account (the first time I've seen Box used for this kind of government data release).&lt;/p&gt;

&lt;p&gt;The most interesting file in there was &lt;code&gt;foia_150k_plus.csv&lt;/code&gt; - a file containing 661,218 loans over $150,000. So I loaded it into Datasette and published it at &lt;a href="https://sba-loans-covid-19.datasettes.com/loans_150k_plus/foia_150k_plus"&gt;https://sba-loans-covid-19.datasettes.com/loans_150k_plus/foia_150k_plus&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;I made one modification to the data: on the &lt;a href="https://twitter.com/zeeg/status/1280280556102512640"&gt;suggestion of David Cramer&lt;/a&gt; I imported a list of NAICS code descriptions &lt;a href="https://www.census.gov/eos/www/naics/downloadables/downloadables.html"&gt;from the US Census&lt;/a&gt; and set up the &lt;code&gt;NAICSCode&lt;/code&gt; column as a foreign key to that table.&lt;/p&gt;

&lt;p&gt;Here's &lt;a href="https://sba-loans-covid-19.datasettes.com/loans_150k_plus?sql=with+counts+as+%28select+NAICSCode%2C+count%28*%29+as+num_loan_recipients+from+foia_150k_plus+group+by+NAICSCode+order+by+num_loan_recipients+desc%29%0D%0Aselect+counts.NAICSCode%2C+counts.num_loan_recipients%2C+naics_2017.name%2C+%27https%3A%2F%2Fsba-loans-covid-19.datasettes.com%2Floans_150k_plus%2Ffoia_150k_plus%3F_facet%3DCity%26_facet%3DState%26_facet%3DRaceEthnicity%26_facet%3DBusinessType%26_facet%3DGender%26_facet%3DVeteran%26NAICSCode%3D%27+%7C%7C+NAICSCode+as+view_them+from+counts+join+naics_2017+on+counts.NAICSCode+%3D+naics_2017.id"&gt;a custom query&lt;/a&gt; showing the NAICS codes with the most loan claims &amp;gt; $150k - &lt;a href="https://sba-loans-covid-19.datasettes.com/loans_150k_plus/foia_150k_plus?_facet=City&amp;amp;_facet=State&amp;amp;_facet=RaceEthnicity&amp;amp;_facet=BusinessType&amp;amp;_facet=Gender&amp;amp;_facet=Veteran&amp;amp;NAICSCode=621210"&gt;Offices of Dentists&lt;/a&gt; come in 8th place with 10,627 loans!&lt;/p&gt;

&lt;p&gt;My &lt;a href="https://twitter.com/simonw/status/1280283053726691329"&gt;Twitter thread&lt;/a&gt; has more commentary on things I found exploring the data, and my &lt;a href="https://github.com/simonw/sba-loans-covid-19-datasette"&gt;sba-loans-covid-19-datasette GitHub repo&lt;/a&gt; describes the exact steps I went through to create the Datasette instance (using &lt;a href="https://github.com/simonw/csvs-to-sqlite"&gt;csvs-to-sqlite&lt;/a&gt; and &lt;a href="https://github.com/simonw/sqlite-utils"&gt;sqlite-utils&lt;/a&gt;).&lt;/p&gt;

&lt;h4&gt;Pitching some talks&lt;/h4&gt;

&lt;p&gt;I haven't done any public speaking in a while, and the pandemic means I'm not going to be giving any in-person talks for the forseeable future... so I spent some time pitching talks to remote events.&lt;/p&gt;

&lt;p&gt;I'll be speaking &lt;a href="https://www.meetup.com/djangolondon/events/271800940/"&gt;at Django London&lt;/a&gt; on July 14th and I have a few other submissions in the pipeline.&lt;/p&gt;

&lt;p&gt;I'm also attending (virtually) the &lt;a href="https://2020.srccon.org/"&gt;SRCCON journalism conference&lt;/a&gt; next week. They asked me to put together a short video introduction to Datasette, which I've embedded below. I'll be hanging out and talking to anyone who's interested in learning more about the project, or who can help me figure out what direction to take it next.&lt;/p&gt;

&lt;iframe src="https://player.vimeo.com/video/436903714" style="width: 100%" width="640" height="400" frameborder="0" allow="autoplay; fullscreen" allowfullscreen="allowfullscreen"&gt;&amp;#160;&lt;/iframe&gt;
&lt;p&gt;&lt;a href="https://vimeo.com/436903714"&gt;SRCCON 2020: Datasette&lt;/a&gt; from &lt;a href="https://vimeo.com/user23009240"&gt;OpenNews Source&lt;/a&gt; on &lt;a href="https://vimeo.com"&gt;Vimeo&lt;/a&gt;.&lt;/p&gt;

&lt;h4&gt;Upgrading plugins&lt;/h4&gt;

&lt;p&gt;Datasette 0.44 broke some of my existing plugins due to a change in how it handles ASGI lifespan events. I've upgraded the following this week:&lt;/p&gt;

&lt;ul&gt;&lt;li&gt;&lt;a href="https://github.com/simonw/datasette-configure-fts/releases/tag/1.0"&gt;datasette-configure-fts 1.0&lt;/a&gt; - a plugin for configuring which columns in a table are enabled for full-text search.&lt;/li&gt;&lt;li&gt;&lt;a href="https://github.com/simonw/datasette-edit-tables/releases/tag/0.2a"&gt;datasette-edit-tables 0.2a&lt;/a&gt; - tools for renaming tables and adding columns. This isn't particularly useful yet but I'm excited about its potential.&lt;/li&gt;&lt;li&gt;&lt;a href="https://github.com/simonw/datasette-media/releases/tag/0.3"&gt;datasette-media 0.3&lt;/a&gt; - a plugin for serving media from disk based on paths served out of the SQLite database.&lt;/li&gt;&lt;li&gt;&lt;a href="https://github.com/simonw/datasette-search-all/releases/tag/0.3"&gt;datasette-search-all 0.3&lt;/a&gt; - a plugin providing a mechanism for searching all FTS-enabled tables at once, &lt;a href="https://simonwillison.net/2020/Mar/9/datasette-search-all/"&gt;discussed here previously&lt;/a&gt;.&lt;/li&gt;&lt;/ul&gt;

&lt;h4&gt;sqlite-utils 2.11&lt;/h4&gt;

&lt;p&gt;&lt;a href="https://sqlite-utils.readthedocs.io/en/stable/changelog.html#v2-11"&gt;sqlite-utils 2.11&lt;/a&gt; is the first release of &lt;code&gt;sqlite-utils&lt;/code&gt; that was entirely written by someone else! Thomas Sibley added a new &lt;code&gt;--truncate&lt;/code&gt; option for emptying a table (safely within a transaction) before populating it and made an improvement to how transactions work generally.&lt;/p&gt;

&lt;p&gt;Thomas inspired me to &lt;a href="https://github.com/simonw/sqlite-utils/issues/121"&gt;start thinking more carefully&lt;/a&gt; about how transactions should work with the library.&lt;/p&gt;
    
        &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/projects"&gt;projects&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/datasette"&gt;datasette&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/weeknotes"&gt;weeknotes&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/covid19"&gt;covid19&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/sqlite-utils"&gt;sqlite-utils&lt;/a&gt;&lt;/p&gt;
    

</summary><category term="projects"/><category term="datasette"/><category term="weeknotes"/><category term="covid19"/><category term="sqlite-utils"/></entry><entry><title>sba-loans-covid-19-datasette</title><link href="https://simonwillison.net/2020/Jul/7/sba-loans-covid-19-datasette/#atom-tag" rel="alternate"/><published>2020-07-07T02:42:40+00:00</published><updated>2020-07-07T02:42:40+00:00</updated><id>https://simonwillison.net/2020/Jul/7/sba-loans-covid-19-datasette/#atom-tag</id><summary type="html">
    
&lt;p&gt;&lt;strong&gt;&lt;a href="https://github.com/simonw/sba-loans-covid-19-datasette"&gt;sba-loans-covid-19-datasette&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;
The treasury department released a bunch of data on the Covid-19 SBA Paycheck Protection Program Loan recipients today—I’ve loaded the most interesting data (the $150,000+ loans) into a Datasette instance.

    &lt;p&gt;&lt;small&gt;&lt;/small&gt;Via &lt;a href="https://twitter.com/simonw/status/1280283053726691329"&gt;@simonw&lt;/a&gt;&lt;/small&gt;&lt;/p&gt;


    &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/data-journalism"&gt;data-journalism&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/projects"&gt;projects&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/datasette"&gt;datasette&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/covid19"&gt;covid19&lt;/a&gt;&lt;/p&gt;



</summary><category term="data-journalism"/><category term="projects"/><category term="datasette"/><category term="covid19"/></entry><entry><title>Quoting Tim O'Reilly</title><link href="https://simonwillison.net/2020/Jul/4/tim-oreilly/#atom-tag" rel="alternate"/><published>2020-07-04T16:06:41+00:00</published><updated>2020-07-04T16:06:41+00:00</updated><id>https://simonwillison.net/2020/Jul/4/tim-oreilly/#atom-tag</id><summary type="html">
    &lt;blockquote cite="https://www.oreilly.com/tim/21stcentury/"&gt;&lt;p&gt;The future will not be like the past. The comfortable Victorian and Georgian world complete with grand country houses, a globe-spanning British empire, and lords and commoners each knowing their place, was swept away by the events that began in the summer of 1914 (and that with Britain on the “winning” side of both world wars.) So too, our comfortable “American century” of conspicuous consumer consumption, global tourism, and ever-increasing stock and home prices may be gone forever.&lt;/p&gt;&lt;/blockquote&gt;
&lt;p class="cite"&gt;&amp;mdash; &lt;a href="https://www.oreilly.com/tim/21stcentury/"&gt;Tim O&amp;#x27;Reilly&lt;/a&gt;&lt;/p&gt;

    &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/tim-oreilly"&gt;tim-oreilly&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/covid19"&gt;covid19&lt;/a&gt;&lt;/p&gt;



</summary><category term="tim-oreilly"/><category term="covid19"/></entry><entry><title>Weeknotes: Archiving coronavirus.data.gov.uk, custom pages and directory configuration in Datasette, photos-to-sqlite</title><link href="https://simonwillison.net/2020/Apr/29/weeknotes/#atom-tag" rel="alternate"/><published>2020-04-29T19:41:11+00:00</published><updated>2020-04-29T19:41:11+00:00</updated><id>https://simonwillison.net/2020/Apr/29/weeknotes/#atom-tag</id><summary type="html">
    &lt;p&gt;I mainly made progress on three projects this week: Datasette, photos-to-sqlite and a cleaner way of archiving data to a git repository.&lt;/p&gt;

&lt;h3&gt;Archiving coronavirus.data.gov.uk&lt;/h3&gt;

&lt;p&gt;The UK goverment have a new portal website sharing detailed Coronavirus data for regions around the country, at &lt;a href="https://coronavirus.data.gov.uk/"&gt;coronavirus.data.gov.uk&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;As with everything else built in 2020, it's a big single-page JavaScript app. Matthew Somerville &lt;a href="http://dracos.co.uk/wrote/coronavirus-dashboard/"&gt;investigated&lt;/a&gt; what it would take to build a much lighter (and faster loading) site displaying the same information by moving much of the rendering to the server.&lt;/p&gt;

&lt;p&gt;One of the best things about the SPA craze is that it strongly encourages structured data to be published as JSON files. Matthew's article inspired me to take a look, and sure enough the government figures are available in an extremely comprehensive (and 3.3MB in size) JSON file, available from &lt;a href="https://c19downloads.azureedge.net/downloads/data/data_latest.json"&gt;https://c19downloads.azureedge.net/downloads/data/data_latest.json&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;Any time I see a file like this my first questions are how often does it change - and what kind of changes are being made to it?&lt;/p&gt;

&lt;p&gt;I've written about scraping to a git repository (see my new &lt;a href="https://simonwillison.net/tags/gitscraping/"&gt;gitscraping&lt;/a&gt; tag) a bunch in the past:&lt;/p&gt;

&lt;ul&gt;&lt;li&gt;&lt;a href="https://simonwillison.net/2017/Sep/10/scraping-irma/"&gt;Scraping hurricane Irma&lt;/a&gt; - September 2017&lt;/li&gt;&lt;li&gt;&lt;a href="https://simonwillison.net/2017/Oct/10/fires-in-the-north-bay/"&gt;Changelogs to help understand the fires in the North Bay&lt;/a&gt; - October 2017&lt;/li&gt;&lt;li&gt;&lt;a href="https://simonwillison.net/2019/Mar/13/tree-history/"&gt;Generating a commit log for San Francisco’s official list of trees&lt;/a&gt; - March 2019&lt;/li&gt;&lt;li&gt;&lt;a href="https://simonwillison.net/2019/Oct/10/pge-outages/"&gt;Tracking PG&amp;amp;E outages by scraping to a git repo&lt;/a&gt; - October 2019&lt;/li&gt;&lt;li&gt;&lt;a href="https://simonwillison.net/2020/Jan/21/github-actions-cloud-run/"&gt;Deploying a data API using GitHub Actions and Cloud Run&lt;/a&gt; - January 2020&lt;/li&gt;&lt;/ul&gt;

&lt;p&gt;Now that I've figured out a really clean way to &lt;a href="https://github.com/simonw/til/blob/master/github-actions/commit-if-file-changed.md"&gt;Commit a file if it changed&lt;/a&gt; in a GitHub Action knocking out new versions of this pattern is really quick.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://github.com/simonw/coronavirus-data-gov-archive"&gt;simonw/coronavirus-data-gov-archive&lt;/a&gt; is my new repo that does exactly that: it periodically fetches the latest versions of the JSON data files powering that site and commits them if they have changed. The aim is to build a &lt;a href="https://github.com/simonw/coronavirus-data-gov-archive/commits/master/data_latest.json"&gt;commit history&lt;/a&gt; of changes made to the underlying data.&lt;/p&gt;

&lt;p&gt;The first implementation was extremely simple - here's the &lt;a href="https://github.com/simonw/coronavirus-data-gov-archive/blob/c83d69e95ec6400bf77d7b0d474e868baa78841e/.github/workflows/scheduled.yml"&gt;entire action&lt;/a&gt;:&lt;/p&gt;

&lt;pre&gt;&lt;code&gt;name: Fetch latest data

on:
push:
repository_dispatch:
schedule:
    - cron:  '25 * * * *'

jobs:
scheduled:
    runs-on: ubuntu-latest
    steps:
    - name: Check out this repo
    uses: actions/checkout@v2
    - name: Fetch latest data
    run: |-
        curl https://c19downloads.azureedge.net/downloads/data/data_latest.json | jq . &amp;gt; data_latest.json
        curl https://c19pub.azureedge.net/utlas.geojson | gunzip | jq . &amp;gt; utlas.geojson
        curl https://c19pub.azureedge.net/countries.geojson | gunzip | jq . &amp;gt; countries.geojson
        curl https://c19pub.azureedge.net/regions.geojson | gunzip | jq . &amp;gt; regions.geojson
    - name: Commit and push if it changed
    run: |-
        git config user.name "Automated"
        git config user.email "actions@users.noreply.github.com"
        git add -A
        timestamp=$(date -u)
        git commit -m "Latest data: ${timestamp}" || exit 0
        git push&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;It uses a combination of &lt;code&gt;curl&lt;/code&gt; and &lt;code&gt;jq&lt;/code&gt; (both available &lt;a href="https://github.com/actions/virtual-environments/blob/master/images/linux/Ubuntu1804-README.md"&gt;in the default worker environment&lt;/a&gt;) to pull down the data and pretty-print it (better for readable diffs), then commits the result.&lt;/p&gt;

&lt;p&gt;Matthew Somerville &lt;a href="https://twitter.com/dracos/status/1255221799085846532"&gt;pointed out&lt;/a&gt; that inefficient polling sets a bad precedent. Here I'm hitting &lt;code&gt;azureedge.net&lt;/code&gt;, the Azure CDN, so that didn't particularly worry me - but since I want this pattern to be used widely it's good to provide a best-practice example.&lt;/p&gt;

&lt;p&gt;Figuring out the best way to make &lt;a href="https://developer.mozilla.org/en-US/docs/Web/HTTP/Conditional_requests"&gt;conditional get requests&lt;/a&gt; in a GitHub Action lead me down &lt;a href="https://github.com/simonw/coronavirus-data-gov-archive/issues/1"&gt;something of a rabbit hole&lt;/a&gt;. I wanted to use &lt;a href="https://daniel.haxx.se/blog/2019/12/06/curl-speaks-etag/"&gt;curl's new ETag support&lt;/a&gt; but I ran into &lt;a href="https://github.com/curl/curl/issues/5309"&gt;a curl bug&lt;/a&gt;, so I ended up rolling a simple Python CLI tool called &lt;a href="https://github.com/simonw/conditional-get"&gt;conditional-get&lt;/a&gt; to solve my problem. In the time it took me to release that tool (just a few hours) a &lt;a href="https://github.com/curl/curl/issues/5309#issuecomment-621265179"&gt;new curl release&lt;/a&gt; came out with a fix for that bug!&lt;/p&gt;

&lt;p&gt;Here's &lt;a href="https://github.com/simonw/coronavirus-data-gov-archive/blob/a95d7661b236a9ee9a26a441dd948eb00308f919/.github/workflows/scheduled.yml"&gt;the workflow&lt;/a&gt; using my &lt;code&gt;conditional-get&lt;/code&gt; tool. See &lt;a href="https://github.com/simonw/coronavirus-data-gov-archive/issues/1"&gt;the issue thread&lt;/a&gt; for all of the other potential solutions, including a really neat &lt;a href="https://github.com/hubgit/curl-etag"&gt;Action shell-script solution&lt;/a&gt; by Alf Eaton.&lt;/p&gt;

&lt;p&gt;To my absolute delight, the project has already been forked once by Daniel Langer to &lt;a href="https://github.com/dlanger/coronavirus-hc-infobase-archive"&gt;capture Canadian Covid-19 cases&lt;/a&gt;!&lt;/p&gt;

&lt;h3 id="new-datasette-features"&gt;New Datasette features&lt;/h3&gt;

&lt;p&gt;I pushed two new features to &lt;a href="https://github.com/simonw/datasette"&gt;Datasette&lt;/a&gt; master, ready for release in 0.41.&lt;/p&gt;

&lt;h4&gt;Configuration directory mode&lt;/h4&gt;

&lt;p&gt;This is an idea I had while building &lt;a href="https://github.com/simonw/datasette-publish-now"&gt;datasette-publish-now&lt;/a&gt;. Datasette instances can be run with custom metadata, custom plugins and custom templates. I'm increasingly finding myself working on projects that run using something like this:&lt;/p&gt;

&lt;pre&gt;&lt;code&gt;$ datasette data1.db data2.db data3.db \
    --metadata=metadata.json
    --template-dir=templates \
    --plugins-dir=plugins&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;Directory configuration mode introduces the idea that Datasette can configure itself based on a directory layout. The above example can instead by handled by creating the following layout:&lt;/p&gt;

&lt;pre&gt;&lt;code&gt;my-project/data1.db
my-project/data2.db
my-project/data3.db
my-project/metadatata.json
my-project/templates/index.html
my-project/plugins/custom_plugin.py&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;Then run Datasette directly targetting that directory:&lt;/p&gt;

&lt;pre&gt;&lt;code&gt;$ datasette my-project/&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;See &lt;a href="https://github.com/simonw/datasette/issues/731"&gt;issue #731&lt;/a&gt; for more details. Directory configuration mode &lt;a href="https://datasette.readthedocs.io/en/latest/config.html#configuration-directory-mode"&gt;is documented here&lt;/a&gt;.&lt;/p&gt;

&lt;h4&gt;Define custom pages using templates/pages&lt;/h4&gt;

&lt;p&gt;In &lt;a href="https://simonwillison.net/2019/Nov/25/niche-museums/"&gt;niche-museums.com, powered by Datasette&lt;/a&gt; I described how I built the &lt;a href="https://www.niche-museums.com/"&gt;www.niche-museums.com&lt;/a&gt; website as a heavily customized Datasette instance.&lt;/p&gt;

&lt;p&gt;That site has &lt;a href="https://www.niche-museums.com/about"&gt;/about&lt;/a&gt; and &lt;a href="https://www.niche-museums.com/map"&gt;/map&lt;/a&gt; pages which are served by custom templates - but I had to do some gnarly hacks with empty &lt;code&gt;about.db&lt;/code&gt; and &lt;code&gt;map.db&lt;/code&gt; files to get them to work.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://github.com/simonw/datasette/issues/648"&gt;Issue #648&lt;/a&gt; introduces a new mechanism for creating this kind of page: create a &lt;code&gt;templates/pages/map.html&lt;/code&gt; template file and custom 404 handling code will ensure that any hits to &lt;code&gt;/map&lt;/code&gt; serve the rendered contents of that template.&lt;/p&gt;

&lt;p&gt;This could work really well with the &lt;a href="https://github.com/simonw/datasette-template-sql"&gt;datasette-template-sql&lt;/a&gt; plugin, which allows templates to execute abritrary SQL queries (ala PHP or ColdFusion).&lt;/p&gt;

&lt;p&gt;Here's the new &lt;a href="https://datasette.readthedocs.io/en/latest/custom_templates.html#custom-pages"&gt;documentation on custom pages&lt;/a&gt;, including details of how to use the new &lt;code&gt;custom_status()&lt;/code&gt;, &lt;code&gt;custom_header()&lt;/code&gt; and &lt;code&gt;custom_redirect()&lt;/code&gt; template functions to go beyond just returning HTML.&lt;/p&gt;

&lt;h3&gt;photos-to-sqlite&lt;/h3&gt;

&lt;p&gt;My &lt;a href="https://dogsheep.github.io/"&gt;Dogsheep&lt;/a&gt; personal analytics project brings my &lt;a href="https://github.com/dogsheep/twitter-to-sqlite"&gt;tweets&lt;/a&gt;, &lt;a href="https://github.com/dogsheep/github-to-sqlite"&gt;GitHub activity&lt;/a&gt;, &lt;a href="https://github.com/dogsheep/swarm-to-sqlite"&gt;Swarm checkins&lt;/a&gt; and more together in one place. But the big missing feature is my photos.&lt;/p&gt;

&lt;p&gt;As-of yesterday, I have 39,000 photos from Apple Photos uploaded to an S3 bucket using my new &lt;a href="https://github.com/dogsheep/photos-to-sqlite/"&gt;photos-to-sqlite&lt;/a&gt; tool. I can run the following SQL query and get back ten random photos!&lt;/p&gt;

&lt;pre&gt;&lt;code&gt;select
  json_object(
    'img_src',
    'https://photos.simonwillison.net/i/' || 
    sha256 || '.' || ext || '?w=400'
  ),
  filepath,
  ext
from
  photos
where
  ext in ('jpeg', 'jpg', 'heic')
order by
  random()
limit
  10&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;&lt;code&gt;photos.simonwillison.net&lt;/code&gt; is running a modified version of my &lt;a href="https://github.com/simonw/heic-to-jpeg"&gt;heic-to-jpeg&lt;/a&gt; image converting and resizing proxy, which I'll release at some point soon.&lt;/p&gt;

&lt;p&gt;There's still plenty of work to do - I still need to import EXIF data (including locations) into SQLite, and I plan to use &lt;a href="https://github.com/RhetTbull/osxphotos"&gt;osxphotos&lt;/a&gt; to export additional metadata from my Apple Photos library. But this week it went from a pure research project to something I can actually start using, which is exciting.&lt;/p&gt;

&lt;h3&gt;TIL this week&lt;/h3&gt;

&lt;ul&gt;&lt;li&gt;&lt;a href="https://github.com/simonw/til/blob/master/macos/fixing-compinit-insecure-directories.md"&gt;Fixing "compinit: insecure directories" error&lt;/a&gt;&lt;/li&gt;&lt;li&gt;&lt;a href="https://github.com/simonw/til/blob/master/tailscale/lock-down-sshd.md"&gt;Restricting SSH connections to devices within a Tailscale network&lt;/a&gt;&lt;/li&gt;&lt;li&gt;&lt;a href="https://github.com/simonw/til/blob/master/python/generate-nested-json-summary.md"&gt;Generated a summary of nested JSON data&lt;/a&gt;&lt;/li&gt;&lt;li&gt;&lt;a href="https://github.com/simonw/til/blob/master/pytest/session-scoped-tmp.md"&gt;Session-scoped temporary directories in pytest&lt;/a&gt;&lt;/li&gt;&lt;li&gt;&lt;a href="https://github.com/simonw/til/blob/master/pytest/mock-httpx.md"&gt;How to mock httpx using pytest-mock&lt;/a&gt;&lt;/li&gt;&lt;/ul&gt;

&lt;p&gt;Generated using &lt;a href="https://til.simonwillison.net/til?sql=select+json_object(%27pre%27%2C+group_concat(%27*+[%27+||+title+||+%27](%27+||+url+||+%27)%27%2C+%27%0D%0A%27))+from+til+where+%22created_utc%22+%3E%3D+%3Ap0+order+by+updated_utc+desc+limit+101&amp;amp;p0=2020-04-23"&gt;this query&lt;/a&gt;.&lt;/p&gt;
    
        &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/git"&gt;git&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/http"&gt;http&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/matthew-somerville"&gt;matthew-somerville&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/photos"&gt;photos&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/projects"&gt;projects&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/datasette"&gt;datasette&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/weeknotes"&gt;weeknotes&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/covid19"&gt;covid19&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/git-scraping"&gt;git-scraping&lt;/a&gt;&lt;/p&gt;
    

</summary><category term="git"/><category term="http"/><category term="matthew-somerville"/><category term="photos"/><category term="projects"/><category term="datasette"/><category term="weeknotes"/><category term="covid19"/><category term="git-scraping"/></entry><entry><title>Bill Gates’s vision for life beyond the coronavirus</title><link href="https://simonwillison.net/2020/Apr/28/bill-gates/#atom-tag" rel="alternate"/><published>2020-04-28T01:01:58+00:00</published><updated>2020-04-28T01:01:58+00:00</updated><id>https://simonwillison.net/2020/Apr/28/bill-gates/#atom-tag</id><summary type="html">
    
&lt;p&gt;&lt;strong&gt;&lt;a href="https://www.vox.com/coronavirus-covid19/2020/4/27/21236270/bill-gates-coronavirus-covid-19-plan-vaccines-conspiracies-podcast"&gt;Bill Gates’s vision for life beyond the coronavirus&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;
Fascinating interview with Bill Gates—the most interesting and informative article I’ve read about Covid-19 in quite a while.


    &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/bill-gates"&gt;bill-gates&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/covid19"&gt;covid19&lt;/a&gt;&lt;/p&gt;



</summary><category term="bill-gates"/><category term="covid19"/></entry><entry><title>Estimating COVID-19's Rt in Real-Time</title><link href="https://simonwillison.net/2020/Apr/20/estimating-covid-19s-rt-real-time/#atom-tag" rel="alternate"/><published>2020-04-20T15:06:53+00:00</published><updated>2020-04-20T15:06:53+00:00</updated><id>https://simonwillison.net/2020/Apr/20/estimating-covid-19s-rt-real-time/#atom-tag</id><summary type="html">
    
&lt;p&gt;&lt;strong&gt;&lt;a href="https://github.com/k-sys/covid-19/blob/master/Realtime%20R0.ipynb"&gt;Estimating COVID-19&amp;#x27;s Rt in Real-Time&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;
I’m not qualified to comment on the mathematical approach, but this is a really nice example of a Jupyter Notebook explanatory essay by Kevin Systrom.


    &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/jupyter"&gt;jupyter&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/covid19"&gt;covid19&lt;/a&gt;&lt;/p&gt;



</summary><category term="jupyter"/><category term="covid19"/></entry><entry><title>Weeknotes: Covid-19, First Python Notebook, more Dogsheep, Tailscale</title><link href="https://simonwillison.net/2020/Apr/1/weeknotes/#atom-tag" rel="alternate"/><published>2020-04-01T20:29:59+00:00</published><updated>2020-04-01T20:29:59+00:00</updated><id>https://simonwillison.net/2020/Apr/1/weeknotes/#atom-tag</id><summary type="html">
    &lt;p&gt;My &lt;a href="https://covid-19.datasettes.com/"&gt;covid-19.datasettes.com&lt;/a&gt; project publishes information on COVID-19 cases around the world. The project started out using data &lt;a href="https://github.com/CSSEGISandData/COVID-19"&gt;from Johns Hopkins CSSE&lt;/a&gt;, but last week the New York Times &lt;a href="https://www.nytimes.com/article/coronavirus-county-data-us.html"&gt;started publishing&lt;/a&gt; high quality USA county- and state-level daily numbers to their &lt;a href="https://github.com/nytimes/covid-19-data"&gt;own repository&lt;/a&gt;. Here's &lt;a href="https://github.com/simonw/covid-19-datasette/commit/56e1644390e5d01ff67c61d6c165749093675632"&gt;the change&lt;/a&gt; that added the NY Times data.&lt;/p&gt;

&lt;p&gt;It's very easy to use this data to accidentally build misleading things. I've been &lt;a href="https://github.com/simonw/covid-19-datasette/blob/master/README.md"&gt;updating the README&lt;/a&gt; with links about this - my current favourite is &lt;a href="https://fivethirtyeight.com/features/why-its-so-freaking-hard-to-make-a-good-covid-19-model/"&gt;Why It’s So Freaking Hard To Make A Good COVID-19 Model&lt;/a&gt; by  Maggie Koerth, Laura Bronner and Jasmine Mithani at FiveThirtyEight.&lt;/p&gt;

&lt;h3 id="weeknotes-first-python-notebook"&gt;First Python Notebook&lt;/h3&gt;

&lt;p&gt;&lt;a href="https://twitter.com/palewire"&gt;Ben Welsh&lt;/a&gt; from the LA Times teaches a course called &lt;a href="https://www.firstpythonnotebook.org/"&gt;First Python Notebook&lt;/a&gt; at journalism conferences such as NICAR. He ran a free online version the course last weekend, and I offered to help out as a TA.&lt;/p&gt;

&lt;p&gt;Most of the help I provided came before the course: Ben asked attendees to confirm that they had working installations of Python 3 and pipenv, and if they didn't volunteers such as myself would step in to help. I had Zoom and email conversations with at least ten people to help them get their environments into shape.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://xkcd.com/1987/"&gt;This XKCD&lt;/a&gt; neatly summarizes the problem:&lt;/p&gt;

&lt;p&gt;&lt;img src="https://static.simonwillison.net/static/2020/python_environment_2x.png" alt="XKCD Python Environments" style="max-width: 100%" /&gt;&lt;/p&gt;

&lt;p&gt;One of the most common problems I had to debug was PATH issues: people had installed the software, but due to various environmental differences &lt;code&gt;python3&lt;/code&gt; and &lt;code&gt;pipenv&lt;/code&gt; weren't available on the PATH. Talking people through the obscurities of creating a &lt;code&gt;~/.bashrc&lt;/code&gt; file and using it to define a PATH over-ride really helps emphasize how arcane this kind of knowledge is.&lt;/p&gt;

&lt;p&gt;I enjoyed this comment:&lt;/p&gt;

&lt;blockquote&gt;&lt;p&gt;"Welcome to intro to Tennis. In the first two weeks, we'll discuss how to rig a net and resurface a court." - &lt;a href="https://twitter.com/ClausWilke/status/1234941405883138048"&gt;Claus Wilke&lt;/a&gt;&lt;/p&gt;&lt;/blockquote&gt;

&lt;p&gt;Ben's course itself is hands down the best introduction to Python from a Data Journalism perspective I have ever seen. Within an hour of starting the students are using Pandas in a Jupyter notebook to find interesting discrepancies in California campaign finance data.&lt;/p&gt;

&lt;p&gt;If you want to check it out yourself, the entire four hour workshop &lt;a href="https://twitter.com/palewire/status/1244410903279177728"&gt;is now on YouTube&lt;/a&gt; and closely follows the material on &lt;a href="https://www.firstpythonnotebook.org/"&gt;firstpythonnotebook.org&lt;/a&gt;.&lt;/p&gt;

&lt;h3 id="weeknotes-coronavirus-diary"&gt;Coronavirus Diary&lt;/h3&gt;

&lt;p&gt;We are clearly living through a notable and very painful period of history right now. On the 19th of March (just under two weeks ago, but time is moving both really fast and incredibly slowly right now) I started a personal diary - something I've never done before. It lives in an Apple Note and I'm adding around a dozen paragraphs to it every day. I think it's helping. I'm sure it will be interesting to look back on in a few years time.&lt;/p&gt;

&lt;h3 id="weeknotes-dogsheep"&gt;Dogsheep&lt;/h3&gt;

&lt;p&gt;Much of my development work this past week has gone into my &lt;a href="https://github.com/dogsheep"&gt;Dogsheep&lt;/a&gt; suite of tools for personal analytics.&lt;/p&gt;

&lt;ul&gt;&lt;li&gt;I upgraded the entire family of tools for compatibility with &lt;a href="https://sqlite-utils.readthedocs.io/en/stable/changelog.html#v2"&gt;sqlite-utils 2.x&lt;/a&gt;.&lt;/li&gt;&lt;li&gt;&lt;a href="https://github.com/dogsheep/pocket-to-sqlite"&gt;pocket-to-sqlite&lt;/a&gt; got a major upgrade: it now fetches items using Pocket's API pagination (previously it just tried to pull in 5,000 items in one go) and has the ability to only fetch new items. As a result I'm now running it from cron in my personal Dogsheep instance, so "Save to Pocket" is now my preferred Dogsheep-compatible way of bookmarking content.&lt;/li&gt;&lt;li&gt;&lt;a href="https://github.com/dogsheep/twitter-to-sqlite"&gt;twitter-to-sqlite&lt;/a&gt; got a couple of important new features in &lt;a href="https://github.com/dogsheep/twitter-to-sqlite/releases/tag/0.20"&gt;release 0.20&lt;/a&gt;. I fixed &lt;a href="https://github.com/dogsheep/twitter-to-sqlite/issues/39"&gt;a nasty bug&lt;/a&gt; in the &lt;code&gt;--since&lt;/code&gt; flag where retweets from other accounts could cause new tweets from an account to be ignored. I also added a new &lt;code&gt;count_history&lt;/code&gt; table which automatically tracks changes to a Twitter user's friends, follower and listed counts over time (&lt;a href="https://github.com/dogsheep/twitter-to-sqlite/issues/40"&gt;#40&lt;/a&gt;).&lt;/li&gt;&lt;/ul&gt;

&lt;p&gt;I'm also now using Dogsheep for some journalism! I'm working with the &lt;a href="https://biglocalnews.org/"&gt;Big Local News&lt;/a&gt; team at Stanford to help track and archive tweets by a number of different US politicians and health departments relating to the ongoing pandemic. This collaboration resulted in the above improvements to &lt;code&gt;twitter-to-sqlite&lt;/code&gt;.&lt;/p&gt;

&lt;h3 id="weeknotes-tailscale"&gt;Tailscale&lt;/h3&gt;

&lt;p&gt;My personal Dogsheep is currently protected by &lt;a href="https://simonwillison.net/2019/Oct/5/client-side-certificate-authentication-nginx/"&gt;client certificates&lt;/a&gt;, so only my personal laptop and iPhone (with the right certificates installed) can connect to the web server it is running on.&lt;/p&gt;

&lt;p&gt;I spent a bit of time this week playing with &lt;a href="https://tailscale.com/"&gt;Tailscale&lt;/a&gt;, and I'm &lt;em&gt;really&lt;/em&gt; impressed by it.&lt;/p&gt;

&lt;p&gt;Tailscale is a commercial company built on top of &lt;a href="https://www.wireguard.com/"&gt;WireGuard&lt;/a&gt;, the new approach to VPN tunnels which just &lt;a href="https://arstechnica.com/gadgets/2020/03/wireguard-vpn-makes-it-to-1-0-0-and-into-the-next-linux-kernel/"&gt;got merged&lt;/a&gt; into the Linux 5.6 kernel. Tailscale first caught my attention in January when they &lt;a href="https://bradfitz.com/2020/01/30/joining-tailscale"&gt;hired Brad Fitzpatrick&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;WireGuard lets you form a private network by having individual hosts exchange public/private keys with each other. Tailscale provides software which manages those keys for you, making it trivial to set up a private network between different nodes.&lt;/p&gt;

&lt;p&gt;How trivial? It took me less than ten minutes to get a three-node private network running between my iPhone, laptop and a Linux server. I installed the &lt;a href="https://apps.apple.com/us/app/tailscale/id1470499037?ls=1"&gt;iPhone app&lt;/a&gt;, the &lt;a href="https://tailscale.com/kb/1037/install-ubuntu-1804"&gt;Ubuntu package&lt;/a&gt;, the &lt;a href="https://apps.apple.com/ca/app/tailscale/id1475387142?mt=12"&gt;OS X app&lt;/a&gt;, signed them all into my Google account and I was done.&lt;/p&gt;

&lt;p&gt;Each of those devices now has an additional IP address in the 100.x range which they can use to talk to each other. Tailscale guarantees that the IP address will stay constant for each of them.&lt;/p&gt;

&lt;p&gt;Since the network is public/private key encrypted between the nodes, Tailscale can't see any of my traffic - they're purely acting as a key management mechanism. And it's free: Tailscale charge for networks with multiple users, but a personal network like this is free of charge.&lt;/p&gt;

&lt;p&gt;I'm not running my own personal Dogsheep on it yet, but I'm tempted to switch over. I'd love other people to start running their own personal Dogsheep instances but I'm paranoid about encouraging this when securing them is so important. Tailscale looks like it might be a great solution for making secure personal infrastructure more easily and widely available.&lt;/p&gt;
    
        &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/brad-fitzpatrick"&gt;brad-fitzpatrick&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/data-journalism"&gt;data-journalism&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/projects"&gt;projects&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/python"&gt;python&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/teaching"&gt;teaching&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/datasette"&gt;datasette&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/dogsheep"&gt;dogsheep&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/weeknotes"&gt;weeknotes&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/tailscale"&gt;tailscale&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/covid19"&gt;covid19&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/ben-welsh"&gt;ben-welsh&lt;/a&gt;&lt;/p&gt;
    

</summary><category term="brad-fitzpatrick"/><category term="data-journalism"/><category term="projects"/><category term="python"/><category term="teaching"/><category term="datasette"/><category term="dogsheep"/><category term="weeknotes"/><category term="tailscale"/><category term="covid19"/><category term="ben-welsh"/></entry><entry><title>Weeknotes: COVID-19 numbers in Datasette</title><link href="https://simonwillison.net/2020/Mar/11/covid-19/#atom-tag" rel="alternate"/><published>2020-03-11T04:49:35+00:00</published><updated>2020-03-11T04:49:35+00:00</updated><id>https://simonwillison.net/2020/Mar/11/covid-19/#atom-tag</id><summary type="html">
    &lt;p&gt;&lt;a href="https://en.wikipedia.org/wiki/Coronavirus_disease_2019"&gt;COVID-19&lt;/a&gt;, the disease caused by the novel coronavirus, gets more terrifying every day. Johns Hopkins Center for Systems Science and Engineering (CSSE) have been &lt;a href="https://github.com/CSSEGISandData/COVID-19"&gt;collating data&lt;/a&gt; about the spread of the disease and publishing it as CSV files on GitHub.&lt;/p&gt;

&lt;p&gt;This morning I used the pattern described in &lt;a href="https://simonwillison.net/2020/Jan/21/github-actions-cloud-run/"&gt;Deploying a data API using GitHub Actions and Cloud Run&lt;/a&gt; to set up a scheduled task that grabs their data once an hour and publishes it to &lt;a href="https://covid-19.datasettes.com/"&gt;https://covid-19.datasettes.com/&lt;/a&gt; as a table in Datasette.&lt;/p&gt;

&lt;p&gt;If you're not yet concerned about COVID-19 you clearly haven't been paying atttention to what's been happening in Italy. Here's &lt;a href="https://covid-19.datasettes.com/covid/daily_reports?country_or_region=Italy&amp;amp;_sort_desc=confirmed#g.mark=bar&amp;amp;g.x_column=day&amp;amp;g.x_type=ordinal&amp;amp;g.y_column=confirmed&amp;amp;g.y_type=quantitative"&gt;a query&lt;/a&gt; which shows a graph of the number of confirmed cases in Italy over the past few weeks (using &lt;a href="https://github.com/simonw/datasette-vega"&gt;datasette-vega&lt;/a&gt;):&lt;/p&gt;

&lt;p&gt;&lt;img src="https://static.simonwillison.net/static/2020/covid-19-italy.png" alt="COVID-19 confirmed cases in Italy, spiking up to 10,149" style="max-width: 100%" /&gt;&lt;/p&gt;

&lt;p&gt;155 cases 17 days ago to 10,149 cases today is really frightening. And the USA still doesn't have robust testing in place, so the numbers here are likely to really shock people once they start to become more apparent.&lt;/p&gt;

&lt;p&gt;If you're going to use the data in covid-19.datasettes.com for anything please be responsible with it and &lt;a href="https://github.com/simonw/covid-19-datasette/blob/master/README.md"&gt;read the warnings in the README file&lt;/a&gt; in detail: it's important to fully understand the sources of the data and how it is being processed before you use it to make any assertions about the spread of COVID-19.&lt;/p&gt;

&lt;p&gt;My favourite resource to understand Coronavirus and what we should be doing about it is &lt;a href="https://www.flattenthecurve.com/"&gt;flattenthecurve.com&lt;/a&gt;, compiled by &lt;a href="https://twitter.com/figgyjam"&gt;Julie McMurry&lt;/a&gt;, an assistant professor at Oregon State University College of Public Health. I strongly recommend checking it out.&lt;/p&gt;

&lt;h3&gt;Other projects&lt;/h3&gt;

&lt;p&gt;I've worked on a bunch of other projects this week, some of which were inspired by my time at &lt;a href="https://www.ire.org/events-and-training/conferences/nicar-2020"&gt;NICAR&lt;/a&gt;.&lt;/p&gt;

&lt;ul&gt;&lt;li&gt;&lt;a href="https://github.com/simonw/fec-to-sqlite"&gt;fec-to-sqlite&lt;/a&gt; is a script for saving FEC campaign finance filings to a SQLite database. Since those filings are pulled in via HTTP and can get pretty big, it uses a neat trick to generate a progress bar with the &lt;a href="https://github.com/tqdm/tqdm"&gt;tqdm&lt;/a&gt; library - it &lt;a href="https://github.com/simonw/fec-to-sqlite/blob/d3ec100f4e9d5acbc5798d95b49e6e373c1ce778/fec_to_sqlite/cli.py#L26-L27"&gt;initiates a progress bar&lt;/a&gt; with &lt;a href="https://github.com/simonw/fec-to-sqlite/blob/d3ec100f4e9d5acbc5798d95b49e6e373c1ce778/fec_to_sqlite/utils.py#L89"&gt;the Content-Length&lt;/a&gt; of the incoming file, then as it iterates over the lines coming in over HTTP it uses the length of each line &lt;a href="https://github.com/simonw/fec-to-sqlite/blob/d3ec100f4e9d5acbc5798d95b49e6e373c1ce778/fec_to_sqlite/utils.py#L75-L78"&gt;to update that bar&lt;/a&gt;.&lt;/li&gt;&lt;li&gt;&lt;a href="https://github.com/simonw/datasette-search-all"&gt;datasette-search-all&lt;/a&gt; is a new plugin that enables search across multiple FTS-enabled SQLite tables at once. I wrote more about that in &lt;a href="https://simonwillison.net/2020/Mar/9/datasette-search-all/"&gt;this blog post&lt;/a&gt; on Monday.&lt;/li&gt;&lt;li&gt;&lt;a href="https://github.com/simonw/datasette-column-inspect"&gt;datasette-column-inspect&lt;/a&gt; is an extremely experimental plugin that tries out a "column inspector" tool for Datasette tables - click on a column heading and the plugin shows you interesting facts about that column, such as the min/mean/max/stdev, any outlying values, the most common values and the least common values. Screenshot below. This prototype came about as part of a JSK team project for the Designing Machine Learning course at Stanford - we were thinking about ways in which machine learning could help journalists find stories in large datasets. The prototype doesn't have any machine learning in it - just some simple statistics to identify outliers - but it's meant to illustrate how a tool that exposes machine learning insights against tabular data might work.&lt;/li&gt;&lt;li&gt;&lt;a href="https://github.com/dogsheep/github-to-sqlite"&gt;github-to-sqlite&lt;/a&gt; grew a new sub-command: &lt;code&gt;github-to-sqlite commits github.db simonw/datasette&lt;/code&gt; - which imports information about commits to a repository (just the author and commit message, not the body of the commit itself). I'm running a private version of this against all of my projects, which is really useful for seeing what I worked on over the past week when writing my weeknotes.&lt;/li&gt;&lt;/ul&gt;

&lt;p&gt;Here are two screenshots of &lt;code&gt;datasette-column-inspect&lt;/code&gt; in action. You can try out a live demo of the plugin &lt;a href="https://datasette-column-inspect-demo-j7hipcg4aq-uc.a.run.app/fivethirtyeight/antiquities-act%2Factions_under_antiquities_act"&gt;over here&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;&lt;img src="https://static.simonwillison.net/static/2020/column-inspect-avengers.png" alt="Outliers in number of appearences in the Avengers: Iron Man, Captain America, Spider Man and Wolverine" style="max-width: 100%" /&gt;&lt;/p&gt;

&lt;p&gt;&lt;img src="https://static.simonwillison.net/static/2020/column-inspect-antiquities.png" alt="Column summary for states in actions_under_antiquities_act" style="max-width: 100%" /&gt;&lt;/p&gt;
    
        &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/plugins"&gt;plugins&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/projects"&gt;projects&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/datasette"&gt;datasette&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/weeknotes"&gt;weeknotes&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/coronavirus"&gt;coronavirus&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/covid19"&gt;covid19&lt;/a&gt;&lt;/p&gt;
    

</summary><category term="plugins"/><category term="projects"/><category term="datasette"/><category term="weeknotes"/><category term="coronavirus"/><category term="covid19"/></entry></feed>