<?xml version="1.0" encoding="utf-8"?>
<feed xml:lang="en-us" xmlns="http://www.w3.org/2005/Atom"><title>Simon Willison's Weblog: tom-macwright</title><link href="http://simonwillison.net/" rel="alternate"/><link href="http://simonwillison.net/tags/tom-macwright.atom" rel="self"/><id>http://simonwillison.net/</id><updated>2025-08-06T16:37:13+00:00</updated><author><name>Simon Willison</name></author><entry><title>Tom MacWright: Observable Notebooks 2.0</title><link href="https://simonwillison.net/2025/Aug/6/observable-notebooks-20/#atom-tag" rel="alternate"/><published>2025-08-06T16:37:13+00:00</published><updated>2025-08-06T16:37:13+00:00</updated><id>https://simonwillison.net/2025/Aug/6/observable-notebooks-20/#atom-tag</id><summary type="html">
    
&lt;p&gt;&lt;strong&gt;&lt;a href="https://macwright.com/2025/07/31/observable-notebooks-2"&gt;Tom MacWright: Observable Notebooks 2.0&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;
Observable announced &lt;a href="https://observablehq.com/notebook-kit/"&gt;Observable Notebooks 2.0&lt;/a&gt; last week - the latest take on their JavaScript notebook technology, this time with an &lt;a href="https://observablehq.com/notebook-kit/kit"&gt;open file format&lt;/a&gt; and a brand new &lt;a href="https://observablehq.com/notebook-kit/desktop"&gt;macOS desktop app&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;Tom MacWright worked at Observable during their first iteration and here provides thoughtful commentary from an insider-to-outsider perspective on how their platform has evolved over time.&lt;/p&gt;
&lt;p&gt;I particularly appreciated this aside on the downsides of evolving your own not-quite-standard language syntax:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;Notebook Kit and Desktop &lt;a href="https://observablehq.com/notebook-kit/#vanilla-java-script"&gt;support vanilla JavaScript&lt;/a&gt;, which is excellent and cool. The Observable changes to JavaScript were always tricky and meant that we struggled to use off-the-shelf parsers, and users couldn't use standard JavaScript tooling like eslint. This is stuff like the &lt;code&gt;viewof&lt;/code&gt; operator which meant that &lt;a href="https://observablehq.com/@observablehq/observable-javascript"&gt;Observable was not JavaScript&lt;/a&gt;. [...] &lt;em&gt;Sidenote&lt;/em&gt;: I now work on &lt;a href="https://www.val.town/"&gt;Val Town&lt;/a&gt;, which is also a platform based on writing JavaScript, and when I joined it &lt;em&gt;also&lt;/em&gt; had a tweaked version of JavaScript. We used the &lt;code&gt;@&lt;/code&gt; character to let you 'mention' other vals and implicitly import them. This was, like it was in Observable, not worth it and we switched to standard syntax: don't mess with language standards folks!&lt;/p&gt;
&lt;/blockquote&gt;


    &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/javascript"&gt;javascript&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/observable"&gt;observable&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/tom-macwright"&gt;tom-macwright&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/val-town"&gt;val-town&lt;/a&gt;&lt;/p&gt;



</summary><category term="javascript"/><category term="observable"/><category term="tom-macwright"/><category term="val-town"/></entry><entry><title>Directive prologues and JavaScript dark matter</title><link href="https://simonwillison.net/2025/Jun/2/directive-prologues-and-javascript-dark-matter/#atom-tag" rel="alternate"/><published>2025-06-02T18:30:31+00:00</published><updated>2025-06-02T18:30:31+00:00</updated><id>https://simonwillison.net/2025/Jun/2/directive-prologues-and-javascript-dark-matter/#atom-tag</id><summary type="html">
    
&lt;p&gt;&lt;strong&gt;&lt;a href="https://macwright.com/2025/04/29/directive-prologues-and-javascript-dark-matter"&gt;Directive prologues and JavaScript dark matter&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;
Tom MacWright does some archaeology and describes the three different magic comment formats that can affect how JavaScript/TypeScript files are processed:&lt;/p&gt;
&lt;p&gt;&lt;code&gt;"a directive";&lt;/code&gt; is a &lt;a href="https://262.ecma-international.org/5.1/#sec-14.1"&gt;directive prologue&lt;/a&gt;, most commonly seen with &lt;code&gt;"use strict";&lt;/code&gt;.&lt;/p&gt;
&lt;p&gt;&lt;code&gt;/** @aPragma */&lt;/code&gt; is a pragma for a transpiler, often used for &lt;code&gt;/** @jsx h */&lt;/code&gt;.&lt;/p&gt;
&lt;p&gt;&lt;code&gt;//# aMagicComment&lt;/code&gt; is usually used for source maps - &lt;code&gt;//# sourceMappingURL=&amp;lt;url&amp;gt;&lt;/code&gt; - but also just got used by v8 for their new &lt;a href="https://v8.dev/blog/explicit-compile-hints"&gt;explicit compile hints&lt;/a&gt; feature.

    &lt;p&gt;&lt;small&gt;&lt;/small&gt;Via &lt;a href="https://blog.jim-nielsen.com/2025/is-it-javascript/"&gt;Jim Nielsen&lt;/a&gt;&lt;/small&gt;&lt;/p&gt;


    &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/javascript"&gt;javascript&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/programming-languages"&gt;programming-languages&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/v8"&gt;v8&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/typescript"&gt;typescript&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/tom-macwright"&gt;tom-macwright&lt;/a&gt;&lt;/p&gt;



</summary><category term="javascript"/><category term="programming-languages"/><category term="v8"/><category term="typescript"/><category term="tom-macwright"/></entry><entry><title>A warning about tiktoken, BPE, and OpenAI models</title><link href="https://simonwillison.net/2024/Nov/21/a-warning-about-tiktoken/#atom-tag" rel="alternate"/><published>2024-11-21T06:13:51+00:00</published><updated>2024-11-21T06:13:51+00:00</updated><id>https://simonwillison.net/2024/Nov/21/a-warning-about-tiktoken/#atom-tag</id><summary type="html">
    
&lt;p&gt;&lt;strong&gt;&lt;a href="https://macwright.com/2024/11/20/tokenization-bpe-warning.html"&gt;A warning about tiktoken, BPE, and OpenAI models&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;
Tom MacWright warns that OpenAI's &lt;a href="https://github.com/openai/tiktoken"&gt;tiktoken Python library&lt;/a&gt; has a surprising performance profile: it's superlinear with the length of input, meaning someone could potentially denial-of-service you by sending you a 100,000 character string if you're passing that directly to &lt;code&gt;tiktoken.encode()&lt;/code&gt;.&lt;/p&gt;
&lt;p&gt;There's an &lt;a href="https://github.com/openai/tiktoken/issues/195"&gt;open issue&lt;/a&gt; about this (now over a year old), so for safety today it's best to truncate on characters before attempting to count or truncate using &lt;code&gt;tiktoken&lt;/code&gt;.


    &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/denial-of-service"&gt;denial-of-service&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/python"&gt;python&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/security"&gt;security&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/tom-macwright"&gt;tom-macwright&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/openai"&gt;openai&lt;/a&gt;&lt;/p&gt;



</summary><category term="denial-of-service"/><category term="python"/><category term="security"/><category term="tom-macwright"/><category term="openai"/></entry><entry><title>Quoting Tom MacWright</title><link href="https://simonwillison.net/2024/Nov/3/tom-macwright/#atom-tag" rel="alternate"/><published>2024-11-03T16:36:13+00:00</published><updated>2024-11-03T16:36:13+00:00</updated><id>https://simonwillison.net/2024/Nov/3/tom-macwright/#atom-tag</id><summary type="html">
    &lt;blockquote cite="https://macwright.com/2024/10/25/good-software-knip"&gt;&lt;p&gt;Building technology in startups is all about having the &lt;em&gt;right level&lt;/em&gt; of tech debt. If you have none, you’re probably going too slow and not prioritizing product-market fit and the important business stuff. If you get too much, everything grinds to a halt. Plus, tech debt is a “know it when you see it” kind of thing, and I know that my definition of “a bunch of tech debt” is, to other people, “very little tech debt.”&lt;/p&gt;&lt;/blockquote&gt;
&lt;p class="cite"&gt;&amp;mdash; &lt;a href="https://macwright.com/2024/10/25/good-software-knip"&gt;Tom MacWright&lt;/a&gt;&lt;/p&gt;

    &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/tom-macwright"&gt;tom-macwright&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/technical-debt"&gt;technical-debt&lt;/a&gt;&lt;/p&gt;



</summary><category term="tom-macwright"/><category term="technical-debt"/></entry><entry><title>Quoting Tom MacWright</title><link href="https://simonwillison.net/2024/Aug/12/tom-macwright/#atom-tag" rel="alternate"/><published>2024-08-12T20:17:08+00:00</published><updated>2024-08-12T20:17:08+00:00</updated><id>https://simonwillison.net/2024/Aug/12/tom-macwright/#atom-tag</id><summary type="html">
    &lt;blockquote cite="https://macwright.com/2024/07/18/llms-democratizing-coding"&gt;&lt;p&gt;But [LLM assisted programming] does make me wonder whether the adoption of these tools will lead to a form of &lt;a href="https://www.baldurbjarnason.com/2024/the-deskilling-of-web-dev-is-harming-us-all/"&gt;de-skilling&lt;/a&gt;. Not even that programmers &lt;em&gt;will be less skilled&lt;/em&gt;, but that the job will drift from the perception and dynamics of a skilled trade to an unskilled trade, with the attendant change - decrease - in pay. Instead of hiring a team of engineers who try to write something of quality and try to load the mental model of what they're building into their heads, companies will just hire a lot of prompt engineers and, who knows, generate 5 versions of the application and A/B test them all across their users.&lt;/p&gt;&lt;/blockquote&gt;
&lt;p class="cite"&gt;&amp;mdash; &lt;a href="https://macwright.com/2024/07/18/llms-democratizing-coding"&gt;Tom MacWright&lt;/a&gt;&lt;/p&gt;

    &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/ai"&gt;ai&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/tom-macwright"&gt;tom-macwright&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/generative-ai"&gt;generative-ai&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/llms"&gt;llms&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/ai-assisted-programming"&gt;ai-assisted-programming&lt;/a&gt;&lt;/p&gt;



</summary><category term="ai"/><category term="tom-macwright"/><category term="generative-ai"/><category term="llms"/><category term="ai-assisted-programming"/></entry><entry><title>The first four Val Town runtimes</title><link href="https://simonwillison.net/2024/Feb/8/the-first-four-val-town-runtimes/#atom-tag" rel="alternate"/><published>2024-02-08T18:38:39+00:00</published><updated>2024-02-08T18:38:39+00:00</updated><id>https://simonwillison.net/2024/Feb/8/the-first-four-val-town-runtimes/#atom-tag</id><summary type="html">
    
&lt;p&gt;&lt;strong&gt;&lt;a href="https://blog.val.town/blog/first-four-val-town-runtimes/"&gt;The first four Val Town runtimes&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;
Val Town solves one of my favourite technical problems: how to run untrusted code in a safe sandbox. They're on their fourth iteration of this now, currently using a Node.js application that launches Deno sub-processes using the &lt;a href="https://github.com/casual-simulation/node-deno-vm"&gt;node-deno-vm&lt;/a&gt; npm package and runs code in those, taking advantage of the Deno sandboxing mechanism and terminating processes that take too long in order to protect against &lt;code&gt;while(true)&lt;/code&gt; style attacks.

    &lt;p&gt;&lt;small&gt;&lt;/small&gt;Via &lt;a href="https://twitter.com/tmcw/status/1755616125474504960"&gt;@tmcw&lt;/a&gt;&lt;/small&gt;&lt;/p&gt;


    &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/javascript"&gt;javascript&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/nodejs"&gt;nodejs&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/sandboxing"&gt;sandboxing&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/deno"&gt;deno&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/tom-macwright"&gt;tom-macwright&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/val-town"&gt;val-town&lt;/a&gt;&lt;/p&gt;



</summary><category term="javascript"/><category term="nodejs"/><category term="sandboxing"/><category term="deno"/><category term="tom-macwright"/><category term="val-town"/></entry><entry><title>Playing with ActivityPub</title><link href="https://simonwillison.net/2022/Dec/10/playing-with-activitypub/#atom-tag" rel="alternate"/><published>2022-12-10T00:58:42+00:00</published><updated>2022-12-10T00:58:42+00:00</updated><id>https://simonwillison.net/2022/Dec/10/playing-with-activitypub/#atom-tag</id><summary type="html">
    
&lt;p&gt;&lt;strong&gt;&lt;a href="https://macwright.com/2022/12/09/activitypub.html"&gt;Playing with ActivityPub&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;
Tom MacWright describes his attempts to build the simplest possible ActivityPub publication—for a static site powered by Jekyll, where he used Netlify functions to handle incoming subscriptions (storing them in PlanetScale via their Deno API library) and wrote a script which loops through and notifies all of his subscriptions every time he publishes something new.

    &lt;p&gt;&lt;small&gt;&lt;/small&gt;Via &lt;a href="https://lobste.rs/s/xvvjza/playing_with_activitypub"&gt;lobste.rs&lt;/a&gt;&lt;/small&gt;&lt;/p&gt;


    &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/deno"&gt;deno&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/tom-macwright"&gt;tom-macwright&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/mastodon"&gt;mastodon&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/activitypub"&gt;activitypub&lt;/a&gt;&lt;/p&gt;



</summary><category term="deno"/><category term="tom-macwright"/><category term="mastodon"/><category term="activitypub"/></entry><entry><title>Quoting Tom MacWright</title><link href="https://simonwillison.net/2022/Mar/4/tom-macwright/#atom-tag" rel="alternate"/><published>2022-03-04T16:11:08+00:00</published><updated>2022-03-04T16:11:08+00:00</updated><id>https://simonwillison.net/2022/Mar/4/tom-macwright/#atom-tag</id><summary type="html">
    &lt;blockquote cite="https://macwright.com/2022/03/04/browsers-and-files.html"&gt;&lt;p&gt;Working with the web platform is dealing with history, with the accumulated matter of quirksmode and good-enough standards. In exchange for the ability to deliver instantly-updating software directly to customers with no middlemen and no installation, you have to absorb a great deal of nearly-useless information that’s entirely about dodging meaningless traps.&lt;/p&gt;&lt;/blockquote&gt;
&lt;p class="cite"&gt;&amp;mdash; &lt;a href="https://macwright.com/2022/03/04/browsers-and-files.html"&gt;Tom MacWright&lt;/a&gt;&lt;/p&gt;

    &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/web"&gt;web&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/tom-macwright"&gt;tom-macwright&lt;/a&gt;&lt;/p&gt;



</summary><category term="web"/><category term="tom-macwright"/></entry><entry><title>lon lat lon lat lon</title><link href="https://simonwillison.net/2022/Feb/10/lonlat/#atom-tag" rel="alternate"/><published>2022-02-10T16:32:49+00:00</published><updated>2022-02-10T16:32:49+00:00</updated><id>https://simonwillison.net/2022/Feb/10/lonlat/#atom-tag</id><summary type="html">
    
&lt;p&gt;&lt;strong&gt;&lt;a href="https://macwright.com/lonlat/"&gt;lon lat lon lat lon&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;
Tom MacWright’s definitive guide to the (latitude, longitude) v.s. (longitude, latitude) debate. The answer is frustrating: both orders are used by significant software, so there’s no single answer that will satisfy everyone. I’ve recently been mostly convinced over to the longitude, latitude side mainly because that’s a better fit for the non-geospatial x, y pattern.


    &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/geospatial"&gt;geospatial&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/tom-macwright"&gt;tom-macwright&lt;/a&gt;&lt;/p&gt;



</summary><category term="geospatial"/><category term="tom-macwright"/></entry><entry><title>GitHub Burndown</title><link href="https://simonwillison.net/2022/Feb/10/github-burndown/#atom-tag" rel="alternate"/><published>2022-02-10T16:29:04+00:00</published><updated>2022-02-10T16:29:04+00:00</updated><id>https://simonwillison.net/2022/Feb/10/github-burndown/#atom-tag</id><summary type="html">
    
&lt;p&gt;&lt;strong&gt;&lt;a href="https://observablehq.com/@tmcw/github-burndown"&gt;GitHub Burndown&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;
Neat Observable notebook by Tom MacWright—give it a GitHub access token and the name of a repo and it pulls the details of every issue and plots a burndown chart over time, showing how long issues stay open for. The code is worth spending some time with—the way it fetches data from the paginated JSON API is a really great example of using generators with Observable, and the chart itself is a lovely clear example of Observable Plot.

    &lt;p&gt;&lt;small&gt;&lt;/small&gt;Via &lt;a href="https://twitter.com/simonw/status/1491136777301929985"&gt;@tmcw&lt;/a&gt;&lt;/small&gt;&lt;/p&gt;


    &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/github"&gt;github&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/observable"&gt;observable&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/tom-macwright"&gt;tom-macwright&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/observable-plot"&gt;observable-plot&lt;/a&gt;&lt;/p&gt;



</summary><category term="github"/><category term="observable"/><category term="tom-macwright"/><category term="observable-plot"/></entry><entry><title>Serving map tiles from SQLite with MBTiles and datasette-tiles</title><link href="https://simonwillison.net/2021/Feb/4/datasette-tiles/#atom-tag" rel="alternate"/><published>2021-02-04T01:09:30+00:00</published><updated>2021-02-04T01:09:30+00:00</updated><id>https://simonwillison.net/2021/Feb/4/datasette-tiles/#atom-tag</id><summary type="html">
    &lt;p&gt;Working on &lt;a href="https://simonwillison.net/2021/Jan/31/weeknotes/"&gt;datasette-leaflet&lt;/a&gt; last week re-kindled my interest in using Datasette as a GIS (Geographic Information System) platform. SQLite already has strong GIS functionality in the form of &lt;a href="https://docs.datasette.io/en/stable/spatialite.html"&gt;SpatiaLite&lt;/a&gt; and &lt;a href="https://datasette.io/plugins/datasette-cluster-map"&gt;datasette-cluster-map&lt;/a&gt; is currently the &lt;a href="https://datasette.io/plugins?sort=downloads-this-week"&gt;most downloaded&lt;/a&gt; plugin. Most importantly, maps are fun!&lt;/p&gt;
&lt;h4&gt;MBTiles&lt;/h4&gt;
&lt;p&gt;I was talking to &lt;a href="https://macwright.com/"&gt;Tom MacWright&lt;/a&gt; on Monday and I mentioned that I'd been thinking about how SQLite might make a good mechanism for distributing tile images for use with libraries like Leaflet. "I might be able to save you some time there" he said... and he showed me &lt;a href="https://github.com/mapbox/mbtiles-spec"&gt;MBTiles&lt;/a&gt;, a specification he started developing ten years ago at Mapbox which does exactly that - bundles tile images up in SQLite databases.&lt;/p&gt;
&lt;p&gt;(My best guess is I read about MBTiles a while ago, then managed to forget about the spec entirely while the idea of using SQLite for tile distribution wedged itself in my head somewhere.)&lt;/p&gt;
&lt;h4&gt;The new datasette-tiles plugin&lt;/h4&gt;
&lt;p&gt;I found some example MBTiles files on the internet and started playing around with them. My first prototype used the &lt;a href="https://datasette.io/plugins/datasette-media"&gt;datasette-media&lt;/a&gt; plugin, described here previously in &lt;a href="https://simonwillison.net/2020/Jul/30/fun-binary-data-and-sqlite/"&gt;Fun with binary data and SQLite&lt;/a&gt;. I used some convoluted SQL to teach it that hits to &lt;code&gt;/-/media/tiles/{z},{x},{y}&lt;/code&gt; should serve up content from the &lt;code&gt;tiles&lt;/code&gt; table in my MBTiles database - you can see details of that prototype in &lt;a href="https://til.simonwillison.net/datasette/serving-mbtiles"&gt;this TIL: Serving MBTiles with datasette-media&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;The obvious next step was to write a dedicated plugin: &lt;a href="https://datasette.io/plugins/datasette-tiles"&gt;datasette-tiles&lt;/a&gt;. Install it and run Datasette against any MBTiles database file and the plugin will set up a &lt;code&gt;/-/tiles/db-name/z/x/y.png&lt;/code&gt; endpoint that serves the specified tiles.&lt;/p&gt;
&lt;p&gt;It also adds a tile explorer view with a pre-configured Leaflet map. Here's &lt;a href="https://datasette-tiles-demo.datasette.io/-/tiles/japan-toner"&gt;a live demo&lt;/a&gt; serving up a subset of Stamen's toner map - just zoom levels 6 and 7 for the country of Japan.&lt;/p&gt;
&lt;p&gt;&lt;a href="https://datasette-tiles-demo.datasette.io/-/tiles/japan-toner"&gt;&lt;img alt="The tile explorer showing a toner map for Japan" src="https://static.simonwillison.net/static/2021/datasette-tiles-japan-toner-demo.png" style="max-width:100%;" /&gt;&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;Here's how to run this on your own computer:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;# Install Datasette
brew install datasette
# Install the plugin
datasette install datasette-tiles
# Download the japan-toner.db database
curl -O https://datasette-tiles-demo.datasette.io/japan-toner.db
# Launch Datasette and open a browser
datasette japan-toner.db -o
# Use the cog menu to access the tile explorer
# Or visit http://127.0.0.1:8001/-/tiles/japan-toner
&lt;/code&gt;&lt;/pre&gt;
&lt;h4&gt;Creating MBTiles files with my download-tiles tool&lt;/h4&gt;
&lt;p&gt;A sticking point when I started playing with MBTiles was finding example files to work with.&lt;/p&gt;
&lt;p&gt;After some digging, I came across the amazing &lt;a href="https://export.hotosm.org/en/v3/"&gt;HOT Export Tool&lt;/a&gt;. It's a project by the &lt;a href="https://www.hotosm.org/"&gt;Humanitarian OpenStreetMap Team&lt;/a&gt; that allows anyone to export subsets of data from OpenStreetMap in a wide variety of formats, including MBTiles.&lt;/p&gt;
&lt;p&gt;I filed &lt;a href="https://github.com/hotosm/osm-export-tool/issues/371"&gt;a minor bug report&lt;/a&gt; against it, and in doing so took a look at the source code (it's all open source)... and found &lt;a href="https://github.com/hotosm/osm-export-tool-python/blob/8e4165a454303abbea2bd18cf5ffcdd5b9d0370d/osm_export_tool/nontabular.py#L103-L108"&gt;the code that assembles MBTiles files&lt;/a&gt;. It uses another open source library called &lt;a href="https://github.com/makinacorpus/landez"&gt;Landez&lt;/a&gt;, which provides functions for downloading tiles from existing providers and bundling those up as an MBTiles SQLite file.&lt;/p&gt;
&lt;p&gt;I prefer command-line tools for this kind of thing over using Python libraries directly, so I fired up my &lt;a href="https://github.com/simonw/click-app"&gt;click-app cookiecutter template&lt;/a&gt; and built a thin command-line interface over the top of the library.&lt;/p&gt;
&lt;p&gt;The new tool is called &lt;a href="https://datasette.io/tools/download-tiles"&gt;download-tiles&lt;/a&gt; and it does exactly that: downloads tiles from a tile server and creates an MBTiles SQLite database on disk containing those tiles.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Please use this tool responsibly&lt;/strong&gt;. Downloading large numbers of tiles is bad manners. Be sure to familiarize yourself with the &lt;a href="https://operations.osmfoundation.org/policies/tiles/"&gt;OpenStreetMap Tile Usage Policy&lt;/a&gt;, and use the tool politely when pointing it at other tile servers.&lt;/p&gt;
&lt;p&gt;Basic usage is as follows:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;download-tiles world.mbtiles
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;By default the tool pulls tiles from OpenStreetMap. The above command will fetch zoom levels 0-3 of the entire world - 85 tiles total, well within acceptable usage limits.&lt;/p&gt;
&lt;p&gt;Various options (described in &lt;a href="https://datasette.io/tools/download-tiles"&gt;the README&lt;/a&gt;) can be used to customize the tiles that are downloaded. Here's how I created the &lt;a href="https://datasette-tiles-demo.datasette.io/japan-toner"&gt;japan-toner.db&lt;/a&gt; demo database, linked to above:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;download-tiles japan-toner.mbtiles \ 
    --zoom-levels 6-7 \
    --country Japan \
    --tiles-url "http://{s}.tile.stamen.com/toner/{z}/{x}/{y}.png" \
    --tiles-subdomains "a,b,c,d" \
    --attribution 'Map tiles by Stamen Design, under CC BY 3.0. Data by OpenStreetMap, under CC BY SA.'
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;The &lt;code&gt;--country Japan&lt;/code&gt; option here looks up the bounding box for Japan &lt;a href="https://nominatim.openstreetmap.org/ui/search.html?country=japan"&gt;using Nominatim&lt;/a&gt;. &lt;code&gt;--zoom-levels 6-7&lt;/code&gt; fetches zoom levels 6 and 7 (in this case that makes for 193 tiles total). &lt;code&gt;--tiles-url&lt;/code&gt; and &lt;code&gt;--tiles-subdomain&lt;/code&gt; configure the tile server to fetch them from. The &lt;code&gt;--attribution&lt;/code&gt; option bakes that string into the &lt;a href="https://datasette-tiles-demo.datasette.io/japan-toner/metadata"&gt;metadata table&lt;/a&gt; for the database - which is then used to display it correctly in the tile explorer (and eventually in other Datasette plugins).&lt;/p&gt;
&lt;h4&gt;datasette-basemap&lt;/h4&gt;
&lt;p&gt;Out of the box, Datasette's current Leaflet plugins (&lt;a href="https://datasette.io/plugins/datasette-cluster-map"&gt;datasette-cluster-map&lt;/a&gt;, &lt;a href="https://datasette.io/plugins/datasette-leaflet-geojson"&gt;datasette-leaflet-geojson&lt;/a&gt; and so on) serve tiles directly from the OpenStreetMap tile server.&lt;/p&gt;
&lt;p&gt;I've never felt particularly comfortable about this. Users can configure the plugins to run against other tile servers, but pointing to OpenStreetMap as a default was the easiest way to ensure these plugins would work for people who just wanted to try them out.&lt;/p&gt;
&lt;p&gt;Now that I have the tooling for bundling map subsets, maybe I can do better.&lt;/p&gt;
&lt;p&gt;&lt;a href="https://datasette.io/plugins/datasette-basemap"&gt;datasette-basemap&lt;/a&gt; offers an alternative: it's a plugin that bundles a 22.7MB SQLite file containing zoom levels 0-6 of OpenStreetMap - &lt;a href="https://datasette-tiles-demo.datasette.io/basemap/tiles?_facet=zoom_level"&gt;5,461 tiles total&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;Running &lt;code&gt;pip install datasette-basemap&lt;/code&gt; (or &lt;code&gt;datasette install datasette-basemap&lt;/code&gt;) will install the plugin, complete with that database - and register it with Datasette.&lt;/p&gt;
&lt;p&gt;Start Datasette with the plugin installed and &lt;code&gt;/basemap&lt;/code&gt; will expose &lt;a href="https://datasette-tiles-demo.datasette.io/basemap"&gt;the bundled database&lt;/a&gt;. Install &lt;code&gt;datasette-tiles&lt;/code&gt; and you'll be able to browse it as a tile server: &lt;a href="https://datasette-tiles-demo.datasette.io/-/tiles/basemap"&gt;here's a demo&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;(I recommend also installing &lt;a href="https://datasette.io/plugins/datasette-render-images"&gt;datasette-render-images&lt;/a&gt; so you can see the tile images themselves in the regular table view, &lt;a href="https://datasette-tiles-demo.datasette.io/basemap/tiles"&gt;like this&lt;/a&gt;.)&lt;/p&gt;
&lt;p&gt;Zoom level 6 is close enough that major cities and the roads between them are visible, for all of the countries in the world. Not bad for 22.7MB!&lt;/p&gt;
&lt;p&gt;This is the first time I've built a Datasette plugin that bundles a full SQLite database as part of the Python package. The pattern seems to work well - I'm excited to explore it further with other projects.&lt;/p&gt;
&lt;h4&gt;Bonus feature: tile stacks&lt;/h4&gt;
&lt;p&gt;I added one last feature to &lt;code&gt;datasette-tiles&lt;/code&gt; before writing everything up for my blog. I'm calling this feature &lt;strong&gt;tile stacks&lt;/strong&gt; - it lets you serve tiles from multiple MBTiles files, falling back to other files if a tile is missing.&lt;/p&gt;
&lt;p&gt;Imagine you had a low-zoom-level world map (similar to &lt;code&gt;datasette-basemap&lt;/code&gt;) and a number of other databases providing packages of tiles for specific countries or cities. You could run Datasette like this:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;datasette basemap.mbtiles japan.mbtiles london.mbtiles tokyo.mbtiles
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Hitting &lt;code&gt;/-/tiles-stack/1/1/1.png&lt;/code&gt; would seek out the specified tile in the &lt;code&gt;tokyo.mbtiles&lt;/code&gt; file, then fall back to &lt;code&gt;london.mbtiles&lt;/code&gt; and then &lt;code&gt;japan.mbtiles&lt;/code&gt; and finally &lt;code&gt;basemap.mbtiles&lt;/code&gt; if it couldn't find it.&lt;/p&gt;
&lt;p&gt;For a demo, visit &lt;a href="https://datasette-tiles-demo.datasette.io/-/tiles-stack"&gt;https://datasette-tiles-demo.datasette.io/-/tiles-stack&lt;/a&gt; and zoom in on Japan. It should start to display the Stamen toner map once you get to zoom levels 6 and 7.&lt;/p&gt;
&lt;h4&gt;Next steps&lt;/h4&gt;
&lt;p&gt;I've been having a lot of fun exploring MBTiles - it's such a natural fit for Datasette, and it's exciting to be able to build new things on top of nearly a decade of innovation by other geo-hackers.&lt;/p&gt;
&lt;p&gt;There are plenty of features missing from &lt;code&gt;datasette-tiles&lt;/code&gt;.&lt;/p&gt;
&lt;p&gt;It currently only handles &lt;code&gt;.png&lt;/code&gt; image data, but the &lt;a href="https://github.com/mapbox/mbtiles-spec/blob/master/1.3/spec.md"&gt;MBTiles 1.3 specification&lt;/a&gt; also defines &lt;code&gt;.jpg&lt;/code&gt; and &lt;code&gt;.webp&lt;/code&gt; tiles, plus vector tiles using Mapbox's &lt;code&gt;.pbf&lt;/code&gt; gzip-compressed protocol buffers.&lt;/p&gt;
&lt;p&gt;&lt;a href="https://github.com/mapbox/utfgrid-spec"&gt;UTFGrid&lt;/a&gt; is a related specification for including "rasterized interaction data" in MBTiles databases - it helps efficiently provide maps &lt;a href="https://blog.mapbox.com/how-interactivity-works-with-utfgrid-3b7d437f9ca9"&gt;with millions of embedded objects&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;As a newcomer to the MBTiles world I'd love to hear suggestions for new features and feedback on how I can improve what I've got so far in the &lt;a href="https://github.com/simonw/datasette-tiles/issues"&gt;datasette-tiles issues&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;Being able to serve your own map tiles like this feels very much in the spirit of the OpenStreetMap project. I'm looking forward to using my own tile subsets for any future projects that fit within a sensible tile subset.&lt;/p&gt;
    
        &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/geospatial"&gt;geospatial&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/mapping"&gt;mapping&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/projects"&gt;projects&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/sqlite"&gt;sqlite&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/datasette"&gt;datasette&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/tom-macwright"&gt;tom-macwright&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/leaflet"&gt;leaflet&lt;/a&gt;&lt;/p&gt;
    

</summary><category term="geospatial"/><category term="mapping"/><category term="projects"/><category term="sqlite"/><category term="datasette"/><category term="tom-macwright"/><category term="leaflet"/></entry><entry><title>Quoting Tom MacWright</title><link href="https://simonwillison.net/2020/May/11/tom-macwright/#atom-tag" rel="alternate"/><published>2020-05-11T00:03:42+00:00</published><updated>2020-05-11T00:03:42+00:00</updated><id>https://simonwillison.net/2020/May/11/tom-macwright/#atom-tag</id><summary type="html">
    &lt;blockquote cite="https://macwright.org/2020/05/10/spa-fatigue.html"&gt;&lt;p&gt;And for what? Again - there is a swath of use cases which would be hard without React and which aren’t complicated enough to push beyond React’s limits. But there are also a lot of problems for which I can’t see any concrete benefit to using React. Those are things like blogs, shopping-cart-websites, mostly-CRUD-and-forms-websites. For these things, all of the fancy optimizations are optimizations to get you closer to the performance you would’ve gotten if you just hadn’t used so much technology.&lt;/p&gt;&lt;/blockquote&gt;
&lt;p class="cite"&gt;&amp;mdash; &lt;a href="https://macwright.org/2020/05/10/spa-fatigue.html"&gt;Tom MacWright&lt;/a&gt;&lt;/p&gt;

    &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/react"&gt;react&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/tom-macwright"&gt;tom-macwright&lt;/a&gt;&lt;/p&gt;



</summary><category term="react"/><category term="tom-macwright"/></entry><entry><title>Things I learned about shapefiles building shapefile-to-sqlite</title><link href="https://simonwillison.net/2020/Feb/19/shapefile-to-sqlite/#atom-tag" rel="alternate"/><published>2020-02-19T05:25:58+00:00</published><updated>2020-02-19T05:25:58+00:00</updated><id>https://simonwillison.net/2020/Feb/19/shapefile-to-sqlite/#atom-tag</id><summary type="html">
    &lt;p&gt;The latest in my series of &lt;a href="https://datasette.readthedocs.io/en/latest/ecosystem.html#tools-for-creating-sqlite-databases"&gt;x-to-sqlite tools&lt;/a&gt; is &lt;a href="https://github.com/simonw/shapefile-to-sqlite"&gt;shapefile-to-sqlite&lt;/a&gt;. I learned a whole bunch of things about the ESRI shapefile format while building it.&lt;/p&gt;
&lt;p&gt;Governments really love ESRI shapefiles. There is a huge amount of interesting geospatial data made available in the format - &lt;a href="https://catalog.data.gov/dataset?res_format=SHP"&gt;4,614 on Data.gov&lt;/a&gt;!&lt;/p&gt;
&lt;h3 id="shapefile-to-sqlite"&gt;shapefile-to-sqlite&lt;/h3&gt;
&lt;p&gt;&lt;code&gt;shapefile-to-sqlite&lt;/code&gt; loads the data from these files into a SQLite database, turning geometry properties into database columns and the geometry itself into a blob of GeoJSON. Let&amp;#39;s try it out on a shapefile containing the &lt;a href="https://catalog.data.gov/dataset/national-parks"&gt;boundaries of US national parks&lt;/a&gt;.&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;$ wget http:&lt;span class="hljs-comment"&gt;//nrdata.nps.gov/programs/lands/nps_boundary.zip&lt;/span&gt;
...
Saving to: ‘nps_boundary.zip’
nps_boundary.zip                           &lt;span class="hljs-number"&gt;100&lt;/span&gt;%[=====================================================================================&amp;gt;]  &lt;span class="hljs-number"&gt;12.61&lt;/span&gt;M   &lt;span class="hljs-number"&gt;705&lt;/span&gt;KB/s    &lt;span class="hljs-keyword"&gt;in&lt;/span&gt; &lt;span class="hljs-number"&gt;22&lt;/span&gt;s     
&lt;span class="hljs-number"&gt;2020&lt;/span&gt;&lt;span class="hljs-number"&gt;-02&lt;/span&gt;&lt;span class="hljs-number"&gt;-18&lt;/span&gt; &lt;span class="hljs-number"&gt;19&lt;/span&gt;:&lt;span class="hljs-number"&gt;59&lt;/span&gt;:&lt;span class="hljs-number"&gt;22&lt;/span&gt; (&lt;span class="hljs-number"&gt;597&lt;/span&gt; KB/s) - ‘nps_boundary.zip’ saved [&lt;span class="hljs-number"&gt;13227561&lt;/span&gt;/&lt;span class="hljs-number"&gt;13227561&lt;/span&gt;]

$ unzip nps_boundary.zip 
Archive:  nps_boundary.zip
inflating: temp/Current_Shapes/Data_Store/&lt;span class="hljs-number"&gt;06&lt;/span&gt;&lt;span class="hljs-number"&gt;-06&lt;/span&gt;&lt;span class="hljs-number"&gt;-12&lt;/span&gt;_Posting/nps_boundary.xml  
inflating: temp/Current_Shapes/Data_Store/&lt;span class="hljs-number"&gt;06&lt;/span&gt;&lt;span class="hljs-number"&gt;-06&lt;/span&gt;&lt;span class="hljs-number"&gt;-12&lt;/span&gt;_Posting/nps_boundary.dbf  
inflating: temp/Current_Shapes/Data_Store/&lt;span class="hljs-number"&gt;06&lt;/span&gt;&lt;span class="hljs-number"&gt;-06&lt;/span&gt;&lt;span class="hljs-number"&gt;-12&lt;/span&gt;_Posting/nps_boundary.prj  
inflating: temp/Current_Shapes/Data_Store/&lt;span class="hljs-number"&gt;06&lt;/span&gt;&lt;span class="hljs-number"&gt;-06&lt;/span&gt;&lt;span class="hljs-number"&gt;-12&lt;/span&gt;_Posting/nps_boundary.shp  
inflating: temp/Current_Shapes/Data_Store/&lt;span class="hljs-number"&gt;06&lt;/span&gt;&lt;span class="hljs-number"&gt;-06&lt;/span&gt;&lt;span class="hljs-number"&gt;-12&lt;/span&gt;_Posting/nps_boundary.shx

$ shapefile-to-sqlite nps.db temp/Current_Shapes/Data_Store/&lt;span class="hljs-number"&gt;06&lt;/span&gt;&lt;span class="hljs-number"&gt;-06&lt;/span&gt;&lt;span class="hljs-number"&gt;-12&lt;/span&gt;_Posting/nps_boundary.shp
temp/Current_Shapes/Data_Store/&lt;span class="hljs-number"&gt;06&lt;/span&gt;&lt;span class="hljs-number"&gt;-06&lt;/span&gt;&lt;span class="hljs-number"&gt;-12&lt;/span&gt;_Posting/nps_boundary.shp
[####################################]  &lt;span class="hljs-number"&gt;100&lt;/span&gt;%

$ datasette nps.db
Serve! files=(&lt;span class="hljs-string"&gt;'nps.db'&lt;/span&gt;,) (immutables=()) on port &lt;span class="hljs-number"&gt;8003&lt;/span&gt;
INFO:     Started server process [&lt;span class="hljs-number"&gt;33534&lt;/span&gt;]
INFO:     Waiting for application startup.
INFO:     Application startup complete.
INFO:     Uvicorn running on http:&lt;span class="hljs-comment"&gt;//127.0.0.1:8001 (Press CTRL+C to quit)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;p&gt;I recommend installing the &lt;a href="https://github.com/simonw/datasette-leaflet-geojson"&gt;datasette-leaflet-geojson&lt;/a&gt; plugin, which will turn any column containing GeoJSON into a Leaflet map.&lt;/p&gt;
&lt;p&gt;&lt;img src="https://static.simonwillison.net/static/2020/nps-boundaries.jpg" alt="Screenshot of National Parks in Datasette" style="max-width: 100%" /&gt;&lt;/p&gt;
&lt;p&gt;If you&amp;#39;ve installed SpatiaLite (&lt;a href="https://datasette.readthedocs.io/en/latest/spatialite.html#installation"&gt;installation instructions here&lt;/a&gt;) you can use the &lt;code&gt;--spatialite&lt;/code&gt; option to instead store the geometry in a SpatiaLite column, unlocking &lt;a href="http://www.gaia-gis.it/gaia-sins/spatialite-sql-latest.html"&gt;a bewildering array&lt;/a&gt; of SQL geometry functions.&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;$ shapefile-to-sqlite nps.db temp/Current_Shapes/Data_Store/&lt;span class="hljs-number"&gt;06&lt;/span&gt;&lt;span class="hljs-number"&gt;-06&lt;/span&gt;&lt;span class="hljs-number"&gt;-12&lt;/span&gt;_Posting/nps_boundary.shp --spatialite --table=nps-spatialite
temp/Current_Shapes/Data_Store/&lt;span class="hljs-number"&gt;06&lt;/span&gt;&lt;span class="hljs-number"&gt;-06&lt;/span&gt;&lt;span class="hljs-number"&gt;-12&lt;/span&gt;_Posting/nps_boundary.shp
[##################################--]   &lt;span class="hljs-number"&gt;94&lt;/span&gt;%  &lt;span class="hljs-number"&gt;00&lt;/span&gt;:&lt;span class="hljs-number"&gt;00&lt;/span&gt;:&lt;span class="hljs-number"&gt;00&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;p&gt;I deployed a copy of the resulting database using Cloud Run:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;$ datasette publish cloudrun nps.db \
    -&lt;span class="ruby"&gt;-service national-parks \
&lt;/span&gt;    -&lt;span class="ruby"&gt;-title &lt;span class="hljs-string"&gt;"National Parks"&lt;/span&gt; \
&lt;/span&gt;    -&lt;span class="ruby"&gt;-source_url=&lt;span class="hljs-string"&gt;"https://catalog.data.gov/dataset/national-parks"&lt;/span&gt; \
&lt;/span&gt;    -&lt;span class="ruby"&gt;-source=&lt;span class="hljs-string"&gt;"data.gov"&lt;/span&gt; \
&lt;/span&gt;    -&lt;span class="ruby"&gt;-spatialite \
&lt;/span&gt;    -&lt;span class="ruby"&gt;-install=datasette-leaflet-geojson \
&lt;/span&gt;    -&lt;span class="ruby"&gt;-install=datasette-render-binary \
&lt;/span&gt;    -&lt;span class="ruby"&gt;-extra-options=&lt;span class="hljs-string"&gt;"--config max_returned_rows:5"&lt;/span&gt;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;p&gt;I used &lt;code&gt;max_returned_rows:5&lt;/code&gt; there because these geometrries are pretty big - without it a page with 100 rows on it can return over 90MB of HTML!&lt;/p&gt;
&lt;p&gt;You can browse the GeoJSON version of the table &lt;a href="https://national-parks-j7hipcg4aq-uc.a.run.app/nps/nps_boundary"&gt;here&lt;/a&gt; and the SpatiaLite version &lt;a href="https://national-parks-j7hipcg4aq-uc.a.run.app/nps/nps-spatialite"&gt;here&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;The SpatiaLite version defaults to rendering each geometry as an ugly binary blob. You can convert them to GeoJSON for compatibility with &lt;code&gt;datasette-leaflet-geojson&lt;/code&gt; using the SpatiaLite &lt;code&gt;AsGeoJSON()&lt;/code&gt; function:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;&lt;span class="hljs-keyword"&gt;select&lt;/span&gt; &lt;span class="hljs-keyword"&gt;id&lt;/span&gt;, UNIT_NAME, AsGeoJSON(geometry)
&lt;span class="hljs-keyword"&gt;from&lt;/span&gt; [nps-spatialite]
&lt;/code&gt;&lt;/pre&gt;&lt;p&gt;Here&amp;#39;s &lt;a href="https://national-parks-j7hipcg4aq-uc.a.run.app/nps?sql=select+id%2C+UNIT_NAME%2C+AsGeoJSON%28geometry%29+from+%5Bnps-spatialite%5D"&gt;the result&lt;/a&gt; of that query running against the demo.&lt;/p&gt;
&lt;h3 id="understanding-shapefiles"&gt;Understanding shapefiles&lt;/h3&gt;
&lt;p&gt;The most confusing thing about shapefiles is that they aren&amp;#39;t a single file. A shapefile comes as a minimum of three files: &lt;code&gt;foo.shp&lt;/code&gt; containing geometries, &lt;code&gt;foo.shx&lt;/code&gt; containing an index into those geometries (really more of an implementation detail) and &lt;code&gt;foo.dbf&lt;/code&gt; contains key/value properties for each geometry.&lt;/p&gt;
&lt;p&gt;They often come bundled with other files too. &lt;code&gt;foo.prj&lt;/code&gt; is a WKT projection for the data for example. Wikipedia lists &lt;a href="https://en.wikipedia.org/wiki/Shapefile#Overview"&gt;a whole bunch&lt;/a&gt; of other possibilities.&lt;/p&gt;
&lt;p&gt;As a result, shapefiles are usually distributed as a zip file. Some shapefile libraries can even read directly from a zip.&lt;/p&gt;
&lt;p&gt;The &lt;a href="https://tools.ietf.org/html/rfc7946"&gt;GeoJSON format&lt;/a&gt; was designed as a modern alternative to shapefiles, so understanding GeoJSON really helps in understanding shapefiles. In particular the GeoJSON geometry types: Point, LineString, MultiLineString, Polygon and MultiPolygon match how shapefile geometries work.&lt;/p&gt;
&lt;p&gt;An important detail in shapefiles is that data in the &lt;code&gt;.shp&lt;/code&gt; and &lt;code&gt;.dbf&lt;/code&gt; files is matched by array index - so the first geometry can be considered as having ID=0, the second ID=1 and so on.&lt;/p&gt;
&lt;p&gt;You can read the properties from the &lt;code&gt;.dbf&lt;/code&gt; file using the &lt;a href="https://dbfread.readthedocs.io/en/latest/"&gt;dbfread&lt;/a&gt; Python module like this:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;$ ipython
&lt;span class="hljs-keyword"&gt;In&lt;/span&gt; [&lt;span class="hljs-number"&gt;1&lt;/span&gt;]: import dbfread
&lt;span class="hljs-keyword"&gt;In&lt;/span&gt; [&lt;span class="hljs-number"&gt;2&lt;/span&gt;]: db = dbfread.DBF(&lt;span class="hljs-string"&gt;"temp/Current_Shapes/Data_Store/06-06-12_Posting/nps_boundary.dbf"&lt;/span&gt;)
&lt;span class="hljs-keyword"&gt;In&lt;/span&gt; [&lt;span class="hljs-number"&gt;3&lt;/span&gt;]: next(iter(db))
&lt;span class="hljs-keyword"&gt;Out&lt;/span&gt;[&lt;span class="hljs-number"&gt;3&lt;/span&gt;]: 
OrderedDict([(&lt;span class="hljs-string"&gt;'UNIT_TYPE'&lt;/span&gt;, &lt;span class="hljs-string"&gt;'Park'&lt;/span&gt;),
            (&lt;span class="hljs-string"&gt;'STATE'&lt;/span&gt;, &lt;span class="hljs-string"&gt;''&lt;/span&gt;),
            (&lt;span class="hljs-string"&gt;'REGION'&lt;/span&gt;, &lt;span class="hljs-string"&gt;'NC'&lt;/span&gt;),
            (&lt;span class="hljs-string"&gt;'UNIT_CODE'&lt;/span&gt;, &lt;span class="hljs-string"&gt;'NACC'&lt;/span&gt;),
            (&lt;span class="hljs-string"&gt;'UNIT_NAME'&lt;/span&gt;, &lt;span class="hljs-string"&gt;'West Potomac Park'&lt;/span&gt;),
            (&lt;span class="hljs-string"&gt;'DATE_EDIT'&lt;/span&gt;, &lt;span class="hljs-keyword"&gt;None&lt;/span&gt;),
            (&lt;span class="hljs-string"&gt;'GIS_NOTES'&lt;/span&gt;, &lt;span class="hljs-string"&gt;''&lt;/span&gt;),
            (&lt;span class="hljs-string"&gt;'CREATED_BY'&lt;/span&gt;, &lt;span class="hljs-string"&gt;'Legacy'&lt;/span&gt;),
            (&lt;span class="hljs-string"&gt;'METADATA'&lt;/span&gt;, &lt;span class="hljs-string"&gt;''&lt;/span&gt;),
            (&lt;span class="hljs-string"&gt;'PARKNAME'&lt;/span&gt;, &lt;span class="hljs-string"&gt;''&lt;/span&gt;)])
&lt;/code&gt;&lt;/pre&gt;&lt;h3 id="reading-shapefiles-in-python"&gt;Reading shapefiles in Python&lt;/h3&gt;
&lt;p&gt;I&amp;#39;m a big fan of the &lt;a href="https://shapely.readthedocs.io/"&gt;Shapely&lt;/a&gt; Python library, so I was delighted to see that Sean Gillies, creator of Shapely, also created a library for reading and writing shapefiles: &lt;a href="https://fiona.readthedocs.io/"&gt;Fiona&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;&lt;a href="https://macwright.org/2012/10/31/gis-with-python-shapely-fiona.html"&gt;GIS with Python, Shapely, and Fiona&lt;/a&gt; by Tom MacWright was particularly useful for figuring this out. I like how he wrote that post in 2012 but added a note in 2017 that it&amp;#39;s still his recommended way of getting started with GIS in Python.&lt;/p&gt;
&lt;h3 id="projections"&gt;Projections&lt;/h3&gt;
&lt;p&gt;The trickiest part of working with any GIS data is always figuring out how to deal with &lt;a href="https://xkcd.com/977/"&gt;projections&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;GeoJSON &lt;a href="https://tools.ietf.org/html/rfc7946#section-4"&gt;attempts to standardize&lt;/a&gt; on WGS 84, otherwise known as the latitude/longitude model used by GPS. But... shapefiles frequently use something else. The &lt;a href="https://www.sccgov.org/sites/parks/Parks-Maps/Maps-Data/Pages/home.aspx"&gt;Santa Clara county parks&lt;/a&gt; shapefiles for example use &lt;a href="https://epsg.io/2227"&gt;EPSG:2227&lt;/a&gt;, also known as California zone 3.&lt;/p&gt;
&lt;p&gt;(Fun fact: ESPG stands for European Petroleum Survey Group, a now defunct oil industry group that today lives on only as a database of projected coordinate systems.)&lt;/p&gt;
&lt;p&gt;I spent &lt;a href="https://github.com/simonw/shapefile-to-sqlite/issues/6"&gt;quite a while&lt;/a&gt; thinking about how to best handle projections. In the end I decided that I&amp;#39;d follow GeoJSON&amp;#39;s lead and attempt to convert everything to WGS 84, but allow users to skip that behaviour using &lt;code&gt;--crs=keep&lt;/code&gt; or to specify an alternative projection to convert to with &lt;code&gt;--crs=epsg:2227&lt;/code&gt; or similar.&lt;/p&gt;
&lt;p&gt;SpatiaLite creates its geometry columns with a baked in SRID (a code which usually maps to the EPSG identifier). You can see which SRID was used for a specific geometry using the &lt;code&gt;srid()&lt;/code&gt; function:&lt;/p&gt;
&lt;p&gt;&lt;a href="https://national-parks-j7hipcg4aq-uc.a.run.app/nps?sql=select+srid%28geometry%29+from+%22nps-spatialite%22+limit+1"&gt;select srid(geometry) from &amp;quot;nps-spatialite&amp;quot; limit 1&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;SpatiaLite can also convert to another projection using the &lt;code&gt;Transform()&lt;/code&gt; function:&lt;/p&gt;
&lt;p&gt;&lt;a href="https://national-parks-j7hipcg4aq-uc.a.run.app/nps?sql=select+%27%3A%27+%7C%7C+AsGeoJSON%28Transform%28geometry%2C+2227%29%29+from+%22nps-spatialite%22+limit+1"&gt;select &amp;#39;:&amp;#39; || AsGeoJSON(Transform(geometry, 2227)) from &amp;quot;nps-spatialite&amp;quot; limit 1&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;(I&amp;#39;m using &lt;code&gt;&amp;#39;:&amp;#39; || AsGeoJSON(...)&lt;/code&gt; here to disable the &lt;code&gt;datasette-leaflet-geojson&lt;/code&gt; plugin, since it can&amp;#39;t correctly render data that has been transformed to a non-WGS-84 proection.)&lt;/p&gt;
&lt;h3 id="pulling-it-all-together"&gt;Pulling it all together&lt;/h3&gt;
&lt;p&gt;I now have two tools for imorting geospatial data into SQLite (or SpatiaLite) databases: &lt;a href="hhttps://github.com/simonw/shahpefile-to-sqlite"&gt;shapefile-to-sqlite&lt;/a&gt; and &lt;a href="https://github.com/simonw/geojson-to-sqlite"&gt;geojson-to-sqlite&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;I&amp;#39;m excited about Datasette&amp;#39;s potential as a tool for GIS. I started exploring this back in 2017 when I used it to &lt;a href="https://simonwillison.net/2017/Dec/12/location-time-zone-api/"&gt;build a location to timezone API&lt;/a&gt; - but adding easy shapefile imports to the toolchain should unlock all kinds of interesting new geospatial projects.&lt;/p&gt;
    
        &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/geospatial"&gt;geospatial&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/projects"&gt;projects&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/shapefiles"&gt;shapefiles&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/spatialite"&gt;spatialite&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/sqlite"&gt;sqlite&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/geojson"&gt;geojson&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/weeknotes"&gt;weeknotes&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/tom-macwright"&gt;tom-macwright&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/leaflet"&gt;leaflet&lt;/a&gt;&lt;/p&gt;
    

</summary><category term="geospatial"/><category term="projects"/><category term="shapefiles"/><category term="spatialite"/><category term="sqlite"/><category term="geojson"/><category term="weeknotes"/><category term="tom-macwright"/><category term="leaflet"/></entry><entry><title>togeojson</title><link href="https://simonwillison.net/2019/Jan/18/togeojson/#atom-tag" rel="alternate"/><published>2019-01-18T23:50:00+00:00</published><updated>2019-01-18T23:50:00+00:00</updated><id>https://simonwillison.net/2019/Jan/18/togeojson/#atom-tag</id><summary type="html">
    
&lt;p&gt;&lt;strong&gt;&lt;a href="https://github.com/tmcw/togeojson"&gt;togeojson&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;
Handy JavaScript library and command-mine tool for converting KML and GPX to GeoJSON, by Tom MacWright

    &lt;p&gt;&lt;small&gt;&lt;/small&gt;Via &lt;a href="https://twitter.com/tmcw/status/1086407002773803008"&gt;@tmcw&lt;/a&gt;&lt;/small&gt;&lt;/p&gt;


    &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/geospatial"&gt;geospatial&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/kml"&gt;kml&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/geojson"&gt;geojson&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/tom-macwright"&gt;tom-macwright&lt;/a&gt;&lt;/p&gt;



</summary><category term="geospatial"/><category term="kml"/><category term="geojson"/><category term="tom-macwright"/></entry><entry><title>Observable Beta</title><link href="https://simonwillison.net/2018/Jan/31/observable/#atom-tag" rel="alternate"/><published>2018-01-31T16:46:38+00:00</published><updated>2018-01-31T16:46:38+00:00</updated><id>https://simonwillison.net/2018/Jan/31/observable/#atom-tag</id><summary type="html">
    
&lt;p&gt;&lt;strong&gt;&lt;a href="https://beta.observablehq.com/"&gt;Observable Beta&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;
Observable just released their beta, and it’s quite something. It’s by Mike Bostock (d3), Jeremy Ashkenas (Backbone, CoffeeScript) and Tom MacWright (Mapbox Studio). The easiest way to describe it is Jupyter notebooks for JavaScript supporting reactive programming—so code is evaluated as you type and you can add interactive widgets (like sliders and canvas views)  to construct explorable visualizations on the fly.

    &lt;p&gt;&lt;small&gt;&lt;/small&gt;Via &lt;a href="https://news.ycombinator.com/item?id=16274686"&gt;Hacker News&lt;/a&gt;&lt;/small&gt;&lt;/p&gt;


    &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/javascript"&gt;javascript&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/jeremy-ashkenas"&gt;jeremy-ashkenas&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/d3"&gt;d3&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/jupyter"&gt;jupyter&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/observable"&gt;observable&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/mike-bostock"&gt;mike-bostock&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/tom-macwright"&gt;tom-macwright&lt;/a&gt;&lt;/p&gt;



</summary><category term="javascript"/><category term="jeremy-ashkenas"/><category term="d3"/><category term="jupyter"/><category term="observable"/><category term="mike-bostock"/><category term="tom-macwright"/></entry></feed>