<?xml version="1.0" encoding="utf-8"?>
<feed xml:lang="en-us" xmlns="http://www.w3.org/2005/Atom"><title>Simon Willison's Weblog: yaml</title><link href="http://simonwillison.net/" rel="alternate"/><link href="http://simonwillison.net/tags/yaml.atom" rel="self"/><id>http://simonwillison.net/</id><updated>2025-06-21T17:15:21+00:00</updated><author><name>Simon Willison</name></author><entry><title>model.yaml</title><link href="https://simonwillison.net/2025/Jun/21/model-yaml/#atom-tag" rel="alternate"/><published>2025-06-21T17:15:21+00:00</published><updated>2025-06-21T17:15:21+00:00</updated><id>https://simonwillison.net/2025/Jun/21/model-yaml/#atom-tag</id><summary type="html">
    
&lt;p&gt;&lt;strong&gt;&lt;a href="https://modelyaml.org/"&gt;model.yaml&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;
From their &lt;a href="https://github.com/modelyaml/modelyaml"&gt;GitHub repo&lt;/a&gt; it looks like this effort quietly launched a couple of months ago, driven by the &lt;a href="https://lmstudio.ai/"&gt;LM Studio&lt;/a&gt; team. Their goal is to specify an "open standard for defining crossplatform, composable AI models".&lt;/p&gt;
&lt;p&gt;A model can be defined using a YAML file that &lt;a href="https://lmstudio.ai/models/mistralai/mistral-small-3.2"&gt;looks like this&lt;/a&gt;:&lt;/p&gt;
&lt;pre&gt;&lt;span class="pl-ent"&gt;model&lt;/span&gt;: &lt;span class="pl-s"&gt;mistralai/mistral-small-3.2&lt;/span&gt;
&lt;span class="pl-ent"&gt;base&lt;/span&gt;:
  - &lt;span class="pl-ent"&gt;key&lt;/span&gt;: &lt;span class="pl-s"&gt;lmstudio-community/mistral-small-3.2-24b-instruct-2506-gguf&lt;/span&gt;
    &lt;span class="pl-ent"&gt;sources&lt;/span&gt;:
      - &lt;span class="pl-ent"&gt;type&lt;/span&gt;: &lt;span class="pl-s"&gt;huggingface&lt;/span&gt;
        &lt;span class="pl-ent"&gt;user&lt;/span&gt;: &lt;span class="pl-s"&gt;lmstudio-community&lt;/span&gt;
        &lt;span class="pl-ent"&gt;repo&lt;/span&gt;: &lt;span class="pl-s"&gt;Mistral-Small-3.2-24B-Instruct-2506-GGUF&lt;/span&gt;
&lt;span class="pl-ent"&gt;metadataOverrides&lt;/span&gt;:
  &lt;span class="pl-ent"&gt;domain&lt;/span&gt;: &lt;span class="pl-s"&gt;llm&lt;/span&gt;
  &lt;span class="pl-ent"&gt;architectures&lt;/span&gt;:
    - &lt;span class="pl-s"&gt;mistral&lt;/span&gt;
  &lt;span class="pl-ent"&gt;compatibilityTypes&lt;/span&gt;:
    - &lt;span class="pl-s"&gt;gguf&lt;/span&gt;
  &lt;span class="pl-ent"&gt;paramsStrings&lt;/span&gt;:
    - &lt;span class="pl-c1"&gt;24B&lt;/span&gt;
  &lt;span class="pl-ent"&gt;minMemoryUsageBytes&lt;/span&gt;: &lt;span class="pl-c1"&gt;14300000000&lt;/span&gt;
  &lt;span class="pl-ent"&gt;contextLengths&lt;/span&gt;:
    - &lt;span class="pl-c1"&gt;4096&lt;/span&gt;
  &lt;span class="pl-ent"&gt;vision&lt;/span&gt;: &lt;span class="pl-c1"&gt;true&lt;/span&gt;&lt;/pre&gt;

&lt;p&gt;This should be enough information for an LLM serving engine - such as LM Studio - to understand where to get the model weights (here that's &lt;a href="https://huggingface.co/lmstudio-community/Mistral-Small-3.2-24B-Instruct-2506-GGUF"&gt;lmstudio-community/Mistral-Small-3.2-24B-Instruct-2506-GGUF&lt;/a&gt; on Hugging Face, but it leaves space for alternative providers) plus various other configuration options and important metadata about the capabilities of the model.&lt;/p&gt;
&lt;p&gt;I like this concept a lot. I've actually been considering something similar for my LLM tool - my idea was to use Markdown with a YAML frontmatter block - but now that there's an early-stage standard for it I may well build on top of this work instead.&lt;/p&gt;
&lt;p&gt;I couldn't find any evidence that anyone outside of LM Studio is using this yet, so it's effectively a one-vendor standard for the moment. All of the models in their &lt;a href="https://lmstudio.ai/models"&gt;Model Catalog&lt;/a&gt; are defined using model.yaml.


    &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/standards"&gt;standards&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/yaml"&gt;yaml&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/ai"&gt;ai&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/generative-ai"&gt;generative-ai&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/llms"&gt;llms&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/llm"&gt;llm&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/lm-studio"&gt;lm-studio&lt;/a&gt;&lt;/p&gt;



</summary><category term="standards"/><category term="yaml"/><category term="ai"/><category term="generative-ai"/><category term="llms"/><category term="llm"/><category term="lm-studio"/></entry><entry><title>openai/openai-openapi</title><link href="https://simonwillison.net/2024/Dec/22/openai-openapi/#atom-tag" rel="alternate"/><published>2024-12-22T22:59:25+00:00</published><updated>2024-12-22T22:59:25+00:00</updated><id>https://simonwillison.net/2024/Dec/22/openai-openapi/#atom-tag</id><summary type="html">
    
&lt;p&gt;&lt;strong&gt;&lt;a href="https://github.com/openai/openai-openapi"&gt;openai/openai-openapi&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;
Seeing as the LLM world has semi-standardized on imitating OpenAI's API format for a whole host of different tools, it's useful to note that OpenAI themselves maintain a dedicated repository for a &lt;a href="https://www.openapis.org/"&gt;OpenAPI&lt;/a&gt; YAML representation of their current API.&lt;/p&gt;
&lt;p&gt;(I get OpenAI and OpenAPI typo-confused all the time, so &lt;code&gt;openai-openapi&lt;/code&gt; is a delightfully fiddly repository name.)&lt;/p&gt;
&lt;p&gt;The &lt;a href="https://github.com/openai/openai-openapi/blob/master/openapi.yaml"&gt;openapi.yaml&lt;/a&gt; file itself is over 26,000 lines long, defining 76 API endpoints ("paths" in OpenAPI terminology) and 284 "schemas" for JSON that can be sent to and from those endpoints. A much more interesting view onto it is the &lt;a href="https://github.com/openai/openai-openapi/commits/master/openapi.yaml"&gt;commit history&lt;/a&gt; for that file, showing details of when each different API feature was released.&lt;/p&gt;
&lt;p&gt;Browsing 26,000 lines of YAML isn't pleasant, so I &lt;a href="https://gist.github.com/simonw/54b4e533481cc7a686b0172c3a9ac21e"&gt;got Claude&lt;/a&gt; to build me a rudimentary YAML expand/hide exploration tool. Here's that tool running against the OpenAI schema, loaded directly from GitHub via a CORS-enabled &lt;code&gt;fetch()&lt;/code&gt; call: &lt;a href="https://tools.simonwillison.net/yaml-explorer#eyJ1cmwiOiJodHRwczovL3Jhdy5naXRodWJ1c2VyY29udGVudC5jb20vb3BlbmFpL29wZW5haS1vcGVuYXBpL3JlZnMvaGVhZHMvbWFzdGVyL29wZW5hcGkueWFtbCIsIm9wZW4iOlsiZDAiLCJkMjAiXX0="&gt;https://tools.simonwillison.net/yaml-explorer#.eyJ1c...&lt;/a&gt; - the code after that fragment is a base64-encoded JSON for the current state of the tool (mostly Claude's idea).&lt;/p&gt;
&lt;p&gt;&lt;img alt="Screenshot of the YAML explorer, showing a partially expanded set of sections from the OpenAI API specification." src="https://static.simonwillison.net/static/2024/yaml-explorer.jpg" /&gt;&lt;/p&gt;
&lt;p&gt;The tool is a little buggy - the expand-all option doesn't work quite how I want - but it's useful enough for the moment.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Update&lt;/strong&gt;: It turns out the &lt;a href="https://petstore.swagger.io/"&gt;petstore.swagger.io&lt;/a&gt; demo has an (as far as I can tell) undocumented &lt;code&gt;?url=&lt;/code&gt; parameter which can load external YAML files, so &lt;a href="https://petstore.swagger.io/?url=https://raw.githubusercontent.com/openai/openai-openapi/refs/heads/master/openapi.yaml"&gt;here's openai-openapi/openapi.yaml&lt;/a&gt; in an OpenAPI explorer interface.&lt;/p&gt;
&lt;p&gt;&lt;img alt="The Swagger API browser showing the OpenAI API" src="https://static.simonwillison.net/static/2024/swagger.jpg" /&gt;


    &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/apis"&gt;apis&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/tools"&gt;tools&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/yaml"&gt;yaml&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/ai"&gt;ai&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/openai"&gt;openai&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/generative-ai"&gt;generative-ai&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/llms"&gt;llms&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/ai-assisted-programming"&gt;ai-assisted-programming&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/claude-3-5-sonnet"&gt;claude-3-5-sonnet&lt;/a&gt;&lt;/p&gt;



</summary><category term="apis"/><category term="tools"/><category term="yaml"/><category term="ai"/><category term="openai"/><category term="generative-ai"/><category term="llms"/><category term="ai-assisted-programming"/><category term="claude-3-5-sonnet"/></entry><entry><title>Weeknotes: airtable-export, generating screenshots in GitHub Actions, Dogsheep!</title><link href="https://simonwillison.net/2020/Sep/3/weeknotes-airtable-screenshots-dogsheep/#atom-tag" rel="alternate"/><published>2020-09-03T23:28:29+00:00</published><updated>2020-09-03T23:28:29+00:00</updated><id>https://simonwillison.net/2020/Sep/3/weeknotes-airtable-screenshots-dogsheep/#atom-tag</id><summary type="html">
    &lt;p&gt;This week I figured out how to populate Datasette from Airtable, wrote code to generate social media preview card page screenshots using Puppeteer, and made a big breakthrough with my Dogsheep project.&lt;/p&gt;
&lt;h4 id="weeknotes-2020-09-03-airtable-export"&gt;airtable-export&lt;/h4&gt;
&lt;p&gt;I wrote about &lt;a href="https://www.rockybeaches.com/"&gt;Rocky Beaches&lt;/a&gt; in my weeknotes &lt;a href="https://simonwillison.net/2020/Aug/21/weeknotes-rocky-beaches/"&gt;two weeks ago&lt;/a&gt;. It's a new website built by Natalie Downe that showcases great places to go rockpooling (tidepooling in American English), mixing in tide data from NOAA and species sighting data from iNaturalist.&lt;/p&gt;
&lt;p&gt;Rocky Beaches is powered by Datasette, using a GitHub Actions workflow that builds the site's underlying SQLite database using API calls and YAML data stored in the GitHub repository.&lt;/p&gt;
&lt;p&gt;Natalie wanted to use Airtable to maintain the structured data for the site, rather than hand-editing a YAML file. So I built &lt;a href="https://github.com/simonw/airtable-export"&gt;airtable-export&lt;/a&gt;, a command-line script for sucking down all of the data from an Airtable instance and writing it to disk as YAML or JSON.&lt;/p&gt;
&lt;p&gt;You run it like this:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;airtable-export out/ mybaseid table1 table2 --key=key
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;This will create a folder called &lt;code&gt;out/&lt;/code&gt; with a &lt;code&gt;.yml&lt;/code&gt; file for each of the tables.&lt;/p&gt;
&lt;p&gt;Sadly the Airtable API doesn't yet provide a mechanism to list all of the tables in a database (a &lt;a href="https://community.airtable.com/t/list-tables-given-api-key-and-baseid/1173"&gt;long-running feature request&lt;/a&gt;) so you have to list the tables yourself.&lt;/p&gt;
&lt;p&gt;We're now &lt;a href="https://github.com/natbat/rockybeaches/blob/32a010292e7c1ba47db1a86523a61c666d977074/.github/workflows/deploy.yml#L31-L44"&gt;running that command&lt;/a&gt; as part of the Rocky Beaches build script, and committing the latest version of the YAML file back to the GitHub repo (thus gaining a &lt;a href="https://github.com/natbat/rockybeaches/commits/main/airtable"&gt;full change history&lt;/a&gt; for that data).&lt;/p&gt;
&lt;h4 id="weeknotes-2020-09-03-social-media-cards-tils"&gt;Social media cards for my TILs&lt;/h4&gt;
&lt;p&gt;I really like social media cards - &lt;code&gt;og:image&lt;/code&gt; HTML meta attributes for Facebook and &lt;code&gt;twitter:image&lt;/code&gt; for Twitter. I wanted them for articles on my &lt;a href="https://til.simonwillison.net/"&gt;TIL website&lt;/a&gt; since I often share those via Twitter.&lt;/p&gt;
&lt;p&gt;One catch: my TILs aren't very image heavy. So I decided to generate screenshots of the pages and use those as the 2x1 social media card images.&lt;/p&gt;
&lt;p&gt;The best way I know of programatically generating screenshots is to use &lt;a href="https://developers.google.com/web/tools/puppeteer"&gt;Puppeteer&lt;/a&gt;, a Node.js library for automating a headless instance of the Chrome browser that is maintained by the Chrome DevTools team.&lt;/p&gt;
&lt;p&gt;My first attempt was to run Puppeteer in an AWS Lambda function on &lt;a href="https://vercel.com/"&gt;Vercel&lt;/a&gt;. I remembered seeing an example of how to do this in the Vercel documentation a few years ago. The example isn't there any more, but I found the &lt;a href="https://github.com/vercel/now-examples/pull/207"&gt;original pull request&lt;/a&gt; that introduced it.&lt;/p&gt;
&lt;p&gt;Since the example was MIT licensed I created my own fork at &lt;a href="https://github.com/simonw/puppeteer-screenshot"&gt;simonw/puppeteer-screenshot&lt;/a&gt; and updated it to work with the latest Chrome.&lt;/p&gt;
&lt;p&gt;It's pretty resource intensive, so I also added a secret &lt;code&gt;?key=&lt;/code&gt; mechanism so only my own automation code could call my instance running on Vercel.&lt;/p&gt;
&lt;p&gt;I needed to store the generated screenshots somewhere. They're pretty small - on the order of 60KB each - so I decided to store them in my SQLite database itself and use my &lt;a href="https://github.com/simonw/datasette-media"&gt;datasette-media&lt;/a&gt; plugin (see &lt;a href="https://simonwillison.net/2020/Jul/30/fun-binary-data-and-sqlite/"&gt;Fun with binary data and SQLite&lt;/a&gt;) to serve them up.&lt;/p&gt;
&lt;p&gt;This worked! Until it didn't... I ran into a showstopper bug when I realized that the screenshot process relies on the page being live on the site... but when a new article is added it's not live when the build process works, so the generated screenshot &lt;a href="https://github.com/simonw/til/issues/23"&gt;is of the 404 page&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;So I reworked it to generate the screenshots inside the GitHub Action as part of the build script, using &lt;a href="https://github.com/JarvusInnovations/puppeteer-cli"&gt;puppeteer-cli&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;My &lt;a href="https://github.com/simonw/til/blob/3fca996228ad54ee433b25840fcd3682e9f7bbfd/generate_screenshots.py"&gt;generate_screenshots.py&lt;/a&gt; script handles this, by first shelling out to &lt;code&gt;datasette --get&lt;/code&gt; to render the HTML for the page, then running &lt;code&gt;puppeteer&lt;/code&gt; to generate the screenshot. Relevant code:&lt;/p&gt;
&lt;pre&gt;&lt;span class="pl-k"&gt;def&lt;/span&gt; &lt;span class="pl-en"&gt;png_for_path&lt;/span&gt;(&lt;span class="pl-s1"&gt;path&lt;/span&gt;):
    &lt;span class="pl-c"&gt;# Path is e.g. /til/til/python_debug-click-with-pdb.md&lt;/span&gt;
    &lt;span class="pl-s1"&gt;page_html&lt;/span&gt; &lt;span class="pl-c1"&gt;=&lt;/span&gt; &lt;span class="pl-en"&gt;str&lt;/span&gt;(&lt;span class="pl-v"&gt;TMP_PATH&lt;/span&gt; &lt;span class="pl-c1"&gt;/&lt;/span&gt; &lt;span class="pl-s"&gt;"generate-screenshots-page.html"&lt;/span&gt;)
    &lt;span class="pl-c"&gt;# Use datasette to generate HTML&lt;/span&gt;
    &lt;span class="pl-s1"&gt;proc&lt;/span&gt; &lt;span class="pl-c1"&gt;=&lt;/span&gt; &lt;span class="pl-s1"&gt;subprocess&lt;/span&gt;.&lt;span class="pl-en"&gt;run&lt;/span&gt;([&lt;span class="pl-s"&gt;"datasette"&lt;/span&gt;, &lt;span class="pl-s"&gt;"."&lt;/span&gt;, &lt;span class="pl-s"&gt;"--get"&lt;/span&gt;, &lt;span class="pl-s1"&gt;path&lt;/span&gt;], &lt;span class="pl-s1"&gt;capture_output&lt;/span&gt;&lt;span class="pl-c1"&gt;=&lt;/span&gt;&lt;span class="pl-c1"&gt;True&lt;/span&gt;)
    &lt;span class="pl-en"&gt;open&lt;/span&gt;(&lt;span class="pl-s1"&gt;page_html&lt;/span&gt;, &lt;span class="pl-s"&gt;"wb"&lt;/span&gt;).&lt;span class="pl-en"&gt;write&lt;/span&gt;(&lt;span class="pl-s1"&gt;proc&lt;/span&gt;.&lt;span class="pl-s1"&gt;stdout&lt;/span&gt;)
    &lt;span class="pl-c"&gt;# Now use puppeteer screenshot to generate a PNG&lt;/span&gt;
    &lt;span class="pl-s1"&gt;proc2&lt;/span&gt; &lt;span class="pl-c1"&gt;=&lt;/span&gt; &lt;span class="pl-s1"&gt;subprocess&lt;/span&gt;.&lt;span class="pl-en"&gt;run&lt;/span&gt;(
        [
            &lt;span class="pl-s"&gt;"puppeteer"&lt;/span&gt;,
            &lt;span class="pl-s"&gt;"screenshot"&lt;/span&gt;,
            &lt;span class="pl-s1"&gt;page_html&lt;/span&gt;,
            &lt;span class="pl-s"&gt;"--viewport"&lt;/span&gt;,
            &lt;span class="pl-s"&gt;"800x400"&lt;/span&gt;,
            &lt;span class="pl-s"&gt;"--full-page=false"&lt;/span&gt;,
        ],
        &lt;span class="pl-s1"&gt;capture_output&lt;/span&gt;&lt;span class="pl-c1"&gt;=&lt;/span&gt;&lt;span class="pl-c1"&gt;True&lt;/span&gt;,
    )
    &lt;span class="pl-s1"&gt;png_bytes&lt;/span&gt; &lt;span class="pl-c1"&gt;=&lt;/span&gt; &lt;span class="pl-s1"&gt;proc2&lt;/span&gt;.&lt;span class="pl-s1"&gt;stdout&lt;/span&gt;
    &lt;span class="pl-k"&gt;return&lt;/span&gt; &lt;span class="pl-s1"&gt;png_bytes&lt;/span&gt;&lt;/pre&gt;
&lt;p&gt;This worked great! Except for one thing... the site is hosted on Vercel, and Vercel has a 5MB &lt;a href="https://vercel.com/docs/platform/limits#serverless-function-payload-size-limit"&gt;response size limit&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;Every time my GitHub build script runs it downloads the previous SQLite database file, so it can avoid regenerating screenshots and HTML for pages that haven't changed.&lt;/p&gt;
&lt;p&gt;The addition of the binary screenshots drove the size of the SQLite database over 5MB, so the part of my script that retrieved the previous database &lt;a href="https://github.com/simonw/til/issues/25"&gt;no longer worked&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;I needed a reliable way to store that 5MB (and probably eventually 10-50MB) database file in between runs of my action.&lt;/p&gt;
&lt;p&gt;The best place to put this would be an S3 bucket, but I find the process of setting up IAM permissions for access to a new bucket so infuriating that I couldn't bring myself to do it.&lt;/p&gt;
&lt;p&gt;So... I created a new dedicated GitHub repository, &lt;a href="https://github.com/simonw/til-db"&gt;simonw/til-db&lt;/a&gt;, and updated my action to store the binary file in that repo - using &lt;a href="https://github.com/simonw/til/blob/1e29c3fe5e90c29b0e71d87dba805484ceb4393c/.github/workflows/build.yml#L80-L86"&gt;a force push&lt;/a&gt; so the repo doesn't need to maintain unnecessary version history of the binary asset.&lt;/p&gt;
&lt;p&gt;This is an abomination of a hack, and it made me cackle a lot. I &lt;a href="https://twitter.com/simonw/status/1301029346614718465"&gt;tweeted about it&lt;/a&gt; and got the suggestion to try &lt;a href="https://git-lfs.github.com/"&gt;Git LFS&lt;/a&gt; instead, which would definitely be a more appropriate way to solve this problem.&lt;/p&gt;
&lt;h4 id="weeknotes-2020-09-03-rendering-markdown"&gt;Rendering Markdown&lt;/h4&gt;
&lt;p&gt;I write my blog entries in Markdown and transform them into HTML before I post them on my blog. Some day I'll teach my blog to render Markdown itself, but so far I've got by through copying and pasting into Markdown tools.&lt;/p&gt;
&lt;p&gt;My favourite Markdown flavour is GitHub's, which adds a bunch of useful capabilities - most notably the ability to apply syntax highlighting. GitHub &lt;a href="https://docs.github.com/en/rest/reference/markdown"&gt;expose an API&lt;/a&gt; that applies their Markdown formatter and returns the resulting HTML.&lt;/p&gt;
&lt;p&gt;I built myself &lt;a href="https://til.simonwillison.net/tools/render-markdown"&gt;a quick and scrappy tool&lt;/a&gt; in JavaScript that sends Markdown through their API and then applies a few DOM manipulations to clean up what comes back. It was a nice opportunity to write some modern vanilla JavaScript using &lt;code&gt;fetch()&lt;/code&gt;:&lt;/p&gt;
&lt;div class="highlight highlight-source-js"&gt;&lt;pre&gt;&lt;span class="pl-k"&gt;async&lt;/span&gt; &lt;span class="pl-k"&gt;function&lt;/span&gt; &lt;span class="pl-en"&gt;render&lt;/span&gt;&lt;span class="pl-kos"&gt;(&lt;/span&gt;&lt;span class="pl-s1"&gt;markdown&lt;/span&gt;&lt;span class="pl-kos"&gt;)&lt;/span&gt; &lt;span class="pl-kos"&gt;{&lt;/span&gt;
    &lt;span class="pl-k"&gt;return&lt;/span&gt; &lt;span class="pl-kos"&gt;(&lt;/span&gt;&lt;span class="pl-k"&gt;await&lt;/span&gt; &lt;span class="pl-en"&gt;fetch&lt;/span&gt;&lt;span class="pl-kos"&gt;(&lt;/span&gt;&lt;span class="pl-s"&gt;'https://api.github.com/markdown'&lt;/span&gt;&lt;span class="pl-kos"&gt;,&lt;/span&gt; &lt;span class="pl-kos"&gt;{&lt;/span&gt;
        &lt;span class="pl-c1"&gt;method&lt;/span&gt;: &lt;span class="pl-s"&gt;'POST'&lt;/span&gt;&lt;span class="pl-kos"&gt;,&lt;/span&gt;
        &lt;span class="pl-c1"&gt;headers&lt;/span&gt;: &lt;span class="pl-kos"&gt;{&lt;/span&gt;
            &lt;span class="pl-s"&gt;'Content-Type'&lt;/span&gt;: &lt;span class="pl-s"&gt;'application/json'&lt;/span&gt;
        &lt;span class="pl-kos"&gt;}&lt;/span&gt;&lt;span class="pl-kos"&gt;,&lt;/span&gt;
        &lt;span class="pl-c1"&gt;body&lt;/span&gt;: &lt;span class="pl-c1"&gt;JSON&lt;/span&gt;&lt;span class="pl-kos"&gt;.&lt;/span&gt;&lt;span class="pl-en"&gt;stringify&lt;/span&gt;&lt;span class="pl-kos"&gt;(&lt;/span&gt;&lt;span class="pl-kos"&gt;{&lt;/span&gt;&lt;span class="pl-s"&gt;'mode'&lt;/span&gt;: &lt;span class="pl-s"&gt;'markdown'&lt;/span&gt;&lt;span class="pl-kos"&gt;,&lt;/span&gt; &lt;span class="pl-s"&gt;'text'&lt;/span&gt;: &lt;span class="pl-s1"&gt;markdown&lt;/span&gt;&lt;span class="pl-kos"&gt;}&lt;/span&gt;&lt;span class="pl-kos"&gt;)&lt;/span&gt;
    &lt;span class="pl-kos"&gt;}&lt;/span&gt;&lt;span class="pl-kos"&gt;)&lt;/span&gt;&lt;span class="pl-kos"&gt;)&lt;/span&gt;&lt;span class="pl-kos"&gt;.&lt;/span&gt;&lt;span class="pl-en"&gt;text&lt;/span&gt;&lt;span class="pl-kos"&gt;(&lt;/span&gt;&lt;span class="pl-kos"&gt;)&lt;/span&gt;&lt;span class="pl-kos"&gt;;&lt;/span&gt;
&lt;span class="pl-kos"&gt;}&lt;/span&gt;

&lt;span class="pl-k"&gt;const&lt;/span&gt; &lt;span class="pl-s1"&gt;button&lt;/span&gt; &lt;span class="pl-c1"&gt;=&lt;/span&gt; &lt;span class="pl-smi"&gt;document&lt;/span&gt;&lt;span class="pl-kos"&gt;.&lt;/span&gt;&lt;span class="pl-en"&gt;getElementsByTagName&lt;/span&gt;&lt;span class="pl-kos"&gt;(&lt;/span&gt;&lt;span class="pl-s"&gt;'button'&lt;/span&gt;&lt;span class="pl-kos"&gt;)&lt;/span&gt;&lt;span class="pl-kos"&gt;[&lt;/span&gt;&lt;span class="pl-c1"&gt;0&lt;/span&gt;&lt;span class="pl-kos"&gt;]&lt;/span&gt;&lt;span class="pl-kos"&gt;;&lt;/span&gt;
&lt;span class="pl-k"&gt;const&lt;/span&gt; &lt;span class="pl-s1"&gt;output&lt;/span&gt; &lt;span class="pl-c1"&gt;=&lt;/span&gt; &lt;span class="pl-smi"&gt;document&lt;/span&gt;&lt;span class="pl-kos"&gt;.&lt;/span&gt;&lt;span class="pl-en"&gt;getElementById&lt;/span&gt;&lt;span class="pl-kos"&gt;(&lt;/span&gt;&lt;span class="pl-s"&gt;'output'&lt;/span&gt;&lt;span class="pl-kos"&gt;)&lt;/span&gt;&lt;span class="pl-kos"&gt;;&lt;/span&gt;
&lt;span class="pl-k"&gt;const&lt;/span&gt; &lt;span class="pl-s1"&gt;preview&lt;/span&gt; &lt;span class="pl-c1"&gt;=&lt;/span&gt; &lt;span class="pl-smi"&gt;document&lt;/span&gt;&lt;span class="pl-kos"&gt;.&lt;/span&gt;&lt;span class="pl-en"&gt;getElementById&lt;/span&gt;&lt;span class="pl-kos"&gt;(&lt;/span&gt;&lt;span class="pl-s"&gt;'preview'&lt;/span&gt;&lt;span class="pl-kos"&gt;)&lt;/span&gt;&lt;span class="pl-kos"&gt;;&lt;/span&gt;

&lt;span class="pl-s1"&gt;button&lt;/span&gt;&lt;span class="pl-kos"&gt;.&lt;/span&gt;&lt;span class="pl-en"&gt;addEventListener&lt;/span&gt;&lt;span class="pl-kos"&gt;(&lt;/span&gt;&lt;span class="pl-s"&gt;'click'&lt;/span&gt;&lt;span class="pl-kos"&gt;,&lt;/span&gt; &lt;span class="pl-k"&gt;async&lt;/span&gt; &lt;span class="pl-k"&gt;function&lt;/span&gt;&lt;span class="pl-kos"&gt;(&lt;/span&gt;&lt;span class="pl-kos"&gt;)&lt;/span&gt; &lt;span class="pl-kos"&gt;{&lt;/span&gt;
    &lt;span class="pl-k"&gt;const&lt;/span&gt; &lt;span class="pl-s1"&gt;rendered&lt;/span&gt; &lt;span class="pl-c1"&gt;=&lt;/span&gt; &lt;span class="pl-k"&gt;await&lt;/span&gt; &lt;span class="pl-en"&gt;render&lt;/span&gt;&lt;span class="pl-kos"&gt;(&lt;/span&gt;&lt;span class="pl-s1"&gt;input&lt;/span&gt;&lt;span class="pl-kos"&gt;.&lt;/span&gt;&lt;span class="pl-c1"&gt;value&lt;/span&gt;&lt;span class="pl-kos"&gt;)&lt;/span&gt;&lt;span class="pl-kos"&gt;;&lt;/span&gt;
    &lt;span class="pl-s1"&gt;output&lt;/span&gt;&lt;span class="pl-kos"&gt;.&lt;/span&gt;&lt;span class="pl-c1"&gt;value&lt;/span&gt; &lt;span class="pl-c1"&gt;=&lt;/span&gt; &lt;span class="pl-s1"&gt;rendered&lt;/span&gt;&lt;span class="pl-kos"&gt;;&lt;/span&gt;
    &lt;span class="pl-s1"&gt;preview&lt;/span&gt;&lt;span class="pl-kos"&gt;.&lt;/span&gt;&lt;span class="pl-c1"&gt;innerHTML&lt;/span&gt; &lt;span class="pl-c1"&gt;=&lt;/span&gt; &lt;span class="pl-s1"&gt;rendered&lt;/span&gt;&lt;span class="pl-kos"&gt;;&lt;/span&gt;
&lt;span class="pl-kos"&gt;}&lt;/span&gt;&lt;span class="pl-kos"&gt;)&lt;/span&gt;&lt;span class="pl-kos"&gt;;&lt;/span&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;h4 id="weeknotes-2020-09-03-dogsheep-beta"&gt;Dogsheep Beta&lt;/h4&gt;
&lt;p&gt;My most exciting project this week was getting out the first working version of &lt;a href="https://github.com/dogsheep/beta"&gt;Dogsheep Beta&lt;/a&gt; - the search engine that ties together results from my &lt;a href="https://dogsheep.github.io/"&gt;Dogsheep&lt;/a&gt; family of tools for personal analytics.&lt;/p&gt;
&lt;p&gt;I'm giving a talk about this tonight at PyCon Australia: &lt;a href="https://2020.pycon.org.au/program/73uk8x/"&gt;Build your own data warehouse for personal analytics with SQLite and Datasette&lt;/a&gt;. I'll be writing up detailed notes in the next few days, so watch this space.&lt;/p&gt;
&lt;h4 id="weeknotes-2020-09-03-til-this-week"&gt;TIL this week&lt;/h4&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href="https://til.simonwillison.net/til/til/jq_reformatting-airtable-json.md"&gt;Converting Airtable JSON for use with sqlite-utils using jq&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://til.simonwillison.net/til/til/javascript_minifying-uglify-npx.md"&gt;Minifying JavaScript with npx uglify-js&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://til.simonwillison.net/til/til/pytest_subprocess-server.md"&gt;Start a server in a subprocess during a pytest session&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://til.simonwillison.net/til/til/bash_loop-over-csv.md"&gt;Looping over comma-separated values in Bash&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://til.simonwillison.net/til/til/cloudrun_gcloud-run-services-list.md"&gt;Using the gcloud run services list command&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://til.simonwillison.net/til/til/python_debug-click-with-pdb.md"&gt;Debugging a Click application using pdb&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;h4 id="weeknotes-2020-09-03-releases-this-week"&gt;Releases this week&lt;/h4&gt;
&lt;ul&gt;
&lt;li&gt;
&lt;a href="https://github.com/dogsheep/dogsheep-beta/releases/tag/0.4.1"&gt;dogsheep-beta 0.4.1&lt;/a&gt; - 2020-09-03&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://github.com/dogsheep/dogsheep-beta/releases/tag/0.4"&gt;dogsheep-beta 0.4&lt;/a&gt; - 2020-09-03&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://github.com/dogsheep/dogsheep-beta/releases/tag/0.4a1"&gt;dogsheep-beta 0.4a1&lt;/a&gt; - 2020-09-03&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://github.com/dogsheep/dogsheep-beta/releases/tag/0.4a0"&gt;dogsheep-beta 0.4a0&lt;/a&gt; - 2020-09-03&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://github.com/dogsheep/dogsheep-beta/releases/tag/0.3"&gt;dogsheep-beta 0.3&lt;/a&gt; - 2020-09-02&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://github.com/dogsheep/dogsheep-beta/releases/tag/0.2"&gt;dogsheep-beta 0.2&lt;/a&gt; - 2020-09-01&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://github.com/dogsheep/dogsheep-beta/releases/tag/0.1"&gt;dogsheep-beta 0.1&lt;/a&gt; - 2020-09-01&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://github.com/dogsheep/dogsheep-beta/releases/tag/0.1a2"&gt;dogsheep-beta 0.1a2&lt;/a&gt; - 2020-09-01&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://github.com/dogsheep/dogsheep-beta/releases/tag/0.1a"&gt;dogsheep-beta 0.1a&lt;/a&gt; - 2020-09-01&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://github.com/simonw/airtable-export/releases/tag/0.4"&gt;airtable-export 0.4&lt;/a&gt; - 2020-08-30&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://github.com/simonw/datasette-yaml/releases/tag/0.1a"&gt;datasette-yaml 0.1a&lt;/a&gt; - 2020-08-29&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://github.com/simonw/airtable-export/releases/tag/0.3.1"&gt;airtable-export 0.3.1&lt;/a&gt; - 2020-08-29&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://github.com/simonw/airtable-export/releases/tag/0.3"&gt;airtable-export 0.3&lt;/a&gt; - 2020-08-29&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://github.com/simonw/airtable-export/releases/tag/0.2"&gt;airtable-export 0.2&lt;/a&gt; - 2020-08-29&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://github.com/simonw/airtable-export/releases/tag/0.1.1"&gt;airtable-export 0.1.1&lt;/a&gt; - 2020-08-29&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://github.com/simonw/airtable-export/releases/tag/0.1"&gt;airtable-export 0.1&lt;/a&gt; - 2020-08-29&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://github.com/simonw/datasette/releases/tag/0.49a0"&gt;datasette 0.49a0&lt;/a&gt; - 2020-08-28&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://github.com/simonw/sqlite-utils/releases/tag/2.16.1"&gt;sqlite-utils 2.16.1&lt;/a&gt; - 2020-08-28&lt;/li&gt;
&lt;/ul&gt;
    
        &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/projects"&gt;projects&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/yaml"&gt;yaml&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/markdown"&gt;markdown&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/dogsheep"&gt;dogsheep&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/weeknotes"&gt;weeknotes&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/github-actions"&gt;github-actions&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/airtable"&gt;airtable&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/puppeteer"&gt;puppeteer&lt;/a&gt;&lt;/p&gt;
    

</summary><category term="projects"/><category term="yaml"/><category term="markdown"/><category term="dogsheep"/><category term="weeknotes"/><category term="github-actions"/><category term="airtable"/><category term="puppeteer"/></entry><entry><title>airtable-export</title><link href="https://simonwillison.net/2020/Aug/29/airtable-export/#atom-tag" rel="alternate"/><published>2020-08-29T21:48:37+00:00</published><updated>2020-08-29T21:48:37+00:00</updated><id>https://simonwillison.net/2020/Aug/29/airtable-export/#atom-tag</id><summary type="html">
    
&lt;p&gt;&lt;strong&gt;&lt;a href="https://github.com/simonw/airtable-export"&gt;airtable-export&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;
I wrote a command-line utility for exporting data from Airtable and dumping it to disk as YAML, JSON or newline delimited JSON files. This means you can backup an Airtable database from a GitHub Action and get a commit history of changes made to your data.


    &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/json"&gt;json&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/projects"&gt;projects&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/yaml"&gt;yaml&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/airtable"&gt;airtable&lt;/a&gt;&lt;/p&gt;



</summary><category term="json"/><category term="projects"/><category term="yaml"/><category term="airtable"/></entry><entry><title>Goodbye Zeit Now v1, hello datasette-publish-now - and talking to myself in GitHub issues</title><link href="https://simonwillison.net/2020/Apr/8/weeknotes-zeit-now-v2/#atom-tag" rel="alternate"/><published>2020-04-08T03:32:24+00:00</published><updated>2020-04-08T03:32:24+00:00</updated><id>https://simonwillison.net/2020/Apr/8/weeknotes-zeit-now-v2/#atom-tag</id><summary type="html">
    &lt;p&gt;This week I’ve been mostly dealing with the finally announced shutdown of Zeit Now v1. And having long-winded conversations with myself in GitHub issues.&lt;/p&gt;

&lt;h3&gt;How Zeit Now inspired Datasette&lt;/h3&gt;

&lt;p&gt;I first started experiencing with Zeit’s serverless &lt;a href="https://zeit.co/home"&gt;Now&lt;/a&gt; hosting platform back &lt;a href="https://simonwillison.net/2017/Oct/14/async-python-sanic-now/"&gt;in October 2017&lt;/a&gt;, when I used it to deploy &lt;a href="https://json-head.now.sh/"&gt;json-head.now.sh&lt;/a&gt; - an updated version of an API tool I originally built for Google App Engine &lt;a href="https://simonwillison.net/2008/Jul/29/jsonhead/"&gt;in July 2008&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;I liked Zeit Now, a lot. Instant, inexpensive deploys of any stateless project that could be defined using a Dockerfile? Just type &lt;code&gt;now&lt;/code&gt; to deploy the project in your current directory? Every deployment gets its own permanent URL? Amazing!&lt;/p&gt;

&lt;p&gt;There was just one catch: Since Now deployments are ephemeral applications running on them need to be stateless. If you want a database, you need to involve another (potentially costly) service. It's a limitation shared by other scalable hosting solutions - Heroku, App Engine and so on. How much interesting stuff can you build without a database?&lt;/p&gt;

&lt;p&gt;I was musing about this in the shower one day (that &lt;a href="https://lifehacker.com/science-explains-why-our-best-ideas-come-in-the-shower-5987858"&gt;old cliche&lt;/a&gt; really happened for me) when I had a thought: sure, you can't write to a database... but if your data is read-only, why not bundle the database alongside the application code as part of the Docker image?&lt;/p&gt;

&lt;p&gt;Ever since I &lt;a href="https://simonwillison.net/2009/Mar/10/openplatform/"&gt;helped launch the Datablog&lt;/a&gt; at the Guardian back in 2009 I had been interested in finding better ways to publish data journalism datasets than CSV files or a Google spreadsheets - so building something that could package and bundle read-only data was of extreme interest to me.&lt;/p&gt;

&lt;p&gt;In November 2017 I released &lt;a href="https://simonwillison.net/2017/Nov/13/datasette/"&gt;the first version&lt;/a&gt; of Datasette. The original idea was very much inspired by Zeit Now.&lt;/p&gt;

&lt;p&gt;I gave &lt;a href="https://www.youtube.com/watch?v=_uwrqB--eM4"&gt;a talk about Datasette&lt;/a&gt; at the Zeit Day conference in San Francisco in April 2018. Suffice to say I was a huge fan!&lt;/p&gt;

&lt;h3&gt;Goodbye, Zeit Now v1&lt;/h3&gt;

&lt;p&gt;In November 2018, Zeit &lt;a href="https://simonwillison.net/2018/Nov/19/smaller-python-docker-images/"&gt;announced Now v2&lt;/a&gt;. And it was, &lt;em&gt;different&lt;/em&gt;.&lt;/p&gt;

&lt;p&gt;v2 is an entirely different architecture from v1. Where v1 built on Docker containers, v2 is built on top of serverless functions - AWS Lambda in particular.&lt;/p&gt;

&lt;p&gt;I can see why Zeit did this. Lambda functions can launch from cold &lt;em&gt;way faster&lt;/em&gt; - v1's Docker infrastructure had tough cold-start times. They are much cheaper to run as well - crucial for Zeit given their &lt;a href="https://zeit.co/pricing"&gt;extremely generous pricing plans&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;But it was bad news for my projects. Lambdas are tightly size constrained, which is tough when you're bundling potentially large SQLite database files with your deployments.&lt;/p&gt;

&lt;p&gt;More importantly, in 2018 Amazon were deliberately excluding the Python &lt;code&gt;sqlite3&lt;/code&gt; standard library module from the Python Lambda environment! I guess they hadn't considered people who might want to work with read-only database files.&lt;/p&gt;

&lt;p&gt;So Datasette on Now v2 just wasn't going to work. Zeit kept v1 supported for the time being, but the writing was clearly on the wall.&lt;/p&gt;

&lt;p&gt;In April 2019 &lt;a href="https://cloud.google.com/blog/products/serverless/announcing-cloud-run-the-newest-member-of-our-serverless-compute-stack"&gt;Google announced Cloud Run&lt;/a&gt;, a serverless, scale-to-zero hosting environment based around Docker containers. In many ways it's Google's version of Zeit Now v1 - it has many of the characteristics I loved about v1, albeit with a clunkier developer experience and much more friction in assigning nice URLs to projects. Romain Primet &lt;a href="https://github.com/simonw/datasette/pull/434"&gt;contributed Cloud Run support to Datasette&lt;/a&gt; and it has since become my preferred hosting target for my new projects (see &lt;a href="https://simonwillison.net/2020/Jan/21/github-actions-cloud-run/"&gt;Deploying a data API using GitHub Actions and Cloud Run&lt;/a&gt;).&lt;/p&gt;

&lt;p&gt;Last week, Zeit &lt;a href="https://twitter.com/simonw/status/1246300304917680128"&gt;finally announced&lt;/a&gt; the sunset date for v1. From 1st of May new deploys won't be allowed, and on the 7th of August they'll be turning off the old v1 infrastructure and deleting all existing Now v1 deployments.&lt;/p&gt;

&lt;p&gt;I engaged in &lt;a href="https://twitter.com/simonw/status/1246300304917680128"&gt;an extensive Twitter conversation&lt;/a&gt; about this, where I praised Zeit's handling of the shutdown while bemoaning the loss of the v1 product I had loved so much.&lt;/p&gt;

&lt;h3 id="migrating-my-projects"&gt;Migrating my projects&lt;/h3&gt;

&lt;p&gt;My newer projects have been on Cloud Run for quite some time, but I still have a bunch of old projects that I care about and want to keep running past the v1 shutdown.&lt;/p&gt;

&lt;p&gt;The first project I ported was &lt;a href="https://latest.datasette.io/"&gt;latest.datasette.io&lt;/a&gt;, a live demo of Datasette which updates with the latest code any time I push to the Datasette master branch on GitHub.&lt;/p&gt;

&lt;p&gt;Any time I do some kind of ops task like this I've gotten into the habit of meticulously documenting every single step in comments on a GitHub issue. Here's &lt;a href="https://github.com/simonw/datasette/issues/705"&gt;the issue&lt;/a&gt; for porting latest.datasette.io to Cloud Run (and switching from Circle CI to GitHub Actions at the same time).&lt;/p&gt;

&lt;p&gt;My next project was &lt;a href="https://global-power-plants.datasettes.com/global-power-plants/global-power-plants"&gt;global-power-plants-datasette&lt;/a&gt;, a small project which takes a database of global power plants &lt;a href="https://www.wri.org/publication/global-power-plant-database"&gt;published by the World Resources Institute&lt;/a&gt; and publishes it using Datasette. It checks for new updates to &lt;a href="https://github.com/wri/global-power-plant-database"&gt;their repo&lt;/a&gt; once a day. I originally built it as a demo for &lt;a href="https://github.com/simonw/datasette-cluster-map"&gt;datasette-cluster-map&lt;/a&gt;, since it's fun seeing 33,000 power plants on a single map. Here's &lt;a href="https://github.com/simonw/global-power-plants-datasette/issues/1"&gt;that issue&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;Having warmed up with these two, my next target was the most significant: porting my &lt;a href="https://www.niche-museums.com/"&gt;Niche Museums&lt;/a&gt; website.&lt;/p&gt;

&lt;p&gt;Niche Museums is the most heavily customized Datasette instance I've run anywhere - it incorporates custom templates, CSS and plugins.&lt;/p&gt;

&lt;p&gt;Here's &lt;a href="https://github.com/simonw/museums/issues/20"&gt;the tracking issue&lt;/a&gt; for porting it to Cloud Run. I ran into a few hurdles with DNS and TLS certificates, and I had to do &lt;a href="https://github.com/simonw/museums/issues/21"&gt;some additional work&lt;/a&gt; to ensure &lt;code&gt;niche-museums.com&lt;/code&gt; redirects to &lt;code&gt;www.niche-musums.com&lt;/code&gt;, but it's now fully migrated.&lt;/p&gt;

&lt;h3 id="hello-zeit-now-v2"&gt;Hello, Zeit Now v2&lt;/h3&gt;

&lt;p&gt;In &lt;a href="https://twitter.com/simonw/status/1246302021608591360"&gt;complaining about&lt;/a&gt; the lack of that essential &lt;code&gt;sqlite3&lt;/code&gt; module I figured it would be responsible to double-check and make sure that was still true.&lt;/p&gt;

&lt;p&gt;It was not! Today Now's Python environment &lt;a href="https://twitter.com/simonw/status/1246600935289184256"&gt;includes sqlite3&lt;/a&gt; after all.&lt;/p&gt;

&lt;p&gt;Datasette's &lt;a href="https://datasette.readthedocs.io/en/0.39/plugins.html#publish-subcommand-publish"&gt;publish_subcommand() plugin hook&lt;/a&gt; lets plugins add new publishing targets to the &lt;code&gt;datasette publish&lt;/code&gt; command (I used it to build &lt;a href="https://github.com/simonw/datasette-publish-fly"&gt;datasette-publish-fly&lt;/a&gt; last month). How hard would it be to build a plugin for Zeit Now v2?&lt;/p&gt;

&lt;p&gt;I fired up a new &lt;a href="https://github.com/simonw/datasette/issues/717"&gt;lengthy talking-to-myself GitHub issue&lt;/a&gt; and started prototyping.&lt;/p&gt;

&lt;p&gt;Now v2 may not support Docker, but it does support the &lt;a href="https://asgi.readthedocs.io/en/latest/"&gt;ASGI Python standard&lt;/a&gt; (the asynchronous alternative to WSGI, shepherded by Andrew Godwin).&lt;/p&gt;

&lt;p&gt;Zeit are keen proponents of the &lt;a href="https://jamstack.org/"&gt;Jamstack&lt;/a&gt; approach, where websites are built using static pre-rendered HTML and JavaScript that calls out to APIs for dynamic data. v2 deployments are expected to consist of static HTML with "serverless functions" - standalone server-side scripts that live in an &lt;code&gt;api/&lt;/code&gt; directory by convention and are compiled into separate lambdas.&lt;/p&gt;

&lt;p&gt;Datasette works just fine without JavaScript, which means it needs to handle all of the URL routes for a site. Essentually I need to build a single function that runs the whole of Datasette, then route all incoming traffic to it.&lt;/p&gt;

&lt;p&gt;It took me a while to figure it out, but it turns out the Now v2 recipe for that is a &lt;code&gt;now.json&lt;/code&gt; file that looks like this:&lt;/p&gt;

&lt;pre&gt;&lt;code&gt;{
    "version": 2,
    "builds": [
        {
            "src": "index.py",
            "use": "@now/python"
        }
    ],
    "routes": [
        {
            "src": "(.*)",
            "dest": "index.py"
        }
    ]
}&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;Thanks Aaron Boodman for &lt;a href="https://twitter.com/aboodman/status/1246605658067066882"&gt;the tip&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;Given the above configuration, Zeit will install any Python dependencies in a &lt;code&gt;requirements.txt&lt;/code&gt; file, then treat an &lt;code&gt;app&lt;/code&gt; variable in the &lt;code&gt;index.py&lt;/code&gt; file as an ASGI application it should route all incoming traffic to. Exactly what I need to deploy Datasette!&lt;/p&gt;

&lt;p&gt;This was everything I needed to build the new plugin. &lt;a href="https://github.com/simonw/datasette-publish-now"&gt;datasette-publish-now&lt;/a&gt; is the result.&lt;/p&gt;

&lt;p&gt;Here's &lt;a href="https://datasette-public.now.sh/_src"&gt;the generated source code&lt;/a&gt; for a project deployed using the plugin, showing how the underlyinng ASGI application is configured.&lt;/p&gt;

&lt;p&gt;It's currently an alpha - not every feature is supported (see &lt;a href="https://github.com/simonw/datasette-publish-now/milestone/1"&gt;this milestone&lt;/a&gt;) and it relies on a minor deprecated feature (which I've &lt;a href="https://github.com/zeit/now/discussions/4021"&gt;implored Zeit to reconsider&lt;/a&gt;) but it's already full-featured enough that I can start using it to upgrade some of my smaller existing Now projects.&lt;/p&gt;

&lt;p&gt;The first I upgraded is one of my favourites: &lt;a href="https://polar-bears.now.sh/"&gt;polar-bears.now.sh&lt;/a&gt;, which visualizes tracking data from polar bear ear tags (using &lt;a href="https://github.com/simonw/datasette-cluster-map"&gt;datasette-cluster-map&lt;/a&gt;) that was &lt;a href="https://alaska.usgs.gov/products/data.php?dataid=130"&gt;published by the USGS Alaska Science Center, Polar Bear Research Program&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;Here's the command I used to deploy the site:&lt;/p&gt;

&lt;pre&gt;&lt;code&gt;$ pip install datasette-publish-now
$ datasette publish now2 polar-bears.db \
    --title "Polar Bear Ear Tags, 2009-2011" \
    --source "USGS Alaska Science Center, Polar Bear Research Program" \
    --source_url "https://alaska.usgs.gov/products/data.php?dataid=130" \
    --install datasette-cluster-map \
    --project=polar-bears&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;I exported a full list of my Now v1 projects from their handy &lt;a href="https://zeit.co/dashboard/active-v1-instances"&gt;active v1 instances&lt;/a&gt; page.&lt;/p&gt;

&lt;h3&gt;The rest of my projects&lt;/h3&gt;

&lt;p&gt;I scraped the page using the following JavaScript, constructed with the help of the &lt;a href="https://simonwillison.net/2020/Apr/7/new-developer-features-firefox-75/"&gt;instant evaluation&lt;/a&gt; console feature in Firefox 75:&lt;/p&gt;

&lt;pre&gt;&lt;code&gt;console.log(
  JSON.stringify(
    Array.from(
      Array.from(
        document.getElementsByTagName("table")[1].
          getElementsByTagName("tr")
      ).slice(1).map(
        (tr) =&amp;gt;
          Array.from(
            tr.getElementsByTagName("td")
        ).map((td) =&amp;gt; td.innerText)
      )
    )
  )
);&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;Then I loaded them into Datasette for analysis.&lt;/p&gt;

&lt;p&gt;After filtering out the &lt;code&gt;datasette-latest-commithash.now.sh&lt;/code&gt; projects I had deployed for every push to GitHub it turns out I have 34 distinct projects running there.&lt;/p&gt;

&lt;p&gt;I won't port all of them, but given &lt;code&gt;datasette-publish-now&lt;/code&gt; I should be able to port the ones that I care about without too much trouble.&lt;/p&gt;

&lt;h3 id="git-bisect"&gt;Debugging Datasette with git bisect run&lt;/h3&gt;

&lt;p&gt;I fixed two bugs in Datasette this week using &lt;code&gt;git bisect run&lt;/code&gt; - a tool I've been meaning to figure out for years, which lets you run an automated binary search against a commit log to find the source of a bug.&lt;/p&gt;

&lt;p&gt;Since I was figuring out a new tool, I fired up another GitHub issue self-conversation: in &lt;a href="https://github.com/simonw/datasette/issues/716"&gt;issue #716&lt;/a&gt; I document my process of both learning to use &lt;code&gt;git bisect run&lt;/code&gt; and using it to find a solution to that particular bug.&lt;/p&gt;

&lt;p&gt;It worked great, so I used the same trick on &lt;a href="https://github.com/simonw/datasette/issues/689"&gt;issue 689&lt;/a&gt; as well.&lt;/p&gt;

&lt;p&gt;Watching &lt;code&gt;git bisect run&lt;/code&gt; churn through 32 revisions in a few seconds and pinpoint the exact moment a bug was introduced is pretty delightful:&lt;/p&gt;

&lt;pre&gt;&lt;code&gt;$ git bisect start master 0.34
Bisecting: 32 revisions left to test after this (roughly 5 steps)
[dc80e779a2e708b2685fc641df99e6aae9ad6f97] Handle scope path if it is a string
$ git bisect run python check_templates_considered.py
running python check_templates_considered.py
Traceback (most recent call last):
...
AssertionError
Bisecting: 15 revisions left to test after this (roughly 4 steps)
[7c6a9c35299f251f9abfb03fd8e85143e4361709] Better tests for prepare_connection() plugin hook, refs #678
running python check_templates_considered.py
Traceback (most recent call last):
...
AssertionError
Bisecting: 7 revisions left to test after this (roughly 3 steps)
[0091dfe3e5a3db94af8881038d3f1b8312bb857d] More reliable tie-break ordering for facet results
running python check_templates_considered.py
Traceback (most recent call last):
...
AssertionError
Bisecting: 3 revisions left to test after this (roughly 2 steps)
[ce12244037b60ba0202c814871218c1dab38d729] Release notes for 0.35
running python check_templates_considered.py
Traceback (most recent call last):
...
AssertionError
Bisecting: 1 revision left to test after this (roughly 1 step)
[70b915fb4bc214f9d064179f87671f8a378aa127] Datasette.render_template() method, closes #577
running python check_templates_considered.py
Traceback (most recent call last):
...
AssertionError
Bisecting: 0 revisions left to test after this (roughly 0 steps)
[286ed286b68793532c2a38436a08343b45cfbc91] geojson-to-sqlite
running python check_templates_considered.py
70b915fb4bc214f9d064179f87671f8a378aa127 is the first bad commit
commit 70b915fb4bc214f9d064179f87671f8a378aa127
Author: Simon Willison
Date:   Tue Feb 4 12:26:17 2020 -0800

    Datasette.render_template() method, closes #577

    Pull request #664.

:040000 040000 def9e31252e056845609de36c66d4320dd0c47f8 da19b7f8c26d50a4c05e5a7f05220b968429725c M	datasette
bisect run success&lt;/code&gt;&lt;/pre&gt;

&lt;h3&gt;Supporting metadata.yaml&lt;/h3&gt;

&lt;p&gt;The other Datasette project I completed this week is a relatively small feature with hopefully a big impact: you can &lt;a href="https://github.com/simonw/datasette/issues/713"&gt;now use YAML for Datasette's metadata configuration&lt;/a&gt; as an alternative to JSON.&lt;/p&gt;

&lt;p&gt;I'm not crazy about YAML: I still don't feel like I've mastered it, and I've been &lt;a href="https://simonwillison.net/tags/yaml/"&gt;tracking it for 18 years&lt;/a&gt;! But it has one big advantage over JSON for configuration files: robust support for multi-line strings.&lt;/p&gt;

&lt;p&gt;Datasette's &lt;a href="https://datasette.readthedocs.io/en/latest/metadata.html"&gt;metadata file&lt;/a&gt; can include lengthy SQL statements and strings of HTML, both of which benefit from multi-line strings.&lt;/p&gt;

&lt;p&gt;I first used YAML for metadata for my &lt;a href="https://simonwillison.net/2018/Aug/6/russian-facebook-ads/"&gt;Analyzing US Election Russian Facebook Ads&lt;/a&gt; project. The &lt;a href="https://github.com/simonw/russian-ira-facebook-ads-datasette/blob/336ba87ef8071e664441ad0a95e3b8d0a33f682a/russian-ads-metadata.yaml"&gt;metadata file for that&lt;/a&gt; demonstrates both embedded HTML and embedded SQL - and an accompanying &lt;a href="https://github.com/simonw/russian-ira-facebook-ads-datasette/blob/336ba87ef8071e664441ad0a95e3b8d0a33f682a/build_metadata.py"&gt;build_metadata.py&lt;/a&gt; script converted it to JSON at build time. I've since used the same trick for a number of other projects.&lt;/p&gt;

&lt;p&gt;The next release of Datasette (hopefully within a week) will ship the new feature, at which point those conversion scripts won't be necessary.&lt;/p&gt;

&lt;p&gt;This should work particularly well with the forthcoming &lt;a href="https://github.com/simonw/datasette/issues/698"&gt;ability for a canned query to write to a database&lt;/a&gt;. Getting that wrapped up and shipped will be my focus for the next few days.&lt;/p&gt;
    
        &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/git"&gt;git&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/github"&gt;github&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/projects"&gt;projects&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/yaml"&gt;yaml&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/zeit-now"&gt;zeit-now&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/datasette"&gt;datasette&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/weeknotes"&gt;weeknotes&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/github-issues"&gt;github-issues&lt;/a&gt;&lt;/p&gt;
    

</summary><category term="git"/><category term="github"/><category term="projects"/><category term="yaml"/><category term="zeit-now"/><category term="datasette"/><category term="weeknotes"/><category term="github-issues"/></entry><entry><title>niche-museums.com, powered by Datasette</title><link href="https://simonwillison.net/2019/Nov/25/niche-museums/#atom-tag" rel="alternate"/><published>2019-11-25T22:27:46+00:00</published><updated>2019-11-25T22:27:46+00:00</updated><id>https://simonwillison.net/2019/Nov/25/niche-museums/#atom-tag</id><summary type="html">
    &lt;p&gt;I just released a major upgrade to my &lt;a href="https://www.niche-museums.com/"&gt;www.niche-museums.com&lt;/a&gt; website (launched &lt;a href="https://simonwillison.net/2019/Oct/28/niche-museums-kepler/"&gt;last month&lt;/a&gt;).&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;The site is now rendered server-side. The previous version used &lt;a href="https://lit-html.polymer-project.org/"&gt;lit-html&lt;/a&gt; to render content using JavaScript.&lt;/li&gt;
&lt;li&gt;Each museum now has its own page. Here&amp;#39;s today&amp;#39;s new museum listing for the &lt;a href="https://www.niche-museums.com/browse/museums/46"&gt;Conservatory of Flowers&lt;/a&gt; in San Francisco. These pages have a map on them.&lt;/li&gt;
&lt;li&gt;The site has an &lt;a href="https://www.niche-museums.com/about"&gt;about page&lt;/a&gt;.&lt;/li&gt;
&lt;li&gt;You can now link to the page for a specific latitude and longitude, e.g. &lt;a href="https://www.niche-museums.com/?latitude=37.77&amp;amp;longitude=-122.458"&gt;this location in Golden Gate Park&lt;/a&gt;.&lt;/li&gt;
&lt;li&gt;The source code for the site is now &lt;a href="https://github.com/simonw/museums"&gt;available on GitHub&lt;/a&gt;.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Notably, the site is entirely powered by &lt;a href="https://github.com/simonw/datasette"&gt;Datasette&lt;/a&gt;. It&amp;#39;s a heavily customized Datasette instance, making extensive use of &lt;a href="https://datasette.readthedocs.io/en/0.32/custom_templates.html#custom-templates"&gt;custom templates&lt;/a&gt; and &lt;a href="https://datasette.readthedocs.io/en/0.32/plugins.html"&gt;plugins&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;It&amp;#39;s a really fun experiment. I&amp;#39;m essentially using Datasette as a weird twist on a static site generator - no moving parts since the database is immutable but there&amp;#39;s still stuff happening server-side to render the pages.&lt;/p&gt;
&lt;h3 id="continuous-deployment"&gt;Continuous deployment&lt;/h3&gt;
&lt;p&gt;The site is entirely stateless and is published &lt;a href="https://circleci.com/gh/simonw/museums"&gt;using Circle CI&lt;/a&gt; to a serverless hosting provider (currently Zeit Now v1, but I&amp;#39;ll probably move it to Google Cloud Run in the near future.)&lt;/p&gt;
&lt;p&gt;The site content - 46 museums and counting - lives in the &lt;a href="https://github.com/simonw/museums/blob/master/museums.yaml"&gt;museums.yaml&lt;/a&gt; file. I&amp;#39;ve been adding a new museum listing every day by editing the YAML file using &lt;a href="https://workingcopyapp.com/"&gt;Working Copy&lt;/a&gt; on my iPhone.&lt;/p&gt;
&lt;p&gt;The &lt;a href="https://github.com/simonw/museums/blob/master/.circleci/config.yml"&gt;build script&lt;/a&gt; runs automatically on every commit. It converts the YAML file into a SQLite database using my &lt;a href="https://github.com/simonw/yaml-to-sqlite"&gt;yaml-to-sqlite&lt;/a&gt; tool, then runs &lt;code&gt;datasette publish now...&lt;/code&gt; to deploy the resulting database.&lt;/p&gt;
&lt;p&gt;The full deployment command is as follows:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;datasette publish now browse.db about.db \
    --token=$NOW_TOKEN \
    --alias=www.niche-museums.com \
    --name=niche-museums \
    --install=datasette-haversine \
    --install=datasette-pretty-json \
    --install=datasette-template-sql \
    --install=datasette-json-html \
    --install=datasette-cluster-map~=0.8 \
    --metadata=metadata.json \
    --template-dir=templates \
    --plugins-dir=plugins \
    --branch=master
&lt;/code&gt;&lt;/pre&gt;&lt;p&gt;There&amp;#39;s a lot going on here.&lt;/p&gt;
&lt;p&gt;&lt;code&gt;browse.db&lt;/code&gt; is the SQLite database file that was built by running &lt;code&gt;yaml-to-sqlite&lt;/code&gt;.&lt;/p&gt;
&lt;p&gt;&lt;code&gt;about.db&lt;/code&gt; is an empty database built using &lt;code&gt;sqlite3 about.db &amp;#39;&amp;#39;&lt;/code&gt; - more on this later.&lt;/p&gt;
&lt;p&gt;The &lt;code&gt;--alias=&lt;/code&gt; option tells Zeit Now to alias that URL to the resulting deployment. This is the single biggest feature that I&amp;#39;m missing from Google Cloud Run at the moment. It&amp;#39;s possible to point domains at deployments there but it&amp;#39;s not nearly as easy to script.&lt;/p&gt;
&lt;p&gt;The &lt;code&gt;--install=&lt;/code&gt; options tell &lt;code&gt;datasette publish&lt;/code&gt; which plugins should be installed on the resulting instance.&lt;/p&gt;
&lt;p&gt;&lt;code&gt;--metadata=&lt;/code&gt;, &lt;code&gt;--template-dir=&lt;/code&gt; and &lt;code&gt;--plugins-dir=&lt;/code&gt; are the options that customize the instance.&lt;/p&gt;
&lt;p&gt;&lt;code&gt;--branch=master&lt;/code&gt; means we always deploy the latest master of Datasette directly from GitHub, ignoring the most recent release to PyPI. This isn&amp;#39;t strictly necessary here.&lt;/p&gt;
&lt;h3 id="customization"&gt;Customization&lt;/h3&gt;
&lt;p&gt;The site itself is built almost entirely using Datasette custom templates. I have four of them:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href="https://github.com/simonw/museums/blob/c81e8ec9f39d87f13481608832c94b8e824fd347/templates/index.html"&gt;index.html&lt;/a&gt; is the template used for the homepage, and for the page you see when you search for museums near a specific latitude and longitude.&lt;/li&gt;
&lt;li&gt;&lt;a href="https://github.com/simonw/museums/blob/c81e8ec9f39d87f13481608832c94b8e824fd347/templates/row-browse-museums.html"&gt;row-browse-museums.html&lt;/a&gt; is the template used for the &lt;a href="https://www.niche-museums.com/browse/museums/43"&gt;individual museum pages&lt;/a&gt;. It includes the JavaScript used for the map (which is powered by &lt;a href="https://leafletjs.com/"&gt;Leaflet&lt;/a&gt; and uses &lt;a href="https://foundation.wikimedia.org/wiki/Maps_Terms_of_Use"&gt;Wikimedia&amp;#39;s OpenStreetMap tiles&lt;/a&gt;, which I discovered thanks to &lt;a href="https://observablehq.com/@tmcw/leaflet"&gt;this Observable notebook&lt;/a&gt; by Tom MacWright).&lt;/li&gt;
&lt;li&gt;&lt;a href="https://github.com/simonw/museums/blob/c81e8ec9f39d87f13481608832c94b8e824fd347/templates/_museum_card.html"&gt;_museum_card.html&lt;/a&gt; is an included template rendering a card for a museum, shared by the index and museum pages.&lt;/li&gt;
&lt;li&gt;&lt;a href="https://github.com/simonw/museums/blob/c81e8ec9f39d87f13481608832c94b8e824fd347/templates/database-about.html"&gt;database-about.html&lt;/a&gt; is the template for &lt;a href="https://www.niche-museums.com/about"&gt;the about page&lt;/a&gt;.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;The about page uses a particularly devious hack.&lt;/p&gt;
&lt;p&gt;Datasette doesn&amp;#39;t have an easy way to create additional custom pages with URLs at the moment (without abusing the &lt;a href="https://datasette.readthedocs.io/en/stable/plugins.html#asgi-wrapper-datasette"&gt;asgi_wrapper()&lt;/a&gt; hook, which is pretty low-level).&lt;/p&gt;
&lt;p&gt;But... every attached database gets its own URL at &lt;code&gt;/database-name&lt;/code&gt;.&lt;/p&gt;
&lt;p&gt;So, to create the &lt;code&gt;/about&lt;/code&gt; page I create an empty database called &lt;code&gt;about.db&lt;/code&gt; using the &lt;code&gt;sqlite3 about.db &amp;quot;&amp;quot;&lt;/code&gt; command. I serve that using Datasette, then create a custom template for that specific database using Datasette&amp;#39;s &lt;a href="https://datasette.readthedocs.io/en/0.32/custom_templates.html#custom-templates"&gt;template naming conventions&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;I&amp;#39;ll probably come up with a less grotesque way of doing this and bake it into Datasette in the future. For the moment this seems to work pretty well.&lt;/p&gt;
&lt;h3 id="plugins"&gt;Plugins&lt;/h3&gt;
&lt;p&gt;The two key plugins here are &lt;code&gt;datasette-haversine&lt;/code&gt; and &lt;code&gt;datasette-template-sql&lt;/code&gt;.&lt;/p&gt;
&lt;p&gt;&lt;a href="https://github.com/simonw/datasette-haversine"&gt;datasette-haversine&lt;/a&gt; adds a custom SQL function to Datasette called &lt;code&gt;haversine()&lt;/code&gt;, which calculates the haversine distance between two latitude/longitude points.&lt;/p&gt;
&lt;p&gt;It&amp;#39;s used by the SQL query which finds the nearest museums to the user.&lt;/p&gt;
&lt;p&gt;This is very inefficient - it&amp;#39;s essentially a brute-force approach which calculates that distance for every museum in the database and sorts them accordingly - but it will be years before I have enough museums listed for that to cause any kind of performance issue.&lt;/p&gt;
&lt;p&gt;&lt;a href="https://github.com/simonw/datasette-template-sql"&gt;datasette-template-sql&lt;/a&gt; is the new plugin I &lt;a href="https://simonwillison.net/2019/Nov/18/datasette-template-sql/"&gt;described last week&lt;/a&gt;, made possible by Datasette dropping Python 3.5 support. It allows SQL queries to be executed directly from templates. I&amp;#39;m using it here to &lt;a href="https://github.com/simonw/museums/blob/c81e8ec9f39d87f13481608832c94b8e824fd347/templates/index.html#L58-L69"&gt;run the queries&lt;/a&gt; that power homepage.&lt;/p&gt;
&lt;p&gt;I tried to get the site working just using code in the templates, but it got pretty messy. Instead, I took advantage of Datasette&amp;#39;s &lt;code&gt;--plugins-dir&lt;/code&gt; option, which causes Datasette to treat all Python modules in a specific directory as plugins and attempt to load them.&lt;/p&gt;
&lt;p&gt;&lt;a href="https://github.com/simonw/museums/blob/c81e8ec9f39d87f13481608832c94b8e824fd347/plugins/index_vars.py"&gt;index_vars.py&lt;/a&gt; is a single custom plugin that I&amp;#39;m bundling with the site. It uses the &lt;a href="https://datasette.readthedocs.io/en/0.32/plugins.html#extra-template-vars-template-database-table-view-name-request-datasette"&gt;extra_template_vars()&lt;/a&gt; plugin took to detect requests to the &lt;code&gt;index&lt;/code&gt; page and inject some additional custom template variables based on values read from the querystring.&lt;/p&gt;
&lt;p&gt;This ends up acting a little bit like a custom Django view function. It&amp;#39;s a slightly weird pattern but again it does the job - and helps me further explore the potential of Datasette as a tool for powering websites in addition to just providing an API.&lt;/p&gt;
&lt;h2 id="weeknotes"&gt;Weeknotes&lt;/h2&gt;
&lt;p&gt;This post is standing in for my regular weeknotes, because it represents most of what I achieved this last week. A few other bits and pieces:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;I&amp;#39;ve been exploring ways to enable CSV upload directly into a Datasette instance. I&amp;#39;m building a prototype of this on top of &lt;a href="https://www.starlette.io/"&gt;Starlette&lt;/a&gt;, because it has solid ASGI &lt;a href="https://www.starlette.io/requests/#request-files"&gt;file upload support&lt;/a&gt;. This is currently a standalone web application but I&amp;#39;ll probably make it work as a Datasette ASGI plugin once I have something I like.&lt;/li&gt;
&lt;li&gt;&lt;a href="https://sixcolors.com/post/2019/09/13-features-of-ios-13-shortcuts/"&gt;Shortcuts in iOS 13&lt;/a&gt; got some very interesting new features, most importantly the ability to trigger shortcuts automatically on specific actions - including every time you open a specific app. I&amp;#39;ve been experimenting with using this to automatically copy data from my iPhone up to a custom web application - maybe this could help ingest notes and photos into &lt;a href="https://simonwillison.net/2019/Oct/7/dogsheep/"&gt;Dogsheep&lt;/a&gt;.&lt;/li&gt;
&lt;li&gt;Posted seven new museums to niche-museums.com: &lt;ul&gt;
&lt;li&gt;&lt;a href="https://www.niche-museums.com/browse/museums/39"&gt;Cable Car Museum&lt;/a&gt; in San Francisco&lt;/li&gt;
&lt;li&gt;&lt;a href="https://www.niche-museums.com/browse/museums/40"&gt;Audium&lt;/a&gt; in San Francisco&lt;/li&gt;
&lt;li&gt;&lt;a href="https://www.niche-museums.com/browse/museums/41"&gt;House of Broel Dollhouse Museum&lt;/a&gt; in New Orleans&lt;/li&gt;
&lt;li&gt;&lt;a href="https://www.niche-museums.com/browse/museums/43"&gt;Neptune Society Columbarium&lt;/a&gt; in San Francisco&lt;/li&gt;
&lt;li&gt;&lt;a href="https://www.niche-museums.com/browse/museums/44"&gt;Recoleta Cemetery&lt;/a&gt; in Buenos Aires&lt;/li&gt;
&lt;li&gt;&lt;a href="https://www.niche-museums.com/browse/museums/45"&gt;NASA Glenn Visitor Center&lt;/a&gt; in Cleveland&lt;/li&gt;
&lt;li&gt;&lt;a href="https://www.niche-museums.com/browse/museums/46"&gt;Conservatory of Flowers&lt;/a&gt; in San Francisco&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;I composed &lt;a href="https://www.niche-museums.com/browse?sql=select+json_object%28%22pre%22%2C+group_concat%28%27*+%5B%27+%7C%7C+name+%7C%7C+%27%5D%28https%3A%2F%2Fwww.niche-museums.com%2Fbrowse%2Fmuseums%2F%27+%7C%7C+id+%7C%7C++%2B+%27%29+in+%27+%7C%7C+coalesce%28osm_city%2C+osm_county%2C+osm_state%2C+osm_country%2C+%27%27%29%2C+%27%0D%0A%27%29%29+from+%28select+*+from+%28select+*+from+museums+order+by+id+desc+limit+7%29+order+by+id%29%3B"&gt;devious SQL query&lt;/a&gt; for generating the markdown for the seven most recently added museums.&lt;/li&gt;
&lt;/ul&gt;
    
        &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/museums"&gt;museums&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/projects"&gt;projects&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/yaml"&gt;yaml&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/datasette"&gt;datasette&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/weeknotes"&gt;weeknotes&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/baked-data"&gt;baked-data&lt;/a&gt;&lt;/p&gt;
    

</summary><category term="museums"/><category term="projects"/><category term="yaml"/><category term="datasette"/><category term="weeknotes"/><category term="baked-data"/></entry><entry><title>Analyzing US Election Russian Facebook Ads</title><link href="https://simonwillison.net/2018/Aug/6/russian-facebook-ads/#atom-tag" rel="alternate"/><published>2018-08-06T16:01:18+00:00</published><updated>2018-08-06T16:01:18+00:00</updated><id>https://simonwillison.net/2018/Aug/6/russian-facebook-ads/#atom-tag</id><summary type="html">
    &lt;p&gt;Two interesting data sources have emerged in the past few weeks concerning the Russian impact on the 2016 US elections.&lt;/p&gt;
&lt;p&gt;FiveThirtyEight &lt;a href="https://fivethirtyeight.com/features/why-were-sharing-3-million-russian-troll-tweets/"&gt;published nearly 3 million tweets&lt;/a&gt; from accounts associated with the Russian “Internet Research Agency” - see &lt;a href="https://simonwillison.net/2018/Aug/6/troll-tweets/"&gt;my article and searchable tweet archive here&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;Separately, the House Intelligence Committee Minority &lt;a href="https://democrats-intelligence.house.gov/social-media-content/"&gt;released 3,517 Facebook ads&lt;/a&gt; that were reported to have been bought by the Russian Internet Research Agency as a set of redacted PDF files.&lt;/p&gt;
&lt;h3&gt;&lt;a id="Exploring_the_Russian_Facebook_Ad_spend_18"&gt;&lt;/a&gt;Exploring the Russian Facebook Ad spend&lt;/h3&gt;
&lt;p&gt;The initial data was released as &lt;a href="https://democrats-intelligence.house.gov/social-media-content/social-media-advertisements.htm"&gt;zip files full of PDFs&lt;/a&gt;, one of the least friendly formats you can use to publish data.&lt;/p&gt;
&lt;p&gt;&lt;a href="https://twitter.com/edsu"&gt;Ed Summers&lt;/a&gt; took on the intimidating task of cleaning that up. &lt;a href="https://github.com/edsu/irads"&gt;His results are incredible&lt;/a&gt;: he used the &lt;a href="https://pypi.org/project/pytesseract/"&gt;pytesseract OCR library&lt;/a&gt; and &lt;a href="https://pypi.org/project/PyPDF2/"&gt;PyPDF2&lt;/a&gt; to extract both the images and the associated metadata and convert the whole lot into a single 3.9MB JSON file.&lt;/p&gt;
&lt;p&gt;I &lt;a href="https://github.com/simonw/russian-ira-facebook-ads-datasette"&gt;wrote some code&lt;/a&gt; to convert his JSON file to SQLite (more on the details later) and the result can be found here:&lt;/p&gt;
&lt;p&gt;&lt;a href="https://russian-ira-facebook-ads.datasettes.com/"&gt;https://russian-ira-facebook-ads.datasettes.com/&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;Here’s an &lt;a href="https://russian-ira-facebook-ads.datasettes.com/russian-ads-919cbfd/display_ads?_search=cops&amp;amp;_sort_desc=spend_usd"&gt;example search for “cops” ordered by the USD equivalent spent on the ad&lt;/a&gt; (some of the spends are in rubles, so I convert those to USD using today’s exchange rate of 0.016).&lt;/p&gt;
&lt;p&gt;&lt;img style="max-width: 100%" src="https://static.simonwillison.net/static/2018/ads-cops-sorted-by-usd.png" alt="Search ads for cops, order by USD descending" /&gt;&lt;/p&gt;
&lt;p&gt;One of the most interesting things about this data is that it includes the Facebook ad targetting options that were used to promote the ads. I’ve built a separate interface for browsing those - you can see &lt;a href="https://russian-ira-facebook-ads.datasettes.com/russian-ads-919cbfd/top_targets"&gt;the most frequently applied targets&lt;/a&gt;:&lt;/p&gt;
&lt;p&gt;&lt;img style="max-width: 100%" src="https://static.simonwillison.net/static/2018/top-targets.png" alt="Top targets" /&gt;&lt;/p&gt;
&lt;p&gt;And by browsing &lt;a href="https://russian-ira-facebook-ads.datasettes.com/russian-ads-919cbfd/faceted-targets?targets=%5B%22d6ade%22%5D"&gt;through the different facets&lt;/a&gt; you can construct e.g. a search for all ads that targeted people interested in both &lt;code&gt;interests:Martin Luther King&lt;/code&gt; and  &lt;code&gt;interests:Police Brutality is a Crime&lt;/code&gt;: &lt;a href="https://russian-ira-facebook-ads.datasettes.com/russian-ads-919cbfd/display_ads?_targets_json=%5B%22d6ade%22%2C%2240c27%22%5D"&gt;https://russian-ira-facebook-ads.datasettes.com/russian-ads-919cbfd/display_ads?_targets_json=[&amp;quot;d6ade&amp;quot;%2C&amp;quot;40c27&amp;quot;]&lt;/a&gt;&lt;/p&gt;
&lt;h3&gt;&lt;a id="New_tooling_under_the_hood_40"&gt;&lt;/a&gt;New tooling under the hood&lt;/h3&gt;
&lt;p&gt;I ended up spinning up several new projects to help process and explore this data.&lt;/p&gt;
&lt;h4&gt;&lt;a id="sqliteutils_44"&gt;&lt;/a&gt;sqlite-utils&lt;/h4&gt;
&lt;p&gt;The first is a new library called &lt;a href="https://sqlite-utils.readthedocs.io/en/latest/"&gt;sqlite-utils&lt;/a&gt;. If data is already in CSV I tend to convert it using csvs-to-sqlite, but if data is in a less tabular format (JSON or XML for example) I have to hand-write code. Here’s &lt;a href="https://github.com/simonw/register-of-members-interests/blob/2baf75956b8b9e93a3985ebeb2259f7f2af760c8/convert_xml_to_sqlite.py"&gt;a script&lt;/a&gt; I wrote to process the XML version of &lt;a href="https://simonwillison.net/2018/Apr/25/register-members-interests/"&gt;the UK Register of Members Interests for example&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;My goal with sqlite-utils is to take some of the common patterns from those scripts and make them as easy to use as possible, in particular when running inside a Jupyter notebook. It’s still very early, but &lt;a href="https://github.com/simonw/russian-ira-facebook-ads-datasette/blob/336ba87ef8071e664441ad0a95e3b8d0a33f682a/fetch_and_build_russian_ads.py"&gt;the script I wrote&lt;/a&gt; to process the Russian ads JSON is a good example of the kind of thing I want to do with it.&lt;/p&gt;
&lt;h4&gt;&lt;a id="datasettejsonhtml_50"&gt;&lt;/a&gt;datasette-json-html&lt;/h4&gt;
&lt;p&gt;The second new tool is a new Datasette plugin (and &lt;a href="https://github.com/simonw/datasette/issues/352"&gt;corresponding plugin hook&lt;/a&gt;) called &lt;a href="https://github.com/simonw/datasette-json-html"&gt;datasette-json-html&lt;/a&gt;. I used this to solve the need to display both rendered images and customized links as part of the regular Datasette instance.&lt;/p&gt;
&lt;p&gt;It’s a pretty crazy solution (hence why it’s implemented as a plugin and not part of Datasette core) but it works surprisingly well. The basic idea is to support a mini JSON language which can be detected and rendered as HTML. A couple of examples:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;{
  &amp;quot;img_src&amp;quot;: &amp;quot;https://raw.githubusercontent.com/edsu/irads/03fb4b/site/images/0771.png&amp;quot;,
  &amp;quot;width&amp;quot;: 200
}
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Is rendered as an HTML &lt;code&gt;&amp;lt;img src=&amp;quot;&amp;quot;&amp;gt;&lt;/code&gt; element.&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;[
  {
    &amp;quot;label&amp;quot;: &amp;quot;location:United States&amp;quot;,
    &amp;quot;href&amp;quot;: &amp;quot;/russian-ads/display_ads?_target=ec3ac&amp;quot;
  },
  {
    &amp;quot;label&amp;quot;: &amp;quot;interests:Martin Luther King&amp;quot;,
    &amp;quot;href&amp;quot;: &amp;quot;/russian-ads/display_ads?_target=d6ade&amp;quot;
  },
  {
    &amp;quot;label&amp;quot;: &amp;quot;interests:Jr.&amp;quot;,
    &amp;quot;href&amp;quot;: &amp;quot;/russian-ads/display_ads?_target=8e7b3&amp;quot;
  }
]
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Is rendered as a comma-separated list of HTML links.&lt;/p&gt;
&lt;p&gt;Why use JSON for this? Because SQLite has some &lt;a href="https://www.sqlite.org/json1.html"&gt;incredibly powerful JSON features&lt;/a&gt;, making it trivial to output JSON as part of the result of a SQL query. Most interestingly of all it has &lt;code&gt;json_group_array()&lt;/code&gt; which can work as an aggregation function to combine a set of related rows into a single JSON array.&lt;/p&gt;
&lt;p&gt;The display_ads page shown above is powered by a SQL view. Here’s the relevant subset of that view:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;select ads.id,
    case when image is not null then
        json_object(&amp;quot;img_src&amp;quot;, &amp;quot;https://raw.githubusercontent.com/edsu/irads/03fb4b/site/&amp;quot; || image, &amp;quot;width&amp;quot;, 200)
    else
        &amp;quot;no image&amp;quot;
    end as img,
    json_group_array(
        json_object(
            &amp;quot;label&amp;quot;, targets.name,
            &amp;quot;href&amp;quot;, &amp;quot;/russian-ads/display_ads?_target=&amp;quot;
                || urllib_quote_plus(targets.id)
        )
    ) as targeting
from ads
    join ad_targets on ads.id = ad_targets.ad_id
    join targets on ad_targets.target_id = targets.id
group by ads.id limit 10
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;I’m using SQLite’s JSON functions to dynamically assemble the JSON format that datasette-json-html knows how to render. I’m delighted at how well it works.&lt;/p&gt;
&lt;p&gt;I’ve turned off arbitrary SQL querying against the main Facebook ads Datasette instance, but there’s a copy running at &lt;a href="https://russian-ira-facebook-ads-sql-allowed.now.sh/russian-ads"&gt;https://russian-ira-facebook-ads-sql-allowed.now.sh/russian-ads&lt;/a&gt; if you want to play with these queries.&lt;/p&gt;
&lt;h4&gt;&lt;a id="Weird_implementation_details_106"&gt;&lt;/a&gt;Weird implementation details&lt;/h4&gt;
&lt;p&gt;The full source code for my implementation &lt;a href="https://github.com/simonw/russian-ira-facebook-ads-datasette"&gt;is available on GitHub&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;I ended up using &lt;a href="https://github.com/simonw/datasette/commit/5116c4ec8aed5091e1f75415424b80f613518dc6"&gt;an experimental plugin hook&lt;/a&gt; to enable additional custom filtering on Datasette views in order to support showing ads against multiple m2m targets, but hopefully that will be made unnecessary as work on Datasette’s &lt;a href="https://github.com/simonw/datasette/issues/354"&gt;support for m2m relationships&lt;/a&gt; progresses.&lt;/p&gt;
&lt;p&gt;I also experimented with YAML to generate the &lt;code&gt;metadata.json&lt;/code&gt; file as JSON strings aren’t a great way of &lt;a href="https://github.com/simonw/russian-ira-facebook-ads-datasette/blob/336ba87ef8071e664441ad0a95e3b8d0a33f682a/russian-ads-metadata.yaml"&gt;representing multi-line HTML and SQL&lt;/a&gt;. And if you want to see some &lt;em&gt;really&lt;/em&gt; convoluted SQL have a look at how the &lt;a href="https://github.com/simonw/russian-ira-facebook-ads-datasette/blob/336ba87ef8071e664441ad0a95e3b8d0a33f682a/russian-ads-metadata.yaml#L52-L81"&gt;canned query&lt;/a&gt; for the &lt;a href="https://russian-ira-facebook-ads.datasettes.com/russian-ads-919cbfd/faceted-targets?targets=%5B%22371f0%22%2C%22cc5ed%22%5D"&gt;faceted targeting interface&lt;/a&gt; works.&lt;/p&gt;
&lt;p&gt;This was a really fun project, which further stretched my ideas about what Datasette should be capable of out of the box. I’m hoping that the &lt;a href="https://github.com/simonw/datasette/issues/354"&gt;m2m work&lt;/a&gt; will make a lot of these crazy hacks redundant.&lt;/p&gt;
    
        &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/politics"&gt;politics&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/projects"&gt;projects&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/yaml"&gt;yaml&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/datasette"&gt;datasette&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/sqlite-utils"&gt;sqlite-utils&lt;/a&gt;&lt;/p&gt;
    

</summary><category term="politics"/><category term="projects"/><category term="yaml"/><category term="datasette"/><category term="sqlite-utils"/></entry><entry><title>twitter-text-conformance</title><link href="https://simonwillison.net/2010/Feb/6/twitter/#atom-tag" rel="alternate"/><published>2010-02-06T15:39:27+00:00</published><updated>2010-02-06T15:39:27+00:00</updated><id>https://simonwillison.net/2010/Feb/6/twitter/#atom-tag</id><summary type="html">
    
&lt;p&gt;&lt;strong&gt;&lt;a href="http://github.com/mzsanford/twitter-text-conformance"&gt;twitter-text-conformance&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;
This is a neat idea: Twitter have released open source libraries for parsing standard tweet syntax in Ruby and Java, but they’ve also released a set of YAML unit tests aimed at anyone who wants to implement the same parsing logic in other languages.

    &lt;p&gt;&lt;small&gt;&lt;/small&gt;Via &lt;a href="http://engineering.twitter.com/2010/02/introducing-open-source-twitter-text.html"&gt;Twitter Engineering Blog&lt;/a&gt;&lt;/small&gt;&lt;/p&gt;


    &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/java"&gt;java&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/ruby"&gt;ruby&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/testing"&gt;testing&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/twitter"&gt;twitter&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/yaml"&gt;yaml&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/conformance-suites"&gt;conformance-suites&lt;/a&gt;&lt;/p&gt;



</summary><category term="java"/><category term="ruby"/><category term="testing"/><category term="twitter"/><category term="yaml"/><category term="conformance-suites"/></entry><entry><title>More YAML</title><link href="https://simonwillison.net/2003/Feb/5/moreYaml/#atom-tag" rel="alternate"/><published>2003-02-05T23:49:43+00:00</published><updated>2003-02-05T23:49:43+00:00</updated><id>https://simonwillison.net/2003/Feb/5/moreYaml/#atom-tag</id><summary type="html">
    &lt;p&gt;Paul Tchistopolskii's &lt;a href="http://www.pault.com/pault/pxml/xmlalternatives.html"&gt;XML Alternatives&lt;/a&gt; reminded me to take another look at &lt;a href="YAML Ain&amp;apos;t Markup Language"&gt;YAML&lt;/a&gt;. The specification has been updated since &lt;a href="/2002/Dec/05/yaml/"&gt;I last looked&lt;/a&gt; and seems to be a bit more complicated, but it's still a very nicely designed format. Implementations are available for Perl, Python and Ruby with C and Java on the way but strangely no one seems to be doing one for &lt;acronym title="PHP: Hypertext Preprocessor"&gt;PHP&lt;/acronym&gt; yet. I'm doing a course at Uni on compilers at the moment which includes quite a lot of stuff about writing parsers so I'm very tempted to have a go at a YAML implementation in the next few weeks just to try stuff out. The possibility of easily swapping relatively complex data structures between &lt;acronym title="PHP: Hypertext Preprocessor"&gt;PHP&lt;/acronym&gt; and Python is pretty tempting as well.&lt;/p&gt;
    
        &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/xml"&gt;xml&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/yaml"&gt;yaml&lt;/a&gt;&lt;/p&gt;
    

</summary><category term="xml"/><category term="yaml"/></entry><entry><title>YAML</title><link href="https://simonwillison.net/2002/Dec/5/yaml/#atom-tag" rel="alternate"/><published>2002-12-05T02:49:08+00:00</published><updated>2002-12-05T02:49:08+00:00</updated><id>https://simonwillison.net/2002/Dec/5/yaml/#atom-tag</id><summary type="html">
    &lt;p&gt;I forget quite how I got there, but the other day I found myself reading about &lt;acronym title="YAML Ain&amp;apos;t Markup Language"&gt;YAML&lt;/acronym&gt; - &lt;a href="http://www.yaml.org/"&gt;YAML Ain't Markup Language&lt;/a&gt;. It looks really interesting. YAML aims to be an easily human readable format for storing and transferring structured data - so far, so &lt;acronym title="eXtensible Markup Language"&gt;XML&lt;/acronym&gt;. Where it differs from the &lt;acronym title="Information Technology"&gt;IT&lt;/acronym&gt; world's favourite buzzword is that YAML is specifically designed to handle the three most common data structures - scalars (single values), lists and dictionaries. Here's a sample (taken from the &lt;a href="http://www.yaml.org/spec/" title="YAML Ain&amp;apos;t Markup Language"&gt;official specification&lt;/a&gt;):&lt;/p&gt;
&lt;pre&gt;
Time: 2001-11-23 15:01:42 -05:00
User: ed
Warning: &amp;gt;
  This is an error message
  for the log file
&lt;/pre&gt;
&lt;p&gt;YAML has a number of obvious influences, including Python and &lt;acronym title="Multipurpose Internet Mail Extensions"&gt;MIME&lt;/acronym&gt;. Implementations already exist for &lt;a href="http://wiki.yaml.org/yamlwiki/YamlPm" title="YamlPm"&gt;Perl&lt;/a&gt;, &lt;a href="http://wiki.yaml.org/yamlwiki/PurePythonParserForYaml" title="PurePythonParserForYaml"&gt;Python&lt;/a&gt; and &lt;a href="http://helide.com/g/yaml/" title="A YAML parser written in Java (work in progress)"&gt;Java&lt;/a&gt;. &lt;acronym title="eXtensible Markup Language - Remote Procedure Calls"&gt;XML-RPC&lt;/acronym&gt; aptly demonstrates how powerful the combination of lists, dictionaries and arrays can be for exchanging data between different systems and YAML looks like it offers a very nice alternative to XML based data structure syntax. I have to admit to being slightly concerned by the length of the specification - while YAML is definitely human readable it looks like it could take a while for a human to learn to write it. Then again, the actual generation of the format is meant to be handled by computers (I imagine that humans will make simple edits to YAML files more often than they create them from scratch) so the complexity of the more advanced parts of the specification is probably not too much of a problem.&lt;/p&gt;
    
        &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/markup"&gt;markup&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/yaml"&gt;yaml&lt;/a&gt;&lt;/p&gt;
    

</summary><category term="markup"/><category term="yaml"/></entry></feed>