<?xml version="1.0" encoding="utf-8"?>
<feed xml:lang="en-us" xmlns="http://www.w3.org/2005/Atom"><title>Simon Willison's Weblog: inaturalist</title><link href="http://simonwillison.net/" rel="alternate"/><link href="http://simonwillison.net/tags/inaturalist.atom" rel="self"/><id>http://simonwillison.net/</id><updated>2026-05-15T23:53:11+00:00</updated><author><name>Simon Willison</name></author><entry><title>inaturalist-clumper 0.1</title><link href="https://simonwillison.net/2026/May/15/inaturalist-clumper/#atom-tag" rel="alternate"/><published>2026-05-15T23:53:11+00:00</published><updated>2026-05-15T23:53:11+00:00</updated><id>https://simonwillison.net/2026/May/15/inaturalist-clumper/#atom-tag</id><summary type="html">
    
        &lt;p&gt;&lt;strong&gt;Release:&lt;/strong&gt; &lt;a href="https://github.com/simonw/inaturalist-clumper/releases/tag/0.1"&gt;inaturalist-clumper 0.1&lt;/a&gt;&lt;/p&gt;
        &lt;p&gt;Part of the infrastructure I use for &lt;a href="https://simonwillison.net/2026/May/1/inat-sightings/"&gt;publishing my iNaturalist sightings on my blog&lt;/a&gt;. I've been running this in production for a few weeks now, inspiring some iterations on how it works, so I decided to ship a 0.1 release.&lt;/p&gt;
&lt;p&gt;You can see an example of the output &lt;a href="https://github.com/simonw/inaturalist-clumps/blob/main/clumps.json"&gt;in this JSON file&lt;/a&gt;.&lt;/p&gt;
    
    
        &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/projects"&gt;projects&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/inaturalist"&gt;inaturalist&lt;/a&gt;&lt;/p&gt;
    

</summary><category term="projects"/><category term="inaturalist"/></entry><entry><title>Sightings</title><link href="https://simonwillison.net/2026/May/2/sightings/#atom-tag" rel="alternate"/><published>2026-05-02T17:26:40+00:00</published><updated>2026-05-02T17:26:40+00:00</updated><id>https://simonwillison.net/2026/May/2/sightings/#atom-tag</id><summary type="html">
    
&lt;p&gt;&lt;strong&gt;&lt;a href="https://simonwillison.net/elsewhere/sighting/"&gt;/elsewhere/sightings/&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;
I have a new camera (a Canon R6 Mark II) so I'm taking a lot more photos of birds. I share my best wildlife photos on &lt;a href="https://www.inaturalist.org/"&gt;iNaturalist&lt;/a&gt;, and based on yesterday's &lt;a href="https://simonwillison.net/2026/May/1/inat-sightings/"&gt;successful prototype&lt;/a&gt;  I decided to add those to my blog.&lt;/p&gt;
&lt;p&gt;&lt;img class="blogmark-image" src="https://static.simonwillison.net/static/2026/beats-sightings.jpeg" alt="Screenshot of a &amp;quot;Sightings&amp;quot; webpage with a search bar and RSS icon, showing &amp;quot;Filters: Sorted by date&amp;quot; and &amp;quot;208 results page 1 / 7 next » last »»&amp;quot;. First entry: SIGHTING 7:51 PM — Acorn Woodpecker, with two photos labeled &amp;quot;Acorn Woodpecker&amp;quot; of black and white woodpeckers with red caps on tree branches, dated 2nd May 2026. Second entry: SIGHTING 10:08 AM – 11:17 AM — Acorn Woodpecker, Western Fence Lizard, Osprey, with three photos labeled &amp;quot;Acorn Woodpecker&amp;quot; (bird on bare branches against blue sky), &amp;quot;Wester...&amp;quot; (lizard on tree bark), and &amp;quot;Osprey&amp;quot; (nest on a utility pole), dated 1st May 2026. Third entry: SIGHTING 11:11 AM — White-crowned Sparrow, with a photo labeled &amp;quot;White-crowned Sparrow&amp;quot; of a sparrow with black and white striped head singing with open beak, dated 30th Apr 2026."&gt;&lt;/p&gt;
&lt;p&gt;I built this feature on my phone using Claude Code for web, as an extension of my &lt;a href="https://simonwillison.net/2026/Feb/20/beats/"&gt;beats system&lt;/a&gt; for syndicating external content. Here's &lt;a href="https://github.com/simonw/simonwillisonblog/pull/668"&gt;the PR&lt;/a&gt; and prompt.&lt;/p&gt;
&lt;p&gt;As with my other forms of incoming syndicated content sightings show up on the homepage, the date archive pages, and in site search results.&lt;/p&gt;
&lt;p&gt;I back-populated over a decade of iNaturalist sightings, which means you that if you &lt;a href="https://simonwillison.net/search/?q=lemur"&gt;search for lemur&lt;/a&gt; you'll see my lemur photos from Madagascar in 2019!


    &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/blogging"&gt;blogging&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/photography"&gt;photography&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/wildlife"&gt;wildlife&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/ai"&gt;ai&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/inaturalist"&gt;inaturalist&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/generative-ai"&gt;generative-ai&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/llms"&gt;llms&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/ai-assisted-programming"&gt;ai-assisted-programming&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/claude-code"&gt;claude-code&lt;/a&gt;&lt;/p&gt;



</summary><category term="blogging"/><category term="photography"/><category term="wildlife"/><category term="ai"/><category term="inaturalist"/><category term="generative-ai"/><category term="llms"/><category term="ai-assisted-programming"/><category term="claude-code"/></entry><entry><title>iNaturalist Sightings</title><link href="https://simonwillison.net/2026/May/1/inat-sightings/#atom-tag" rel="alternate"/><published>2026-05-01T19:35:41+00:00</published><updated>2026-05-01T19:35:41+00:00</updated><id>https://simonwillison.net/2026/May/1/inat-sightings/#atom-tag</id><summary type="html">
    
        &lt;p&gt;&lt;strong&gt;Tool:&lt;/strong&gt; &lt;a href="https://tools.simonwillison.net/inat-sightings"&gt;iNaturalist Sightings&lt;/a&gt;&lt;/p&gt;
        &lt;p&gt;I wanted to see my &lt;a href="https://www.inaturalist.org"&gt;iNaturalist&lt;/a&gt; observations - across two separate accounts - grouped by when they occurred. I'm camping this weekend so I built this entirely on my phone using Claude Code for web.&lt;/p&gt;
&lt;p&gt;I started by building an &lt;a href="https://github.com/simonw/inaturalist-clumper"&gt;inaturalist-clumper&lt;/a&gt; Python CLI for fetching and "clumping" observations - by default clumps use observations within 2 hours and 5km of each other.&lt;/p&gt;
&lt;p&gt;Then I setup &lt;a href="https://github.com/simonw/inaturalist-clumps"&gt;simonw/inaturalist-clumps&lt;/a&gt; as a &lt;a href="https://simonwillison.net/series/git-scraping/"&gt;Git scraping&lt;/a&gt; repository to run that tool and record the result to &lt;a href="https://github.com/simonw/inaturalist-clumps/blob/main/clumps.json"&gt;clumps.json&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;That JSON file is hosted on GitHub, which means it can be fetched by JavaScript using CORS.&lt;/p&gt;
&lt;p&gt;Finally I ran this prompt against my &lt;a href="https://github.com/simonw/tools"&gt;simonw/tools&lt;/a&gt; repo:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;code&gt;Build inat-sightings.html - an app that does a fetch() against https://raw.githubusercontent.com/simonw/inaturalist-clumps/refs/heads/main/clumps.json and then displays all of the observations on one page using the https://static.inaturalist.org/photos/538073008/small.jpg small.jpg URLs for the thumbnails - with loading=lazy - but when a thumbnail is clicked showing the large.jpg in an HTML modal. Both small and large should include the common species names if available&lt;/code&gt;&lt;/p&gt;
&lt;/blockquote&gt;
    
    
        &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/tools"&gt;tools&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/ai"&gt;ai&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/inaturalist"&gt;inaturalist&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/generative-ai"&gt;generative-ai&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/llms"&gt;llms&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/claude-code"&gt;claude-code&lt;/a&gt;&lt;/p&gt;
    

</summary><category term="tools"/><category term="ai"/><category term="inaturalist"/><category term="generative-ai"/><category term="llms"/><category term="claude-code"/></entry><entry><title>Building and deploying a custom site using GitHub Actions and GitHub Pages</title><link href="https://simonwillison.net/2025/Mar/18/actions-pages/#atom-tag" rel="alternate"/><published>2025-03-18T20:17:34+00:00</published><updated>2025-03-18T20:17:34+00:00</updated><id>https://simonwillison.net/2025/Mar/18/actions-pages/#atom-tag</id><summary type="html">
    
&lt;p&gt;&lt;strong&gt;&lt;a href="https://til.simonwillison.net/github-actions/github-pages"&gt;Building and deploying a custom site using GitHub Actions and GitHub Pages&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;
I figured out a minimal example of how to use GitHub Actions to run custom scripts to build a website and then publish that static site to GitHub Pages. I turned &lt;a href="https://github.com/simonw/minimal-github-pages-from-actions/"&gt;the example&lt;/a&gt; into a template repository, which should make getting started for a new project extremely quick.&lt;/p&gt;
&lt;p&gt;I've needed this for various projects over the years, but today I finally put these notes together while setting up &lt;a href="https://github.com/simonw/recent-california-brown-pelicans"&gt;a system&lt;/a&gt; for scraping the &lt;a href="https://www.inaturalist.org/"&gt;iNaturalist&lt;/a&gt; API for recent sightings of the California Brown Pelican and converting those into an Atom feed that I can subscribe to in &lt;a href="https://netnewswire.com/"&gt;NetNewsWire&lt;/a&gt;:&lt;/p&gt;
&lt;p&gt;&lt;img alt="Screenshot of a Brown Pelican sighting Atom feed in NetNewsWire showing a list of entries on the left sidebar and detailed view of &amp;quot;Brown Pelican at Art Museum, Isla Vista, CA 93117, USA&amp;quot; on the right with date &amp;quot;MAR 13, 2025 AT 10:40 AM&amp;quot;, coordinates &amp;quot;34.4115542997, -119.8500448&amp;quot;, and a photo of three brown pelicans in water near a dock with copyright text &amp;quot;(c) Ery, all rights reserved&amp;quot;" src="https://static.simonwillison.net/static/2025/pelicans-netnewswire.jpg" /&gt;&lt;/p&gt;
&lt;p&gt;I got Claude &lt;a href="https://claude.ai/share/533a1d59-60db-4686-bd50-679dd01a585e"&gt;to write&lt;/a&gt; me &lt;a href="https://github.com/simonw/recent-california-brown-pelicans/blob/81f87b378b6626e97eeca0719e89c87ace141816/to_atom.py"&gt;the script&lt;/a&gt; that converts the scraped JSON to atom.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Update&lt;/strong&gt;: I just &lt;a href="https://sfba.social/@kueda/114185945871929778"&gt;found out&lt;/a&gt; iNaturalist have their own atom feeds! Here's their own &lt;a href="https://www.inaturalist.org/observations.atom?verifiable=true&amp;amp;taxon_id=123829"&gt;feed of recent Pelican observations&lt;/a&gt;.


    &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/atom"&gt;atom&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/github"&gt;github&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/netnewswire"&gt;netnewswire&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/inaturalist"&gt;inaturalist&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/github-actions"&gt;github-actions&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/git-scraping"&gt;git-scraping&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/ai-assisted-programming"&gt;ai-assisted-programming&lt;/a&gt;&lt;/p&gt;



</summary><category term="atom"/><category term="github"/><category term="netnewswire"/><category term="inaturalist"/><category term="github-actions"/><category term="git-scraping"/><category term="ai-assisted-programming"/></entry><entry><title>Weeknotes: Rocky Beaches, Datasette 0.48, a commit history of my database</title><link href="https://simonwillison.net/2020/Aug/21/weeknotes-rocky-beaches/#atom-tag" rel="alternate"/><published>2020-08-21T00:52:16+00:00</published><updated>2020-08-21T00:52:16+00:00</updated><id>https://simonwillison.net/2020/Aug/21/weeknotes-rocky-beaches/#atom-tag</id><summary type="html">
    &lt;p&gt;This week I helped Natalie launch &lt;a href="https://www.rockybeaches.com/"&gt;Rocky Beaches&lt;/a&gt;, shipped Datasette 0.48 and several releases of &lt;code&gt;datasette-graphql&lt;/code&gt;, upgraded the CSRF protection for &lt;code&gt;datasette-upload-csvs&lt;/code&gt; and figured out how to get a commit log of changes to my blog by backing up its database to a GitHub repository.&lt;/p&gt;
&lt;h4 id="rocky-beaches"&gt;Rocky Beaches&lt;/h4&gt;
&lt;p&gt;&lt;a href="https://twitter.com/natbat"&gt;Natalie&lt;/a&gt; released the first version of &lt;a href="https://www.rockybeaches.com/"&gt;rockybeaches.com&lt;/a&gt; this week. It's a site that helps you find places to go tidepooling (known as rockpooling in the UK) and figure out the best times to go based on low tide times.&lt;/p&gt;

&lt;p&gt;&lt;img style="max-width: 100%" src="https://static.simonwillison.net/static/2020/Rocky_Beaches__Pillar_Point_Harbor_CA.jpg" alt="Screenshot of the Pillar Point page for Rocky Beaches" /&gt;&lt;/p&gt;

&lt;p&gt;I helped out with the backend for the site, mainly as an excuse to further explore the idea of using Datasette to power full websites (previously explored with &lt;a href="https://simonwillison.net/2019/Nov/25/niche-museums/"&gt;Niche Museums&lt;/a&gt; and &lt;a href="https://simonwillison.net/2020/Apr/20/self-rewriting-readme/"&gt;my TILs&lt;/a&gt;).&lt;/p&gt;
&lt;p&gt;The site uses a pattern I've been really enjoying: it's essentially a static dynamic site. Pages are dynamically rendered by Datasette using Jinja templates and a SQLite database, but the database itself is treated as a static asset: it's built at deploy time by &lt;a href="https://github.com/natbat/rockybeaches/blob/main/.github/workflows/deploy.yml"&gt;this GitHub Actions workflow&lt;/a&gt; and deployed (currently to &lt;a href="https://www.vercel.com/"&gt;Vercel&lt;/a&gt;) as a binary asset along with the code.&lt;/p&gt;
&lt;p&gt;The &lt;a href="https://github.com/natbat/rockybeaches/blob/main/scripts/build.sh"&gt;build script&lt;/a&gt; uses &lt;a href="https://github.com/simonw/yaml-to-sqlite"&gt;yaml-to-sqlite&lt;/a&gt; to load two YAML files - &lt;a href="https://github.com/natbat/rockybeaches/blob/4127c0f0539178664cefed4aca00db2b5c00c855/data/places.yml"&gt;places.yml&lt;/a&gt; and &lt;a href="https://github.com/natbat/rockybeaches/blob/4127c0f0539178664cefed4aca00db2b5c00c855/data/stations.yml"&gt;stations.yml&lt;/a&gt; - and create the &lt;code&gt;stations&lt;/code&gt; and &lt;code&gt;places&lt;/code&gt; database tables.&lt;/p&gt;
&lt;p&gt;It then runs two custom Python scripts to fetch relevant data for those places from &lt;a href="https://www.inaturalist.org/"&gt;iNaturalist&lt;/a&gt; and the &lt;a href="https://tidesandcurrents.noaa.gov/web_services_info.html"&gt;NOAA Tides &amp;amp; Currents API&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;The data all ends up in the Datasette instance that powers the site - you can browse it at &lt;a href="http://www.rockybeaches.com/data"&gt;www.rockybeaches.com/data&lt;/a&gt; or interact with it using GraphQL API at &lt;a href="http://www.rockybeaches.com/graphql"&gt;www.rockybeaches.com/graphql&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;The code is a little convoluted at the moment - I'm still iterating towards the best patterns for building websites like this using Datasette - but I'm very pleased with the productivity and performance that this approach produced.&lt;/p&gt;
&lt;h4 id="datasette-048"&gt;Datasette 0.48&lt;/h4&gt;
&lt;p&gt;Highlights from &lt;a href="https://docs.datasette.io/en/stable/changelog.html#v0-48"&gt;Datasette 0.48&lt;/a&gt; release notes:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Datasette documentation now lives at &lt;a href="https://docs.datasette.io/"&gt;docs.datasette.io&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;The &lt;code&gt;extra_template_vars&lt;/code&gt;, &lt;code&gt;extra_css_urls&lt;/code&gt;, &lt;code&gt;extra_js_urls&lt;/code&gt; and &lt;code&gt;extra_body_script&lt;/code&gt; plugin hooks now all accept the same arguments. See &lt;a href="https://docs.datasette.io/en/stable/plugin_hooks.html#plugin-hook-extra-template-vars"&gt;extra_template_vars(template, database, table, columns, view_name, request, datasette)&lt;/a&gt; for details. (&lt;a href="https://github.com/simonw/datasette/issues/939"&gt;#939&lt;/a&gt;)&lt;/li&gt;
&lt;li&gt;Those hooks now accept a new &lt;code&gt;columns&lt;/code&gt; argument detailing the table columns that will be rendered on that page. (&lt;a href="https://github.com/simonw/datasette/issues/938"&gt;#938&lt;/a&gt;)&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;I released a new version of &lt;a href="https://github.com/simonw/datasette-cluster-map"&gt;datasette-cluster-map&lt;/a&gt; that takes advantage of the new &lt;code&gt;columns&lt;/code&gt; argument to only inject Leaflet maps JavaScript onto the page if the table being rendered includes latitude and longitude columns - previously the plugin would load extra code on pages that weren't going to render a map at all. That's now running on &lt;a href="https://global-power-plants.datasettes.com/"&gt;https://global-power-plants.datasettes.com/&lt;/a&gt;.&lt;/p&gt;
&lt;h4 id="datasette-graphql"&gt;datasette-graphql&lt;/h4&gt;
&lt;p&gt;Using &lt;a href="https://github.com/simonw/datasette-graphql"&gt;datasette-graphql&lt;/a&gt; for Rocky Beaches inspired me to add two new features:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;A new &lt;code&gt;graphql()&lt;/code&gt; Jinja custom template function that lets you execute custom GraphQL queries inside a Datasette template page - which turns out to be a pretty elegant way for the template to load exactly the data that it needs in order to render the page. Here's &lt;a href="https://github.com/natbat/rockybeaches/blob/70039f18b3d3823a4f069deca513e950a3aaba4f/templates/row-data-places.html#L29-L46"&gt;how Rocky Beaches uses that&lt;/a&gt;. &lt;a href="https://github.com/simonw/datasette-graphql/issues/50"&gt;Issue 50&lt;/a&gt;.&lt;/li&gt;
&lt;li&gt;Some of the iNaturalist data that Rocky Beaches uses is stored as JSON data in text columns in SQLite - mainly because I was too lazy to model it out as tables. This was coming out of the GraphQL API as strings-containing-JSON, so I added a &lt;code&gt;json_columns&lt;/code&gt; plugin configuration mechanism for turning those into Graphene &lt;code&gt;GenericScalar&lt;/code&gt; fields - see &lt;a href="https://github.com/simonw/datasette-graphql/issues/53"&gt;issue 53&lt;/a&gt; for details.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;I also landed a big performance improvement. The plugin works by introspecting the database and generating a GraphQL schema that represents those tables, columns and views. For tables with a lot of tables this can get expensive, and the introspection was being run on every request.&lt;/p&gt;
&lt;p&gt;I didn't want to require a server restart any time the schema changed, so I didn't want to cache the schema in-memory. Ideally it would be cached but the cache would become invalid any time the schema itself changed.&lt;/p&gt;
&lt;p&gt;It turns out SQLite has a mechanism for this: the &lt;code&gt;PRAGMA schema_version&lt;/code&gt; statement, which returns an integer version number that changes any time the underlying schema is changed (e.g. a table is added or modified).&lt;/p&gt;
&lt;p&gt;I built a quick &lt;a href="https://github.com/simonw/datasette-schema-versions"&gt;datasette-schema-versions&lt;/a&gt; plugin to try this feature out (in less than twenty minutes thanks to my &lt;a href="https://simonwillison.net/2020/Jun/20/cookiecutter-plugins/"&gt;datasette-plugin cookiecutter template&lt;/a&gt;) and prove to myself that it works. Then I built a caching mechanism for &lt;code&gt;datasette-graphql&lt;/code&gt; that uses the current &lt;code&gt;schema_version&lt;/code&gt; as the cache key. See &lt;a href="https://github.com/simonw/datasette-graphql/issues/51"&gt;issue 51&lt;/a&gt; for details.&lt;/p&gt;
&lt;h4 id="asgi-csrf-and-datasette-upload-csvs"&gt;asgi-csrf and datasette-upload-csvs&lt;/h4&gt;
&lt;p&gt;&lt;a href="https://github.com/simonw/datasette-upload-csvs"&gt;datasette-upload-csvs&lt;/a&gt; is a Datasette plugin that adds a form for uploading CSV files and converting them to SQLite tables.&lt;/p&gt;
&lt;p&gt;Datasette 0.44 &lt;a href="https://docs.datasette.io/en/latest/changelog.html#csrf-protection"&gt;added CSRF protection&lt;/a&gt;, which broke the plugin. I fixed that this week, but it took some extra work because file uploads use the &lt;code&gt;multipart/form-data&lt;/code&gt; HTTP mechanism and my &lt;a href="https://github.com/simonw/asgi-csrf"&gt;asgi-csrf&lt;/a&gt; library didn't support that.&lt;/p&gt;
&lt;p&gt;I &lt;a href="https://github.com/simonw/asgi-csrf/issues/1"&gt;fixed that&lt;/a&gt; this week, but the code was quite complicated. Since &lt;code&gt;asgi-csrf&lt;/code&gt; is a security library I decided to aim for 100% code coverage, the first time I've done that for one of my projects.&lt;/p&gt;
&lt;p&gt;I got there with the help of codecov.io and &lt;a href="https://pypi.org/project/pytest-cov/"&gt;pytest-cov&lt;/a&gt;. I wrote up what I learned about those tools in &lt;a href="https://github.com/simonw/til/blob/main/pytest/pytest-code-coverage.md"&gt;a TIL&lt;/a&gt;.&lt;/p&gt;
&lt;h4 id="backing-up-my-blog-database-to-a-github-repository"&gt;Backing up my blog database to a GitHub repository&lt;/h4&gt;
&lt;p&gt;I really like keeping content in a git repository (see Rocky Beaches and Niche Museums). Every content management system I've ever been has eventually desired revision control, and modeling that in a database and adding it to an existing project is always a huge pain.&lt;/p&gt;
&lt;p&gt;I have 18 years of content on this blog. I want that backed up to git - and this week I realized I have the tools to do that already.&lt;/p&gt;
&lt;p&gt;&lt;a href="https://github.com/simonw/db-to-sqlite"&gt;db-to-sqlite&lt;/a&gt; is my tool for taking any SQL Alchemy supported database (so far tested with MySQL and PostgreSQL) and exporting it into a SQLite database.&lt;/p&gt;
&lt;p&gt;&lt;a href="https://github.com/simonw/sqlite-diffable"&gt;sqlite-diffable&lt;/a&gt; is a very early stage tool I built last year. The idea is to dump a SQLite database out to disk in a way that is designed to work well with git diffs. Each table is dumped out as newline-delimited JSON, one row per line.&lt;/p&gt;
&lt;p&gt;So... how about converting my blog's PostgreSQL database to SQLite, then dumping it to disk with &lt;code&gt;sqlite-diffable&lt;/code&gt; and committing the result to a git repository? And then running that in a GitHub Action?&lt;/p&gt;
&lt;p&gt;Here's &lt;a href="https://github.com/simonw/simonwillisonblog-backup/blob/main/.github/workflows/backup.yml"&gt;the workflow&lt;/a&gt;. It does exactly that, with a few extra steps: it only grabs a subset of my tables, and it redacts the &lt;code&gt;password&lt;/code&gt; column from my &lt;code&gt;auth_user&lt;/code&gt; table so that my hashed password isn't exposed in the backup.&lt;/p&gt;
&lt;p&gt;I now have &lt;a href="https://github.com/simonw/simonwillisonblog-backup/commits/main"&gt;a commit log&lt;/a&gt; of changes to my blog's database!&lt;/p&gt;
&lt;p&gt;I've set it to run nightly, but I can trigger it manually by clicking a button too.&lt;/p&gt;
&lt;h4 id="til-this-week-46"&gt;TIL this week&lt;/h4&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href="https://github.com/simonw/til/blob/main/readthedocs/custom-subdomain.md"&gt;Pointing a custom subdomain at Read the Docs&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://github.com/simonw/til/blob/main/pytest/pytest-code-coverage.md"&gt;Code coverage using pytest and codecov.io&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://github.com/simonw/til/blob/main/readthedocs/readthedocs-search-api.md"&gt;Read the Docs Search API&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://github.com/simonw/til/blob/main/heroku/programatic-access-postgresql.md"&gt;Programatically accessing Heroku PostgreSQL from GitHub Actions&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://github.com/simonw/til/blob/main/macos/find-largest-sqlite.md"&gt;Finding the largest SQLite files on a Mac&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://github.com/simonw/til/blob/main/github-actions/grep-tests.md"&gt;Using grep to write tests in CI&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;h4 id="releases-this-week-46"&gt;Releases this week&lt;/h4&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href="https://github.com/simonw/datasette-graphql/releases/tag/0.14"&gt;datasette-graphql 0.14&lt;/a&gt; - 2020-08-20&lt;/li&gt;
&lt;li&gt;&lt;a href="https://github.com/simonw/datasette-graphql/releases/tag/0.13"&gt;datasette-graphql 0.13&lt;/a&gt; - 2020-08-19&lt;/li&gt;
&lt;li&gt;&lt;a href="https://github.com/simonw/datasette-schema-versions/releases/tag/0.1"&gt;datasette-schema-versions 0.1&lt;/a&gt; - 2020-08-19&lt;/li&gt;
&lt;li&gt;&lt;a href="https://github.com/simonw/datasette-graphql/releases/tag/0.12.3"&gt;datasette-graphql 0.12.3&lt;/a&gt; - 2020-08-19&lt;/li&gt;
&lt;li&gt;&lt;a href="https://github.com/dogsheep/github-to-sqlite/releases/tag/2.5"&gt;github-to-sqlite 2.5&lt;/a&gt; - 2020-08-18&lt;/li&gt;
&lt;li&gt;&lt;a href="https://github.com/simonw/datasette-publish-vercel/releases/tag/0.8"&gt;datasette-publish-vercel 0.8&lt;/a&gt; - 2020-08-17&lt;/li&gt;
&lt;li&gt;&lt;a href="https://github.com/simonw/datasette-cluster-map/releases/tag/0.12"&gt;datasette-cluster-map 0.12&lt;/a&gt; - 2020-08-16&lt;/li&gt;
&lt;li&gt;&lt;a href="https://github.com/simonw/datasette/releases/tag/0.48"&gt;datasette 0.48&lt;/a&gt; - 2020-08-16&lt;/li&gt;
&lt;li&gt;&lt;a href="https://github.com/simonw/datasette-graphql/releases/tag/0.12.2"&gt;datasette-graphql 0.12.2&lt;/a&gt; - 2020-08-16&lt;/li&gt;
&lt;li&gt;&lt;a href="https://github.com/simonw/datasette-saved-queries/releases/tag/0.2.1"&gt;datasette-saved-queries 0.2.1&lt;/a&gt; - 2020-08-15&lt;/li&gt;
&lt;li&gt;&lt;a href="https://github.com/simonw/datasette/releases/tag/0.47.3"&gt;datasette 0.47.3&lt;/a&gt; - 2020-08-15&lt;/li&gt;
&lt;li&gt;&lt;a href="https://github.com/simonw/datasette-upload-csvs/releases/tag/0.5"&gt;datasette-upload-csvs 0.5&lt;/a&gt; - 2020-08-15&lt;/li&gt;
&lt;li&gt;&lt;a href="https://github.com/simonw/asgi-csrf/releases/tag/0.7"&gt;asgi-csrf 0.7&lt;/a&gt; - 2020-08-15&lt;/li&gt;
&lt;li&gt;&lt;a href="https://github.com/simonw/asgi-csrf/releases/tag/0.7a0"&gt;asgi-csrf 0.7a0&lt;/a&gt; - 2020-08-15&lt;/li&gt;
&lt;li&gt;&lt;a href="https://github.com/simonw/asgi-csrf/releases/tag/0.7a0"&gt;asgi-csrf 0.7a0&lt;/a&gt; - 2020-08-15&lt;/li&gt;
&lt;li&gt;&lt;a href="https://github.com/simonw/datasette-cluster-map/releases/tag/0.11.1"&gt;datasette-cluster-map 0.11.1&lt;/a&gt; - 2020-08-14&lt;/li&gt;
&lt;li&gt;&lt;a href="https://github.com/simonw/datasette-cluster-map/releases/tag/0.11"&gt;datasette-cluster-map 0.11&lt;/a&gt; - 2020-08-14&lt;/li&gt;
&lt;li&gt;&lt;a href="https://github.com/simonw/datasette-graphql/releases/tag/0.12.1"&gt;datasette-graphql 0.12.1&lt;/a&gt; - 2020-08-13&lt;/li&gt;
&lt;/ul&gt;
    
        &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/csrf"&gt;csrf&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/databases"&gt;databases&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/git"&gt;git&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/github"&gt;github&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/natalie-downe"&gt;natalie-downe&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/projects"&gt;projects&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/graphql"&gt;graphql&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/datasette"&gt;datasette&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/inaturalist"&gt;inaturalist&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/weeknotes"&gt;weeknotes&lt;/a&gt;&lt;/p&gt;
    

</summary><category term="csrf"/><category term="databases"/><category term="git"/><category term="github"/><category term="natalie-downe"/><category term="projects"/><category term="graphql"/><category term="datasette"/><category term="inaturalist"/><category term="weeknotes"/></entry><entry><title>Practical Deep Learning for Coders 2019</title><link href="https://simonwillison.net/2019/Jan/26/practical-deep-learning-coders-2019/#atom-tag" rel="alternate"/><published>2019-01-26T00:32:52+00:00</published><updated>2019-01-26T00:32:52+00:00</updated><id>https://simonwillison.net/2019/Jan/26/practical-deep-learning-coders-2019/#atom-tag</id><summary type="html">
    
&lt;p&gt;&lt;strong&gt;&lt;a href="https://www.fast.ai/2019/01/24/course-v3/"&gt;Practical Deep Learning for Coders 2019&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;
The deep learning evening course I took a few months ago has now been shared online in full, and it’s outstanding. “After the first lesson you’ll be able to train a state-of-the-art image classification model on your own data”—can confirm: after just the first lesson I built a bobcat v.s. cougar classifier using photos from iNaturalist. &lt;/p&gt;

&lt;p&gt;The biggest thing I learned from the course is how powerful transfer learning is. I used to think you needed a huge amount of data to get good results from deep learning. That’s no longer true: you can take an existing model (eg ResNet for image classification) and train on top of it.&lt;/p&gt;

&lt;p&gt;ResNet can classify images as 1,000 classes (house, cat, etc)—training an extra few hundred images of e.g. Bobcats vs Cougars only takes a couple of minutes on a GPU and can give you crazily accurate results.&lt;/p&gt;

&lt;p&gt;It works because the pre-trained model can already pick up really subtle details—fur patterns, ear shapes etc—so you only need to train a few more layers on it for it to be able to classify against the patterns in your new set of training images.&lt;/p&gt;

&lt;p&gt;And this doesnt just work for image classification! Natural language processing benefits from transfer learning too: take an existing model trained on the entire corpus of Wikipedia (so it knows patterns from sentence structures) and you can build IMDB sentiment analysis on top. That’s in lesson 4.

    &lt;p&gt;&lt;small&gt;&lt;/small&gt;Via &lt;a href="https://twitter.com/simonw/status/1088848542028881920"&gt;@simonw&lt;/a&gt;&lt;/small&gt;&lt;/p&gt;


    &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/machine-learning"&gt;machine-learning&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/inaturalist"&gt;inaturalist&lt;/a&gt;&lt;/p&gt;



</summary><category term="machine-learning"/><category term="inaturalist"/></entry><entry><title>Develop Your Naturalist Superpowers with Observable Notebooks and iNaturalist</title><link href="https://simonwillison.net/2018/Dec/18/develop-your-naturalist-superpowers/#atom-tag" rel="alternate"/><published>2018-12-18T22:39:19+00:00</published><updated>2018-12-18T22:39:19+00:00</updated><id>https://simonwillison.net/2018/Dec/18/develop-your-naturalist-superpowers/#atom-tag</id><summary type="html">
    
&lt;p&gt;&lt;strong&gt;&lt;a href="https://24ways.org/2018/observable-notebooks-and-inaturalist/"&gt;Develop Your Naturalist Superpowers with Observable Notebooks and iNaturalist&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;
Natalie’s article for this year’s 24 ways advent calendar shows how you can use Observable notebooks to quickly build interactive visualizations against web APIs. She uses the iNaturalist API to show species of Nudibranchs that you might see in a given month, plus a Vega-powered graph of sightings over the course of the year. This really inspired me to think harder about how I can use Observable to solve some of my API debugging needs, and I’ve already spun up a couple of private Notebooks to exercise new APIs that I’m building at work. It’s a huge productivity boost.

    &lt;p&gt;&lt;small&gt;&lt;/small&gt;Via &lt;a href="https://twitter.com/Natbat/status/1074820561509859328"&gt;@natbat&lt;/a&gt;&lt;/small&gt;&lt;/p&gt;


    &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/natalie-downe"&gt;natalie-downe&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/webapis"&gt;webapis&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/inaturalist"&gt;inaturalist&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/observable"&gt;observable&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/nudibranchs"&gt;nudibranchs&lt;/a&gt;&lt;/p&gt;



</summary><category term="natalie-downe"/><category term="webapis"/><category term="inaturalist"/><category term="observable"/><category term="nudibranchs"/></entry><entry><title>Automatically playing science communication games with transfer learning and fastai</title><link href="https://simonwillison.net/2018/Oct/29/transfer-learning/#atom-tag" rel="alternate"/><published>2018-10-29T03:16:33+00:00</published><updated>2018-10-29T03:16:33+00:00</updated><id>https://simonwillison.net/2018/Oct/29/transfer-learning/#atom-tag</id><summary type="html">
    &lt;p&gt;This weekend was the 9th annual &lt;a href="https://sf.sciencehackday.org/"&gt;Science Hack Day San Francisco&lt;/a&gt;, which was also the 100th Science Hack Day held worldwide.&lt;/p&gt;
&lt;p&gt;Natalie and I decided to combine our interests and build something fun.&lt;/p&gt;
&lt;p&gt;I’m currently enrolled in Jeremy Howard’s &lt;a href="http://course.fast.ai/"&gt;Deep Learning course&lt;/a&gt; so I figured this was a great opportunity to try out some computer vision.&lt;/p&gt;
&lt;p&gt;Natalie runs the &lt;a href="https://natbat.github.io/scicomm-calendar/"&gt;SciComm Games calendar&lt;/a&gt; and accompanying &lt;a href="https://twitter.com/SciCommGames"&gt;@SciCommGames&lt;/a&gt; bot to promote and catalogue science communication hashtag games on Twitter.&lt;/p&gt;
&lt;p&gt;Hashtag games? Natalie &lt;a href="https://natbat.github.io/scicomm-calendar/"&gt;explains them here&lt;/a&gt; - essentially they are games run by scientists on Twitter to foster public engagement around an animal or topic by challenging people to identify if a photo is a #cougarOrNot or participate in a #TrickyBirdID or identify #CrowOrNo or many others.&lt;/p&gt;
&lt;p&gt;Combining the two… we decided to build a bot that automatically plays these games using computer vision. So far it’s just trying #cougarOrNot - you can see the bot in action at &lt;a href="https://twitter.com/critter_vision/with_replies"&gt;@critter_vision&lt;/a&gt;.&lt;/p&gt;
&lt;h3&gt;&lt;a id="Training_data_from_iNaturalist_14"&gt;&lt;/a&gt;Training data from iNaturalist&lt;/h3&gt;
&lt;p&gt;In order to build a machine learning model, you need to start out with some training data.&lt;/p&gt;
&lt;p&gt;I’m a big fan of &lt;a href="https://www.inaturalist.org/"&gt;iNaturalist&lt;/a&gt;, a citizen science project that encourages users to upload photographs of wildlife (and plants) they have seen and have their observations verified by a community. Natalie and I used it to build &lt;a href="https://www.owlsnearme.com/"&gt;owlsnearme.com&lt;/a&gt; earlier this year - the API in particular is fantastic.&lt;/p&gt;
&lt;p&gt;iNaturalist has &lt;a href="https://www.inaturalist.org/observations?place_id=1&amp;amp;taxon_id=41944"&gt;over 5,000 verified sightings&lt;/a&gt; of felines (cougars, bobcats, domestic cats and more) in the USA.&lt;/p&gt;
&lt;p&gt;The raw data is available as &lt;a href="http://api.inaturalist.org/v1/observations?identified=true&amp;amp;photos=true&amp;amp;identifications=most_agree&amp;amp;quality_grade=research&amp;amp;order=desc&amp;amp;order_by=created_at&amp;amp;taxon_id=41944&amp;amp;place_id=1&amp;amp;per_page=200"&gt;a paginated JSON API&lt;/a&gt;. The &lt;a href="https://static.inaturalist.org/photos/27333309/medium.jpg"&gt;medium sized photos&lt;/a&gt; are just the right size for training a neural network.&lt;/p&gt;
&lt;p&gt;I started by grabbing 5,000 images and saving them to disk with a filename that reflected their identified species:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;Bobcat_9005106.jpg
Domestic-Cat_10068710.jpg
Bobcat_15713672.jpg
Domestic-Cat_6755280.jpg
Mountain-Lion_9075705.jpg
&lt;/code&gt;&lt;/pre&gt;
&lt;h3&gt;&lt;a id="Building_a_model_32"&gt;&lt;/a&gt;Building a model&lt;/h3&gt;
&lt;p&gt;I’m only one week into the &lt;a href="http://www.fast.ai/"&gt;fast.ai&lt;/a&gt; course so this really isn’t particularly sophisticated yet, but it was just about good enough to power our hack.&lt;/p&gt;
&lt;p&gt;The main technique we are learning in the course is called &lt;a href="https://machinelearningmastery.com/transfer-learning-for-deep-learning/"&gt;transfer learning&lt;/a&gt;, and it really is shockingly effective. Instead of training a model from scratch you start out with a pre-trained model and use some extra labelled images to train a small number of extra layers.&lt;/p&gt;
&lt;p&gt;The initial model we are using is &lt;a href="https://www.kaggle.com/pytorch/resnet34"&gt;ResNet-34&lt;/a&gt;, a 34-layer neural network trained on 1,000 labelled categories in the &lt;a href="http://www.image-net.org/"&gt;ImageNet&lt;/a&gt; corpus.&lt;/p&gt;
&lt;p&gt;In class, we learned to use this technique to get 94% accuracy against the &lt;a href="http://www.robots.ox.ac.uk/~vgg/data/pets/"&gt;Oxford-IIIT Pet Dataset&lt;/a&gt; - around 7,000 images covering 12 cat breeds and 25 dog breeds. In 2012 the researchers at Oxford were able to get 59.21% using a sophisticated model - it 2018 we can get 94% with transfer learning and just a few lines of code.&lt;/p&gt;
&lt;p&gt;I started with an example provided in class, which loads and trains images from files on disk using a regular expression that extracts the labels from the filenames.&lt;/p&gt;
&lt;p&gt;My full Jupyter notebook is &lt;a href="https://github.com/simonw/cougar-or-not/blob/master/inaturalist-cats.ipynb"&gt;inaturalist-cats.ipynb&lt;/a&gt; - the key training code is as follows:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;from fastai import *
from fastai.vision import *
cat_images_path = Path('/home/jupyter/.fastai/data/inaturalist-usa-cats/images')
cat_fnames = get_image_files(cat_images_path)
cat_data = ImageDataBunch.from_name_re(
    cat_images_path,
    cat_fnames,
    r'/([^/]+)_\d+.jpg$',
    ds_tfms=get_transforms(),
    size=224
)
cat_data.normalize(imagenet_stats)
cat_learn = ConvLearner(cat_data, models.resnet34, metrics=error_rate)
cat_learn.fit_one_cycle(4)
# Save the generated model to disk
cat_learn.save(&amp;quot;usa-inaturalist-cats&amp;quot;)
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Calling &lt;code&gt;cat_learn.save(&amp;quot;usa-inaturalist-cats&amp;quot;)&lt;/code&gt; created an 84MB file on disk at &lt;code&gt;/home/jupyter/.fastai/data/inaturalist-usa-cats/images/models/usa-inaturalist-cats.pth&lt;/code&gt; - I used &lt;code&gt;scp&lt;/code&gt; to copy that model down to my laptop.&lt;/p&gt;
&lt;p&gt;This model gave me a 24% error rate which is pretty terrible - others on the course have been getting error rates less than 10% for all kinds of interesting problems. My focus was to get a model deployed as an API though so I haven’t spent any additional time fine-tuning things yet.&lt;/p&gt;
&lt;h3&gt;&lt;a id="Deploying_the_model_as_an_API_67"&gt;&lt;/a&gt;Deploying the model as an API&lt;/h3&gt;
&lt;p&gt;The &lt;a href="https://github.com/fastai/fastai"&gt;fastai library&lt;/a&gt; strongly encourages training against a GPU, using &lt;a href="https://pytorch.org/"&gt;pytorch&lt;/a&gt; and &lt;a href="https://mathema.tician.de/software/pycuda/"&gt;PyCUDA&lt;/a&gt;. I’ve been using n1-highmem-8 Google Cloud Platform instance with an attached Tesla P4, then running everything in a Jupyter notebook there. This costs around $0.38 an hour - fine for a few hours of training, but way too expensive to permanently host a model.&lt;/p&gt;
&lt;p&gt;Thankfully, while a GPU is essential for productively training models it’s not nearly as important for evaluating them against new data. pytorch can run in CPU mode for that just fine on standard hardware, and the &lt;a href="https://github.com/fastai/fastai/blob/master/README.md"&gt;fastai README&lt;/a&gt; includes instructions on installing it for a CPU using pip.&lt;/p&gt;
&lt;p&gt;I started out by ensuring I could execute my generated model on my own laptop (since pytorch doesn’t yet work with the GPU built into the Macbook Pro). Once I had that working, I used the resulting code to write a tiny Starlette-powered API server. The code for that can be found in &lt;a href="https://github.com/simonw/cougar-or-not/blob/8adafac571aad3385317c76bd229448b3cdaa0ac/cougar.py"&gt;in cougar.py&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;fastai is under very heavy development and the latest version doesn’t quite have a clean way of loading a model from disk without also including the initial training images, so I had to hack around quite a bit to get this working using clues from &lt;a href="https://forums.fast.ai/"&gt;the fastai forums&lt;/a&gt;. I expect this to get much easier over the next few weeks as the library continues to evolve based on feedback from the current course.&lt;/p&gt;
&lt;p&gt;To deploy the API I wrote &lt;a href="https://github.com/simonw/cougar-or-not/blob/8adafac571aad3385317c76bd229448b3cdaa0ac/Dockerfile"&gt;a Dockerfile&lt;/a&gt; and shipped it to &lt;a href="https://zeit.co/now"&gt;Zeit Now&lt;/a&gt;. Now remains my go-to choice for this kind of project, though unfortunately their new (and brilliant) v2 platform imposes &lt;a href="https://github.com/zeit/now-cli/issues/1523"&gt;a 100MB image size limit&lt;/a&gt; - not nearly enough when the model file itself weights in at 83 MB. Thankfully it’s still possible to &lt;a href="https://github.com/simonw/cougar-or-not/commit/5ad3d5b49c6419e4c2440291bc5fb204625aae83"&gt;specify their v1 cloud&lt;/a&gt; which is more forgiving for larger applications.&lt;/p&gt;
&lt;p&gt;Here’s the result: an API which can accept either the URL to an image or an uploaded image file: &lt;a href="https://cougar-or-not.now.sh/"&gt;https://cougar-or-not.now.sh/&lt;/a&gt; - try it out with &lt;a href="https://cougar-or-not.now.sh/classify-url?url=https://upload.wikimedia.org/wikipedia/commons/9/9a/Oregon_Cougar_ODFW.JPG"&gt;a cougar&lt;/a&gt; and &lt;a href="https://cougar-or-not.now.sh/classify-url?url=https://upload.wikimedia.org/wikipedia/commons/thumb/d/dc/Bobcat2.jpg/1200px-Bobcat2.jpg"&gt;a bobcat&lt;/a&gt;.&lt;/p&gt;
&lt;h3&gt;&lt;a id="The_Twitter_Bot_81"&gt;&lt;/a&gt;The Twitter Bot&lt;/h3&gt;
&lt;p&gt;Natalie built &lt;a href="https://github.com/natbat/CritterVision"&gt;the Twitter bot&lt;/a&gt;. It runs as a scheduled task on Heroku and works by checking for new #cougarOrNot tweets from &lt;a href="https://twitter.com/drmichellelarue"&gt;Dr. Michelle LaRue&lt;/a&gt;, extracting any images, passing them to my API and replying with a tweet that summarizes the results. Take a look at &lt;a href="https://twitter.com/critter_vision/with_replies"&gt;its recent replies&lt;/a&gt; to get a feel for how well it is doing.&lt;/p&gt;
&lt;p&gt;Amusingly, Dr. LaRue frequently tweets memes to promote upcoming competitions and marks them with the same hashtag. The bot appears to think that most of the memes are bobcats! I should definitely spend some time tuning that model.&lt;/p&gt;
&lt;p&gt;Science Hack Day was great fun. A big thanks to the organizing team, and congrats to all of the other participants. I’m really looking forward to the next one.&lt;/p&gt;
&lt;p&gt;Plus… we won a medal!&lt;/p&gt;
&lt;blockquote class="twitter-tweet" data-lang="en"&gt;&lt;p lang="en" dir="ltr"&gt;Enjoyed &lt;a href="https://twitter.com/hashtag/scienceHackday?src=hash&amp;amp;ref_src=twsrc%5Etfw"&gt;#scienceHackday&lt;/a&gt; this weekend, made &amp;amp; launched a cool machine learning hack to process images &amp;amp; work out if they have a cougar in them or not! &lt;a href="https://twitter.com/hashtag/CougarOrNot?src=hash&amp;amp;ref_src=twsrc%5Etfw"&gt;#CougarOrNot&lt;/a&gt; &lt;a href="https://twitter.com/critter_vision?ref_src=twsrc%5Etfw"&gt;@critter_vision&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;... we won a medal!&lt;br /&gt;&lt;br /&gt;Bot code: &lt;a href="https://t.co/W2jZcGCnFr"&gt;https://t.co/W2jZcGCnFr&lt;/a&gt;&lt;br /&gt;Machine learning API: &lt;a href="https://t.co/swNiKlcTp0"&gt;https://t.co/swNiKlcTp0&lt;/a&gt; &lt;a href="https://t.co/dcdIhNZy63"&gt;pic.twitter.com/dcdIhNZy63&lt;/a&gt;&lt;/p&gt;&amp;#8212; Natbat (@Natbat) &lt;a href="https://twitter.com/Natbat/status/1056717060116369410?ref_src=twsrc%5Etfw"&gt;October 29, 2018&lt;/a&gt;&lt;/blockquote&gt;
    
        &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/computer-vision"&gt;computer-vision&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/machine-learning"&gt;machine-learning&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/natalie-downe"&gt;natalie-downe&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/inaturalist"&gt;inaturalist&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/fastai"&gt;fastai&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/transferlearning"&gt;transferlearning&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/jeremy-howard"&gt;jeremy-howard&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/starlette"&gt;starlette&lt;/a&gt;&lt;/p&gt;
    

</summary><category term="computer-vision"/><category term="machine-learning"/><category term="natalie-downe"/><category term="inaturalist"/><category term="fastai"/><category term="transferlearning"/><category term="jeremy-howard"/><category term="starlette"/></entry><entry><title>owlsnearme source code on GitHub</title><link href="https://simonwillison.net/2018/Feb/4/owlsnearme-source/#atom-tag" rel="alternate"/><published>2018-02-04T22:33:34+00:00</published><updated>2018-02-04T22:33:34+00:00</updated><id>https://simonwillison.net/2018/Feb/4/owlsnearme-source/#atom-tag</id><summary type="html">
    
&lt;p&gt;&lt;strong&gt;&lt;a href="https://github.com/simonw/owlsnearme"&gt;owlsnearme source code on GitHub&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;
Here’s the source code for our new owlsnearme.com project. It’s a single-page React application that pulls all of its data from the iNaturalist API. We built it this weekend with the SuperbOwl kick-off as a hard deadline so it’s not the most beautiful React code, but it’s a nice demonstration of how React (and create-react-app in particular) can be used for rapid development.


    &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/github"&gt;github&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/javascript"&gt;javascript&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/natalie-downe"&gt;natalie-downe&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/projects"&gt;projects&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/react"&gt;react&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/inaturalist"&gt;inaturalist&lt;/a&gt;&lt;/p&gt;



</summary><category term="github"/><category term="javascript"/><category term="natalie-downe"/><category term="projects"/><category term="react"/><category term="inaturalist"/></entry><entry><title>Owls Near Me</title><link href="https://simonwillison.net/2018/Feb/4/owlsnearme/#atom-tag" rel="alternate"/><published>2018-02-04T22:26:29+00:00</published><updated>2018-02-04T22:26:29+00:00</updated><id>https://simonwillison.net/2018/Feb/4/owlsnearme/#atom-tag</id><summary type="html">
    
&lt;p&gt;&lt;strong&gt;&lt;a href="https://www.owlsnearme.com/"&gt;Owls Near Me&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;
Back in 2010 Natalie and I shipped owlsnearyou.com—a website for finding your nearest owls, using data from the sadly deceased WildlifeNearYou (RIP). To celebrate #SuperbOwl Sunday we rebuilt the same concept on top of the excellent iNaturalist API. Search for a place to see which owls have been spotted there, or click the magic button to geolocate your device and see which owls have been spotted in your nearby area!


    &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/natalie-downe"&gt;natalie-downe&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/projects"&gt;projects&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/wildlifenearyou"&gt;wildlifenearyou&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/inaturalist"&gt;inaturalist&lt;/a&gt;&lt;/p&gt;



</summary><category term="natalie-downe"/><category term="projects"/><category term="wildlifenearyou"/><category term="inaturalist"/></entry><entry><title>6M observations total! Where has iNaturalist grown in 80 days with 1 million new observations?</title><link href="https://simonwillison.net/2018/Jan/28/inaturalist/#atom-tag" rel="alternate"/><published>2018-01-28T20:18:58+00:00</published><updated>2018-01-28T20:18:58+00:00</updated><id>https://simonwillison.net/2018/Jan/28/inaturalist/#atom-tag</id><summary type="html">
    
&lt;p&gt;&lt;strong&gt;&lt;a href="https://www.inaturalist.org/blog/11590-6m-observations-total-where-has-inaturalist-grown-in-80-days-with-1-million-new-observations"&gt;6M observations total! Where has iNaturalist grown in 80 days with 1 million new observations?&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;
Citizen science app iNaturalist is seeing explosive growth at the moment—they’ve been around for nearly a decade but 1/6 of the observations posted to the site were added in just the past few months. Having tried the latest version of their iPhone app it’s easy to see why: snap a photo of some nature and upload it to the app and it will use surprisingly effective machine learning to suggest the genus or even the individual species. Submit the observation and within a few minutes other iNaturalist community members will confirm the identification or suggest a correction. It’s brilliantly well executed and an utter delight to use.


    &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/computer-vision"&gt;computer-vision&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/crowdsourcing"&gt;crowdsourcing&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/machine-learning"&gt;machine-learning&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/science"&gt;science&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/citizenscience"&gt;citizenscience&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/inaturalist"&gt;inaturalist&lt;/a&gt;&lt;/p&gt;



</summary><category term="computer-vision"/><category term="crowdsourcing"/><category term="machine-learning"/><category term="science"/><category term="citizenscience"/><category term="inaturalist"/></entry></feed>