<?xml version="1.0" encoding="utf-8"?>
<feed xml:lang="en-us" xmlns="http://www.w3.org/2005/Atom"><title>Simon Willison's Weblog: coronavirus</title><link href="http://simonwillison.net/" rel="alternate"/><link href="http://simonwillison.net/tags/coronavirus.atom" rel="self"/><id>http://simonwillison.net/</id><updated>2020-03-11T04:49:35+00:00</updated><author><name>Simon Willison</name></author><entry><title>Weeknotes: COVID-19 numbers in Datasette</title><link href="https://simonwillison.net/2020/Mar/11/covid-19/#atom-tag" rel="alternate"/><published>2020-03-11T04:49:35+00:00</published><updated>2020-03-11T04:49:35+00:00</updated><id>https://simonwillison.net/2020/Mar/11/covid-19/#atom-tag</id><summary type="html">
    &lt;p&gt;&lt;a href="https://en.wikipedia.org/wiki/Coronavirus_disease_2019"&gt;COVID-19&lt;/a&gt;, the disease caused by the novel coronavirus, gets more terrifying every day. Johns Hopkins Center for Systems Science and Engineering (CSSE) have been &lt;a href="https://github.com/CSSEGISandData/COVID-19"&gt;collating data&lt;/a&gt; about the spread of the disease and publishing it as CSV files on GitHub.&lt;/p&gt;

&lt;p&gt;This morning I used the pattern described in &lt;a href="https://simonwillison.net/2020/Jan/21/github-actions-cloud-run/"&gt;Deploying a data API using GitHub Actions and Cloud Run&lt;/a&gt; to set up a scheduled task that grabs their data once an hour and publishes it to &lt;a href="https://covid-19.datasettes.com/"&gt;https://covid-19.datasettes.com/&lt;/a&gt; as a table in Datasette.&lt;/p&gt;

&lt;p&gt;If you're not yet concerned about COVID-19 you clearly haven't been paying atttention to what's been happening in Italy. Here's &lt;a href="https://covid-19.datasettes.com/covid/daily_reports?country_or_region=Italy&amp;amp;_sort_desc=confirmed#g.mark=bar&amp;amp;g.x_column=day&amp;amp;g.x_type=ordinal&amp;amp;g.y_column=confirmed&amp;amp;g.y_type=quantitative"&gt;a query&lt;/a&gt; which shows a graph of the number of confirmed cases in Italy over the past few weeks (using &lt;a href="https://github.com/simonw/datasette-vega"&gt;datasette-vega&lt;/a&gt;):&lt;/p&gt;

&lt;p&gt;&lt;img src="https://static.simonwillison.net/static/2020/covid-19-italy.png" alt="COVID-19 confirmed cases in Italy, spiking up to 10,149" style="max-width: 100%" /&gt;&lt;/p&gt;

&lt;p&gt;155 cases 17 days ago to 10,149 cases today is really frightening. And the USA still doesn't have robust testing in place, so the numbers here are likely to really shock people once they start to become more apparent.&lt;/p&gt;

&lt;p&gt;If you're going to use the data in covid-19.datasettes.com for anything please be responsible with it and &lt;a href="https://github.com/simonw/covid-19-datasette/blob/master/README.md"&gt;read the warnings in the README file&lt;/a&gt; in detail: it's important to fully understand the sources of the data and how it is being processed before you use it to make any assertions about the spread of COVID-19.&lt;/p&gt;

&lt;p&gt;My favourite resource to understand Coronavirus and what we should be doing about it is &lt;a href="https://www.flattenthecurve.com/"&gt;flattenthecurve.com&lt;/a&gt;, compiled by &lt;a href="https://twitter.com/figgyjam"&gt;Julie McMurry&lt;/a&gt;, an assistant professor at Oregon State University College of Public Health. I strongly recommend checking it out.&lt;/p&gt;

&lt;h3&gt;Other projects&lt;/h3&gt;

&lt;p&gt;I've worked on a bunch of other projects this week, some of which were inspired by my time at &lt;a href="https://www.ire.org/events-and-training/conferences/nicar-2020"&gt;NICAR&lt;/a&gt;.&lt;/p&gt;

&lt;ul&gt;&lt;li&gt;&lt;a href="https://github.com/simonw/fec-to-sqlite"&gt;fec-to-sqlite&lt;/a&gt; is a script for saving FEC campaign finance filings to a SQLite database. Since those filings are pulled in via HTTP and can get pretty big, it uses a neat trick to generate a progress bar with the &lt;a href="https://github.com/tqdm/tqdm"&gt;tqdm&lt;/a&gt; library - it &lt;a href="https://github.com/simonw/fec-to-sqlite/blob/d3ec100f4e9d5acbc5798d95b49e6e373c1ce778/fec_to_sqlite/cli.py#L26-L27"&gt;initiates a progress bar&lt;/a&gt; with &lt;a href="https://github.com/simonw/fec-to-sqlite/blob/d3ec100f4e9d5acbc5798d95b49e6e373c1ce778/fec_to_sqlite/utils.py#L89"&gt;the Content-Length&lt;/a&gt; of the incoming file, then as it iterates over the lines coming in over HTTP it uses the length of each line &lt;a href="https://github.com/simonw/fec-to-sqlite/blob/d3ec100f4e9d5acbc5798d95b49e6e373c1ce778/fec_to_sqlite/utils.py#L75-L78"&gt;to update that bar&lt;/a&gt;.&lt;/li&gt;&lt;li&gt;&lt;a href="https://github.com/simonw/datasette-search-all"&gt;datasette-search-all&lt;/a&gt; is a new plugin that enables search across multiple FTS-enabled SQLite tables at once. I wrote more about that in &lt;a href="https://simonwillison.net/2020/Mar/9/datasette-search-all/"&gt;this blog post&lt;/a&gt; on Monday.&lt;/li&gt;&lt;li&gt;&lt;a href="https://github.com/simonw/datasette-column-inspect"&gt;datasette-column-inspect&lt;/a&gt; is an extremely experimental plugin that tries out a "column inspector" tool for Datasette tables - click on a column heading and the plugin shows you interesting facts about that column, such as the min/mean/max/stdev, any outlying values, the most common values and the least common values. Screenshot below. This prototype came about as part of a JSK team project for the Designing Machine Learning course at Stanford - we were thinking about ways in which machine learning could help journalists find stories in large datasets. The prototype doesn't have any machine learning in it - just some simple statistics to identify outliers - but it's meant to illustrate how a tool that exposes machine learning insights against tabular data might work.&lt;/li&gt;&lt;li&gt;&lt;a href="https://github.com/dogsheep/github-to-sqlite"&gt;github-to-sqlite&lt;/a&gt; grew a new sub-command: &lt;code&gt;github-to-sqlite commits github.db simonw/datasette&lt;/code&gt; - which imports information about commits to a repository (just the author and commit message, not the body of the commit itself). I'm running a private version of this against all of my projects, which is really useful for seeing what I worked on over the past week when writing my weeknotes.&lt;/li&gt;&lt;/ul&gt;

&lt;p&gt;Here are two screenshots of &lt;code&gt;datasette-column-inspect&lt;/code&gt; in action. You can try out a live demo of the plugin &lt;a href="https://datasette-column-inspect-demo-j7hipcg4aq-uc.a.run.app/fivethirtyeight/antiquities-act%2Factions_under_antiquities_act"&gt;over here&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;&lt;img src="https://static.simonwillison.net/static/2020/column-inspect-avengers.png" alt="Outliers in number of appearences in the Avengers: Iron Man, Captain America, Spider Man and Wolverine" style="max-width: 100%" /&gt;&lt;/p&gt;

&lt;p&gt;&lt;img src="https://static.simonwillison.net/static/2020/column-inspect-antiquities.png" alt="Column summary for states in actions_under_antiquities_act" style="max-width: 100%" /&gt;&lt;/p&gt;
    
        &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/plugins"&gt;plugins&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/projects"&gt;projects&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/datasette"&gt;datasette&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/weeknotes"&gt;weeknotes&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/coronavirus"&gt;coronavirus&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/covid19"&gt;covid19&lt;/a&gt;&lt;/p&gt;
    

</summary><category term="plugins"/><category term="projects"/><category term="datasette"/><category term="weeknotes"/><category term="coronavirus"/><category term="covid19"/></entry></feed>