<?xml version="1.0" encoding="utf-8"?>
<feed xml:lang="en-us" xmlns="http://www.w3.org/2005/Atom"><title>Simon Willison's Weblog: jsonhead</title><link href="http://simonwillison.net/" rel="alternate"/><link href="http://simonwillison.net/tags/jsonhead.atom" rel="self"/><id>http://simonwillison.net/</id><updated>2017-10-14T21:46:38+00:00</updated><author><name>Simon Willison</name></author><entry><title>Deploying an asynchronous Python microservice with Sanic and Zeit Now</title><link href="https://simonwillison.net/2017/Oct/14/async-python-sanic-now/#atom-tag" rel="alternate"/><published>2017-10-14T21:46:38+00:00</published><updated>2017-10-14T21:46:38+00:00</updated><id>https://simonwillison.net/2017/Oct/14/async-python-sanic-now/#atom-tag</id><summary type="html">
    &lt;p&gt;&lt;a href="https://simonwillison.net/tags/jsonhead/"&gt;Back in 2008&lt;/a&gt; Natalie Downe and I deployed what today we would call a microservice: &lt;a href="https://github.com/simonw/json-head"&gt;json-head&lt;/a&gt;, a tiny Google App Engine app that allowed you to make an HTTP head request against a URL and get back the HTTP headers as JSON. One of our initial use-scase for this was &lt;a href="https://gist.github.com/natbat/8406b8e5a8ed22d6a2e1bbd75771bc97"&gt;Natalie’s addSizes.js&lt;/a&gt;, an unobtrusive jQuery script that could annotate links to PDFs and other large files with their corresponding file size pulled from the &lt;code&gt;Content-Length&lt;/code&gt; header. Another potential use-case is detecting broken links, since the API can be used to spot 404 status codes (&lt;a href="https://json-head.now.sh/?url=https://simonwillison.net/page-does-not-exist"&gt;as in this example&lt;/a&gt;).&lt;/p&gt;
&lt;p&gt;At some point in the following decade &lt;code&gt;json-head.appspot.com&lt;/code&gt; stopped working. Today I’m bringing it back, mainly as an excuse to try out the combination of Python 3.5 async, the &lt;a href="https://github.com/channelcat/sanic/"&gt;Sanic&lt;/a&gt; microframework and Zeit’s brilliant &lt;a href="https://zeit.co/now"&gt;Now&lt;/a&gt; deployment platform.&lt;/p&gt;
&lt;p&gt;First, a demo. &lt;a href="https://json-head.now.sh/?url=https://simonwillison.net/"&gt;https://json-head.now.sh/?url=https://simonwillison.net/&lt;/a&gt; returns the following:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;[
    {
        &amp;quot;ok&amp;quot;: true,
        &amp;quot;headers&amp;quot;: {
            &amp;quot;Date&amp;quot;: &amp;quot;Sat, 14 Oct 2017 18:37:52 GMT&amp;quot;,
            &amp;quot;Content-Type&amp;quot;: &amp;quot;text/html; charset=utf-8&amp;quot;,
            &amp;quot;Connection&amp;quot;: &amp;quot;keep-alive&amp;quot;,
            &amp;quot;Set-Cookie&amp;quot;: &amp;quot;__cfduid=dd0b71b4e89bbaca5b27fa06c0b95af4a1508006272; expires=Sun, 14-Oct-18 18:37:52 GMT; path=/; domain=.simonwillison.net; HttpOnly; Secure&amp;quot;,
            &amp;quot;Cache-Control&amp;quot;: &amp;quot;s-maxage=200&amp;quot;,
            &amp;quot;X-Frame-Options&amp;quot;: &amp;quot;SAMEORIGIN&amp;quot;,
            &amp;quot;Via&amp;quot;: &amp;quot;1.1 vegur&amp;quot;,
            &amp;quot;CF-Cache-Status&amp;quot;: &amp;quot;HIT&amp;quot;,
            &amp;quot;Vary&amp;quot;: &amp;quot;Accept-Encoding&amp;quot;,
            &amp;quot;Server&amp;quot;: &amp;quot;cloudflare-nginx&amp;quot;,
            &amp;quot;CF-RAY&amp;quot;: &amp;quot;3adca70269a51e8f-SJC&amp;quot;,
            &amp;quot;Content-Encoding&amp;quot;: &amp;quot;gzip&amp;quot;
        },
        &amp;quot;status&amp;quot;: 200,
        &amp;quot;url&amp;quot;: &amp;quot;https://simonwillison.net/&amp;quot;
    }
]
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Given a URL, &lt;code&gt;json-head.now.sh&lt;/code&gt; performs an HTTP HEAD request and returns the resulting status code and the HTTP headers. Results are returned with the &lt;code&gt;Access-Control-Allow-Origin: *&lt;/code&gt; header so you can call the API using &lt;code&gt;fetch()&lt;/code&gt; or &lt;code&gt;XMLHttpRequest&lt;/code&gt; from JavaScript running on any page.&lt;/p&gt;
&lt;h2&gt;&lt;a id="Sanic_and_Python_asyncawait_32"&gt;&lt;/a&gt;Sanic and Python async/await&lt;/h2&gt;
&lt;p&gt;A key new feature &lt;a href="https://docs.python.org/3/whatsnew/3.5.html"&gt;added to Python 3.5&lt;/a&gt; back in September 2015 was built-in syntactic support for coroutine control via the async/await statements. Python now has some serious credibility as a platform for asynchronous I/O (the concept that got me &lt;a href="https://simonwillison.net/2009/Nov/23/node/"&gt;so excited about Node.js back in 2009&lt;/a&gt;). This has lead to an explosion of asynchronous innovation around the Python community.&lt;/p&gt;
&lt;p&gt;json-head is the perfect application for async - it’s little more than a dumbed-down HTTP proxy, accepting incoming HTTP requests, making its own requests elsewhere and then returning the results.&lt;/p&gt;
&lt;p&gt;&lt;a href="https://github.com/channelcat/sanic/"&gt;Sanic&lt;/a&gt; is a Flask-like web framework built specifically to take advantage of async/await in Python 3.5. It’s designed for speed - built on top of &lt;a href="https://github.com/MagicStack/uvloop"&gt;uvloop&lt;/a&gt;, a Python wrapper for &lt;a href="https://github.com/libuv/libuv"&gt;libuv&lt;/a&gt; (which itself was originally built to power Node.js). uvloop’s self-selected benchmarks are &lt;a href="https://magic.io/blog/uvloop-blazing-fast-python-networking/"&gt;extremely impressive&lt;/a&gt;.&lt;/p&gt;
&lt;h2&gt;&lt;a id="Zeit_Now_40"&gt;&lt;/a&gt;Zeit Now&lt;/h2&gt;
&lt;p&gt;To host this new microservice, I chose &lt;a href="https://zeit.co/now"&gt;Zeit Now&lt;/a&gt;. It’s a truly beautiful piece of software design.&lt;/p&gt;
&lt;p&gt;Now lets you treat deployments as immutable. Every time you deploy you get a brand new URL. You can then interact with your deployment directly, or point an existing alias to it if you want a persistent URL for your project.&lt;/p&gt;
&lt;p&gt;Deployments are free, and deployed code stays available forever due to &lt;a href="https://github.com/zeit/now-cli/issues/189"&gt;some clever engineering&lt;/a&gt; behind the scenes.&lt;/p&gt;
&lt;p&gt;Best of all: deploying a project takes just a single command: type &lt;code&gt;now&lt;/code&gt; and the code in your current directory will be deployed to their cloud and assigned a unique URL.&lt;/p&gt;
&lt;p&gt;Now was originally built for Node.js projects, but last August &lt;a href="https://zeit.co/blog/now-dockerfile"&gt;Zeit added Docker support&lt;/a&gt;. If the directory you run it in contains a Dockerfile, running &lt;code&gt;now&lt;/code&gt; will upload, build and run the corresponding image.&lt;/p&gt;
&lt;p&gt;There’s just one thing missing: good examples of how to deploy Python projects to Now using Docker. I’m hoping this article can help fill that gap.&lt;/p&gt;
&lt;p&gt;Here’s the &lt;a href="https://github.com/simonw/json-head/blob/master/Dockerfile"&gt;complete Dockerfile&lt;/a&gt; I’m using for json-head:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;FROM python:3
COPY . /app
WORKDIR /app
RUN pip install -r requirements.txt
EXPOSE 8006
CMD [&amp;quot;python&amp;quot;, &amp;quot;json_head.py&amp;quot;]
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;I’m using the &lt;a href="https://hub.docker.com/_/python/"&gt;official Docker Python image&lt;/a&gt; as a base, copying the current directory into the image, using &lt;code&gt;pip install&lt;/code&gt; to install dependencies and then exposing port 8006 (for no reason other than that’s the port I use for local development environment) and running the &lt;a href="https://github.com/simonw/json-head/blob/master/json_head.py"&gt;json_head.py&lt;/a&gt; script. Now is smart enough to forward incoming HTTP traffic on port 80 to the port that was exposed by the container.&lt;/p&gt;
&lt;p&gt;If you setup Now yourself (&lt;code&gt;npm install -g now&lt;/code&gt; or use &lt;a href="https://zeit.co/download"&gt;one of their installers&lt;/a&gt;) you can deploy my code directly from GitHub to your own instance with a single command:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;$ now simonw/json-head
&amp;gt; Didn't find directory. Searching on GitHub...
&amp;gt; Deploying GitHub repository &amp;quot;simonw/json-head&amp;quot; under simonw
&amp;gt; Ready! https://simonw-json-head-xqkfgorgei.now.sh (copied to clipboard) [1s]
&amp;gt; Initializing…
&amp;gt; Building
&amp;gt; ▲ docker build
Sending build context to Docker daemon 7.168 kBkB
&amp;gt; Step 1 : FROM python:3
&amp;gt; 3: Pulling from library/python
&amp;gt; ... lots more stuff here ...
&lt;/code&gt;&lt;/pre&gt;
&lt;h2&gt;&lt;a id="Initial_implementation_79"&gt;&lt;/a&gt;Initial implementation&lt;/h2&gt;
&lt;p&gt;Here’s my first working version of json-head using Sanic:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;from sanic import Sanic
from sanic import response
import aiohttp

app = Sanic(__name__)

async def head(session, url):
    try:
        async with session.head(url) as response:
            return {
                'ok': True,
                'headers': dict(response.headers),
                'status': response.status,
                'url': url,
            }
    except Exception as e:
        return {
            'ok': False,
            'error': str(e),
            'url': url,
        }

@app.route('/')
async def handle_request(request):
    url = request.args.get('url')
    if url:
        async with aiohttp.ClientSession() as session:
            head_info = await head(session, url)
            return response.json(
                head_info,
                headers={
                    'Access-Control-Allow-Origin': '*'
                },
            )
    else:
        return response.html('Try /?url=xxx')

if __name__ == '__main__':
    app.run(host=&amp;quot;0.0.0.0&amp;quot;, port=8006)
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;This exact code is deployed at &lt;a href="https://json-head-thlbstmwfi.now.sh/"&gt;https://json-head-thlbstmwfi.now.sh/&lt;/a&gt; - since Now deployments are free, there’s no reason not to leave work-in-progress examples hosted as throwaway deployments.&lt;/p&gt;
&lt;p&gt;In addition to Sanic, I’m also using the handy &lt;a href="https://github.com/aio-libs/aiohttp"&gt;aiohttp&lt;/a&gt; asynchronous HTTP library - which features API design clearly inspired by my all-time favourite HTTP library, &lt;a href="https://github.com/requests/requests"&gt;requests&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;The key new pieces of syntax to understand in the above code are the async and await statements. &lt;code&gt;async def&lt;/code&gt; is used to declare a function that acts as a coroutine. Coroutines need to be executed inside an event loop (which Sanic handles for us), but gain the ability to use the &lt;code&gt;await&lt;/code&gt; statement.&lt;/p&gt;
&lt;p&gt;The &lt;code&gt;await&lt;/code&gt; statement is the real magic here: it suspends the current coroutine until the coroutine it is calling has finished executing. It is this that allows us to write asynchronous code without descending into a messy hell of callback functions.&lt;/p&gt;
&lt;h2&gt;&lt;a id="Adding_parallel_requests_131"&gt;&lt;/a&gt;Adding parallel requests&lt;/h2&gt;
&lt;p&gt;So far we haven’t really taken advantage of what async I/O can do - if every incoming HTTP request results in a single outgoing HTTP response then async may help us scale to serve more incoming requests at once but it’s not really giving us any new functionality.&lt;/p&gt;
&lt;p&gt;Executing multiple outbound HTTP requests in parallel is a much more interesting use-case. Let’s add support for multiple &lt;code&gt;?url=&lt;/code&gt; parameters, such as the following:&lt;/p&gt;
&lt;p&gt;&lt;a href="https://json-head.now.sh/?url=https://simonwillison.net/&amp;amp;url=https://www.google.com/"&gt;https://json-head.now.sh/?url=https://simonwillison.net/&amp;amp;url=https://www.google.com/&lt;/a&gt;&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;[
    {
        &amp;quot;ok&amp;quot;: true,
        &amp;quot;headers&amp;quot;: {
            &amp;quot;Date&amp;quot;: &amp;quot;Sat, 14 Oct 2017 19:35:29 GMT&amp;quot;,
            &amp;quot;Content-Type&amp;quot;: &amp;quot;text/html; charset=utf-8&amp;quot;,
            &amp;quot;Connection&amp;quot;: &amp;quot;keep-alive&amp;quot;,
            &amp;quot;Set-Cookie&amp;quot;: &amp;quot;__cfduid=ded486c1faaac166e8ae72a87979c02101508009729; expires=Sun, 14-Oct-18 19:35:29 GMT; path=/; domain=.simonwillison.net; HttpOnly; Secure&amp;quot;,
            &amp;quot;Cache-Control&amp;quot;: &amp;quot;s-maxage=200&amp;quot;,
            &amp;quot;X-Frame-Options&amp;quot;: &amp;quot;SAMEORIGIN&amp;quot;,
            &amp;quot;Via&amp;quot;: &amp;quot;1.1 vegur&amp;quot;,
            &amp;quot;CF-Cache-Status&amp;quot;: &amp;quot;EXPIRED&amp;quot;,
            &amp;quot;Vary&amp;quot;: &amp;quot;Accept-Encoding&amp;quot;,
            &amp;quot;Server&amp;quot;: &amp;quot;cloudflare-nginx&amp;quot;,
            &amp;quot;CF-RAY&amp;quot;: &amp;quot;3adcfb671c862888-SJC&amp;quot;,
            &amp;quot;Content-Encoding&amp;quot;: &amp;quot;gzip&amp;quot;
        },
        &amp;quot;status&amp;quot;: 200,
        &amp;quot;url&amp;quot;: &amp;quot;https://simonwillison.net/&amp;quot;
    },
    {
        &amp;quot;ok&amp;quot;: true,
        &amp;quot;headers&amp;quot;: {
            &amp;quot;Date&amp;quot;: &amp;quot;Sat, 14 Oct 2017 19:35:29 GMT&amp;quot;,
            &amp;quot;Expires&amp;quot;: &amp;quot;-1&amp;quot;,
            &amp;quot;Cache-Control&amp;quot;: &amp;quot;private, max-age=0&amp;quot;,
            &amp;quot;Content-Type&amp;quot;: &amp;quot;text/html; charset=ISO-8859-1&amp;quot;,
            &amp;quot;P3P&amp;quot;: &amp;quot;CP=\&amp;quot;This is not a P3P policy! See g.co/p3phelp for more info.\&amp;quot;&amp;quot;,
            &amp;quot;Content-Encoding&amp;quot;: &amp;quot;gzip&amp;quot;,
            &amp;quot;Server&amp;quot;: &amp;quot;gws&amp;quot;,
            &amp;quot;X-XSS-Protection&amp;quot;: &amp;quot;1; mode=block&amp;quot;,
            &amp;quot;X-Frame-Options&amp;quot;: &amp;quot;SAMEORIGIN&amp;quot;,
            &amp;quot;Set-Cookie&amp;quot;: &amp;quot;1P_JAR=2017-10-14-19; expires=Sat, 21-Oct-2017 19:35:29 GMT; path=/; domain=.google.com&amp;quot;,
            &amp;quot;Alt-Svc&amp;quot;: &amp;quot;quic=\&amp;quot;:443\&amp;quot;; ma=2592000; v=\&amp;quot;39,38,37,35\&amp;quot;&amp;quot;,
            &amp;quot;Transfer-Encoding&amp;quot;: &amp;quot;chunked&amp;quot;
        },
        &amp;quot;status&amp;quot;: 200,
        &amp;quot;url&amp;quot;: &amp;quot;https://www.google.com/&amp;quot;
    }
]
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;We’re now accepting multiple URLs and executing multiple HEAD requests… but Python 3.5 async makes it easy to do this in parallel, so our overall request time should match that of the single longest HEAD request that we triggered.&lt;/p&gt;
&lt;p&gt;Here’s an implementation that adds support for multiple, parallel outbound HTTP requests:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;@app.route('/')
async def handle_request(request):
    urls = request.args.getlist('url')
    if urls:
        async with aiohttp.ClientSession() as session:
            head_infos = await asyncio.gather(*[
                head(session, url) for url in urls
            ])
            return response.json(
                head_infos,
                headers={'Access-Control-Allow-Origin': '*'},
            )
    else:
        return response.html(INDEX)
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;We’re using the &lt;code&gt;asyncio&lt;/code&gt; module from the Python 3.5 standard library here - in particular the &lt;code&gt;gather&lt;/code&gt; function. &lt;a href="https://docs.python.org/3/library/asyncio-task.html#asyncio.gather"&gt;&lt;code&gt;asyncio.gather&lt;/code&gt;&lt;/a&gt; takes a list of coroutines and returns a future aggregating their results. This future will resolve (and return to a corresponding &lt;code&gt;await&lt;/code&gt; statement) as soon as all of those coroutines have returned their values.&lt;/p&gt;
&lt;p&gt;My final code for json-head &lt;a href="https://github.com/simonw/json-head"&gt;can be found on GitHub&lt;/a&gt;. As I hope I’ve demonstrated, the combination of Python 3.5+, Sanic and Now makes deploying asynchronous Python microservices trivially easy.&lt;/p&gt;
    
        &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/async"&gt;async&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/jsonhead"&gt;jsonhead&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/natalie-downe"&gt;natalie-downe&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/python"&gt;python&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/sanic"&gt;sanic&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/zeit-now"&gt;zeit-now&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/docker"&gt;docker&lt;/a&gt;&lt;/p&gt;
    

</summary><category term="async"/><category term="jsonhead"/><category term="natalie-downe"/><category term="python"/><category term="sanic"/><category term="zeit-now"/><category term="docker"/></entry><entry><title>addSizes.js: Snazzy automatic link file-size generation</title><link href="https://simonwillison.net/2008/Aug/30/addsizesjs/#atom-tag" rel="alternate"/><published>2008-08-30T10:39:35+00:00</published><updated>2008-08-30T10:39:35+00:00</updated><id>https://simonwillison.net/2008/Aug/30/addsizesjs/#atom-tag</id><summary type="html">
    
&lt;p&gt;&lt;strong&gt;&lt;a href="http://natbat.net/2008/Aug/27/addSizes/"&gt;addSizes.js: Snazzy automatic link file-size generation&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;
Posted to Nat’s snazzy new blog: a script that uses my json-head API to grab the file size of linked documents on a page and insert those sizes in to the document.


    &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/addsizes"&gt;addsizes&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/javascript"&gt;javascript&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/json"&gt;json&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/jsonhead"&gt;jsonhead&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/jsonp"&gt;jsonp&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/natalie-downe"&gt;natalie-downe&lt;/a&gt;&lt;/p&gt;



</summary><category term="addsizes"/><category term="javascript"/><category term="json"/><category term="jsonhead"/><category term="jsonp"/><category term="natalie-downe"/></entry><entry><title>json-head</title><link href="https://simonwillison.net/2008/Jul/29/jsonhead/#atom-tag" rel="alternate"/><published>2008-07-29T15:41:57+00:00</published><updated>2008-07-29T15:41:57+00:00</updated><id>https://simonwillison.net/2008/Jul/29/jsonhead/#atom-tag</id><summary type="html">
    
&lt;p&gt;&lt;strong&gt;&lt;a href="http://json-head.appspot.com/"&gt;json-head&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;
I’ve deployed another App Engine mini-app, which provides a JSON-P API for running HEAD requests against an arbitrary URL (useful for checking things like Content-Length and Content-Type headers and whether a URL returns 200). App Engine’s urlfetch limitations mean it can only deal with port 80 and 443 requests. 


    &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/google-app-engine"&gt;google-app-engine&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/json"&gt;json&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/jsonhead"&gt;jsonhead&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/jsonp"&gt;jsonp&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/projects"&gt;projects&lt;/a&gt;&lt;/p&gt;



</summary><category term="google-app-engine"/><category term="json"/><category term="jsonhead"/><category term="jsonp"/><category term="projects"/></entry></feed>