<?xml version="1.0" encoding="utf-8"?>
<feed xml:lang="en-us" xmlns="http://www.w3.org/2005/Atom"><title>Simon Willison's Weblog: proxies</title><link href="http://simonwillison.net/" rel="alternate"/><link href="http://simonwillison.net/tags/proxies.atom" rel="self"/><id>http://simonwillison.net/</id><updated>2025-09-19T21:57:29+00:00</updated><author><name>Simon Willison</name></author><entry><title>httpjail</title><link href="https://simonwillison.net/2025/Sep/19/httpjail/#atom-tag" rel="alternate"/><published>2025-09-19T21:57:29+00:00</published><updated>2025-09-19T21:57:29+00:00</updated><id>https://simonwillison.net/2025/Sep/19/httpjail/#atom-tag</id><summary type="html">
    
&lt;p&gt;&lt;strong&gt;&lt;a href="https://github.com/coder/httpjail"&gt;httpjail&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;
Here's a promising new (experimental) project in the sandboxing space from Ammar Bandukwala at &lt;a href="https://coder.com/"&gt;Coder&lt;/a&gt;. &lt;code&gt;httpjail&lt;/code&gt; provides a Rust CLI tool for running an individual process against a custom configured HTTP proxy.&lt;/p&gt;
&lt;p&gt;The initial goal is to help run coding agents like Claude Code and Codex CLI with extra rules governing how they interact with outside services. From Ammar's blog post that introduces the new tool, &lt;a href="https://ammar.io/blog/httpjail"&gt;Fine-grained HTTP filtering for Claude Code&lt;/a&gt;:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;code&gt;httpjail&lt;/code&gt; implements an HTTP(S) interceptor alongside process-level network isolation. Under default configuration, all DNS (udp:53) is permitted and all other non-HTTP(S) traffic is blocked.&lt;/p&gt;
&lt;p&gt;&lt;code&gt;httpjail&lt;/code&gt; rules are either JavaScript expressions or custom programs. This approach makes them far more flexible than traditional rule-oriented firewalls and avoids the learning curve of a DSL.&lt;/p&gt;
&lt;p&gt;Block all HTTP requests other than the LLM API traffic itself:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;$ httpjail --js "r.host === 'api.anthropic.com'" -- claude "build something great"
&lt;/code&gt;&lt;/pre&gt;
&lt;/blockquote&gt;
&lt;p&gt;I tried it out using OpenAI's Codex CLI instead and found this recipe worked:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;brew upgrade rust
cargo install httpjail # Drops it in `~/.cargo/bin`
httpjail --js "r.host === 'chatgpt.com'" -- codex
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Within that Codex instance the model ran fine but any attempts to access other URLs (e.g. telling it "&lt;code&gt;Use curl to fetch simonwillison.net&lt;/code&gt;)" failed at the proxy layer.&lt;/p&gt;
&lt;p&gt;This is still at a really early stage but there's a lot I like about this project. Being able to use JavaScript to filter requests via the &lt;code&gt;--js&lt;/code&gt; option is neat (it's using V8 under the hood), and there's also a &lt;code&gt;--sh shellscript&lt;/code&gt; option which instead runs a shell program passing environment variables that can be used to determine if the request should be allowed.&lt;/p&gt;
&lt;p&gt;At a basic level it works by running a proxy server and setting &lt;code&gt;HTTP_PROXY&lt;/code&gt; and &lt;code&gt;HTTPS_PROXY&lt;/code&gt; environment variables so well-behaving software knows how to route requests.&lt;/p&gt;
&lt;p&gt;It can also add a bunch of other layers. On Linux it sets up &lt;a href="https://en.wikipedia.org/wiki/Nftables"&gt;nftables&lt;/a&gt; rules to explicitly deny additional network access. There's also a &lt;code&gt;--docker-run&lt;/code&gt; option which can launch a Docker container with the specified image but first locks that container down to only have network access to the &lt;code&gt;httpjail&lt;/code&gt; proxy server.&lt;/p&gt;
&lt;p&gt;It can intercept, filter and log HTTPS requests too by generating its own certificate and making that available to the underlying process.&lt;/p&gt;
&lt;p&gt;I'm always interested in new approaches to sandboxing, and fine-grained network access is a particularly tricky problem to solve. This looks like a very promising step in that direction - I'm looking forward to seeing how this project continues to evolve.

    &lt;p&gt;&lt;small&gt;&lt;/small&gt;Via &lt;a href="https://ammar.io/blog/httpjail"&gt;Fine-grained HTTP filtering for Claude Code&lt;/a&gt;&lt;/small&gt;&lt;/p&gt;


    &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/http"&gt;http&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/javascript"&gt;javascript&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/proxies"&gt;proxies&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/sandboxing"&gt;sandboxing&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/security"&gt;security&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/v8"&gt;v8&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/rust"&gt;rust&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/claude-code"&gt;claude-code&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/codex-cli"&gt;codex-cli&lt;/a&gt;&lt;/p&gt;



</summary><category term="http"/><category term="javascript"/><category term="proxies"/><category term="sandboxing"/><category term="security"/><category term="v8"/><category term="rust"/><category term="claude-code"/><category term="codex-cli"/></entry><entry><title>Weeknotes: Apache proxies in Docker containers, refactoring Datasette</title><link href="https://simonwillison.net/2021/Nov/22/apache-proxies-datasette/#atom-tag" rel="alternate"/><published>2021-11-22T05:43:44+00:00</published><updated>2021-11-22T05:43:44+00:00</updated><id>https://simonwillison.net/2021/Nov/22/apache-proxies-datasette/#atom-tag</id><summary type="html">
    &lt;p&gt;Updates to six major projects this week, plus finally some concrete progress towards Datasette 1.0.&lt;/p&gt;
&lt;h4&gt;Fixing Datasette's proxy bugs&lt;/h4&gt;
&lt;p&gt;Now that Datasette has had its fourth birthday I've decided to really push towards hitting &lt;a href="https://github.com/simonw/datasette/milestone/7"&gt;the 1.0 milestone&lt;/a&gt;. The key property of that release will be a stable JSON API, stable plugin hooks and a stable, documented context for custom templates. There's quite a lot of mostly unexciting work needed to get there.&lt;/p&gt;
&lt;p&gt;As I work through the issues in that milestone I'm encountering some that I filed more than two years ago!&lt;/p&gt;
&lt;p&gt;Two of those made it into the &lt;a href="https://docs.datasette.io/en/stable/changelog.html#v0-59-3"&gt;Datasette 0.59.3&lt;/a&gt; bug fix release earlier this week.&lt;/p&gt;
&lt;p&gt;The majority of the work in that release though related to Datasette's &lt;a href="https://docs.datasette.io/en/stable/settings.html#base-url"&gt;base_url feature&lt;/a&gt;, designed to help people who run Datasette behind a proxy.&lt;/p&gt;
&lt;p&gt;&lt;code&gt;base_url&lt;/code&gt; lets you run Datasette like this:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;datasette --setting base_url=/prefix/ fixtures.db
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;When you do this, Datasette will change its URLs to start with that prefix - so the hompage will live at &lt;code&gt;/prefix/&lt;/code&gt;, the database index page at &lt;code&gt;/prefix/fixtures/&lt;/code&gt;, tables at &lt;code&gt;/prefix/fixtures/facetable&lt;/code&gt; etc.&lt;/p&gt;
&lt;p&gt;The reason you would want this is if you are running a larger website, and you intend to proxy traffic to &lt;code&gt;/prefix/&lt;/code&gt; to a separate Datasette instance.&lt;/p&gt;
&lt;p&gt;The Datasette documentation includes &lt;a href="https://docs.datasette.io/en/stable/deploying.html#running-datasette-behind-a-proxy"&gt;suggested nginx and Apache configurations&lt;/a&gt; for doing exactly that.&lt;/p&gt;
&lt;p&gt;This feature has been &lt;a href="https://github.com/simonw/datasette/issues?q=is%3Aissue+base_url"&gt;a magnet for bugs&lt;/a&gt; over the years! People keep finding new parts of the Datasette interface that fail to link to the correct pages when run in this mode.&lt;/p&gt;
&lt;p&gt;The principle cause of these bugs is that I don't use Datasette in this way myself, so I wasn't testing it nearly as thoroughly as it needed.&lt;/p&gt;
&lt;p&gt;So the first step in finally solving these issues once and for all was to get my own instance of Datasette up and running behind an Apache proxy.&lt;/p&gt;
&lt;p&gt;Since I like to deploy live demos to Cloud Run, I decided to try and run Apache and Datasette in the same container. This took a &lt;em&gt;lot&lt;/em&gt; of figuring out. You can follow my progress on this in these two issue threads:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href="https://github.com/simonw/datasette/issues/1521"&gt;#1521: Docker configuration for exercising Datasette behind Apache mod_proxy&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://github.com/simonw/datasette/issues/1522"&gt;#1522: Deploy a live instance of demos/apache-proxy&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;The short version: I got it working! My Docker implementation now lives in the &lt;a href="https://github.com/simonw/datasette/tree/0.59.3/demos/apache-proxy"&gt;demos/apache-proxy&lt;/a&gt; directory and the live demo itself is deployed to &lt;a href="https://datasette-apache-proxy-demo.fly.dev/prefix/"&gt;datasette-apache-proxy-demo.fly.dev/prefix/&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;(I ended up deploying it to &lt;a href="https://fly.io/"&gt;Fly&lt;/a&gt; after running into a bug when deployed to Cloud Run that I couldn't replicate on my own laptop.)&lt;/p&gt;
&lt;p&gt;My final implementation uses a Debian base container with Supervisord to manage the two processes.&lt;/p&gt;
&lt;p&gt;With a working live environment, I was finally able to track down the root cause of the bugs. My notes on
&lt;a href="https://github.com/simonw/datasette/issues/1519"&gt;#1519: base_url is omitted in JSON and CSV views&lt;/a&gt; document how I found and solved them, and updated the associated test to hopefully avoid them ever coming back in the future.&lt;/p&gt;
&lt;h4&gt;The big Datasette table refactor&lt;/h4&gt;
&lt;p&gt;The single most complicated part of the Datasette codebase is the code behind the table view - the page that lets you browse, facet, search, filter and paginate through the contents of a table (&lt;a href="https://covid-19.datasettes.com/covid/ny_times_us_counties"&gt;this page here&lt;/a&gt;).&lt;/p&gt;
&lt;p&gt;It's got very thorough tests, but the actual implementation is mostly &lt;a href="https://github.com/simonw/datasette/blob/main/datasette/views/table.py#L303-L992"&gt;a 600 line class method&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;It was already difficult to work with, but the changes I want to make for Datasette 1.0 have proven too much for it. I need to refactor.&lt;/p&gt;
&lt;p&gt;Apart from making that view easier to change and maintain, a major goal I have is for it to support a much more flexible JSON syntax. I want the JSON version to default to just returning minimal information about the table, then allow &lt;code&gt;?_extra=x&lt;/code&gt; parameters to opt into additional information - like facets, suggested facets, full counts, SQL schema information and so on.&lt;/p&gt;
&lt;p&gt;This means I want to break up that 600 line method into a bunch of separate methods, each of which can be opted-in-to by the calling code.&lt;/p&gt;
&lt;p&gt;The HTML interface should then build on top of the JSON, requesting the extras that it knows it will need and passing the resulting data through to the template. This helps solve the challenge of having a stable template context that I can document in advance of Datasette 1.0&lt;/p&gt;
&lt;p&gt;I've been putting this off for over a year now, because it's a &lt;em&gt;lot&lt;/em&gt; of work. But no longer! This week I finally started to get stuck in.&lt;/p&gt;
&lt;p&gt;I don't know if I'll stick with it, but my initial attempt at this is a little unconventional. Inspired by how &lt;a href="https://docs.pytest.org/en/6.2.x/fixture.html#back-to-fixtures"&gt;pytest fixtures work&lt;/a&gt; I'm experimenting with a form of dependency injection, in a new (very alpha) library I've released called &lt;a href="https://github.com/simonw/asyncinject"&gt;asyncinject&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;The key idea behind &lt;code&gt;asyncinject&lt;/code&gt; is to provide a way for class methods to indicate their dependencies as named parameters, in the same way as pytest fixtures do.&lt;/p&gt;
&lt;p&gt;When you call a method, the code can spot which dependencies have not yet been resolved and execute them before executing the method.&lt;/p&gt;
&lt;p&gt;Crucially, since they are all &lt;code&gt;async def&lt;/code&gt; methods they can be &lt;em&gt;executed in parallel&lt;/em&gt;. I'm cautiously excited about this - Datasette has a bunch of opportunities for parallel queries - fetching a single page of table rows, calculating a &lt;code&gt;count(*)&lt;/code&gt; for the entire table, executing requested facets and calculating suggested facets are all queries that could potentially run in parallel rather than in serial.&lt;/p&gt;
&lt;p&gt;What about the GIL, you might ask? Datasette's database queries are handled by the &lt;code&gt;sqlite3&lt;/code&gt; module, and that module releases the GIL once it gets into SQLite C code. So theoretically I should be able to use more than one core for this all.&lt;/p&gt;
&lt;p&gt;The &lt;a href="https://github.com/simonw/asyncinject/blob/0.2a0/README.md"&gt;asyncinject README&lt;/a&gt; has more details, including code examples. This may turn out to be a terrible idea! But it's really fun to explore, and I'll be able to tell for sure if this is a useful, maintainable and performant approach once I have Datasette's table view running on top of it.&lt;/p&gt;
&lt;h4&gt;git-history and sqlite-utils&lt;/h4&gt;
&lt;p&gt;I made some big improvements to my &lt;a href="https://github.com/simonw/git-history"&gt;git-history&lt;/a&gt; tool, which automates the process of turning a JSON (or other) file that has been version-tracked in a GitHub repository (see &lt;a href="https://simonwillison.net/2020/Oct/9/git-scraping/"&gt;Git scraping&lt;/a&gt;) into a SQLite database that can be used to explore changes to it over time.&lt;/p&gt;
&lt;p&gt;The biggest was a major change to the database schema. Previously, the tool used full Git SHA hashes as foreign keys in the largest table.&lt;/p&gt;
&lt;p&gt;The problem here is that a SHA hash string is 40 characters long, and if they are being used as a foreign key that's a LOT of extra weight added to the largest table.&lt;/p&gt;
&lt;p&gt;&lt;code&gt;sqlite-utils&lt;/code&gt; has a &lt;a href="https://sqlite-utils.datasette.io/en/stable/python-api.html#python-api-lookup-tables"&gt;table.lookup() method&lt;/a&gt; which is designed to make creating "lookup" tables - where a string is stored in a unique column but an integer ID can be used for things like foreign keys - as easy as possible.&lt;/p&gt;
&lt;p&gt;That method was previously quite limited, but in &lt;a href="https://sqlite-utils.datasette.io/en/stable/changelog.html#v3-18"&gt;sqlite-utils 3.18&lt;/a&gt; and &lt;a href="https://sqlite-utils.datasette.io/en/stable/changelog.html#v3-19"&gt;3.19&lt;/a&gt; - both released this week - I expanded it to cover the more advanced needs of my &lt;code&gt;git-history&lt;/code&gt; tool.&lt;/p&gt;
&lt;p&gt;The great thing about building stuff on top of your own libraries is that you can discover new features that you need along the way - and then ship them promptly without them blocking your progress!&lt;/p&gt;
&lt;h4&gt;Some other highlights&lt;/h4&gt;
&lt;ul&gt;
&lt;li&gt;
&lt;a href="https://github.com/simonw/s3-credentials/releases/tag/0.6"&gt;s3-credentials 0.6&lt;/a&gt; adds a &lt;code&gt;--dry-run&lt;/code&gt; option that you can use to show what the tool would do without making any actual changes to your AWS account. I found myself wanting this while continuing to work on the ability to &lt;a href="https://github.com/simonw/s3-credentials/issues/12"&gt;specify a folder prefix&lt;/a&gt; within S3 that the bucket credentials should be limited to.&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://github.com/simonw/datasette-publish-vercel/releases/tag/0.12"&gt;datasette-publish-vercel 0.12&lt;/a&gt; applies some pull requests from Romain Clement that I had left unreviewed for far too long, and adds the ability to customize the &lt;code&gt;vercel.json&lt;/code&gt; file used for the deployment - useful for things like setting up additional custom redirects.&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://github.com/simonw/datasette-graphql/releases/tag/2.0"&gt;datasette-graphql 2.0&lt;/a&gt; updates that plugin to &lt;a href="https://github.com/graphql-python/graphene/wiki/v3-release-notes"&gt;Graphene 3.0&lt;/a&gt;, a major update to that library. I had to break backwards compatiblity in very minor ways, hence the 2.0 version number.&lt;/li&gt;
&lt;/ul&gt;
&lt;ul&gt;
&lt;li&gt;
&lt;a href="https://github.com/simonw/csvs-to-sqlite/releases/tag/1.3"&gt;csvs-to-sqlite 1.3&lt;/a&gt; is the first relase of that tool in just over a year. William Rowell contributed a new feature that allows you to populate "fixed" database columns on your imported records, see &lt;a href="https://github.com/simonw/csvs-to-sqlite/pull/81"&gt;PR #81&lt;/a&gt; for details.&lt;/li&gt;
&lt;/ul&gt;
&lt;h4&gt;TIL this week&lt;/h4&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href="https://til.simonwillison.net/python/graphlib-topologicalsorter"&gt;Planning parallel downloads with TopologicalSorter&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://til.simonwillison.net/python/cog-to-update-help-in-readme"&gt;Using cog to update --help in a Markdown README file&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://til.simonwillison.net/cloudrun/using-build-args-with-cloud-run"&gt;Using build-arg variables with Cloud Run deployments&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://til.simonwillison.net/fly/custom-subdomain-fly"&gt;Assigning a custom subdomain to a Fly app&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;h4&gt;Releases this week&lt;/h4&gt;
&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;&lt;a href="https://github.com/simonw/datasette-publish-vercel"&gt;datasette-publish-vercel&lt;/a&gt;&lt;/strong&gt;: &lt;a href="https://github.com/simonw/datasette-publish-vercel/releases/tag/0.12"&gt;0.12&lt;/a&gt; - (&lt;a href="https://github.com/simonw/datasette-publish-vercel/releases"&gt;18 releases total&lt;/a&gt;) - 2021-11-22
&lt;br /&gt;Datasette plugin for publishing data using Vercel&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;&lt;a href="https://github.com/simonw/git-history"&gt;git-history&lt;/a&gt;&lt;/strong&gt;: &lt;a href="https://github.com/simonw/git-history/releases/tag/0.4"&gt;0.4&lt;/a&gt; - (&lt;a href="https://github.com/simonw/git-history/releases"&gt;6 releases total&lt;/a&gt;) - 2021-11-21
&lt;br /&gt;Tools for analyzing Git history using SQLite&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;&lt;a href="https://github.com/simonw/sqlite-utils"&gt;sqlite-utils&lt;/a&gt;&lt;/strong&gt;: &lt;a href="https://github.com/simonw/sqlite-utils/releases/tag/3.19"&gt;3.19&lt;/a&gt; - (&lt;a href="https://github.com/simonw/sqlite-utils/releases"&gt;90 releases total&lt;/a&gt;) - 2021-11-21
&lt;br /&gt;Python CLI utility and library for manipulating SQLite databases&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;&lt;a href="https://github.com/simonw/datasette"&gt;datasette&lt;/a&gt;&lt;/strong&gt;: &lt;a href="https://github.com/simonw/datasette/releases/tag/0.59.3"&gt;0.59.3&lt;/a&gt; - (&lt;a href="https://github.com/simonw/datasette/releases"&gt;101 releases total&lt;/a&gt;) - 2021-11-20
&lt;br /&gt;An open source multi-tool for exploring and publishing data&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;&lt;a href="https://github.com/simonw/datasette-redirect-to-https"&gt;datasette-redirect-to-https&lt;/a&gt;&lt;/strong&gt;: &lt;a href="https://github.com/simonw/datasette-redirect-to-https/releases/tag/0.1"&gt;0.1&lt;/a&gt; - 2021-11-20
&lt;br /&gt;Datasette plugin that redirects all non-https requests to https&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;&lt;a href="https://github.com/simonw/s3-credentials"&gt;s3-credentials&lt;/a&gt;&lt;/strong&gt;: &lt;a href="https://github.com/simonw/s3-credentials/releases/tag/0.6"&gt;0.6&lt;/a&gt; - (&lt;a href="https://github.com/simonw/s3-credentials/releases"&gt;6 releases total&lt;/a&gt;) - 2021-11-18
&lt;br /&gt;A tool for creating credentials for accessing S3 buckets&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;&lt;a href="https://github.com/simonw/csvs-to-sqlite"&gt;csvs-to-sqlite&lt;/a&gt;&lt;/strong&gt;: &lt;a href="https://github.com/simonw/csvs-to-sqlite/releases/tag/1.3"&gt;1.3&lt;/a&gt; - (&lt;a href="https://github.com/simonw/csvs-to-sqlite/releases"&gt;13 releases total&lt;/a&gt;) - 2021-11-18
&lt;br /&gt;Convert CSV files into a SQLite database&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;&lt;a href="https://github.com/simonw/datasette-graphql"&gt;datasette-graphql&lt;/a&gt;&lt;/strong&gt;: &lt;a href="https://github.com/simonw/datasette-graphql/releases/tag/2.0"&gt;2.0&lt;/a&gt; - (&lt;a href="https://github.com/simonw/datasette-graphql/releases"&gt;32 releases total&lt;/a&gt;) - 2021-11-17
&lt;br /&gt;Datasette plugin providing an automatic GraphQL API for your SQLite databases&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;&lt;a href="https://github.com/simonw/asyncinject"&gt;asyncinject&lt;/a&gt;&lt;/strong&gt;: &lt;a href="https://github.com/simonw/asyncinject/releases/tag/0.2a0"&gt;0.2a0&lt;/a&gt; - (&lt;a href="https://github.com/simonw/asyncinject/releases"&gt;2 releases total&lt;/a&gt;) - 2021-11-17
&lt;br /&gt;Run async workflows using pytest-fixtures-style dependency injection&lt;/li&gt;
&lt;/ul&gt;
    
        &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/apache"&gt;apache&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/proxies"&gt;proxies&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/refactoring"&gt;refactoring&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/supervisord"&gt;supervisord&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/docker"&gt;docker&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/datasette"&gt;datasette&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/weeknotes"&gt;weeknotes&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/git-scraping"&gt;git-scraping&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/sqlite-utils"&gt;sqlite-utils&lt;/a&gt;&lt;/p&gt;
    

</summary><category term="apache"/><category term="proxies"/><category term="refactoring"/><category term="supervisord"/><category term="docker"/><category term="datasette"/><category term="weeknotes"/><category term="git-scraping"/><category term="sqlite-utils"/></entry><entry><title>Weeknotes: Fun with Unix domain sockets</title><link href="https://simonwillison.net/2021/Jul/13/unix-domain-sockets/#atom-tag" rel="alternate"/><published>2021-07-13T18:57:10+00:00</published><updated>2021-07-13T18:57:10+00:00</updated><id>https://simonwillison.net/2021/Jul/13/unix-domain-sockets/#atom-tag</id><summary type="html">
    &lt;p&gt;A small enhancement to Datasette this week: I've added support for proxying via Unix domain sockets.&lt;/p&gt;
&lt;p&gt;This started out as a feature request from Aslak Raanes: &lt;a href="https://github.com/simonw/datasette/issues/1388"&gt;#1388: Serve using UNIX domain socket&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;I've not worked with these much before so it was a good opportunity to learn something new. Unix domain sockets provide a mechanism whereby different processes on a machine can communicate with each over over a mechanism similar to TCP, but via a file path instead.&lt;/p&gt;
&lt;p&gt;I've encountered these before with the Docker daemon, which listens on path &lt;code&gt;/var/run/docker.sock&lt;/code&gt; and can be communicated with using &lt;code&gt;curl&lt;/code&gt; like so:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;curl --unix-socket /var/run/docker.sock \
  http://localhost/v1.41/containers/json
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Plenty more examples &lt;a href="https://docs.docker.com/engine/api/sdk/examples/"&gt;in the Docker documentation&lt;/a&gt; if you click the 'HTTP' tab.&lt;/p&gt;
&lt;p&gt;It turns out both nginx and Apache have the ability to proxy traffic to a Unix domain socket rather than to an HTTP port, which makes this a useful mechanism for running backend servers without attaching them to TCP ports.&lt;/p&gt;
&lt;h4&gt;Implementing this in Datasette&lt;/h4&gt;
&lt;p&gt;Datasette uses the excellent &lt;a href="https://www.uvicorn.org/"&gt;Uvicorn&lt;/a&gt; Python web server to serve traffic out of the box, and Uvicorn already &lt;a href="https://www.uvicorn.org/settings/#socket-binding"&gt;includes support for UDS&lt;/a&gt; - so adding support to Datasette was pretty easy - here's &lt;a href="https://github.com/simonw/datasette/commit/180c7a5328457aefdf847ada366e296fef4744f1"&gt;the full implementation&lt;/a&gt;. I've added a new &lt;code&gt;--uds&lt;/code&gt; option, so now you can run Datasette like this:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;datasette --uds /tmp/datasette.sock fixtures.db
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Datasette will "listen" on &lt;code&gt;/tmp/datasette.sock&lt;/code&gt; - which means you can run requests via &lt;code&gt;curl&lt;/code&gt; like so:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;curl --unix-socket /tmp/datasette.sock \
  http://localhost/fixtures.json | jq
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;More importantly, it means you can configure nginx or Apache to proxy to the Datasette server like this (nginx):&lt;/p&gt;
&lt;div class="highlight highlight-source-nginx"&gt;&lt;pre&gt;&lt;span class="pl-k"&gt;daemon&lt;/span&gt;&lt;span class="pl-c1"&gt; off&lt;/span&gt;;
&lt;span class="pl-k"&gt;events&lt;/span&gt; {
  &lt;span class="pl-k"&gt;worker_connections&lt;/span&gt;  &lt;span class="pl-s"&gt;1024&lt;/span&gt;;
}
&lt;span class="pl-k"&gt;http&lt;/span&gt; {
  &lt;span class="pl-k"&gt;server&lt;/span&gt; {
    &lt;span class="pl-k"&gt;listen&lt;/span&gt; &lt;span class="pl-s"&gt;80&lt;/span&gt;;
    &lt;span class="pl-k"&gt;location&lt;/span&gt; &lt;span class="pl-en"&gt;/ &lt;/span&gt;{
      &lt;span class="pl-k"&gt;proxy_pass&lt;/span&gt; http://datasette;
      &lt;span class="pl-k"&gt;proxy_set_header&lt;/span&gt; Host &lt;span class="pl-smi"&gt;$host&lt;/span&gt;;
    }
  }
  &lt;span class="pl-k"&gt;upstream&lt;/span&gt; &lt;span class="pl-en"&gt;datasette &lt;/span&gt;{
    &lt;span class="pl-k"&gt;server&lt;/span&gt; unix:/tmp/datasette.sock;
  }
}&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;Or like this (Apache):&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;ProxyPass / unix:/tmp/datasette.sock|http://localhost/
&lt;/code&gt;&lt;/pre&gt;
&lt;h4&gt;Writing tests&lt;/h4&gt;
&lt;p&gt;The implementation was only a few lines of code (to pass the &lt;code&gt;uds&lt;/code&gt; option to Uvicorn) but adding a test proved a little more challenging. I used this pytest fixture to spin up a server process:&lt;/p&gt;
&lt;pre&gt;&lt;span class="pl-en"&gt;@&lt;span class="pl-s1"&gt;pytest&lt;/span&gt;.&lt;span class="pl-en"&gt;fixture&lt;/span&gt;(&lt;span class="pl-s1"&gt;scope&lt;/span&gt;&lt;span class="pl-c1"&gt;=&lt;/span&gt;&lt;span class="pl-s"&gt;"session"&lt;/span&gt;)&lt;/span&gt;
&lt;span class="pl-k"&gt;def&lt;/span&gt; &lt;span class="pl-en"&gt;ds_unix_domain_socket_server&lt;/span&gt;(&lt;span class="pl-s1"&gt;tmp_path_factory&lt;/span&gt;):
    &lt;span class="pl-s1"&gt;socket_folder&lt;/span&gt; &lt;span class="pl-c1"&gt;=&lt;/span&gt; &lt;span class="pl-s1"&gt;tmp_path_factory&lt;/span&gt;.&lt;span class="pl-en"&gt;mktemp&lt;/span&gt;(&lt;span class="pl-s"&gt;"uds"&lt;/span&gt;)
    &lt;span class="pl-s1"&gt;uds&lt;/span&gt; &lt;span class="pl-c1"&gt;=&lt;/span&gt; &lt;span class="pl-en"&gt;str&lt;/span&gt;(&lt;span class="pl-s1"&gt;socket_folder&lt;/span&gt; &lt;span class="pl-c1"&gt;/&lt;/span&gt; &lt;span class="pl-s"&gt;"datasette.sock"&lt;/span&gt;)
    &lt;span class="pl-s1"&gt;ds_proc&lt;/span&gt; &lt;span class="pl-c1"&gt;=&lt;/span&gt; &lt;span class="pl-s1"&gt;subprocess&lt;/span&gt;.&lt;span class="pl-v"&gt;Popen&lt;/span&gt;(
        [&lt;span class="pl-s"&gt;"datasette"&lt;/span&gt;, &lt;span class="pl-s"&gt;"--memory"&lt;/span&gt;, &lt;span class="pl-s"&gt;"--uds"&lt;/span&gt;, &lt;span class="pl-s1"&gt;uds&lt;/span&gt;],
        &lt;span class="pl-s1"&gt;stdout&lt;/span&gt;&lt;span class="pl-c1"&gt;=&lt;/span&gt;&lt;span class="pl-s1"&gt;subprocess&lt;/span&gt;.&lt;span class="pl-v"&gt;PIPE&lt;/span&gt;,
        &lt;span class="pl-s1"&gt;stderr&lt;/span&gt;&lt;span class="pl-c1"&gt;=&lt;/span&gt;&lt;span class="pl-s1"&gt;subprocess&lt;/span&gt;.&lt;span class="pl-v"&gt;STDOUT&lt;/span&gt;,
        &lt;span class="pl-s1"&gt;cwd&lt;/span&gt;&lt;span class="pl-c1"&gt;=&lt;/span&gt;&lt;span class="pl-s1"&gt;tempfile&lt;/span&gt;.&lt;span class="pl-en"&gt;gettempdir&lt;/span&gt;(),
    )
    &lt;span class="pl-c"&gt;# Give the server time to start&lt;/span&gt;
    &lt;span class="pl-s1"&gt;time&lt;/span&gt;.&lt;span class="pl-en"&gt;sleep&lt;/span&gt;(&lt;span class="pl-c1"&gt;1.5&lt;/span&gt;)
    &lt;span class="pl-c"&gt;# Check it started successfully&lt;/span&gt;
    &lt;span class="pl-k"&gt;assert&lt;/span&gt; &lt;span class="pl-c1"&gt;not&lt;/span&gt; &lt;span class="pl-s1"&gt;ds_proc&lt;/span&gt;.&lt;span class="pl-en"&gt;poll&lt;/span&gt;(), &lt;span class="pl-s1"&gt;ds_proc&lt;/span&gt;.&lt;span class="pl-s1"&gt;stdout&lt;/span&gt;.&lt;span class="pl-en"&gt;read&lt;/span&gt;().&lt;span class="pl-en"&gt;decode&lt;/span&gt;(&lt;span class="pl-s"&gt;"utf-8"&lt;/span&gt;)
    &lt;span class="pl-k"&gt;yield&lt;/span&gt; &lt;span class="pl-s1"&gt;ds_proc&lt;/span&gt;, &lt;span class="pl-s1"&gt;uds&lt;/span&gt;
    &lt;span class="pl-c"&gt;# Shut it down at the end of the pytest session&lt;/span&gt;
    &lt;span class="pl-s1"&gt;ds_proc&lt;/span&gt;.&lt;span class="pl-en"&gt;terminate&lt;/span&gt;()&lt;/pre&gt;
&lt;p&gt;I use a similar pattern &lt;a href="https://github.com/simonw/datasette/blob/7f4c854db1ed8c15338e9cf42d2a3f0c92e3b7b2/tests/conftest.py#L104-L155"&gt;for some other tests&lt;/a&gt;, to exercise the &lt;code&gt;--ssl-keyfile&lt;/code&gt; and &lt;code&gt;--ssl-certfile&lt;/code&gt; options added in &lt;a href="https://github.com/simonw/datasette/issues/1221"&gt;#1221&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;The test itself looks like this, taking advantage of HTTPX's ability to make calls against Unix domain sockets:&lt;/p&gt;
&lt;pre&gt;&lt;span class="pl-en"&gt;@&lt;span class="pl-s1"&gt;pytest&lt;/span&gt;.&lt;span class="pl-s1"&gt;mark&lt;/span&gt;.&lt;span class="pl-s1"&gt;serial&lt;/span&gt;&lt;/span&gt;
&lt;span class="pl-en"&gt;@&lt;span class="pl-s1"&gt;pytest&lt;/span&gt;.&lt;span class="pl-s1"&gt;mark&lt;/span&gt;.&lt;span class="pl-en"&gt;skipif&lt;/span&gt;(&lt;span class="pl-c1"&gt;not&lt;/span&gt; &lt;span class="pl-en"&gt;hasattr&lt;/span&gt;(&lt;span class="pl-s1"&gt;socket&lt;/span&gt;, &lt;span class="pl-s"&gt;"AF_UNIX"&lt;/span&gt;), &lt;span class="pl-s1"&gt;reason&lt;/span&gt;&lt;span class="pl-c1"&gt;=&lt;/span&gt;&lt;span class="pl-s"&gt;"Requires socket.AF_UNIX support"&lt;/span&gt;)&lt;/span&gt;
&lt;span class="pl-k"&gt;def&lt;/span&gt; &lt;span class="pl-en"&gt;test_serve_unix_domain_socket&lt;/span&gt;(&lt;span class="pl-s1"&gt;ds_unix_domain_socket_server&lt;/span&gt;):
    &lt;span class="pl-s1"&gt;_&lt;/span&gt;, &lt;span class="pl-s1"&gt;uds&lt;/span&gt; &lt;span class="pl-c1"&gt;=&lt;/span&gt; &lt;span class="pl-s1"&gt;ds_unix_domain_socket_server&lt;/span&gt;
    &lt;span class="pl-s1"&gt;transport&lt;/span&gt; &lt;span class="pl-c1"&gt;=&lt;/span&gt; &lt;span class="pl-s1"&gt;httpx&lt;/span&gt;.&lt;span class="pl-v"&gt;HTTPTransport&lt;/span&gt;(&lt;span class="pl-s1"&gt;uds&lt;/span&gt;&lt;span class="pl-c1"&gt;=&lt;/span&gt;&lt;span class="pl-s1"&gt;uds&lt;/span&gt;)
    &lt;span class="pl-s1"&gt;client&lt;/span&gt; &lt;span class="pl-c1"&gt;=&lt;/span&gt; &lt;span class="pl-s1"&gt;httpx&lt;/span&gt;.&lt;span class="pl-v"&gt;Client&lt;/span&gt;(&lt;span class="pl-s1"&gt;transport&lt;/span&gt;&lt;span class="pl-c1"&gt;=&lt;/span&gt;&lt;span class="pl-s1"&gt;transport&lt;/span&gt;)
    &lt;span class="pl-s1"&gt;response&lt;/span&gt; &lt;span class="pl-c1"&gt;=&lt;/span&gt; &lt;span class="pl-s1"&gt;client&lt;/span&gt;.&lt;span class="pl-en"&gt;get&lt;/span&gt;(&lt;span class="pl-s"&gt;"http://localhost/_memory.json"&lt;/span&gt;)
    &lt;span class="pl-k"&gt;assert&lt;/span&gt; {
        &lt;span class="pl-s"&gt;"database"&lt;/span&gt;: &lt;span class="pl-s"&gt;"_memory"&lt;/span&gt;,
        &lt;span class="pl-s"&gt;"path"&lt;/span&gt;: &lt;span class="pl-s"&gt;"/_memory"&lt;/span&gt;,
        &lt;span class="pl-s"&gt;"tables"&lt;/span&gt;: [],
    }.&lt;span class="pl-en"&gt;items&lt;/span&gt;() &lt;span class="pl-c1"&gt;&amp;lt;=&lt;/span&gt; &lt;span class="pl-s1"&gt;response&lt;/span&gt;.&lt;span class="pl-en"&gt;json&lt;/span&gt;().&lt;span class="pl-en"&gt;items&lt;/span&gt;()&lt;/pre&gt;
&lt;p&gt;The &lt;code&gt;skipif&lt;/code&gt; decorator avoids running this test on platforms which don't support Unix domain sockets (which I think includes Windows, see &lt;a href="https://github.com/simonw/datasette/issues/1388#issuecomment-877716359"&gt;this comment&lt;/a&gt;).&lt;/p&gt;
&lt;p&gt;The &lt;code&gt;@pytest.mark.serial&lt;/code&gt; decorator applies a "mark" that can be used to selectively run the test. I do this because Datasette's tests run in CI using &lt;a href="https://pypi.org/project/pytest-xdist/"&gt;pytest-xdist&lt;/a&gt;, but that's not compatible with this way of spinning up a temporary server. Datasette actually runs the tests in GitHub Actions &lt;a href="https://github.com/simonw/datasette/blob/7f4c854db1ed8c15338e9cf42d2a3f0c92e3b7b2/.github/workflows/test.yml#L27-L30"&gt;like so&lt;/a&gt;:&lt;/p&gt;
&lt;div class="highlight highlight-source-yaml"&gt;&lt;pre&gt;- &lt;span class="pl-ent"&gt;name&lt;/span&gt;: &lt;span class="pl-s"&gt;Run tests&lt;/span&gt;
  &lt;span class="pl-ent"&gt;run&lt;/span&gt;: &lt;span class="pl-s"&gt;|&lt;/span&gt;
&lt;span class="pl-s"&gt;    pytest -n auto -m "not serial"&lt;/span&gt;
&lt;span class="pl-s"&gt;    pytest -m "serial"&lt;/span&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;The &lt;code&gt;pytest -n auto -m "not serial"&lt;/code&gt; line runs almost all of the tests using &lt;code&gt;pytest-xdist&lt;/code&gt; across an automatically selected number of processes, but skips the ones marked with &lt;code&gt;@pytest.mark.serial&lt;/code&gt;. Then the second line runs the remaining serial tests without any additional concurrency.&lt;/p&gt;
&lt;p&gt;Documenation and example configuration for this feature can be found in the &lt;a href="https://docs.datasette.io/en/latest/deploying.html#running-datasette-behind-a-proxy"&gt;Running Datasette behind a proxy&lt;/a&gt; documentation. Thanks to Aslak for contributing the notes on Apache configuration.&lt;/p&gt;
&lt;h4&gt;TIL this week&lt;/h4&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href="https://til.simonwillison.net/javascript/preventing-double-form-submission"&gt;Preventing double form submissions with JavaScript&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://til.simonwillison.net/cloudrun/increase-cloud-scheduler-time-limit"&gt;Increasing the time limit for a Google Cloud Scheduler task&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://til.simonwillison.net/sqlite/pysqlite3-on-macos"&gt;Using pysqlite3 on macOS&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://til.simonwillison.net/nginx/proxy-domain-sockets"&gt;Using nginx to proxy to a Unix domain socket&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
    
        &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/proxies"&gt;proxies&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/datasette"&gt;datasette&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/weeknotes"&gt;weeknotes&lt;/a&gt;&lt;/p&gt;
    

</summary><category term="proxies"/><category term="datasette"/><category term="weeknotes"/></entry><entry><title>Building a stateless API proxy</title><link href="https://simonwillison.net/2019/May/30/building-a-stateless-api-proxy/#atom-tag" rel="alternate"/><published>2019-05-30T04:28:55+00:00</published><updated>2019-05-30T04:28:55+00:00</updated><id>https://simonwillison.net/2019/May/30/building-a-stateless-api-proxy/#atom-tag</id><summary type="html">
    
&lt;p&gt;&lt;strong&gt;&lt;a href="https://blog.thea.codes/building-a-stateless-api-proxy/"&gt;Building a stateless API proxy&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;
This is a really clever idea. The GitHub API is infuriatingly coarsely grained with its permissions: you often end up having to create a token with way more permissions than you actually need for your project. Thea Flowers proposes running your own proxy in front of their API that adds more finely grained permissions, based on custom encrypted proxy API tokens that use JWT to encode the original API key along with the permissions you want to grant to that particular token (as a list of regular expressions matching paths on the underlying API).

    &lt;p&gt;&lt;small&gt;&lt;/small&gt;Via &lt;a href="https://twitter.com/theavalkyrie/status/1133864634178424832"&gt;@theavalkyrie&lt;/a&gt;&lt;/small&gt;&lt;/p&gt;


    &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/apis"&gt;apis&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/encryption"&gt;encryption&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/github"&gt;github&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/proxies"&gt;proxies&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/security"&gt;security&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/jwt"&gt;jwt&lt;/a&gt;&lt;/p&gt;



</summary><category term="apis"/><category term="encryption"/><category term="github"/><category term="proxies"/><category term="security"/><category term="jwt"/></entry><entry><title>Charles Proxy now available on iOS</title><link href="https://simonwillison.net/2018/Mar/28/charles-proxy-now-available-on-ios/#atom-tag" rel="alternate"/><published>2018-03-28T15:57:34+00:00</published><updated>2018-03-28T15:57:34+00:00</updated><id>https://simonwillison.net/2018/Mar/28/charles-proxy-now-available-on-ios/#atom-tag</id><summary type="html">
    
&lt;p&gt;&lt;strong&gt;&lt;a href="http://blog.xk72.com/post/172324913544/charles-proxy-now-available-on-ios/amp?__twitter_impression=true"&gt;Charles Proxy now available on iOS&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;
I didn’t think this was possible, but the Charles debugging proxy is now available for iOS. It works by setting itself up as a VPN such that all app traffic runs through it. You can also optionally turn on SSL decryption for specific hosts by installing a special certificate (which involves jumping through several hoops). It won’t work for apps that implement SSL certificate pinning but from playing with it for a few minutes it looks like most apps haven’t done that, even apps from Google. Well worth $8.99.

    &lt;p&gt;&lt;small&gt;&lt;/small&gt;Via &lt;a href="https://twitter.com/codepo8/status/978844829277917189"&gt;@codepo8&lt;/a&gt;&lt;/small&gt;&lt;/p&gt;


    &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/charles"&gt;charles&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/proxies"&gt;proxies&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/ios"&gt;ios&lt;/a&gt;&lt;/p&gt;



</summary><category term="charles"/><category term="proxies"/><category term="ios"/></entry><entry><title>Velocity: Forcing Gzip Compression</title><link href="https://simonwillison.net/2010/Sep/30/gzip/#atom-tag" rel="alternate"/><published>2010-09-30T17:45:00+00:00</published><updated>2010-09-30T17:45:00+00:00</updated><id>https://simonwillison.net/2010/Sep/30/gzip/#atom-tag</id><summary type="html">
    
&lt;p&gt;&lt;strong&gt;&lt;a href="http://www.stevesouders.com/blog/2010/07/12/velocity-forcing-gzip-compression/"&gt;Velocity: Forcing Gzip Compression&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;
Almost every browser supports gzip these days, but 15% of web requests have had their Accept-Encoding header stripped or mangled, generally due to poorly implemented proxies or anti-virus software. Steve Souders passes on a trick used by Google Search, where an iframe is used to test the browser’s gzip support and set a cookie to force gzipping of future pages.


    &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/browsers"&gt;browsers&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/gzip"&gt;gzip&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/performance"&gt;performance&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/proxies"&gt;proxies&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/steve-souders"&gt;steve-souders&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/recovered"&gt;recovered&lt;/a&gt;&lt;/p&gt;



</summary><category term="browsers"/><category term="gzip"/><category term="performance"/><category term="proxies"/><category term="steve-souders"/><category term="recovered"/></entry><entry><title>nodejitsu's node-http-proxy</title><link href="https://simonwillison.net/2010/Jul/28/nodejitsus/#atom-tag" rel="alternate"/><published>2010-07-28T23:34:00+00:00</published><updated>2010-07-28T23:34:00+00:00</updated><id>https://simonwillison.net/2010/Jul/28/nodejitsus/#atom-tag</id><summary type="html">
    
&lt;p&gt;&lt;strong&gt;&lt;a href="http://github.com/nodejitsu/node-http-proxy"&gt;nodejitsu&amp;#x27;s node-http-proxy&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;
Exactly what I’ve been waiting for—a robust  HTTP proxy library for Node that makes it trivial to proxy requests to a backend with custom proxy behaviour added in JavaScript. The example app adds an artificial delay to every request to simulate a slow connection, but other exciting potential use cases could include rate limiting, API key restriction, logging, load balancing, lint testing and more besides.

    &lt;p&gt;&lt;small&gt;&lt;/small&gt;Via &lt;a href="http://thechangelog.com/post/872114581/node-http-proxy-reverse-proxy-for-node-js"&gt;The Changelog&lt;/a&gt;&lt;/small&gt;&lt;/p&gt;


    &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/http"&gt;http&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/javascript"&gt;javascript&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/nodejs"&gt;nodejs&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/proxies"&gt;proxies&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/recovered"&gt;recovered&lt;/a&gt;&lt;/p&gt;



</summary><category term="http"/><category term="javascript"/><category term="nodejs"/><category term="proxies"/><category term="recovered"/></entry><entry><title>A HTTP Proxy Server in 20 Lines of node.js</title><link href="https://simonwillison.net/2010/Apr/28/http/#atom-tag" rel="alternate"/><published>2010-04-28T13:24:58+00:00</published><updated>2010-04-28T13:24:58+00:00</updated><id>https://simonwillison.net/2010/Apr/28/http/#atom-tag</id><summary type="html">
    
&lt;p&gt;&lt;strong&gt;&lt;a href="http://www.catonmat.net/http-proxy-in-nodejs?utm_source=feedburner&amp;amp;utm_medium=feed&amp;amp;utm_campaign=Feed%3A catonmat %28good coders code%2C great reuse%29"&gt;A HTTP Proxy Server in 20 Lines of node.js&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;
Proxying is definitely a sweet spot for Node.js. Peteris Krummins takes it a step further, adding host blacklists and an IP whitelist as configuration files and using Node’s watchFile method to automatically reload changes to them.


    &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/http"&gt;http&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/javascript"&gt;javascript&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/nodejs"&gt;nodejs&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/peteris-krummins"&gt;peteris-krummins&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/proxies"&gt;proxies&lt;/a&gt;&lt;/p&gt;



</summary><category term="http"/><category term="javascript"/><category term="nodejs"/><category term="peteris-krummins"/><category term="proxies"/></entry><entry><title>Using Django as a Pass Through Image Proxy</title><link href="https://simonwillison.net/2010/Mar/22/passthrough/#atom-tag" rel="alternate"/><published>2010-03-22T07:18:18+00:00</published><updated>2010-03-22T07:18:18+00:00</updated><id>https://simonwillison.net/2010/Mar/22/passthrough/#atom-tag</id><summary type="html">
    
&lt;p&gt;&lt;strong&gt;&lt;a href="http://menendez.com/blog/using-django-as-pass-through-image-proxy/"&gt;Using Django as a Pass Through Image Proxy&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;
Neat idea for running development environments against data copied from a live production site—a static file serving handler which uses a local cache but copies in user-uploaded files from the production site the first time they are requested.

    &lt;p&gt;&lt;small&gt;&lt;/small&gt;Via &lt;a href="http://www.djangosnippets.org/snippets/1967/"&gt;Django snippets&lt;/a&gt;&lt;/small&gt;&lt;/p&gt;


    &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/django"&gt;django&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/proxies"&gt;proxies&lt;/a&gt;&lt;/p&gt;



</summary><category term="django"/><category term="proxies"/></entry><entry><title>Traffic Server</title><link href="https://simonwillison.net/2009/Nov/1/trafficserver/#atom-tag" rel="alternate"/><published>2009-11-01T12:15:27+00:00</published><updated>2009-11-01T12:15:27+00:00</updated><id>https://simonwillison.net/2009/Nov/1/trafficserver/#atom-tag</id><summary type="html">
    
&lt;p&gt;&lt;strong&gt;&lt;a href="http://www.mnot.net/blog/2009/10/30/traffic_server"&gt;Traffic Server&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;
Mark Nottingham explains the release of Traffic Server, a new Apache Incubator open source project donated by Yahoo! using code originally developed at Inktomi around a decade ago. Traffic Server is a HTTP proxy/cache, similar to Squid and Varnish (though Traffic Server acts as both a forward and reverse proxy, whereas Varnish only handles reverse).


    &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/apache"&gt;apache&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/cache"&gt;cache&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/http"&gt;http&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/inktomi"&gt;inktomi&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/mark-nottingham"&gt;mark-nottingham&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/open-source"&gt;open-source&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/proxies"&gt;proxies&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/squid"&gt;squid&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/trafficserver"&gt;trafficserver&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/varnish"&gt;varnish&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/yahoo"&gt;yahoo&lt;/a&gt;&lt;/p&gt;



</summary><category term="apache"/><category term="cache"/><category term="http"/><category term="inktomi"/><category term="mark-nottingham"/><category term="open-source"/><category term="proxies"/><category term="squid"/><category term="trafficserver"/><category term="varnish"/><category term="yahoo"/></entry><entry><title>Exploring OAuth-Protected APIs</title><link href="https://simonwillison.net/2009/Aug/23/oauth/#atom-tag" rel="alternate"/><published>2009-08-23T11:06:42+00:00</published><updated>2009-08-23T11:06:42+00:00</updated><id>https://simonwillison.net/2009/Aug/23/oauth/#atom-tag</id><summary type="html">
    
&lt;p&gt;&lt;strong&gt;&lt;a href="http://mojodna.net/2009/08/21/exploring-oauth-protected-apis.html"&gt;Exploring OAuth-Protected APIs&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;
One of the downsides of OAuth is that it makes debugging APIs in your browser much harder. Seth Fitzsimmons’ oauth-proxy solves this by running a Twisted-powered proxy on your local machine which OAuth-signs every request going through it using your consumer key, secret and tokens for that API. Using it with a browsers risks exposing your key and token (but not secret) to sites you accidentally browse to—it would be useful if you could pass a whitelist of API domains as a command line option to the proxy.


    &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/apis"&gt;apis&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/oauth"&gt;oauth&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/proxies"&gt;proxies&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/python"&gt;python&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/seth-fitzsimmons"&gt;seth-fitzsimmons&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/twisted"&gt;twisted&lt;/a&gt;&lt;/p&gt;



</summary><category term="apis"/><category term="oauth"/><category term="proxies"/><category term="python"/><category term="seth-fitzsimmons"/><category term="twisted"/></entry><entry><title>Yahoo! proposal to open source "Traffic Server" via the ASF</title><link href="https://simonwillison.net/2009/Jul/7/trafficserver/#atom-tag" rel="alternate"/><published>2009-07-07T12:37:02+00:00</published><updated>2009-07-07T12:37:02+00:00</updated><id>https://simonwillison.net/2009/Jul/7/trafficserver/#atom-tag</id><summary type="html">
    
&lt;p&gt;&lt;strong&gt;&lt;a href="http://wiki.apache.org/incubator/TrafficServerProposal"&gt;Yahoo! proposal to open source &amp;quot;Traffic Server&amp;quot; via the ASF&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;
Traffic Server is a “fast, scalable and extensible HTTP/1.1 compliant  caching proxy server” (presumably equivalent to things like Squid and Varnish) originally acquired from Inktomi and developed internally at Yahoo! for the past three years, which has been benchmarked handling 35,000 req/s on a single box. No source code yet but it looks like the release will arrive pretty soon.


    &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/apache"&gt;apache&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/asf"&gt;asf&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/caching"&gt;caching&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/open-source"&gt;open-source&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/proxies"&gt;proxies&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/squid"&gt;squid&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/trafficserver"&gt;trafficserver&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/varnish"&gt;varnish&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/yahoo"&gt;yahoo&lt;/a&gt;&lt;/p&gt;



</summary><category term="apache"/><category term="asf"/><category term="caching"/><category term="open-source"/><category term="proxies"/><category term="squid"/><category term="trafficserver"/><category term="varnish"/><category term="yahoo"/></entry><entry><title>How to use Django with Apache and mod_wsgi</title><link href="https://simonwillison.net/2009/Apr/1/modwsgi/#atom-tag" rel="alternate"/><published>2009-04-01T00:24:04+00:00</published><updated>2009-04-01T00:24:04+00:00</updated><id>https://simonwillison.net/2009/Apr/1/modwsgi/#atom-tag</id><summary type="html">
    
&lt;p&gt;&lt;strong&gt;&lt;a href="http://docs.djangoproject.com/en/dev/howto/deployment/modwsgi/"&gt;How to use Django with Apache and mod_wsgi&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;
My favourite deployment option is now included in the official Django docs, thanks to Alex Gaynor. I tend to run a stripped down Apache with mod_wsgi behind an nginx proxy, and have nginx serve static files directly. This avoids the need for a completely separate media server (although a separate media domain is still a good idea for better client-side performance).


    &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/alex-gaynor"&gt;alex-gaynor&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/deployment"&gt;deployment&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/django"&gt;django&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/modwsgi"&gt;modwsgi&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/nginx"&gt;nginx&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/proxies"&gt;proxies&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/python"&gt;python&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/wsgi"&gt;wsgi&lt;/a&gt;&lt;/p&gt;



</summary><category term="alex-gaynor"/><category term="deployment"/><category term="django"/><category term="modwsgi"/><category term="nginx"/><category term="proxies"/><category term="python"/><category term="wsgi"/></entry><entry><title>Sloppy - the slow proxy</title><link href="https://simonwillison.net/2009/Jan/13/sloppy/#atom-tag" rel="alternate"/><published>2009-01-13T16:17:41+00:00</published><updated>2009-01-13T16:17:41+00:00</updated><id>https://simonwillison.net/2009/Jan/13/sloppy/#atom-tag</id><summary type="html">
    
&lt;p&gt;&lt;strong&gt;&lt;a href="http://www.dallaway.com/sloppy/"&gt;Sloppy - the slow proxy&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;
Java Web Start GUI application which runs a proxy to the site of your choice simulating lower connection speeds—great for testing how well your ajax holds up under poor network conditions.


    &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/ajax"&gt;ajax&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/java"&gt;java&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/javascript"&gt;javascript&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/javawebstart"&gt;javawebstart&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/performance"&gt;performance&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/proxies"&gt;proxies&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/richard-dallaway"&gt;richard-dallaway&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/sloppy"&gt;sloppy&lt;/a&gt;&lt;/p&gt;



</summary><category term="ajax"/><category term="java"/><category term="javascript"/><category term="javawebstart"/><category term="performance"/><category term="proxies"/><category term="richard-dallaway"/><category term="sloppy"/></entry><entry><title>ratproxy</title><link href="https://simonwillison.net/2008/Jul/3/ratproxy/#atom-tag" rel="alternate"/><published>2008-07-03T14:35:25+00:00</published><updated>2008-07-03T14:35:25+00:00</updated><id>https://simonwillison.net/2008/Jul/3/ratproxy/#atom-tag</id><summary type="html">
    
&lt;p&gt;&lt;strong&gt;&lt;a href="http://code.google.com/p/ratproxy/"&gt;ratproxy&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;
“A semi-automated, largely passive web application security audit tool”—watches you browse and highlights potential XSS, CSRF and other vulnerabilities in your application. Created by Michal Zalewski  at Google.


    &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/csrf"&gt;csrf&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/google"&gt;google&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/michal-zalewski"&gt;michal-zalewski&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/proxies"&gt;proxies&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/ratproxy"&gt;ratproxy&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/security"&gt;security&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/testing"&gt;testing&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/xss"&gt;xss&lt;/a&gt;&lt;/p&gt;



</summary><category term="csrf"/><category term="google"/><category term="michal-zalewski"/><category term="proxies"/><category term="ratproxy"/><category term="security"/><category term="testing"/><category term="xss"/></entry><entry><title>Apache proxy auto-re-loader</title><link href="https://simonwillison.net/2008/Feb/18/ned/#atom-tag" rel="alternate"/><published>2008-02-18T09:44:02+00:00</published><updated>2008-02-18T09:44:02+00:00</updated><id>https://simonwillison.net/2008/Feb/18/ned/#atom-tag</id><summary type="html">
    
&lt;p&gt;&lt;strong&gt;&lt;a href="http://nedbatchelder.com/blog/200802/apache_proxy_autoreloader.html"&gt;Apache proxy auto-re-loader&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;
Neat trick: set your 502 (Bad Gateway) error document to include a meta refresh tag, automating the refresh needed should a server you are proxying to be temporarily unavailable.


    &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/apache"&gt;apache&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/metarefresh"&gt;metarefresh&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/ned-batchelder"&gt;ned-batchelder&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/proxies"&gt;proxies&lt;/a&gt;&lt;/p&gt;



</summary><category term="apache"/><category term="metarefresh"/><category term="ned-batchelder"/><category term="proxies"/></entry><entry><title>A Fair Proxy Balancer for Nginx and Mongrel</title><link href="https://simonwillison.net/2007/Dec/9/fair/#atom-tag" rel="alternate"/><published>2007-12-09T14:57:44+00:00</published><updated>2007-12-09T14:57:44+00:00</updated><id>https://simonwillison.net/2007/Dec/9/fair/#atom-tag</id><summary type="html">
    
&lt;p&gt;&lt;strong&gt;&lt;a href="http://brainspl.at/articles/2007/11/09/a-fair-proxy-balancer-for-nginx-and-mongrel"&gt;A Fair Proxy Balancer for Nginx and Mongrel&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;
nginx uses round robin for proxying by default; this extension module ensures requests are queued up and sent through to backend mongrel servers that aren’t currently busy. I don’t see any reason this wouldn’t work with servers other than mongrel.


    &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/fair"&gt;fair&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/load-balancing"&gt;load-balancing&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/mongrel"&gt;mongrel&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/nginx"&gt;nginx&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/proxies"&gt;proxies&lt;/a&gt;&lt;/p&gt;



</summary><category term="fair"/><category term="load-balancing"/><category term="mongrel"/><category term="nginx"/><category term="proxies"/></entry><entry><title>The State of Proxy Caching</title><link href="https://simonwillison.net/2007/Jun/21/mnot/#atom-tag" rel="alternate"/><published>2007-06-21T14:18:50+00:00</published><updated>2007-06-21T14:18:50+00:00</updated><id>https://simonwillison.net/2007/Jun/21/mnot/#atom-tag</id><summary type="html">
    
&lt;p&gt;&lt;strong&gt;&lt;a href="http://www.mnot.net/blog/2007/06/20/proxy_caching"&gt;The State of Proxy Caching&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;
If you’ve always wondered exactly what intermediate proxies are going to do to your carefully constructed Web application, here’s your answer.


    &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/caching"&gt;caching&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/http"&gt;http&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/mark-nottingham"&gt;mark-nottingham&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/proxies"&gt;proxies&lt;/a&gt;&lt;/p&gt;



</summary><category term="caching"/><category term="http"/><category term="mark-nottingham"/><category term="proxies"/></entry><entry><title>Online and offline development with the YUI and Charles</title><link href="https://simonwillison.net/2007/May/15/online/#atom-tag" rel="alternate"/><published>2007-05-15T14:41:39+00:00</published><updated>2007-05-15T14:41:39+00:00</updated><id>https://simonwillison.net/2007/May/15/online/#atom-tag</id><summary type="html">
    
&lt;p&gt;&lt;strong&gt;&lt;a href="http://muffinresearch.co.uk/archives/2007/04/26/online-and-offline-development-with-the-yui-and-charles/"&gt;Online and offline development with the YUI and Charles&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;
Stuart Colville shows how the Charles debugging proxy can be used to serve up hosted YUI files while developing offline.

    &lt;p&gt;&lt;small&gt;&lt;/small&gt;Via &lt;a href="http://yuiblog.com/blog/2007/05/15/in-the-wild-20070515/"&gt;Yahoo! User Interface Blog&lt;/a&gt;&lt;/small&gt;&lt;/p&gt;


    &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/charles"&gt;charles&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/debugging"&gt;debugging&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/javascript"&gt;javascript&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/offline"&gt;offline&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/proxies"&gt;proxies&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/stuart-colville"&gt;stuart-colville&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/yui"&gt;yui&lt;/a&gt;&lt;/p&gt;



</summary><category term="charles"/><category term="debugging"/><category term="javascript"/><category term="offline"/><category term="proxies"/><category term="stuart-colville"/><category term="yui"/></entry></feed>