<?xml version="1.0" encoding="utf-8"?>
<feed xml:lang="en-us" xmlns="http://www.w3.org/2005/Atom"><title>Simon Willison's Weblog: foursquare</title><link href="http://simonwillison.net/" rel="alternate"/><link href="http://simonwillison.net/tags/foursquare.atom" rel="self"/><id>http://simonwillison.net/</id><updated>2024-11-20T05:52:38+00:00</updated><author><name>Simon Willison</name></author><entry><title>Foursquare Open Source Places: A new foundational dataset for the geospatial community</title><link href="https://simonwillison.net/2024/Nov/20/foursquare-open-source-places/#atom-tag" rel="alternate"/><published>2024-11-20T05:52:38+00:00</published><updated>2024-11-20T05:52:38+00:00</updated><id>https://simonwillison.net/2024/Nov/20/foursquare-open-source-places/#atom-tag</id><summary type="html">
    
&lt;p&gt;&lt;strong&gt;&lt;a href="https://location.foursquare.com/resources/blog/products/foursquare-open-source-places-a-new-foundational-dataset-for-the-geospatial-community/"&gt;Foursquare Open Source Places: A new foundational dataset for the geospatial community&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;
I did not expect this!&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;[...] we are announcing today the general availability of a foundational open data set, Foursquare Open Source Places ("FSQ OS Places"). This base layer of 100mm+ global places of interest ("POI") includes 22 core attributes (see schema &lt;a href="https://docs.foursquare.com/data-products/docs/places-os-data-schema"&gt;here&lt;/a&gt;) that will be updated monthly and available for commercial use under the Apache 2.0 license framework.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;The data is available &lt;a href="https://docs.foursquare.com/data-products/docs/access-fsq-os-places"&gt;as Parquet files&lt;/a&gt; hosted on Amazon S3.&lt;/p&gt;
&lt;p&gt;Here's how to list the available files:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;aws s3 ls s3://fsq-os-places-us-east-1/release/dt=2024-11-19/places/parquet/
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;I got back &lt;code&gt;places-00000.snappy.parquet&lt;/code&gt; through &lt;code&gt;places-00024.snappy.parquet&lt;/code&gt;, each file around 455MB for a total of 10.6GB of data.&lt;/p&gt;
&lt;p&gt;I ran &lt;code&gt;duckdb&lt;/code&gt; and then used DuckDB's ability to remotely query Parquet on S3 to explore the data a bit more without downloading it to my laptop first:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;select count(*) from 's3://fsq-os-places-us-east-1/release/dt=2024-11-19/places/parquet/places-00000.snappy.parquet';
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;This got back 4,180,424 - that number is similar for each file, suggesting around 104,000,000 records total.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Update:&lt;/strong&gt; DuckDB can use wildcards in S3 paths (thanks, &lt;a href="https://mas.to/@paulbailey/113520325087085448"&gt;Paul&lt;/a&gt;) so this query provides an exact count:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;select count(*) from 's3://fsq-os-places-us-east-1/release/dt=2024-11-19/places/parquet/places-*.snappy.parquet';
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;That returned 104,511,073 - and Activity Monitor on my Mac confirmed that DuckDB only needed to fetch 1.2MB of data to answer that query.&lt;/p&gt;
&lt;p&gt;I ran this query to retrieve 1,000 places from that first file as newline-delimited JSON:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;copy (
    select * from 's3://fsq-os-places-us-east-1/release/dt=2024-11-19/places/parquet/places-00000.snappy.parquet'
    limit 1000
) to '/tmp/places.json';
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Here's &lt;a href="https://gist.github.com/simonw/53ad57ad42c7efe75e2028d975907180"&gt;that places.json file&lt;/a&gt;, and here it is &lt;a href="https://lite.datasette.io/?json=https://gist.github.com/simonw/53ad57ad42c7efe75e2028d975907180#/data/raw"&gt;imported into Datasette Lite&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;Finally, I got ChatGPT Code Interpreter to &lt;a href="https://chatgpt.com/share/673d7b92-0b4c-8006-a442-c5e6c2713d9c"&gt;convert that file to GeoJSON&lt;/a&gt; and pasted the result &lt;a href="https://gist.github.com/simonw/1e2a170b7368932ebd3922cb5d234924"&gt;into this Gist&lt;/a&gt;, giving me a map of those thousand places (because Gists automatically render GeoJSON):&lt;/p&gt;
&lt;p&gt;&lt;img alt="A map of the world with 1000 markers on it. A marker in Columbia shows a dialog for Raisbeck, Bogota Dv, Cra 47 A 114 05 Second Floor" src="https://static.simonwillison.net/static/2024/places-geojson.jpg" /&gt;

    &lt;p&gt;&lt;small&gt;&lt;/small&gt;Via &lt;a href="https://waxy.org/2024/11/foursquare-open-sources-its-places-database/"&gt;Andy Baio&lt;/a&gt;&lt;/small&gt;&lt;/p&gt;


    &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/geospatial"&gt;geospatial&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/open-source"&gt;open-source&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/foursquare"&gt;foursquare&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/geojson"&gt;geojson&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/parquet"&gt;parquet&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/duckdb"&gt;duckdb&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/datasette-lite"&gt;datasette-lite&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/ai-assisted-programming"&gt;ai-assisted-programming&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/code-interpreter"&gt;code-interpreter&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/coding-agents"&gt;coding-agents&lt;/a&gt;&lt;/p&gt;



</summary><category term="geospatial"/><category term="open-source"/><category term="foursquare"/><category term="geojson"/><category term="parquet"/><category term="duckdb"/><category term="datasette-lite"/><category term="ai-assisted-programming"/><category term="code-interpreter"/><category term="coding-agents"/></entry><entry><title>Which is the most complete and up to date API for restaurants/nightlife?</title><link href="https://simonwillison.net/2013/Oct/19/which-is-the-most/#atom-tag" rel="alternate"/><published>2013-10-19T18:48:00+00:00</published><updated>2013-10-19T18:48:00+00:00</updated><id>https://simonwillison.net/2013/Oct/19/which-is-the-most/#atom-tag</id><summary type="html">
    &lt;p&gt;&lt;em&gt;My answer to &lt;a href="https://www.quora.com/Which-is-the-most-complete-and-up-to-date-API-for-restaurants-nightlife/answer/Simon-Willison"&gt;Which is the most complete and up to date API for restaurants/nightlife?&lt;/a&gt; on Quora&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;The foursquare API is pretty great for restaurants and nightlife these days. No chance if revenue share though - how would you envisage revenue share working?&lt;/p&gt;
    
        &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/apis"&gt;apis&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/yelp"&gt;yelp&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/quora"&gt;quora&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/foursquare"&gt;foursquare&lt;/a&gt;&lt;/p&gt;
    

</summary><category term="apis"/><category term="yelp"/><category term="quora"/><category term="foursquare"/></entry><entry><title>Events (leisure): What platforms based on sharing/creating users' future plans with friends have been successful?</title><link href="https://simonwillison.net/2013/Jul/21/events-leisure-what-platforms/#atom-tag" rel="alternate"/><published>2013-07-21T12:35:00+00:00</published><updated>2013-07-21T12:35:00+00:00</updated><id>https://simonwillison.net/2013/Jul/21/events-leisure-what-platforms/#atom-tag</id><summary type="html">
    &lt;p&gt;&lt;em&gt;My answer to &lt;a href="https://www.quora.com/Events-leisure-What-platforms-based-on-sharing-creating-users-future-plans-with-friends-have-been-successful/answer/Simon-Willison"&gt;Events (leisure): What platforms based on sharing/creating users&amp;#39; future plans with friends have been successful?&lt;/a&gt; on Quora&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;The founder of Plancast wrote an extensive post mortem discussing the challenges involved here: &lt;/p&gt;

&lt;span&gt;&lt;a href="http://www.techcrunch.com/2012/01/22/post-mortem-for-plancast/"&gt;http://www.techcrunch.com/2012/0...&lt;/a&gt;&lt;/span&gt;

&lt;p&gt;A key challenge is that people don't tend to announce their plans more than a few days in advance, which means that the information they share has a very short shelf life during which it is useful.&lt;/p&gt;

&lt;p&gt;We've managed to avoid that particular trap at &lt;span&gt;&lt;a href="http://lanyrd.com/"&gt;http://lanyrd.com/&lt;/a&gt;&lt;/span&gt; because people (in particular speakers) announce their participation in a conference months in advance - unlike social or entertainment events.&lt;/p&gt;
    
        &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/events"&gt;events&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/social-networks"&gt;social-networks&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/quora"&gt;quora&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/foursquare"&gt;foursquare&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/plancast"&gt;plancast&lt;/a&gt;&lt;/p&gt;
    

</summary><category term="events"/><category term="social-networks"/><category term="quora"/><category term="foursquare"/><category term="plancast"/></entry></feed>