<?xml version="1.0" encoding="utf-8"?>
<feed xml:lang="en-us" xmlns="http://www.w3.org/2005/Atom"><title>Simon Willison's Weblog: kafka</title><link href="http://simonwillison.net/" rel="alternate"/><link href="http://simonwillison.net/tags/kafka.atom" rel="self"/><id>http://simonwillison.net/</id><updated>2024-07-31T17:34:54+00:00</updated><author><name>Simon Willison</name></author><entry><title>Build your own SQS or Kafka with Postgres</title><link href="https://simonwillison.net/2024/Jul/31/sqs-or-kafka-with-postgres/#atom-tag" rel="alternate"/><published>2024-07-31T17:34:54+00:00</published><updated>2024-07-31T17:34:54+00:00</updated><id>https://simonwillison.net/2024/Jul/31/sqs-or-kafka-with-postgres/#atom-tag</id><summary type="html">
    
&lt;p&gt;&lt;strong&gt;&lt;a href="https://blog.sequinstream.com/build-your-own-sqs-or-kafka-with-postgres/"&gt;Build your own SQS or Kafka with Postgres&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;
Anthony Accomazzo works on &lt;a href="https://github.com/sequinstream/sequin"&gt;Sequin&lt;/a&gt;, an open source "message stream" (similar to Kafka) written in Elixir and Go on top of PostgreSQL.&lt;/p&gt;
&lt;p&gt;This detailed article describes how you can implement message queue patterns on PostgreSQL from scratch, including this neat example using a CTE, &lt;code&gt;returning&lt;/code&gt; and &lt;code&gt;for update skip locked&lt;/code&gt; to retrieve &lt;code&gt;$1&lt;/code&gt; messages from the &lt;code&gt;messages&lt;/code&gt; table and simultaneously mark them with &lt;code&gt;not_visible_until&lt;/code&gt; set to &lt;code&gt;$2&lt;/code&gt; in order to "lock" them for processing by a client:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;with available_messages as (
  select seq
  from messages
  where not_visible_until is null
    or (not_visible_until &amp;lt;= now())
  order by inserted_at
  limit $1
  for update skip locked
)
update messages m
set
  not_visible_until = $2,
  deliver_count = deliver_count + 1,
  last_delivered_at = now(),
  updated_at = now()
from available_messages am
where m.seq = am.seq
returning m.seq, m.data;
&lt;/code&gt;&lt;/pre&gt;

    &lt;p&gt;&lt;small&gt;&lt;/small&gt;Via &lt;a href="https://lobste.rs/s/ap6qvh/build_your_own_sqs_kafka_with_postgres"&gt;lobste.rs&lt;/a&gt;&lt;/small&gt;&lt;/p&gt;


    &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/message-queues"&gt;message-queues&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/postgresql"&gt;postgresql&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/sql"&gt;sql&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/sqs"&gt;sqs&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/kafka"&gt;kafka&lt;/a&gt;&lt;/p&gt;



</summary><category term="message-queues"/><category term="postgresql"/><category term="sql"/><category term="sqs"/><category term="kafka"/></entry><entry><title>RabbitMQ Streams Overview</title><link href="https://simonwillison.net/2021/Jul/13/rabbitmq-streams-overview/#atom-tag" rel="alternate"/><published>2021-07-13T23:29:18+00:00</published><updated>2021-07-13T23:29:18+00:00</updated><id>https://simonwillison.net/2021/Jul/13/rabbitmq-streams-overview/#atom-tag</id><summary type="html">
    
&lt;p&gt;&lt;strong&gt;&lt;a href="https://blog.rabbitmq.com/posts/2021/07/rabbitmq-streams-overview/"&gt;RabbitMQ Streams Overview&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;
New in RabbitMQ 3.9: streams are a persisted, replicated append-only log with non-destructive consuming semantics. Sounds like it fits the same hole as Kafka and Redis Streams, an extremely useful pattern.


    &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/message-queues"&gt;message-queues&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/rabbitmq"&gt;rabbitmq&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/redis"&gt;redis&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/kafka"&gt;kafka&lt;/a&gt;&lt;/p&gt;



</summary><category term="message-queues"/><category term="rabbitmq"/><category term="redis"/><category term="kafka"/></entry><entry><title>Get Started - Materialize</title><link href="https://simonwillison.net/2020/Jun/1/get-started-materialize/#atom-tag" rel="alternate"/><published>2020-06-01T22:11:49+00:00</published><updated>2020-06-01T22:11:49+00:00</updated><id>https://simonwillison.net/2020/Jun/1/get-started-materialize/#atom-tag</id><summary type="html">
    
&lt;p&gt;&lt;strong&gt;&lt;a href="https://materialize.io/docs/get-started/"&gt;Get Started - Materialize&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;
Materialize is a really interesting new database—“a streaming SQL materialized view engine”. It builds materialized views on top of streaming data sources (such as Kafka)—you define the view using a SQL query, then it figures out how to keep that view up-to-date automatically as new data streams in. It speaks the PostgreSQL protocol so you can talk to it using the psql tool or any PostgreSQL client library. The “get started” guide is particularly impressive: it uses a curl stream of the Wikipedia recent changes API, parsed using a regular expression. And it’s written in Rust, so installing it is as easy as downloading and executing a single binary (though I used Homebrew).


    &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/databases"&gt;databases&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/postgresql"&gt;postgresql&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/sql"&gt;sql&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/kafka"&gt;kafka&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/rust"&gt;rust&lt;/a&gt;&lt;/p&gt;



</summary><category term="databases"/><category term="postgresql"/><category term="sql"/><category term="kafka"/><category term="rust"/></entry><entry><title>Introduction to Redis Streams</title><link href="https://simonwillison.net/2018/Oct/18/introduction-to-redis-streams/#atom-tag" rel="alternate"/><published>2018-10-18T08:35:03+00:00</published><updated>2018-10-18T08:35:03+00:00</updated><id>https://simonwillison.net/2018/Oct/18/introduction-to-redis-streams/#atom-tag</id><summary type="html">
    
&lt;p&gt;&lt;strong&gt;&lt;a href="https://redis.io/topics/streams-intro"&gt;Introduction to Redis Streams&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;
Redis 5.0 is out, introducing the first new Redis data type in several years: streams, a Kafka-like mechanism for implementing a replayable event stream that can be read by many different subscribers.


    &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/redis"&gt;redis&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/kafka"&gt;kafka&lt;/a&gt;&lt;/p&gt;



</summary><category term="redis"/><category term="kafka"/></entry><entry><title>Mozilla Telemetry: In-depth Data Pipeline</title><link href="https://simonwillison.net/2018/Apr/12/in-depth-data-pipeline-detail/#atom-tag" rel="alternate"/><published>2018-04-12T15:44:42+00:00</published><updated>2018-04-12T15:44:42+00:00</updated><id>https://simonwillison.net/2018/Apr/12/in-depth-data-pipeline-detail/#atom-tag</id><summary type="html">
    
&lt;p&gt;&lt;strong&gt;&lt;a href="https://docs.telemetry.mozilla.org/concepts/pipeline/data_pipeline_detail.html#a-detailed-look-at-the-data-platform"&gt;Mozilla Telemetry: In-depth Data Pipeline&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;
Detailed behind-the-scenes look at an extremely sophisticated big data telemetry processing system built using open source tools. Some of this is unsurprising (S3 for storage, Spark and Kafka for streams) but the details are fascinating. They use a custom nginx module for the ingestion endpoint and have a “tee” server written in Lua and OpenResty which lets them route some traffic to alternative backend.

    &lt;p&gt;&lt;small&gt;&lt;/small&gt;Via &lt;a href="https://twitter.com/reid_write/status/984412694336933889"&gt;@reid_write&lt;/a&gt;&lt;/small&gt;&lt;/p&gt;


    &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/analytics"&gt;analytics&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/lua"&gt;lua&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/mozilla"&gt;mozilla&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/nginx"&gt;nginx&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/big-data"&gt;big-data&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/kafka"&gt;kafka&lt;/a&gt;&lt;/p&gt;



</summary><category term="analytics"/><category term="lua"/><category term="mozilla"/><category term="nginx"/><category term="big-data"/><category term="kafka"/></entry><entry><title>Notes on Kafka in Python</title><link href="https://simonwillison.net/2018/Jan/13/notes-on-kafka-in-python/#atom-tag" rel="alternate"/><published>2018-01-13T19:40:01+00:00</published><updated>2018-01-13T19:40:01+00:00</updated><id>https://simonwillison.net/2018/Jan/13/notes-on-kafka-in-python/#atom-tag</id><summary type="html">
    
&lt;p&gt;&lt;strong&gt;&lt;a href="https://matthewrocklin.com/blog/work/2017/10/10/kafka-python"&gt;Notes on Kafka in Python&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;
Useful review by Matthew Rocklin of the three main open source Python Kafka client libraries as of October 2017.


    &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/python"&gt;python&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/kafka"&gt;kafka&lt;/a&gt;&lt;/p&gt;



</summary><category term="python"/><category term="kafka"/></entry><entry><title>Quoting Brandur Leach</title><link href="https://simonwillison.net/2017/Nov/8/redis-streams/#atom-tag" rel="alternate"/><published>2017-11-08T16:23:55+00:00</published><updated>2017-11-08T16:23:55+00:00</updated><id>https://simonwillison.net/2017/Nov/8/redis-streams/#atom-tag</id><summary type="html">
    &lt;blockquote cite="https://brandur.org/redis-streams"&gt;&lt;p&gt;Redis streams aren’t exciting for their innovativeness, but rather than they bring building a unified log architecture within reach of a small and/or inexpensive app. Kafka is infamously difficult to configure and get running, and is expensive to operate once you do.  [...] Redis on the other hand is probably already in your stack.&lt;/p&gt;&lt;/blockquote&gt;
&lt;p class="cite"&gt;&amp;mdash; &lt;a href="https://brandur.org/redis-streams"&gt;Brandur Leach&lt;/a&gt;&lt;/p&gt;

    &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/redis"&gt;redis&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/kafka"&gt;kafka&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/brandur-leach"&gt;brandur-leach&lt;/a&gt;&lt;/p&gt;



</summary><category term="redis"/><category term="kafka"/><category term="brandur-leach"/></entry><entry><title>Streams: a new general purpose data structure in Redis</title><link href="https://simonwillison.net/2017/Oct/3/redis-streams/#atom-tag" rel="alternate"/><published>2017-10-03T15:25:41+00:00</published><updated>2017-10-03T15:25:41+00:00</updated><id>https://simonwillison.net/2017/Oct/3/redis-streams/#atom-tag</id><summary type="html">
    
&lt;p&gt;&lt;strong&gt;&lt;a href="http://antirez.com/news/114"&gt;Streams: a new general purpose data structure in Redis&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;
Exciting new Redis feature inspired by Kafka: redis streams, which allow you to construct an efficient, in-memory list of messages (similar to a Kafka log) which clients can read sections of or block against and await real-time delivery of new messages. As expected from Salvatore the API design is clean, obvious and covers a wide range of exciting use-cases. Planned for release with Redis 4 by the end of the year!


    &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/redis"&gt;redis&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/salvatore-sanfilippo"&gt;salvatore-sanfilippo&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/kafka"&gt;kafka&lt;/a&gt;&lt;/p&gt;



</summary><category term="redis"/><category term="salvatore-sanfilippo"/><category term="kafka"/></entry></feed>