Simon Willison's Weblog: kafka

Build your own SQS or Kafka with Postgres

2024-07-31T17:34:54+00:00

Build your own SQS or Kafka with Postgres

Anthony Accomazzo works on Sequin, an open source "message stream" (similar to Kafka) written in Elixir and Go on top of PostgreSQL.

This detailed article describes how you can implement message queue patterns on PostgreSQL from scratch, including this neat example using a CTE, returning and for update skip locked to retrieve $1 messages from the messages table and simultaneously mark them with not_visible_until set to $2 in order to "lock" them for processing by a client:

with available_messages as (
  select seq
  from messages
  where not_visible_until is null
    or (not_visible_until <= now())
  order by inserted_at
  limit $1
  for update skip locked
)
update messages m
set
  not_visible_until = $2,
  deliver_count = deliver_count + 1,
  last_delivered_at = now(),
  updated_at = now()
from available_messages am
where m.seq = am.seq
returning m.seq, m.data;

Via lobste.rs

Tags: message-queues, postgresql, sql, sqs, kafka

RabbitMQ Streams Overview

2021-07-13T23:29:18+00:00

RabbitMQ Streams Overview

New in RabbitMQ 3.9: streams are a persisted, replicated append-only log with non-destructive consuming semantics. Sounds like it fits the same hole as Kafka and Redis Streams, an extremely useful pattern.

Tags: message-queues, rabbitmq, redis, kafka

Get Started - Materialize

2020-06-01T22:11:49+00:00

Get Started - Materialize

Materialize is a really interesting new database—“a streaming SQL materialized view engine”. It builds materialized views on top of streaming data sources (such as Kafka)—you define the view using a SQL query, then it figures out how to keep that view up-to-date automatically as new data streams in. It speaks the PostgreSQL protocol so you can talk to it using the psql tool or any PostgreSQL client library. The “get started” guide is particularly impressive: it uses a curl stream of the Wikipedia recent changes API, parsed using a regular expression. And it’s written in Rust, so installing it is as easy as downloading and executing a single binary (though I used Homebrew).

Tags: databases, postgresql, sql, kafka, rust

Introduction to Redis Streams

2018-10-18T08:35:03+00:00

Introduction to Redis Streams

Redis 5.0 is out, introducing the first new Redis data type in several years: streams, a Kafka-like mechanism for implementing a replayable event stream that can be read by many different subscribers.

Tags: redis, kafka

Mozilla Telemetry: In-depth Data Pipeline

2018-04-12T15:44:42+00:00

Mozilla Telemetry: In-depth Data Pipeline

Detailed behind-the-scenes look at an extremely sophisticated big data telemetry processing system built using open source tools. Some of this is unsurprising (S3 for storage, Spark and Kafka for streams) but the details are fascinating. They use a custom nginx module for the ingestion endpoint and have a “tee” server written in Lua and OpenResty which lets them route some traffic to alternative backend.

Via @reid_write

Tags: analytics, lua, mozilla, nginx, big-data, kafka

Notes on Kafka in Python

2018-01-13T19:40:01+00:00

Notes on Kafka in Python

Useful review by Matthew Rocklin of the three main open source Python Kafka client libraries as of October 2017.

Tags: python, kafka

Quoting Brandur Leach

2017-11-08T16:23:55+00:00

Redis streams aren’t exciting for their innovativeness, but rather than they bring building a unified log architecture within reach of a small and/or inexpensive app. Kafka is infamously difficult to configure and get running, and is expensive to operate once you do. [...] Redis on the other hand is probably already in your stack.

— Brandur Leach

Tags: redis, kafka, brandur-leach

Streams: a new general purpose data structure in Redis

2017-10-03T15:25:41+00:00

Streams: a new general purpose data structure in Redis

Exciting new Redis feature inspired by Kafka: redis streams, which allow you to construct an efficient, in-memory list of messages (similar to a Kafka log) which clients can read sections of or block against and await real-time delivery of new messages. As expected from Salvatore the API design is clean, obvious and covers a wide range of exciting use-cases. Planned for release with Redis 4 by the end of the year!

Tags: redis, salvatore-sanfilippo, kafka