stanislavkozlovski.bsky.social
@stanislavkozlovski.bsky.social
github.com/tansu-io/tansu

This is something to keep an eye on. Should be doable to add within a pg extension and bundle Kafka into Postgres?
November 11, 2025 at 6:52 PM
the data bible dropping soon
At long last, @chris.blue and I have submitted the final manuscript of Designing Data-Intensive Applications, second edition, to the publisher. There is always more that could be improved but at some point we just have to call it done. Now it goes into production; probably shipping in ~4 months.
November 8, 2025 at 2:29 PM
Just use Postgres until it breaks 🧘‍♂️
November 2, 2025 at 10:09 AM
This was posted 15 years ago

(part 1/2)
October 26, 2025 at 6:57 PM
**taps the sign**
September 29, 2025 at 9:07 PM
Reposted
when I say “storage is cheaper now” this is what I mean

topicpartition.io/definitions/...
Small Data
Small Data Small data appears to be a very exciting movement that is moving the overton window away from Big Data onto much simpler and cheaper solutions ...
topicpartition.io
September 28, 2025 at 1:03 PM
Reposted
“KIP-1150 introduces Diskless Kafka topics that write directly to S3 instead of replicating between brokers.”

“Even using the expensive S3 Express (which a week ago lowered its prices by more than 50%) still saves 73% compared to traditional Apache Kafka.”

/ht @ananthdurai.bsky.social
KIP-1150 in Apache Kafka is a big deal (Diskless Topics)
TL;DR KIP-1150 introduces Diskless Kafka topics that write directly to S3 instead of replicating between brokers. It literally reduces costs by 97% (from $1.8M to $20K annually for a 1GiB/s cluster) a...
topicpartition.io
April 21, 2025 at 3:14 AM
a new 2 minute streaming post is sitting patiently in your inbox...

open it to learn when:
• Kafka decides what messages are visible to Consumers
• acks=all Producers receive responses
July 14, 2025 at 2:51 PM
Apache Kafka has been on a Diskless craze in the last two years:

• 2023: WarpStream launched
• 2024: Confluent bought them for $220M+
• 2025: Aiven published a KIP to the open source project to introduce the same type of leaderless, direct-to-S3 topics
May 18, 2025 at 2:21 PM
Yesterday, CloudFlare dropped a bomb that I believe may change the future of Lakehouse storage.

R2 + Iceberg should become the de-facto choice for hybrid and multi-cloud data lakehouse architectures.

Here's why it may break the cloud monopoly 🧵
April 11, 2025 at 1:44 PM
+1 to this future
- Postgres isn't going away, it's become a standard of databases.
- Iceberg is the file format that changes things and turns immutable columnar files into a living database
- S3 (or S3 compatible) is the future of storage layers
April 8, 2025 at 9:12 PM
This is the most impactful Apache Kafka mentor you’ve never heard of:

Chia-Ping Tsai.

In just 18 months, he bootstrapped a large Taiwanese open source community boasting:
• 5000 participants
• 15,000 Slack messages/month
• 10 meetings/week
• 20+ Apache committers
February 21, 2025 at 4:56 PM
why doesn't Kafka use Protobuf or another popular serialization format? why is everything custom?

afaict the decision to go custom was taken back when it was first created. I just assume we never questioned it again?

ever since it's been a one way street - since upgrading clients will be a pain
February 18, 2025 at 9:11 PM
What if I told you that a 1 GiB/s Kafka topic streamed directly into your S3 data lakehouse as an Iceberg table could cost you... $10/hr?

Bufstream does it. It's literally too good to be true.

I spent 20 hours researching them. Here's their story (2 minute read) 🧵
February 14, 2025 at 3:20 PM
Confluent runs tens of thousands of Kafka clusters across 93 regions in 3 clouds - AWS, GCP, and Azure.

How do they manage all of that?

Here are 13 innovative modifications they did to run Kafka smoothly at that scale. 👇
February 3, 2025 at 3:44 PM
Nobody does data infrastructure like Uber does.

Here are their numbers 👇
January 31, 2025 at 3:39 PM
a new edition of 2 minute streaming is waiting patiently in your inbox

probably the simplest and fastest explanation of AWS networking costs out there on the internet

(plus a gift announced at the end) 🎁

blog.2minutestreaming.com
January 17, 2025 at 3:04 PM
how do you read this?

you pay $0.02 when you cross AZ.

do you pay an extra $0.02 if it's going through a public IPv4?

or does it imply you pay $0 when going cross-AZ though private ip?

ipv6: do you pay an extra charge when going cross-VPC? or is it $0 cross-AZ same-VPC? 🤷‍♂️🤷‍♂️
January 16, 2025 at 3:35 PM
Genius is making complex ideas simple, not making simple ideas complex.

Over 70% of Fortune 500 companies have used Apache Kafka.

At its core, it’s just a distributed commit log.

A log (a.k.a. {write-ahead, commit, transaction} log) is a simple but efficient data structure:
January 12, 2025 at 3:30 PM
The largest performance improvements often come the easiest.

Here’s an Apache Kafka config tweak to increase your performance by 50% 🔥

(a 1-minute 🧵)
January 8, 2025 at 4:34 PM
man I can't get it to post here without errors, so you'll have to go on the bird app if you want to see it.

But I broke down the WarpStream story in tweet format:

x.com/BdKozlovski/...

I also explained very explicitly what my position is regarding the situation. So before you judge - 👀
December 14, 2024 at 1:37 PM
The spiciest post I’ve published yet is out now. 🌶️

"The Brutal Truth About Kafka Cost Calculators"

I also announced a new product that I’ve been heads down coding for the last two months.

Interested? 👀

bigdata.2minutestreaming.com/p/the-bruta...
December 13, 2024 at 1:35 PM
I'm exposing something about the Kafka industry tomorrow.

I usually don't hype these up but tomorrow's newsletter edition will be the most impactful one I've released yet.

It will be spicy. It will change how you think. You don't wanna miss it. 🌶️
December 12, 2024 at 2:57 PM
Every data engineer is talking about S3’s new features from this reInvent.

But can you remember all the others?

Here is a small cheatsheet with AWS S3’s top 12 features to help you keep up👇
December 6, 2024 at 3:32 PM
Looking at the API, it wouldn't be that hard to extend Kafka's Tiered Storage plugin to write to S3 Tables in an Iceberg format directly.

The only question would be - where do you get the topic schema from?

Which makes me question... why doesn't Kafka have first-class schema support?
December 6, 2024 at 1:08 PM