Scala
fromInfoQ
1 day agoQCon London 2026: Introducing Tansu.io -- Rethinking Kafka for Lean Operations
Tansu is an open-source, stateless messaging broker that replaces Kafka's complex architecture with a simpler, durable storage model.
I wrote a book for O'Reilly on scaling machine learning with Spark specifically. My second book is coming out on how to improve high-performance Spark, the second edition. Started my career in the machine learning space 15 years ago, moved into data infrastructure, batch processing, and a year and a half ago I moved into the data streaming space, which I think it's what's going to help us pave the future in the data.
Confluent connects data sources and cleans up data. It built its service on Apache Kafka, an open-source distributed event streaming platform, sparing its customers the hassle of buying and managing their own server clusters in return for a monthly fee per cluster, plus additional fees for data stored and data moved in or out. IBM expects the deal, which it valued at $11 billion, to close by the middle of next year.
This engine takes topic data schemas, metadata, and test rules as inputs to create a set of FlinkSQL-based test definitions. A Flink job then executes these tests, consuming messages from production Kafka topics and forwarding any errors to Grab's observability platform. FlinkSQL was selected because its ability to represent stream data as dynamic tables allowed the team to automatically generate data filters for rules that could be efficiently implemented.
The main feature of the 3.0 release is a new decoupled architecture. This change fixes a key limitation found in earlier versions. In earlier versions of Mimir, the ingester component handled both reading and writing. This setup meant that heavy query loads could hurt ingestion performance. The new design adds Apache Kafka as an asynchronous buffer between ingestion and query tasks. This allows each path to scale on its own and removes the cross-path dependencies that affected system stability before.