What’s a data-driven company if it doesn’t take an opportunity to drive its content with data? Instead of filling our January newsletter with things we think you should know, we turned back the clock to 2021 and considered what other people like you — data engineers, data scientists and developers — want to know.
The data revealed our most popular tutorials, explainer blog posts, research reports and how-to guides. It told us that the data engineering community is ravenous to learn how to streamline data. They’ve realized that dumping raw data into a warehouse for cleanup and query later makes everyone’s jobs harder than they need to be.
The problem is that it’s complicated. Until now, no one except an elite handful of tech juggernauts had harnessed the power of stream processing because it took a literal army of engineers to do it.
We think the future of data is stream processing for everyone. Soon, you’ll see businesses of every size processing data in memory, on a message broker like Kafka to extract its value before their data gets sunk into a lake or buried under ever-increasing volumes of data in a warehouse.
Based on our reader data, you think so, too. Many of you searched us out for our tutorials on doing real-time stream processing with Kafka and Python. Many also looked for our research and head-to-head comparison on how well Python-based stream processing client libraries perform.
So, without further ado, here’s our digest of the top tutorials, explainers and research reports from 2021 that were most helpful to our users.
Kafka + Python = %$#&?!
Python gets the most love from data scientists and other data-friendly developers, but when it comes to Kafka, Python gets the cold shoulder. Here’s how they work together.
Why is streaming data so hard to handle?
And why aren’t these difficulties already solved? Our CTO explains in not-too-technical terms why stream processing has been out of reach for most organizations — until now.
Which Python library is best for stream processing?
When you build a product, you do research — exhaustive competitive research. Our CEO offers a deep-deep-dive on Spark vs. Flink vs Quix performance.
Everything you wanted to know about Kafka but were afraid to ask
Kafka isn’t just tricky technology — it’s tricky terminology. Dig into what makes Kafka different, when you should (and shouldn’t) use it, and how it works.
- An organization’s ability to access customer data, and the demand to build engines that make decisions about how to process it, are the key challenges for customer-centric, data-driven growth, according to McKinsey.
- Levi Strauss and John Deere are among the “old” companies pioneering stream processing. How? They’re empowering employees to take advantage of stream processing and ML.
- You got stream processing to work. Now how do you get it to scale? Insights from Alibaba, Twitter and more.
- Read more of our most popular articles from 2021, complete with key takeaways.