The Quix platform enables users to build applications on real-time data at any scale. Getting data into Quix via the library connectors, a software development kit (SDK) or APIs is the first step in the process. The tool you use depends on what kind of sources you’re using.
This blog post explains how to use the Quix APIs to connect hundreds of devices to Quix. It’s an edited version of a conversation between our CEO Michael Rosam and Head of Software Patrick Mira Pedrol on a recent Stream Session.
What’s a workspace and how does it work with external data?
Michael Rosam: I’m Mike, CEO and co-founder of Quix. I’m here again with Head of Software Patrick Mira Pedrol — we had a lot of fun on our last Stream Session. Let’s see what sort of trouble we can get up to today.
Patrick Mira Pedrol: I did a very long introduction last time. Maybe I’ll skip it this time.
Read about Patrick and find the last conversation between Mike and Patrick in “To stream, or not to stream?”
Mike: Last time, we spoke about Quix’s backstory. This time, you and I are going to get under the hood and talk a little bit about what happens inside Quix with the APIs.
Patrick: Yes, exactly. I’ll start by giving an overview. Quix has this concept called workspace, which is like an infrastructure bundle of what you deploy to your real-time data processing pipeline. Topics communicate between deployments and each step of a pipeline. (When we say topics, we mean Kafka topics, which are our main message brokers.)
Data flows from left to right in this diagram.
Mike: Deployment is where Quix processes data, and topics are how data flows in and out of processing.
Patrick: Exactly. And all of this movement happens inside the workspace.
Why use APIs and how do they interact with Kafka?
Mike: But what happens when you want to gather information or to get some sort of data from outside Quix?
Patrick: The APIs allow for the introduction of data to this pipeline. The same happens on the right side of the process when you want to deliver data out of your Quix workspace.
Mike: I notice on the left-hand side is a streaming writer API, which is called a producer API in the pub/sub world.
Patrick: Yes. And outside the workspace, on the other side of the streaming writer API, are lots of devices. These devices write data to go inside the workspace via the streaming writer API.
The API could accept data from an infinite number of devices, which is the main reason for creating APIs. Quix is built on Kafka, which isn’t prepared to handle a lot of connections. You cannot connect thousands or millions of devices to Kafka topics — not because Kafka is bad technology but because it’s designed to communicate messages from one microservice to another. Kafka can handle a lot of data per second, but it’s not good at handling a lot of connections.
The streaming writer lets you connect devices at scale and then converts that data into the Quix SDK format and sends it into a topic. Specifically, this API converts HTTP or WebSockets to SDK format. From that point, the pipeline is working with the SDK format.
Mike: What kinds of data can you write to the streaming writer API?
Patrick: You can write any time series data, including events, such as JSON events, specific irregular-based events, or high-frequency regular data.
Mike: We have some developers sending binary data, and others are sending audio files and video files. Are they using this service as well?
Patrick: Yes. Our SDK supports numbers, strings, or binary data so you can send any of them with our APIs as well.
It’s worth mentioning that when you create a workspace in Quix, we create APIs for each workspace. Every API is independent of the other. If you have, for instance, a developer workspace and a production workspace, you can be sure that your development workspace is not going to be affected in performance by your production one.
Mike: When would you use the streaming writer API rather than something like the Ably connector, which connected Quix to a tool solely focused on WebSockets?
Patrick: If you need to have very low latency when working with event-based data gathered from around the world, I’d use WebSockets. You could use the streaming writer API with WebSockets or set up the Ably connector. The Ably connector writes data with the SDK format using WebSockets, which are just wormholes. You can shove a lot of data into a WebSocket very quickly because it’s constantly consuming data in real time without waiting for a request. And WebSockets are easy to set up with our Ably connector.
But if your data is something like telemetry data — meaning triangular-based, very high frequency, and located in one spot in the world — I would use the streaming writer API with HTTP, a very old technology that almost any language and any device can use to send data. It’s also great for one-time requests, such as getting historical data.
How to get data out of a workspace and automate processes with APIs?
Mike: Very cool. Data comes in, gets processed, goes into an app that might involve machine learning models and, in the end, provides some results.
Patrick: Yes. Engineers normally use Quix when they need to process data and get the results in real time.
Mike: Formula 1 engineers, for example, look at live data all day long. But I think the actual value of Quix is automation. You can drive another system using some real-time automation inside Quix.
Mike: Can you draw a special arrow from the streaming reader to those mobile phones for me?
Patrick: Yeah, of course.
Mike: This is where I see ample opportunity. Developers are trying to figure out how to send data from many devices to the cloud.
For example, when I book a car on a rideshare app, I can get a price that’s dependent on how many other people are booking and how many cars are available. You can get that result back to the client device very, very quickly with Quix. This, I think, is really where the next generation of data is happening: automating systems for an improved user experience.
Patrick: I think we have done some projects related to this. We can have a live data flow from the client. You can process the data in milliseconds. You can get that data out and do something with it, automate some system.
Mike: So these two APIs, the streaming writer API and the streaming reader API, give you an excellent roundtrip potential. You can send data from the client to the cloud, get it processed and moved on.
What’s the role of storage in stream processing?
Mike: The big fat elephant in the room for a streaming company is the question of storage. What is this “persisted” storage? Tell me a little bit about that.
Patrick: Quix has a switch in the UI that allows you to “persist” data in a topic to be queried later. We also have a data explorer to investigate persisted data using the query API that is here, below, in the green.
Quix provides this query ability out of the box; you can copy and paste the code to your application directly. For the same data that you are visualizing in our platform, you can press a button and have the code snippet to use in your Jupyter Notebook or another application.
The Quix portal UI uses the query API, but you can use it in your own implementations. You can create your own applications and query historical data from this workspace. That’s what you see in the diagram.
Mike: We also have an API not on this diagram. It’s called the telemetry writer API, but it’s not exposed. There’s magic in the telemetry writer. You can write data into the storage and then query it later. That gives you the ability to close the gap between streaming and batch processing.
Patrick: The API is an abstraction layer on top of this data storage. It allows Quix to store data. Actually, Quix uses three databases that are changeable based on needs and independent of the API. If we change databases, the API remains the same for users. In the future, you’ll be able to select your own database or your own Kubernetes cluster or your own Kafka.
What can I do with the portal API?
Mike: Then on the top of the diagram sits the portal API. What’s this?
Patrick: This is not an API that we deploy in our workspace, but I added it here because it’s something available across all workspaces. This is basically the API we use as our backend. Users can use it as well. It allows you to create topics, deployments, standards of deployment, or persistent data. The portal API allows you to operate everything related to the infrastructure that you see with the arrows and the boxes in the diagram.
Mike: All in all, Quix has a complete suite of APIs around the core product. It lets you use any sources and destinations.
Patrick: Yeah, definitely. If you need real-time processing capabilities and you don’t want to deal with Kubernetes, Kafka, or all these protocols to send and receive data, Quix can help.
How do APIs and the SDK work together? When to use APIs and when to use the SDK?
Patrick: But sometimes the APIs aren’t the right choice. Although APIs can connect to a multitude of devices, they introduce a bit of latency — not much, but more than working with Kafka directly. Kafka introduces about 10 milliseconds of latency, which is very, very low. This works when you want to connect a few microservices.
Imagine connecting two cars in Formula 1. In this case, with only two devices, you want to send data directly to the SDK. If you use the SDK directly, you will get the best throughput and latency. Language is the only limitation, as the SDK is only available in Python and C#.
I’ve added the SDK and several larger devices to this next diagram.
The washing machine and computer could be any connected device; they’re simply placeholders here to demonstrate that it’s possible to integrate data from various sources based on what that specific source needs.
Patrick: APIs introduce some latency because they send chunks of values based on custom specifications. Small chunks decrease the latency, large chunks increase latency but decrease the likelihood of jams. If you’re creating a game, for instance, you probably want to send small packages because you don’t want to see the game jumping around on the screen. Making decisions like this requires users to analyze system requirements.
The SDK also allows you to create a buffer if you want to send data every 100 milliseconds, every millisecond, or every 10 timestamps. Learn more about the SDK and data integrations.
Mike: At the end of the day, there’s no one-size-fits-all. That’s what the APIs and the SDK are all about. They let users integrate various tools and devices and give engineers the flexibility they need to build advanced data apps.