.NET for Apache Spark – MQTT Streaming

.NET for Apache Spark 0.4.0 was released recently. Therefore, it is now time to test, if it can be used for MQTT Streaming as well.

If you followed my series about a real-time data processing pipeline, you probably remember that I have used Apache Bahir to retrieve streaming data from Apache ActiveMQ via the MQTT protocol. The data itself was generated by IanniX and forwarded to ActiveMQ utilizing my osc2activemq docker image.

Preparing .NET for Apache Spark for MQTT Streaming

For the most parts, you can follow this quick intro tutorial, which walks … more

Real-time data processing pipeline – Part 4 – data transformation

So far, we know how to get our streaming data from ActiveMQ into Spark by using Bahir. On this basis, it is now time to implement the data transformation required to get to the desired output format.

Extract

As a quick reminder, here is the Scala code that I have used so far to retrieve the data from ActiveMQ and write it to a memory sink.

%spark

import org.apache.spark.sql.SparkSession
import org.apache.spark.sql.functions._
import org.apache.spark.sql.types._

// create a named session
val spark = SparkSession
    .builder()
    .appName("ReadMQTTStream")
    .getOrCreate()

// read data from the OscStream topic
val mqttDf = 
more

Real-time data processing pipeline – Part 2 – OSC to ActiveMQ

Real-time data processing pipeline – Part 2 – OSC to ActiveMQ

Welcome back to the second part of my series, showcasing a real-time data processing pipeline!
In part 1, I explored visual real-time sensor data simulation, as the entry point into our pipeline.
Now it’s time to find out, how we can get the generated data into Apache ActiveMQ, by transferring it via the OSC protocol.

Apache ActiveMQ™ is the most popular open source, multi-protocol, Java-based messaging server. It supports a variety of Cross Language Clients and Protocols, and therefore makes it an excellent choice for our pipeline.

Get ActiveMQ up and running

I won’t … more

Scroll to top