.NET for Apache Spark 0.4.0 docker image

.NET for Apache Spark

.NET for Apache Spark 0.4.0 has been released.
If you want to test it out, you might find my Docker image useful.
Details are available at https://hub.docker.com/r/3rdman/dotnet-spark

UPDATE:

There’s a new image available. Click here for more details.

Quick reference

The image is based on Ubuntu 18.04, Apache Spark 2.4.3 with Hadoop 2.7, .NET Core 2.1.801 and .NET for Apache Spark 0.4.0. It is intended for the purpose of testing .NET for Apache Spark, without the need to install the required bits manually.
Per default, the related container will start up one master instance, and two slave instances of Spark. You can modify the number of slave instances by setting the environment variable SPARK_WORKER_INSTANCES in your docker run command, as shown in the example below.

docker run -d --name dotnet-spark -e SPARK_WORKER_INSTANCES=1 -p 8080:8080 -p 8081:8081 3rdman/dotnet-spark:0.4.0-linux

Once started, use the interactive terminal to play around.

docker exec -it dotnet-spark /bin/bash

Per default the Spark master Web UI is listening on port 8080 and the spark workers UI port start with 8081. Depending on the number of SPARK_WORKER_INSTANCES specified, the port number increases with each additional instance.

Getting started

The HelloSpark example from https://github.com/dotnet/spark/blob/master/docs/getting-started/ubuntu-instructions.md is available in the image under /dotnet/HelloSpark

Please have a look at the instructions from the URL above or the README.txt file contained in the /dotnet/HelloSpark folder.

If you want to test the example with the different workers, use the following command in the interactive terminal:

spark-submit --class org.apache.spark.deploy.dotnet.DotnetRunner --master spark://$HOSTNAME:$SPARK_MASTER_PORT microsoft-spark-2.4.x-0.4.0.jar dotnet HelloSpark.dll

Spark’s log files are located in /spark/logs

Enjoy playing around!

2 Comments

  1. […] 0.5.0 of .NET for Apache Spark has been released. This means that it is time to update my previously released docker image and also show how to perform a quick test using the included example C# project.The image […]

  2. […] .NET for Apache Spark 0.4.0 docker image […]

Comments are closed.

Scroll to top