My last article explained how you can use .NET for Apache Spark together with Entity Framework to stream data to an SQL Server. There is one caveat though. You have to build Microsoft.Spark.Worker yourself.
This time I’ll show you how you can actually build .NET for Apache Spark with VS Code in a browser yourself, including building and running the C# examples.
Setting up your own development environment to build and test .NET for Apache Spark can be tricky and time-consuming. However, as a regular reader, you are probably aware that I like to use docker to simplify things. And this time it’s no different.
The dotnet-spark dev image and code-server
To get started, I use the dotnet-spark development image to fire up a related container.
docker run --name dotnet-spark-dev -d -p 127.0.0.1:8888:8080 3rdman/dotnet-spark:dev-latest
The dev image comes with code-server installed, which is listening on port 8080 internally and mapped to port 8888 on my hosts’ loopback address. Therefore, if I point my browser to http://localhost:8888, I can start a VS Code session and open the dotnet.spark folder that contains a clone of the .NET for Apache Spark GitHub repository.
Building the Core Components
The detailed process of how to build from source is described on this .NET for Apache Spark GitHub page.
As there is a clone of the repository available in the container, already, I just pull the most recent changes and then start with building the Scala Extensions Layer.
After that, it’s time to build the Microsoft.Spark.Worker.
Building and running the C# examples
And finally the C# examples.
Before I can run one of the examples, I need to set the
DOTNET_WORKER_DIR environment variable.
Once that is done, I am ready to run the Sql.Batch.Basic example.
Getting files out of the container
I think it is fair to say that the dotnet-spark dev image can save you a lot of time, if you need to build .NET for Apache Spark yourself.
Finally, one remaining question may be, how to get your compiled files out of the container.
Docker provides the
cp command for that and its usage is shown below.
Thank you very much for reading/watching and have a great time!