Build .NET for Apache Spark with VS Code in a browser

Build .NET for Apache Spark with VS Code in a browser

My last article explained how you can use .NET for Apache Spark together with Entity Framework to stream data to an SQL Server. There is one caveat though. You have to build Microsoft.Spark.Worker yourself.
This time I’ll show you how you can actually build .NET for Apache Spark with VS Code in a browser yourself, including building and running the C# examples.

Setting up your own development environment to build and test .NET for Apache Spark can be tricky and time-consuming. However, as a regular reader, you are probably aware that I like to use docker to simplify things. And this time it’s no different.

The dotnet-spark dev image and code-server

To get started, I use the dotnet-spark development image to fire up a related container.

docker run --name dotnet-spark-dev -d -p 127.0.0.1:8888:8080 3rdman/dotnet-spark:dev-latest

The dev image comes with code-server installed, which is listening on port 8080 internally and mapped to port 8888 on my hosts’ loopback address. Therefore, if I point my browser to http://localhost:8888, I can start a VS Code session and open the dotnet.spark folder that contains a clone of the .NET for Apache Spark GitHub repository.

The dotnet-spark development container running VS Code in a browser

Building the Core Components

The detailed process of how to build from source is described on this .NET for Apache Spark GitHub page.

As there is a clone of the repository available in the container, already, I just pull the most recent changes and then start with building the Scala Extensions Layer.

Build .NET for Apache Spark with VS Code in a browser - Scala extension layer

After that, it’s time to build the Microsoft.Spark.Worker.

Build .NET for Apache Spark with VS Code in a browser - Microsoft.Spark.Worker

Building and running the C# examples

And finally the C# examples.

Build .NET for Apache Spark with VS Code in a browser - C# examples

Before I can run one of the examples, I need to set the DOTNET_WORKER_DIR environment variable.

Setting the DOTNET_WORKER_DIR environment variable

Once that is done, I am ready to run the Sql.Batch.Basic example.

Build .NET for Apache Spark with VS Code in a browser - running an example

Getting files out of the container

I think it is fair to say that the dotnet-spark dev image can save you a lot of time, if you need to build .NET for Apache Spark yourself.

Finally, one remaining question may be, how to get your compiled files out of the container.

Docker provides the cp command for that and its usage is shown below.

Getting files out of the docker container

Thank you very much for reading/watching and have a great time!

3 Comments

  1. […] made available so far. Just recently I’ve added a development image that allows you to easily build .NET for Apache Spark with VS Code in a browser. Today I want to introduce you to the latest member of the […]

  2. […] Use these images if you want to build .NET for Apache Spark yourself, make changes to the source code or contribute to .NET for Apache Spark. For a brief introduction, check out this blog post. […]

  3. […] For a more detailed introduction, check out this blog post. […]

Comments are closed.

Scroll to top