My last article explained, how a .NET for Apache Spark project can be debugged in Visual Studio 2019 under Windows. I have also mentioned some limitation at the end of the article. In this article I will extend the project a bit and demonstrate the aforementioned limitation using version 0.8.0 of my docker image for .NET for Apache Spark. Furthermore, I will show a possible workaround that can be used, if you are running Docker and Visual Studio Code under Linux (Ubuntu 18.04).
The latest version of my docker image for .NET for Apache Spark tries to support direct debugging from Visual Studio 2019 and Visual Studio Code. This is the first article of a small series that will show how this can be done on different environments (Windows and Linux), and what limitations might exist.
Test application & data
I have put together a very simple C# application, named “HelloUdf”, for demonstration purposes. It is supposed to read a JSON file (coordinates.json) that contains one coordinate string per line. Besides reading the file, the application’s task is … more
You do want to test and debug your .NET for Apache Spark application with Visual Studio? But you don’t want to set up Apache Spark yourself? Then read along and find out how my docker image might be able to help.
Before we dig into the details however, I specifically want to thank Devin Martin for sharing his idea about such a docker image with me!
As mentioned in the post related to ActiveMQ, Spark and Bahir, Spark does not provide a JDBC sink out of the box. Therefore, I will have to use the foreach sink and implement an extension of the org.apache.spark.sql.ForeachWriter. It will take each individual data row and write it to PostgreSQL.
Even though I want to use PostgreSQL, I am actually