Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Running nosetests for pyspark

How would one run unit tests with nose for Apache Spark applications written in Python?

With nose one would usually just call the command

nosetests

to run the tests in the tests directory of a Python package. Pyspark scripts need to be run with the spark-submit command instead of the usual Python-executable to enable the import of the pyspark-module. How would I combine nosetests with pyspark to run tests for my Spark application?

like image 266
karlson Avatar asked Sep 29 '22 12:09

karlson


1 Answers

If it helps we use nosetest to test sparkling pandas. We do a bit of magic in our utils file to add pyspark to the path based on the SPARK_HOME shell environment variable.

like image 103
Holden Avatar answered Oct 02 '22 15:10

Holden