When I try to execute this command line at pyspark
arquivo = sc.textFile("dataset_analise_sentimento.csv")
I got the following error message:
Py4JJavaError: An error occurred while calling z:
org.apache.spark.api.python.PythonRDD.runJob.:
org.apache.spark.SparkException: Job aborted due to stage failure:
Task 0 in stage 0.0 failed 1 times, most recent failure:
Lost task 0.0 in stage 0.0 (TID 0, localhost, executor driver):
org.apache.spark.SparkException: Python worker failed to connect back.
I have tried the following steps:
sc = spark.sparkContext (found this possible solution at this question here in Stackoverflow, didn´t work for me).PYSPARK_DRIVER_PYTHON from jupyter to ipython, as said in this link, no success.None of the steps above worked for me and I can´t find a solution.
Actually I´m using the following versions:
Python 3.7.3, Java JDK 11.0.6, Windows 10, Apache Spark 2.3.4
I just configure the following variables environment and now it's working normally:
HADOOP_HOME = C:\HadoopJAVA_HOME = C:\Java\jdk-11.0.6PYSPARK_DRIVER_PYTHON = jupyterPYSPARK_DRIVER_PYTHON_OPTS = notebookPYSPARK_PYTHON = pythonActually I´m using the following versions:
Python 3.7.3, Java JDK 11.0.6, Windows 10, Apache Spark 2.4.3 and using Jupyter Notebook with pyspark.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With