Why is spark not repartioning my dataframe over multiple nodes?

Question

I have 128 cores, 8 nodes, 186Gb ram per node.

I have dataframe (Df) that I load from a jdbc source. It has one partition. I then call:

c = Df.repartition(128*3).cache().count()

The application web UI shows the cached rdd as having 384 partitions, but all located on one node (lets call it node 1) with a size of 57Mb in ram.

When I look at the count stages, I see 384 tasks, all executed on node 1.

Why does Spark not distribute the dataframe evenly on all the nodes?

I'm running this in pycharm. Here are the config values I set:

spark = SparkSession \
        .builder \
        .master("spark://sparkmaster:7087") \
        .appName(__SPARK_APP_NAME__) \
        .config("spark.executor.memory", "80g") \
        .config("spark.eventlog.enabled", "True") \
        .config("spark.eventlog.dir", r"C:\Temp\Athena\UAT\Logs") \
        .config("spark.cores.max", 128) \
        .config("spark.sql.crossJoin.enabled", "True") \
        .config("spark.executor.extraLibraryPath","/net/share/grid/bin/spark/UAT/bin/vertica-jdbc-8.0.0-0.jar") \
        .config("spark.serializer", "org.apache.spark.serializer.KryoSerializer") \
        .getOrCreate()

Here are my spark properties

enter image description here

PradhanKamal · Accepted Answer

here we specify the details of the resources and the application details while submitting the application

./bin/spark-submit \
      --class org.apache.spark.examples.SparkPi \
      --master spark://207.184.161.138:7077 \
      --deploy-mode cluster \
      --supervise \
      --executor-memory 20G \
      --total-executor-cores 100 \
      /path/to/examples.jar \

Why is spark not repartioning my dataframe over multiple nodes?

Tags:

apache-spark

pyspark

pyspark-sql

ThatDataGuy

1 Answers

PradhanKamal

Recent Activity

Donate For Us

Why is spark not repartioning my dataframe over multiple nodes?

Tags:

apache-spark

pyspark

pyspark-sql

ThatDataGuy

1 Answers

PradhanKamal

Related questions

Recent Activity

Donate For Us