I am getting an error while creating a simple RDD in Spark

Question

I am using Jupyter notebook and just started to learn Apache spark, but getting an error while creating a simple RDD:

sc.parallelize([2, 3, 4]).count()

the error is : parallelize() missing 1 required positional argument: 'c' This happens for every kind like if I try textFile(""), I get that a positional argument is missing. I have the sparkcontext as sc, can someone please help me with this.

Haha TTpro · Accepted Answer

You have to Initializing a SparkContext.

Here is a sample code from Learning Spark: Lightning-Fast Big Data Analysis

from pyspark import SparkConf, SparkContext
conf = SparkConf().setMaster("local").setAppName("My App")
sc = SparkContext(conf = conf)

I am getting an error while creating a simple RDD in Spark

Tags:

python

apache-spark

rdd

Sahil

1 Answers

Haha TTpro

Recent Activity

Donate For Us

I am getting an error while creating a simple RDD in Spark

Tags:

python

apache-spark

rdd

Sahil

1 Answers

Haha TTpro

Related questions

Recent Activity

Donate For Us