pyspark throws TypeError: textFile() missing 1 required positional argument: 'name'

Question

I googled this problem, yet no direct answer related to spark-2.2.0-bin-hadoop2.7. I am trying to read a text file from local directory, but I always get TypeError that name argument is missing. This is the code in jupyter notebook with Python3:

from pyspark import SparkContext as sc
data = sc.textFile("/home/bigdata/test.txt")

When I run the cell, I get this error:

TypeError                                 Traceback (most recent call last)
  <ipython-input-7-2a326e5b8f8c> in <module>()
  1 from pyspark import SparkContext as sc
  ----> 2 data = sc.textFile("/home/bigdata/test.txt")
  TypeError: textFile() missing 1 required positional argument: 'name'

Your help is appreciated.

alecxe · Accepted Answer

You are calling the textFile() instance method

def textFile(self, name, minPartitions=None, use_unicode=True):

like it was a static method which results into "/home/bigdata/test.txt" string being used for the self value leaving name argument not specified, hence the error.

Create an instance of the SparkContext class:

from pyspark import SparkConf
from pyspark.context import SparkContext

sc = SparkContext.getOrCreate(SparkConf().setMaster("local[*]"))
data = sc.textFile("/home/bigdata/test.txt")

kamran kausar · Answer

from pyspark import SparkConf
from pyspark.context import SparkContext
sc = SparkContext.getOrCreate(SparkConf())
data = sc.textFile("my_file.txt")

Display some content

['this is text file and sc is working fine']

pyspark throws TypeError: textFile() missing 1 required positional argument: 'name'

Tags:

python

python-3.x

apache-spark

rdd

pyspark

Mohammed

2 Answers

alecxe

kamran kausar

Recent Activity

Donate For Us

pyspark throws TypeError: textFile() missing 1 required positional argument: 'name'

Tags:

python

python-3.x

apache-spark

rdd

pyspark

Mohammed

2 Answers

alecxe

kamran kausar

Related questions

Recent Activity

Donate For Us