Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

mypy type checking shows error when a variable gets dynamically allocated

I have a class that takes a Spark DataFrame and does some processing to it. Here is the code:

    for column in self.sdf.columns:
        if column not in self.__columns:
            row = [column]
            row += '--' * 9
            column_table.append(row)

I have this code inside constructor of my class:

self.sdf: Optional[SparkDataFrame] = None

Here sdf is set dynamically during execution of my class and then the for loop mentioned above will run. __columns is a dictionary which is supposed to have all the columns of sdf. I found no errors in the code when it ran. But when I type checked my code with mypy, it showed an error on the first line of for loop:

error: Item "None" of "Optional[Any]" has no attribute "columns"

I understand that initially sdf will be None. But should I consider this a serious error? Is there are any workarounds for this?

like image 789
ahrooran Avatar asked Jun 10 '20 22:06

ahrooran


1 Answers

Yes, columns is specific to data frame. you can find more info here .When you set self.sdf is None, you will get the error you posted. you can alternatively try this.

    from pyspark.sql import DataFrame

    if self.sdf is not None and isinstance(self.sdf,DataFrame):
        for column in self.sdf.columns:
            if column not in self.__columns:
                row = [column]
                row += '--' * 9
                column_table.append(row)
like image 95
kites Avatar answered Sep 27 '22 23:09

kites