How do I avoid type errors when internal function returns 'Union' that could be 'None'?

I've been running into a bit of weirdness with Unions (and Optionals, of course) in Python - namely it seems that the static type checker tests properties against all member of a union, and not a member of the union (i.e. it seems overly strict?). As an example, consider the following:

import pandas as pd

def test_dummy() -> pd.DataFrame:
   df = pd.DataFrame()
   df = df.fillna(df)
   return df

This creates a type warning, as pd.fillna(..., inplace: Bool = False, ...) -> Optional[pd.DataFrame] (it is a None return if inplace=True). I suspect that in theory the static type checker should realize the return of the function changes depending on the arguments (as that should be known when code is written), but that's a bit beyond the point.

I have the following questions:

What is the best way to resolve this? I can think of two solutions:

i) do nothing -- which creates ugly squiggles in my code

ii) cast the return of fillna to a pd.DataFrame; my understanding is this is a informative step to the static type checker so should not cause any concerns or issues?
Let us consider that I'm writing a function f which, similarly to this, has its return types vary depending on the function call inputs, and this should be determinable before runtime. In order to avoid such errors in the future; what is the best way to go about writing this function? Would it be better to do something like a @typing.overload?

What is a type II error?

The type II error is also known as a false negative. The type II error has an inverse relationship with the power of a statistical test. This means that the higher power of a statistical test, the lower the probability of committing a type II error.

How do I Fix an internal error in my code?

To prevent confusion between errors in your code and errors in the tool itself it is conventional to call the error in the tool itself an internal error. In many programming languages, self-checks are done using assertions, and information about the failure is provided in the form of a stack-trace. So, how do you fix an internal error? You can't.

What is an example of a type 1 error?

A Type I error occurs when you reject the null hypothesis when you indeed should not have. In the aforementioned court example, a Type I error would be convicting an innocent person — the null hypothesis of innocence is rejected when it shouldn’t have been.

How do you minimize the risk of Type 1 error?

Hypothesis testing . However, there are opportunities to minimize the risks of obtaining results that contain a type I error. One of the most common approaches to minimizing the probability of getting a false positive error is to minimize the significance level of a hypothesis test.

The underlying function should really be defined as an overload -- I'd suggest a patch to pandas probably

Here's what the type looks like right now:

    def fillna(
        self: FrameOrSeries,
        value=None,
        method=None,
        axis=None,
        inplace: bool_t = False,
        limit=None,
        downcast=None,
    ) -> Optional[FrameOrSeries]: ...

in reality, a better way to represent this is to use an @overload -- the function returns None when inplace = True:

    @overload
    def fillna(
        self: FrameOrSeries,
        value=None,
        method=None,
        axis=None,
        inplace: Literal[True] = False,
        limit=None,
        downcast=None,
    ) -> None: ...


    @overload
    def fillna(
        self: FrameOrSeries,
        value=None,
        method=None,
        axis=None,
        inplace: Literal[False] = False,
        limit=None,
        downcast=None,
    ) -> FrameOrSeries: ...


    def fillna(
        self: FrameOrSeries,
        value=None,
        method=None,
        axis=None,
        inplace: bool_t = False,
        limit=None,
        downcast=None,
    ) -> Optional[FrameOrSeries]:
        # actual implementation

but assuming you can't change the underlying library you have several approaches to unpacking the union. I made a video about this specifically for re.match but I'll reiterate here since it's basically the same problem (Optional[T])

option 1: an assert indicating the expected return type

the assert tells the type checker something it doesn't know: that the type is narrower than it knows about. mypy will trust this assertion and the type will be assumed to be pd.DataFrame

def test_dummy() -> pd.DataFrame:
   df = pd.DataFrame()
   ret = df.fillna(df)
   assert ret is not None
   return ret

option 2: cast

explicitly tell the type checker that the type is what you expect, "cast"ing away the None-ness

from typing import cast

def test_dummy() -> pd.DataFrame:
   df = pd.DataFrame()
   ret = cast(pd.DataFrame, df.fillna(df))
   return ret

type: ignore

the (imo) hacky solution is to tell the type checker to ignore the incompatibility, I would not suggest this approach but it can be helpful as a quick fix

def test_dummy() -> pd.DataFrame:
   df = pd.DataFrame()
   ret = df.fillna(df)
   return ret  # type: ignore

The pandas.DataFrame.fillna method is defined as returning either DataFrame or None.

If there is a possibility that a function will return None, then this should be documented by using an Optional type hint. It would be wrong to try to hide the fact a function could return None by using a cast or a comment to ignore the warning such as:

return df  # type: ignore

If function could return `None`, use `Optional`

import numpy as np
import pandas as pd
from typing import Optional


def test_dummy() -> Optional[pd.DataFrame]:
    df = pd.DataFrame([np.nan, 2, np.nan, 0])
    df = df.fillna(value=0)
    return df

Function guaranteed not to return `None`, there are these options

If you can guarantee that a function will not return None, but it cannot be statically inferred by a type checker, then there are three options.

Option 1: Use an assertion to indicate that DataFrame is not None

This is the approach recommended by the mypy documentation.

def test_dummy() -> pd.DataFrame:
    df = pd.DataFrame([np.nan, 2, np.nan, 0])
    df = df.fillna(value=0)
    assert df is not None 
    return df

Option 2: Use a cast

from typing import cast

def test_dummy() -> pd.DataFrame:
    df = pd.DataFrame([np.nan, 2, np.nan, 0])
    df = cast(pd.DataFrame, df.fillna(value=0))
    return df

Option 3: Tell mypy to ignore the warning (not recommended)

from typing import cast

def test_dummy() -> pd.DataFrame:
    df = pd.DataFrame([np.nan, 2, np.nan, 0])
    df = df.fillna(value=0)
    return df  # type: ignore

How do I avoid type errors when internal function returns 'Union' that could be 'None'?

Tags:

python

pandas

type-hinting

python-typing

deetsb

People also ask

2 Answers

option 1: an assert indicating the expected return type

option 2: cast

type: ignore

Anthony Sottile

If function could return `None`, use `Optional`

Function guaranteed not to return `None`, there are these options

Option 1: Use an assertion to indicate that DataFrame is not None

Option 2: Use a cast

Option 3: Tell mypy to ignore the warning (not recommended)

Christopher Peisert

Recent Activity

Donate For Us

How do I avoid type errors when internal function returns 'Union' that could be 'None'?

Tags:

python

pandas

type-hinting

python-typing

deetsb

People also ask

2 Answers

option 1: an assert indicating the expected return type

option 2: cast

type: ignore

Anthony Sottile

If function could return None, use Optional

Function guaranteed not to return None, there are these options

Option 1: Use an assertion to indicate that DataFrame is not None

Option 2: Use a cast

Option 3: Tell mypy to ignore the warning (not recommended)

Christopher Peisert

Related questions

Recent Activity

Donate For Us

If function could return `None`, use `Optional`

Function guaranteed not to return `None`, there are these options