Keep only rows that have at least one null

Question

I am trying to do basically the opposite of drop_nulls(). I want to keep all rows that have at least one null.

I want to do something like (but I don't want to list all other columns):

for (name,) in (
    df.filter(
        pl.col("a").is_null()
        | pl.col("b").is_null()
        | pl.col("c").is_null()
    )
    .select("name")
    .unique()
    .rows()
):
    print(
        f"Ignoring `{name}` because it has at least one null",
        file=sys.stderr,
    )
df = df.drop_nulls()

Hericks · Accepted Answer

It sounds like you are looking for pl.Expr.any_horizontal. The following will keep all rows containing at least one null value (in any of the columns).

df.filter(pl.any_horizontal(pl.all().is_null()))

Keep only rows that have at least one null

Tags:

python

dataframe

python-polars

DJDuque

1 Answers

Hericks

Recent Activity

Donate For Us

Keep only rows that have at least one null

Tags:

python

dataframe

python-polars

DJDuque

1 Answers

Hericks

Related questions

Recent Activity

Donate For Us