Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Polars: How to filter using 'in' and 'not in' like in SQL

How can I achieve the equivalents of SQL's IN and NOT IN?

I have a list with the required values. Here's the scenario:

import pandas as pd
import polars as pl
exclude_fruit = ["apple", "orange"]

df = pl.DataFrame(
    {
        "A": [1, 2, 3, 4, 5, 6],
        "fruits": ["banana", "banana", "apple", "apple", "banana", "orange"],
        "B": [5, 4, 3, 2, 1, 6],
        "cars": ["beetle", "audi", "beetle", "beetle", "beetle", "frog"],
        "optional": [28, 300, None, 2, -30, 949],
    }
)
df.filter(~pl.select("fruits").str.contains(exclude_fruit))
df.filter(~pl.select("fruits").to_pandas().isin(exclude_fruit))
df.filter(~pl.select("fruits").isin(exclude_fruit))
like image 998
Daycent Avatar asked Sep 13 '25 05:09

Daycent


2 Answers

You were close.

df.filter(~pl.col('fruits').is_in(exclude_fruit))
shape: (3, 5)
┌─────┬────────┬─────┬────────┬──────────┐
│ A   ┆ fruits ┆ B   ┆ cars   ┆ optional │
│ --- ┆ ---    ┆ --- ┆ ---    ┆ ---      │
│ i64 ┆ str    ┆ i64 ┆ str    ┆ i64      │
╞═════╪════════╪═════╪════════╪══════════╡
│ 1   ┆ banana ┆ 5   ┆ beetle ┆ 28       │
│ 2   ┆ banana ┆ 4   ┆ audi   ┆ 300      │
│ 5   ┆ banana ┆ 1   ┆ beetle ┆ -30      │
└─────┴────────┴─────┴────────┴──────────┘

Alternatively, you can also add .not_()

df.filter(pl.col('fruits').is_in(exclude_fruit).not_())

which is handy for node.js, where the tilde ~ doesn't work:

df.filter(pl.col('fruits').isIn(exclude_fruit).not())
like image 26
sezanzeb Avatar answered Sep 14 '25 17:09

sezanzeb



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!