Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Compute standard deviation for polars dataframe rows for set of columns

I would like to calculate the standard deviation of dataframe row for the columns 'foo' and 'bar'.

I am able to find min,max and mean but not std.

import polars as pl

df = pl.DataFrame(
    {
        "foo": [1, 2, 3],
        "bar": [6, 7, 8],
        "ham": ["a", "b", "c"],
    }
)

# there are _horizontal functions for sum, min, max

df = df.with_columns(
    pl.sum_horizontal('foo','bar')
      .round(2)
      .alias('sum')
)

however, there is no std_horizontal function.

df = df.with_columns(
    pl.std_horizontal('foo','bar')
      .round(2)
      .alias('std')
)

# AttributeError: module 'polars' has no attribute 'std_horizontal'

Is there any better method available to compute standard deviation in such scenario ?

like image 379
Rakesh Chaudhary Avatar asked Sep 15 '25 09:09

Rakesh Chaudhary


1 Answers

Until a dedicated std_horizontal is added:

Another way to get a "row" or "horizontal" context is using the List API

df.with_columns(
   sum = pl.concat_list("foo", "bar").list.sum(),
   std = pl.concat_list("foo", "bar").list.std()
)
shape: (3, 5)
┌─────┬─────┬─────┬─────┬──────────┐
│ foo ┆ bar ┆ ham ┆ sum ┆ std      │
│ --- ┆ --- ┆ --- ┆ --- ┆ ---      │
│ i64 ┆ i64 ┆ str ┆ i64 ┆ f64      │
╞═════╪═════╪═════╪═════╪══════════╡
│ 1   ┆ 6   ┆ a   ┆ 7   ┆ 3.535534 │
│ 2   ┆ 7   ┆ b   ┆ 9   ┆ 3.535534 │
│ 3   ┆ 8   ┆ c   ┆ 11  ┆ 3.535534 │
└─────┴─────┴─────┴─────┴──────────┘
like image 157
jqurious Avatar answered Sep 17 '25 20:09

jqurious