Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Uncertainties in Pandas

Tags:

How to handle easily uncertainties on Series or DataFrame in Pandas (Python Data Analysis Library) ? I recently discovered the Python uncertainties package but I am wondering if there is any simpler way to manage uncertainties directly within Pandas. I didn't find anything about this in the documentation.

To be more precise, I don't want to store the uncertainties as a new column in my DataFrame because I think they are part of a data series and shouldn't be logically separated from it. For example, it doesn't make any sense deleting a column in a DataFrame but not its uncertainties, so I have to handle this case by hand.

I was looking for something like data_frame.uncertainties which could work like the data_frame.values attribute. A data_frame.units (for data units) would be great too but I think those things don't exist in Pandas (yet?)...

like image 976
Falken Avatar asked Feb 10 '14 16:02

Falken


1 Answers

If you really want it to be a built in function you can just create a class to put your dataframe in. Then you can define whatever values or functions that you want. Below I wrote a quick example but you could easily add a units definition or a more complicated uncertainty formula

import pandas as pd

data={'target_column':[100,105,110]}

class data_analysis():
    def __init__(self, data, percentage_uncertainty):
    self.df = pd.DataFrame(data)
    self.uncertainty = percentage_uncertainty*self.df['target_column'].values

When I run

example=data_analysis(data,.01)
example.uncertainty

I get out array([1. , 1.05, 1.1 ])

Hope this helps

like image 130
Novice Avatar answered Sep 26 '22 06:09

Novice