How to handle easily uncertainties on Series or DataFrame in Pandas (Python Data Analysis Library) ? I recently discovered the Python uncertainties package but I am wondering if there is any simpler way to manage uncertainties directly within Pandas. I didn't find anything about this in the documentation.
To be more precise, I don't want to store the uncertainties as a new column in my DataFrame because I think they are part of a data series and shouldn't be logically separated from it. For example, it doesn't make any sense deleting a column in a DataFrame but not its uncertainties, so I have to handle this case by hand.
I was looking for something like data_frame.uncertainties
which could work like the data_frame.values
attribute. A data_frame.units
(for data units) would be great too but I think those things don't exist in Pandas (yet?)...
If you really want it to be a built in function you can just create a class to put your dataframe in. Then you can define whatever values or functions that you want. Below I wrote a quick example but you could easily add a units definition or a more complicated uncertainty formula
import pandas as pd
data={'target_column':[100,105,110]}
class data_analysis():
def __init__(self, data, percentage_uncertainty):
self.df = pd.DataFrame(data)
self.uncertainty = percentage_uncertainty*self.df['target_column'].values
When I run
example=data_analysis(data,.01)
example.uncertainty
I get out array([1. , 1.05, 1.1 ])
Hope this helps
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With