I have the following DataFrame gathering daily stats on 2 measures A and B :
A B
count 17266.000000 17266.000000
std 0.179003 0.178781
75% 101.102251 101.053214
min 100.700993 100.651956
mean 101.016747 100.964003
max 101.540214 101.491178
50% 100.988465 100.938694
25% 100.885251 100.830048
Below is a piece of code that creates it:
day1 = {
'A': {
'count': 17266.0,
'std': 0.17900265293286116,
'min': 100.70099294189714,
'max': 101.54021448871775,
'50%': 100.98846526697825,
'25%': 100.88525124427971,
'75%': 101.10225131847992,
'mean': 101.01674677794136
},
'B': {
'count': 17266.0,
'std': 0.17878125983374854,
'min': 100.65195609992342,
'max': 101.49117764674403,
'50%': 100.93869409089723,
'25%': 100.83004837814667,
'75%': 101.05321447650618,
'mean': 100.96400305527138
}
}
df = pandas.DataFrame.from_dict(day1, orient='index').T
The data come right out from a describe(). I have several such describes (one for each day) and I would like to gather them all into a single dataframe that has the date as an index.
The most obvious way to obtain that would be to stack all the daily results into one dataframe, then group it by day and run the stats on the result. However I would like an alternate method because I run into a MemoryError with the amount of data I process.
The final outcome should look like this:
A B
2014-12-24 count 15895.000000 15895.000000
mean 99.943618 99.968860
std 0.012468 0.011932
min 99.877695 99.928778
25% 99.934890 99.960445
50% 99.943453 99.968847
75% 99.952340 99.977571
max 99.982930 100.002507
2014-12-25 count 16278.000000 16278.000000
mean 99.937056 99.962203
std 0.012395 0.012661
min 99.884501 99.910567
25% 99.928078 99.953758
50% 99.936754 99.962411
75% 99.945914 99.971473
max 99.981512 100.003770
If you are able to make a dict of {date: describe_df_for_that_day}, then you can use pd.concat(dict)
.
Starting with your df
:
In [14]: d = {'2014-12-24': df, '2014-12-25': df}
In [15]: pd.concat(d)
Out[15]:
A B
2014-12-24 count 17266.000000 17266.000000
std 0.179003 0.178781
75% 101.102251 101.053214
min 100.700993 100.651956
mean 101.016747 100.964003
max 101.540214 101.491178
50% 100.988465 100.938694
25% 100.885251 100.830048
2014-12-25 count 17266.000000 17266.000000
std 0.179003 0.178781
75% 101.102251 101.053214
min 100.700993 100.651956
mean 101.016747 100.964003
max 101.540214 101.491178
50% 100.988465 100.938694
25% 100.885251 100.830048
You can of course make the keys real dates instead of strings.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With