I am trying to process some .csv data using pandas, and I am struggling with something that I am sure is a rookie move, but after spending a lot of time trying to make this work, I need your help.
Essentially, I am trying to find the index of a value within a dataframe I have created.
max = cd_gross_revenue.max()
#max value of the cd_gross_revenue dataframe
print max
#finds max value, no problem!
maxindex = cd_gross_revenue.idxmax()
print maxindex
#finds index of max_value, what I wanted!
print max.index
#ERROR: AttributeError: 'numpy.float64' object has no attribute 'index'
The maxindex variable gets me the answer using idxmax(), but what if I am not looking for the index of a max value? What if it is some random value's index that I am looking at, how would I go about it? Clearly .index does not work for me here.
Thanks in advance for any help!
Use a boolean mask
to get the rows where the value is equal to the random variable.
Then use that mask to index the dataframe or series.
Then you would use the .index
field of the pandas dataframe or series. An example is:
In [9]: s = pd.Series(range(10,20))
In [10]: s
Out[10]:
0 10
1 11
2 12
3 13
4 14
5 15
6 16
7 17
8 18
9 19
dtype: int64
In [11]: val_mask = s == 13
In [12]: val_mask
Out[12]:
0 False
1 False
2 False
3 True
4 False
5 False
6 False
7 False
8 False
9 False
dtype: bool
In [15]: s[val_mask]
Out[15]:
3 13
dtype: int64
In [16]: s[val_mask].index
Out[16]: Int64Index([3], dtype='int64')
s[s==13]
Eg,
from pandas import Series
s = Series(range(10,20))
s[s==13]
3 13
dtype: int64
When you called idxmax it returned the key in the index which corresponded to the max value. You need to pass that key to the dataframe to get that value.
max_key = cd_gross_revenue.idxmax()
max_value = cd_gross_revenue.loc[max_key]
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With