I've looking around for this but I can't seem to find it (though it must be extremely trivial).
The problem that I have is that I would like to retrieve the value of a column for the first and last entries of a data frame. But if I do:
df.ix[0]['date']
I get:
datetime.datetime(2011, 1, 10, 16, 0)
but if I do:
df[-1:]['date']
I get:
myIndex
13 2011-12-20 16:00:00
Name: mydate
with a different format. Ideally, I would like to be able to access the value of the last index of the data frame, but I can't find how.
I even tried to create a column (IndexCopy) with the values of the index and try:
df.ix[df.tail(1)['IndexCopy']]['mydate']
but this also yields a different format (since df.tail(1)['IndexCopy'] does not output a simple integer).
Any ideas?
Python3. Pandas iloc is used to retrieve data by specifying its integer index. In python negative index starts from end therefore we can access the last element by specifying index to -1 instead of length-1 which will yield the same result.
Method 1: Using tail() method DataFrame. tail(n) to get the last n rows of the DataFrame. It takes one optional argument n (number of rows you want to get from the end). By default n = 5, it return the last 5 rows if the value of n is not passed to the method.
The former answer is now superseded by .iloc
:
>>> df = pd.DataFrame({"date": range(10, 64, 8)})
>>> df.index += 17
>>> df
date
17 10
18 18
19 26
20 34
21 42
22 50
23 58
>>> df["date"].iloc[0]
10
>>> df["date"].iloc[-1]
58
The shortest way I can think of uses .iget()
:
>>> df = pd.DataFrame({"date": range(10, 64, 8)})
>>> df.index += 17
>>> df
date
17 10
18 18
19 26
20 34
21 42
22 50
23 58
>>> df['date'].iget(0)
10
>>> df['date'].iget(-1)
58
Alternatively:
>>> df['date'][df.index[0]]
10
>>> df['date'][df.index[-1]]
58
There's also .first_valid_index()
and .last_valid_index()
, but depending on whether or not you want to rule out NaN
s they might not be what you want.
Remember that df.ix[0]
doesn't give you the first, but the one indexed by 0. For example, in the above case, df.ix[0]
would produce
>>> df.ix[0]
Traceback (most recent call last):
File "<ipython-input-489-494245247e87>", line 1, in <module>
df.ix[0]
[...]
KeyError: 0
Combining @comte's answer and dmdip's answer in Get index of a row of a pandas dataframe as an integer
df.tail(1).index.item()
gives you the value of the index.
Note that indices are not always well defined not matter they are multi-indexed or single indexed. Modifying dataframes using indices might result in unexpected behavior. We will have an example with a multi-indexed case but note this is also true in a single-indexed case.
Say we have
df = pd.DataFrame({'x':[1,1,3,3], 'y':[3,3,5,5]}, index=[11,11,12,12]).stack()
11 x 1
y 3
x 1
y 3
12 x 3
y 5 # the index is (12, 'y')
x 3
y 5 # the index is also (12, 'y')
df.tail(1).index.item() # gives (12, 'y')
Trying to access the last element with the index df[12, "y"]
yields
(12, y) 5
(12, y) 5
dtype: int64
If you attempt to modify the dataframe based on the index (12, y)
, you will modify two rows rather than one. Thus, even though we learned to access the value of last row's index, it might not be a good idea if you want to change the values of last row based on its index as there could be many that share the same index. You should use df.iloc[-1]
to access last row in this case though.
Reference
https://pandas.pydata.org/pandas-docs/stable/generated/pandas.Index.item.html
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With