Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

python pandas select both head and tail

For a DataFrame in Pandas, how can I select both the first 5 values and last 5 values?

For example

In [11]: df
Out[11]: 
        A  B  C
2012-11-29  0  0  0
2012-11-30  1  1  1
2012-12-01  2  2  2
2012-12-02  3  3  3
2012-12-03  4  4  4
2012-12-04  5  5  5
2012-12-05  6  6  6
2012-12-06  7  7  7
2012-12-07  8  8  8
2012-12-08  9  9  9

How to show the first two and the last two rows?

like image 395
fu xue Avatar asked Feb 28 '17 09:02

fu xue


People also ask

What is head () and tail () function?

Head(): Function which returns the first n rows of the dataset. head(x,n=number) Tail(): Function which returns the last n rows of the dataset. tail(x,n=number)

How do I select top and rows in pandas?

Select first N Rows from a Dataframe using head() function In Python's Pandas module, the Dataframe class provides a head() function to fetch top rows from a Dataframe i.e. It returns the first n rows from a dataframe. If n is not provided then default value is 5.

What does .tail do in Python?

The tail() function is used to get the last n rows. This function returns last n rows from the object based on position. It is useful for quickly verifying data, for example, after sorting or appending rows.


3 Answers

You can use iloc with numpy.r_:

print (np.r_[0:2, -2:0])
[ 0  1 -2 -1]

df = df.iloc[np.r_[0:2, -2:0]]
print (df)
            A  B  C
2012-11-29  0  0  0
2012-11-30  1  1  1
2012-12-07  8  8  8
2012-12-08  9  9  9

df = df.iloc[np.r_[0:4, -4:0]]
print (df)
            A  B  C
2012-11-29  0  0  0
2012-11-30  1  1  1
2012-12-01  2  2  2
2012-12-02  3  3  3
2012-12-05  6  6  6
2012-12-06  7  7  7
2012-12-07  8  8  8
2012-12-08  9  9  9
like image 75
jezrael Avatar answered Oct 11 '22 01:10

jezrael


You can use df.head(5) and df.tail(5) to get first five and last five. Optionally you can create new data frame and append() head and tail:

new_df = df.tail(5)
new_df = new_df.append(df.head(5))
like image 14
Linas Fx Avatar answered Oct 11 '22 00:10

Linas Fx


Not quite the same question but if you just want to show the top / bottom 5 rows (eg with display in jupyter or regular print, there's potentially a simpler way than this if you use the pd.option_context context.

#make 100 3d random numbers
df = pd.DataFrame(np.random.randn(100,3))

# sort them by their axis sum
df = df.loc[df.sum(axis=1).index]

with pd.option_context('display.max_rows',10):
    print(df)

Outputs:

           0         1         2
0  -0.649105 -0.413335  0.374872
1   3.390490  0.552708 -1.723864
2  -0.781308 -0.277342 -0.903127
3   0.433665 -1.125215 -0.290228
4  -2.028750 -0.083870 -0.094274
..       ...       ...       ...
95  0.443618 -1.473138  1.132161
96 -1.370215 -0.196425 -0.528401
97  1.062717 -0.997204 -1.666953
98  1.303512  0.699318 -0.863577
99 -0.109340 -1.330882 -1.455040

[100 rows x 3 columns]
like image 13
Bolster Avatar answered Oct 10 '22 23:10

Bolster