Suppose I concatenate two DataFrames like so:
import numpy as np
import pandas as pd
array1 = np.random.randn(3,3)
array2 = np.random.randn(3,3)
df1 = pd.DataFrame(array1, columns=list('ABC'))
df2 = pd.DataFrame(array2, columns=list('ABC'))
df = pd.concat([df1, df2])
The resulting DataFrame df
looks like this:
A B C
0 1.297362 0.745510 -0.206756
1 -0.056807 -1.875149 -0.210556
2 0.310837 -1.068873 2.054006
0 1.163739 -0.678165 2.626052
1 -0.557625 -1.448195 -1.391434
2 0.222607 -0.334348 0.672643
Note that the indices are the same as in the original DataFrames. I'd like to re-index df
such that the indices simply run from 0
to 5
. How can I do this?
(I've tried df = df.reindex(index = range(df.shape[0]))
but this gives ValueError: cannot reindex from a duplicate axis
. This is because the original axis contains duplicates (two 0
s, two 1
s, etc.)).
You want to pass ignore_index=True
to concat
:
In [68]:
array1 = np.random.randn(3,3)
array2 = np.random.randn(3,3)
df1 = pd.DataFrame(array1, columns=list('ABC'))
df2 = pd.DataFrame(array2, columns=list('ABC'))
df = pd.concat([df1, df2], ignore_index=True)
df
Out[68]:
A B C
0 -0.091094 0.460133 -0.548937
1 -0.839469 -1.354138 -0.823666
2 0.088581 -1.142542 -1.746608
3 0.067320 1.014533 -1.294371
4 2.094135 0.622129 1.203257
5 0.415768 -0.467081 -0.740371
This will ignore the existing indices so in effect it sets a new index starting from 0 for the newly concatenated index
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With