Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Will passing ignore_index=True to pd.concat preserve index succession within dataframes that I'm concatenating?

I have two dataframes:

df1 = 
    value
0     a
1     b
2     c

df2 =
    value
0     d
1     e

I need to concatenate them across index, but I have to preserve the index of the first dataframe and continue it in the second dataframe, like this:

result =
    value
0     a
1     b
2     c
3     d
4     e

My guess is that pd.concat([df1, df2], ignore_index=True) will do the job. However, I'm worried that for large dataframes the order of the rows may be changed and I'll end up with something like this (first two rows changed indices):

result =
    value
0     b
1     a
2     c
3     d
4     e

So my question is, does the pd.concat with ignore_index=True save the index succession within dataframes that are being concatenated, or there is randomness in the index assignment?

like image 740
Alexandr Kapshuk Avatar asked Jun 11 '19 14:06

Alexandr Kapshuk


People also ask

What does ignore_index true do?

ignore_index : If True, do not use the index labels. verify_integrity : If True, raise ValueError on creating index with duplicates. sort : Sort columns if the columns of self and other are not aligned.

Does PD concat reset index?

You can reset the index using concat() function as well. Pass in the argument ignore_index=True to the concat() function. If you have only one dataframe whose index has to be reset, then just pass that dataframe in the list to the concat() function.

Does PD concat preserve order?

Answer. Yes, by default, concatenating dataframes will preserve their row order. The order of the dataframes to concatenate will be the order of the result dataframe.

Does PD concat merge on index?

pd. concat joins on the index and can join two or more DataFrames at once. It does a full outer join by default.


1 Answers

In my experience, pd.concat concats the rows in the order the DataFrames are passed to it during concatenation.


If you want to be safe, specify sort=False which will also avoid sorting on columns:

pd.concat([df1, df2], axis=0, sort=False, ignore_index=True)

  value
0     a
1     b
2     c
3     d
4     e
like image 147
cs95 Avatar answered Oct 17 '22 08:10

cs95