The concat() function can be used to concatenate two Dataframes by adding the rows of one to the other. The merge() function is equivalent to the SQL JOIN clause. 'left', 'right' and 'inner' joins are all possible.
concat() for combining DataFrames across rows or columns.
Merge DataFrames Using concat() Here are the most commonly used parameters for the concat() function: objs is the list of DataFrame objects ([df1, df2, ...]) to be concatenated. axis defines the direction of the concatenation, 0 for row-wise and 1 for column-wise. join can either be inner (intersection) or outer (union ...
I believe you can use the append
method
bigdata = data1.append(data2, ignore_index=True)
to keep their indexes just dont use the ignore_index
keyword ...
You can also use pd.concat
, which is particularly helpful when you are joining more than two dataframes:
bigdata = pd.concat([data1, data2], ignore_index=True, sort=False)
Thought to add this here in case someone finds it useful. @ostrokach already mentioned how you can merge the data frames across rows which is
df_row_merged = pd.concat([df_a, df_b], ignore_index=True)
To merge across columns, you can use the following syntax:
df_col_merged = pd.concat([df_a, df_b], axis=1)
If you're working with big data and need to concatenate multiple datasets calling concat
many times can get performance-intensive.
If you don't want to create a new df each time, you can instead aggregate the changes and call concat
only once:
frames = [df_A, df_B] # Or perform operations on the DFs
result = pd.concat(frames)
This is pointed out in the pandas docs under concatenating objects at the bottom of the section):
Note: It is worth noting however, that
concat
(and thereforeappend
) makes a full copy of the data, and that constantly reusing this function can create a significant performance hit. If you need to use the operation over several datasets, use a list comprehension.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With