I have the following toy code:
 import pandas as pd
 df = pd.DataFrame()
 df["foo"] = [1,2,3,4]
 df2 = pd.DataFrame()
 df2["bar"]=[4,5,6,7]  
 df = pd.concat([df,df2], ignore_index=True,axis=1)
 print(list(df))
Output: [0,1]
Expected Output: [foo,bar] (order is not important)
Is there any way to concatenate two dataframes without losing the original column headers, if I can guarantee that the headers will be unique?
Iterating through the columns and then adding them to one of the DataFrames comes to mind, but is there a pandas function, or concat parameter that I am unaware of?
Thanks!
Pandas can concat dataframe while keeping common columns only, if you provide join='inner' argument in pd.
merge() for combining data on common columns or indices. . join() for combining data on a key column or an index. concat() for combining DataFrames across rows or columns.
The concat() function in pandas is used to append either columns or rows from one DataFrame to another. The concat() function does all the heavy lifting of performing concatenation operations along an axis while performing optional set logic (union or intersection) of the indexes (if any) on the other axes.
As stated in merge, join, and concat documentation, ignore index will remove all name references and use a range (0...n-1) instead. So it should give you the result you want once you remove ignore_index argument or set it to false (default).
df = pd.concat([df, df2], axis=1)
This will join your df and df2 based on indexes (same indexed rows will be concatenated, if other dataframe has no member of that index it will be concatenated as nan).
If you have different indexing on your dataframes, and want to concatenate it this way. You can either create a temporary index and join on that, or set the new dataframe's columns after using concat(..., ignore_index=True).
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With