I have tried several different ways to horizontally concatenate DataFrame objects from the Python Data Analysis Library (PANDAS), but my attempts have failed so far.
I have two dataframes:
d_1:
col2 col3
col1
str1 1 1.5728
str2 2 2.4627
str3 3 3.6143
d_2:
col2 col3
col1
str1 4 4.5345
str2 5 5.1230
str3 6 6.1233
I want the final resulting dataframe to be d_1 and d_2 side-by-side:
col2 col3 col1 col2 col3
col1
str1 1 1.5728 str1 4 4.5345
str2 2 2.4627 str2 5 5.1230
str3 3 3.6143 str3 6 6.1233
Here is some code which creates the dataframes:
import pandas as pd
column_headers = ["col1", "col2", "col3"]
d_1 = dict.fromkeys(column_headers)
d_1["col1"] = ["str1", "str2", "str3"]
d_1["col2"] = [1, 2, 3]
d_1["col3"] = [1.5728, 2.4627, 3.6143]
df_1 = pd.DataFrame(d_1)
df_1 = df_1.set_index("col1")
print("df_1:")
print(df_1)
print()
d_2 = dict.fromkeys(column_headers)
d_2["col1"] = ["str1", "str2", "str3"]
d_2["col2"] = [4, 5, 6]
d_2["col3"] = [4.5345, 5.123, 6.1233]
df_2 = pd.DataFrame(d_2)
df_2 = df_2.set_index("col1")
print("df_2:")
print(df_2)
print()
Outer join fails to horizontally concatenate d_1 an d_2:
merged_df = df_1.join(df_2, how='outer')
We get the following error message:
ValueError: columns overlap but no suffix specified: Index(['col2', 'col3'], dtype='object')
Making a dictionary of dictionaries does not work:
# Make a dictionary of dictionaries
merged_d = dict()
merged_d[1] = d_1
merged_d[2] = d_2
merged_df = pd.DataFrame(merged_d)
print(merged_df)
The resulting DataFrame looks like this:
1 2
col1 [str1, str2, str3] [str1, str2, str3]
col2 [1, 2, 3] [4, 5, 6]
col3 [1.5728, 2.4627, 3.6143] [4.5345, 5.123, 6.1233]
Making a dictionary of DataFrames does not seem to work either:
merged_d = dict()
merged_d[1] = df_1
merged_d[2] = df_2
merged_df = pd.DataFrame(merged_d)
print(merged_df)
We get the following error message:
ValueError: If using all scalar values, you must pass an index
Passing an index into the DataFrame constructor does not help much:
merged_df = pd.DataFrame(data = merged_d, index = [1, 2])
We get the error:
Value Error: cannot copy sequence with size 2 to array axis with dimension 3
Use concat with axis 1 instead of merge i.e
ndf = pd.concat([df_1, df_2], axis=1)
col2 col3 col2 col3
col1
str1 1 1.5728 4 4.5345
str2 2 2.4627 5 5.1230
str3 3 3.6143 6 6.1233
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With