<p>I have dataframes I want to horizontally concatenate while ignoring the index.</p> <p>I know that for arithmetic operations, ignoring the index can lead to a substantial speedup if you use the numpy array <code>.values</code> instead of the pandas Series. Is it possible to horizontally concatenate or merge pandas dataframes whilst ignoring the index? (To my dismay, ignore_index=True does something else.) And if so, does it give a speed gain?</p> <pre class="prettyprint"><code>import pandas as pd df1 = pd.Series(range(10)).to_frame() df2 = pd.Series(range(10), index=range(10, 20)).to_frame() pd.concat([df1, df2], axis=1) # 0 0 # 0 0.0 NaN # 1 1.0 NaN # 2 2.0 NaN # 3 3.0 NaN # 4 4.0 NaN # 5 5.0 NaN # 6 6.0 NaN # 7 7.0 NaN # 8 8.0 NaN # 9 9.0 NaN # 10 NaN 0.0 # 11 NaN 1.0 # 12 NaN 2.0 # 13 NaN 3.0 # 14 NaN 4.0 # 15 NaN 5.0 # 16 NaN 6.0 # 17 NaN 7.0 # 18 NaN 8.0 # 19 NaN 9.0 </code></pre> <p>I know I can get the result I want by resetting the index of df2, but I wonder whether there is a faster (perhaps numpy method) to do this? </p>

<h3><code>np.column_stack</code></h3> <p>Absolutely equivalent to EdChum's answer.</p> <pre class="prettyprint"><code>pd.DataFrame( np.column_stack([df1,df2]), columns=df1.columns.append(df2.columns) ) 0 0 0 0 0 1 1 1 2 2 2 3 3 3 4 4 4 5 5 5 6 6 6 7 7 7 8 8 8 9 9 9 </code></pre> <hr> <h3>Pandas Option with <code>assign</code> </h3> <p>You can do many things with the new columns.<br> I don't recommend this! </p> <pre class="prettyprint"><code>df1.assign(**df2.add_suffix('_').to_dict('l')) 0 0_ 0 0 0 1 1 1 2 2 2 3 3 3 4 4 4 5 5 5 6 6 6 7 7 7 8 8 8 9 9 9 </code></pre>

Is there a way to horizontally concatenate dataframes of same length while ignoring the index?

I have dataframes I want to horizontally concatenate while ignoring the index.

I know that for arithmetic operations, ignoring the index can lead to a substantial speedup if you use the numpy array .values instead of the pandas Series. Is it possible to horizontally concatenate or merge pandas dataframes whilst ignoring the index? (To my dismay, ignore_index=True does something else.) And if so, does it give a speed gain?

import pandas as pd

df1 = pd.Series(range(10)).to_frame()

df2 = pd.Series(range(10), index=range(10, 20)).to_frame()

pd.concat([df1, df2], axis=1)
#      0    0
# 0   0.0  NaN
# 1   1.0  NaN
# 2   2.0  NaN
# 3   3.0  NaN
# 4   4.0  NaN
# 5   5.0  NaN
# 6   6.0  NaN
# 7   7.0  NaN
# 8   8.0  NaN
# 9   9.0  NaN
# 10  NaN  0.0
# 11  NaN  1.0
# 12  NaN  2.0
# 13  NaN  3.0
# 14  NaN  4.0
# 15  NaN  5.0
# 16  NaN  6.0
# 17  NaN  7.0
# 18  NaN  8.0
# 19  NaN  9.0

I know I can get the result I want by resetting the index of df2, but I wonder whether there is a faster (perhaps numpy method) to do this?

How do I concatenate horizontal DataFrames?

To concatenate DataFrames horizontally in Pandas, use the concat(~) method with axis=1 .

Which are the 3 main ways of combining DataFrames together?

Combine data from multiple files into a single DataFrame using merge and concat. Combine two DataFrames using a unique ID found in both DataFrames. Employ to_csv to export a DataFrame in CSV format. Join DataFrames using common fields (join keys).

How do you reset the index while concatenating two DataFrames?

You can reset the index using concat() function as well. Pass in the argument ignore_index=True to the concat() function. If you have only one dataframe whose index has to be reset, then just pass that dataframe in the list to the concat() function.

`np.column_stack`

Absolutely equivalent to EdChum's answer.

pd.DataFrame(
    np.column_stack([df1,df2]),
    columns=df1.columns.append(df2.columns)
)

   0  0
0  0  0
1  1  1
2  2  2
3  3  3
4  4  4
5  5  5
6  6  6
7  7  7
8  8  8
9  9  9

Pandas Option with `assign`

You can do many things with the new columns.
I don't recommend this!

df1.assign(**df2.add_suffix('_').to_dict('l'))

   0  0_
0  0   0
1  1   1
2  2   2
3  3   3
4  4   4
5  5   5
6  6   6
7  7   7
8  8   8
9  9   9

A pure numpy method would be to use np.hstack:

In[33]:
np.hstack([df1,df2])

Out[33]: 
array([[0, 0],
       [1, 1],
       [2, 2],
       [3, 3],
       [4, 4],
       [5, 5],
       [6, 6],
       [7, 7],
       [8, 8],
       [9, 9]], dtype=int64)

this can be easily converted to a df by passing this as the data arg to the DataFrame ctor:

In[34]:
pd.DataFrame(np.hstack([df1,df2]))

Out[34]: 
   0  1
0  0  0
1  1  1
2  2  2
3  3  3
4  4  4
5  5  5
6  6  6
7  7  7
8  8  8
9  9  9

with respect to whether the data is contiguous, the individual columns will be treated as separate arrays as it's a dict of Series essentially, as you're passing numpy arrays there is no allocation of memory and copying needed here for simple and homogeneous dtype so it should be fast.

Is there a way to horizontally concatenate dataframes of same length while ignoring the index?

Tags:

pandas

dataframe

The Unfun Cat

People also ask

2 Answers

`np.column_stack`

Pandas Option with `assign`

piRSquared

EdChum

Recent Activity

Donate For Us

Is there a way to horizontally concatenate dataframes of same length while ignoring the index?

Tags:

pandas

dataframe

The Unfun Cat

People also ask

2 Answers

np.column_stack

Pandas Option with assign

piRSquared

EdChum

Related questions

Recent Activity

Donate For Us

`np.column_stack`

Pandas Option with `assign`