I have a pandas Dataframe y with 1 million rows and 5 columns. <pre class="prettyprint"><code>np.shape(y) (1037889, 5) </code></pre> The column values are all 0 or 1. Looks something like this: <pre class="prettyprint"><code>y.head() a, b, c, d, e 0, 0, 1, 0, 0 1, 0, 0, 1, 1 0, 1, 1, 1, 1 0, 0, 0, 0, 0 </code></pre> I want a Dataframe with 1 million rows and 1 column. <pre class="prettyprint"><code>np.shape(y) (1037889, ) </code></pre> where the column is just the 5 columns concatenated together. <pre class="prettyprint"><code>New column 0, 0, 1, 0, 0 1, 0, 0, 1, 1 0, 1, 1, 1, 1 0, 0, 0, 0, 0 </code></pre> I keep trying different things like <code>merge</code>, <code>concat</code>, <code>dstack</code>, etc... but can't seem to figure this out.

If you want new column to have all data concatenated to string, it's good case for apply() function: <pre class="prettyprint"><code>>>> df = pd.DataFrame({'a':[0,1,0,0], 'b':[0,0,1,0], 'c':[1,0,1,0], 'd':[0,1,1,0], 'c':[0,1,1,0]}) >>> df a b c d 0 0 0 0 0 1 1 0 1 1 2 0 1 1 1 3 0 0 0 0 >>> df2 = df.apply(lambda row: ','.join(map(str, row)), axis=1) >>> df2 0 0,0,0,0 1 1,0,1,1 2 0,1,1,1 3 0,0,0,0 </code></pre>

Pandas DataFrame column concatenation

Tags:

python

merge

concatenation

pandas

numpy

I have a pandas Dataframe y with 1 million rows and 5 columns.

np.shape(y)  
(1037889, 5)

The column values are all 0 or 1. Looks something like this:

y.head()  
a, b, c, d, e  
0, 0, 1, 0, 0  
1, 0, 0, 1, 1  
0, 1, 1, 1, 1  
0, 0, 0, 0, 0

I want a Dataframe with 1 million rows and 1 column.

np.shape(y)  
(1037889, )

where the column is just the 5 columns concatenated together.

New column  
0, 0, 1, 0, 0  
1, 0, 0, 1, 1  
0, 1, 1, 1, 1  
0, 0, 0, 0, 0

I keep trying different things like merge, concat, dstack, etc... but can't seem to figure this out.

956

asked Oct 30 '13 06:10

Neck Beard

1 Answers

If you want new column to have all data concatenated to string, it's good case for apply() function:

>>> df = pd.DataFrame({'a':[0,1,0,0], 'b':[0,0,1,0], 'c':[1,0,1,0], 'd':[0,1,1,0], 'c':[0,1,1,0]})
>>> df
   a  b  c  d
0  0  0  0  0
1  1  0  1  1
2  0  1  1  1
3  0  0  0  0
>>> df2 = df.apply(lambda row: ','.join(map(str, row)), axis=1)
>>> df2
0    0,0,0,0
1    1,0,1,1
2    0,1,1,1
3    0,0,0,0

answered Oct 04 '22 02:10

Roman Pekar

Related questions
                            
                                sqlalchemy connect to server, with not specify database
                            
                                python unable to import module
                            
                                Forking python, defunct child
                            
                                Python: find out whether a list of integers is coherent
                            
                                Python, How to extend Decimal class to add helpful methods
                            
                                Harvesting the power of highly-parallel computers with python scientific code [closed]
                            
                                Tkinter importing without *?
                            
                                COM: excelApplication.Application.Quit() preserves the process
                            
                                Is there any tool to translate Lisp code into Python? [closed]
                            
                                filling numpy array with random element from another array
                            
                                CSVs in Python with newline in quotes [duplicate]
                            
                                Making the `nosetests` script select folder by Python version
                            
                                Regex re.sub list in a file
                            
                                Python: why not (a, b, c) = (*x, 3)
                            
                                Django db error: could not identify an equality operator for type json when trying to annotate a model with jsonfield
                            
                                Using Mutagen to process all accepted file types
                            
                                Sphinx autodoc functions within module
                            
                                Comparing Pandas Dataframe Rows & Dropping rows with overlapping dates
                            
                                'float' object can't be interpreted as int, but converting to int yields no output
                            
                                How to read records terminated by custom separator from file in python?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With