Problem "The df has two columns but sometimes filled with the same values. We need to re-save them into two new columns but in alphabetical order" Context We have a pandas df like this: <pre class="prettyprint"><code>df = pd.DataFrame([{"name_A": "john", "name_B": "mac"}, {"name_A": "mac", "name_B": "john"}]) </code></pre> Like this: <pre class="prettyprint"><code>name_A | name_B john | mac mac | john Trump | Clinton </code></pre> Desired Output <pre class="prettyprint"><code>name_A | name_B | restated_A | restated_B john | mac | john | mac mac | john | john | mac trump | clinton | clinton | trump </code></pre> In words, we wish to have the columns' values <code>name_A</code> and <code>name_B</code> to be alphabetically sorted in <code>restated_A</code> AND <code>restated_B</code> Tried so far bunch of lambdas but couldn't get it to work Specifications Python: 3.5.2 Pandas: 0.18.1

As an alternative vectorized solution, you can use <code>numpy.minimum()</code> and <code>numpy.maximum()</code>: <pre class="prettyprint"><code>import numpy as np df['restart_A'] = np.minimum(df['name_A'], df['name_B']) df['restart_B'] = np.maximum(df['name_A'], df['name_B']) </code></pre> <img src="https://i.stack.imgur.com/piNiF.png" alt="enter image description here"> Or use <code>apply</code> method: <pre class="prettyprint"><code>df[['restated_A', 'restated_B']] = df.apply(lambda r: sorted(r), axis = 1) </code></pre> <img src="https://i.stack.imgur.com/2YKjB.png" alt="enter image description here">

Just send <code>df.values</code> to a list and sort that list for each row. Then reassign the elements in the pairs accordingly. <pre class="prettyprint"><code>>>> df = pd.DataFrame([{"name_A": "john", "name_B": "mac"}, {"name_A": "mac", "name_B": "john"}]) >>> restated_values = [sorted(pair) for pair in df.values.tolist()] >>> restated_values [['john', 'mac'], ['john', 'mac']] >>> df['restated_A'] = [pair[0] for pair in restated_values] >>> df name_A name_B restated_A 0 john mac john 1 mac john john >>> df['restated_b'] = [pair[1] for pair in restated_values] >>> df name_A name_B restated_A restated_b 0 john mac john mac 1 mac john john mac </code></pre> Or, you could do this, using a <code>dict</code> and a new <code>pandas.DataFrame</code> object: <pre class="prettyprint"><code>>>> df = pd.DataFrame([{"name_A": "john", "name_B": "mac"}, {"name_A": "mac", "name_B": "john"}]) >>> restated_values = [sorted(pair) for pair in df.values.tolist()] >>> restated_values [['john', 'mac'], ['john', 'mac']] >>> new_col_rows = {'restated_A': [pair[0] for pair in restated_values], 'restated_B': [pair[1] for pair in restated_values]} >>> new_col_rows {'restated_A': ['john', 'john'], 'restated_B': ['mac', 'mac']} >>> new_df = pd.DataFrame(new_col_rows) >>> new_df restated_A restated_B 0 john mac 1 john mac >>> df = df.join(new_df) >>> df name_A name_B restated_A restated_B 0 john mac john mac 1 mac john john mac </code></pre>

Python: Pandas: two columns with same values, alphabetically sorted and stored

Tags:

python

pandas

Problem
"The df has two columns but sometimes filled with the same values. We need to re-save them into two new columns but in alphabetical order"

Context
We have a pandas df like this:

df = pd.DataFrame([{"name_A": "john", "name_B": "mac"}, {"name_A": "mac", "name_B": "john"}])

Like this:

name_A | name_B
john   |  mac 
mac    |  john 
Trump  |  Clinton

Desired Output

name_A | name_B   | restated_A  | restated_B
john   |  mac     |  john       |  mac
mac    |  john    |  john       |  mac
trump  |  clinton |  clinton    | trump

In words, we wish to have the columns' values name_A and name_B to be alphabetically sorted in restated_A AND restated_B

Tried so far
bunch of lambdas but couldn't get it to work

Specifications
Python: 3.5.2
Pandas: 0.18.1

867

asked Oct 22 '16 02:10

John

2 Answers

As an alternative vectorized solution, you can use numpy.minimum() and numpy.maximum():

import numpy as np
df['restart_A'] = np.minimum(df['name_A'], df['name_B'])
df['restart_B'] = np.maximum(df['name_A'], df['name_B'])

enter image description here

Or use apply method:

df[['restated_A', 'restated_B']] = df.apply(lambda r: sorted(r), axis = 1)

enter image description here

answered Nov 03 '22 05:11

Psidom

Just send df.values to a list and sort that list for each row. Then reassign the elements in the pairs accordingly.

>>> df = pd.DataFrame([{"name_A": "john", "name_B": "mac"}, {"name_A": "mac", "name_B": "john"}])
>>> restated_values = [sorted(pair) for pair in df.values.tolist()]
>>> restated_values
[['john', 'mac'], ['john', 'mac']]
>>> df['restated_A'] = [pair[0] for pair in restated_values]
>>> df
  name_A name_B restated_A
0   john    mac       john
1    mac   john       john
>>> df['restated_b'] = [pair[1] for pair in restated_values]
>>> df
  name_A name_B restated_A restated_b
0   john    mac       john        mac
1    mac   john       john        mac

Or, you could do this, using a dict and a new pandas.DataFrame object:

>>> df = pd.DataFrame([{"name_A": "john", "name_B": "mac"}, {"name_A": "mac", "name_B": "john"}])
>>> restated_values = [sorted(pair) for pair in df.values.tolist()]
>>> restated_values
[['john', 'mac'], ['john', 'mac']]
>>> new_col_rows = {'restated_A': [pair[0] for pair in restated_values], 'restated_B': [pair[1] for pair in restated_values]}
>>> new_col_rows
{'restated_A': ['john', 'john'], 'restated_B': ['mac', 'mac']}
>>> new_df = pd.DataFrame(new_col_rows)
>>> new_df
  restated_A restated_B
0       john        mac
1       john        mac
>>> df = df.join(new_df)
>>> df
  name_A name_B restated_A restated_B
0   john    mac       john        mac
1    mac   john       john        mac

answered Nov 03 '22 07:11

blacksite

Related questions
                            
                                Convert List of List of Tuples Into 2d Numpy Array
                            
                                scipy cdist with sparse matrices
                            
                                broadcast not supported by sql broker transport
                            
                                Aws passing credentials to ansible s3 module
                            
                                Difference between Positional , keyword, optional and required argument?
                            
                                How to combine individual characters in one string in python [duplicate]
                            
                                HDF5 min_itemsize error: ValueError: Trying to store a string with len [##] in [y] column but this column has a limit of [##]!
                            
                                How to create pie chart?
                            
                                Python printing "<built-in method ... object" instead of list
                            
                                Change Python version for evaluating file with SublimREPL plugin
                            
                                Python Flask: Go from Swagger YAML to Google App Engine?
                            
                                Numpy repeat for 2d array
                            
                                Why does adding parenthesis around a yield call in a generator allow it to compile/run?
                            
                                In the logging module's RotatingFileHandler, how to set the backupCount to a practically infinite number
                            
                                Performance difference between scipy and numpy norm
                            
                                Oscillating accuracy of CNN training with Tensor Flow for MNIST handwritten digits
                            
                                String split with indices in Python
                            
                                In Python, what operator to override for "if object:"?
                            
                                Where is documentation for multiprocessing.pool.ApplyResult?
                            
                                clone element with beautifulsoup

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With