I have the following data frame: <pre class="prettyprint"><code>a1 | a2 | a3 | a4 --------------------- Bob | Cat | Dov | Edd Cat | Dov | Bob | Edd Edd | Cat | Dov | Bob </code></pre> and I want to convert it to <pre class="prettyprint"><code>Bob | Cat | Dov | Edd --------------------- a1 | a2 | a3 | a4 a3 | a1 | a2 | a4 a4 | a2 | a3 | a1 </code></pre> Note that the number of columns equals the number of unique values, and the number and order of rows are preserved

1) Required approach: A faster implementation would be to sort the values of the dataframe and align the columns accordingly based on it's obtained indices after <code>np.argsort</code>. <pre class="prettyprint"><code>pd.DataFrame(df.columns[np.argsort(df.values)], df.index, np.unique(df.values)) </code></pre> <img src="https://i.stack.imgur.com/WaVQg.png" alt="enter image description here"> Applying <code>np.argsort</code> gives us the data we are looking for: <pre class="prettyprint"><code>df.columns[np.argsort(df.values)] Out[156]: Index([['a1', 'a2', 'a3', 'a4'], ['a3', 'a1', 'a2', 'a4'], ['a4', 'a2', 'a3', 'a1']], dtype='object') </code></pre> <hr> 2) Slow generalized approach: More generalized approach while at the cost of some speed / efficiency would be to use <code>apply</code> after creating a <code>dict</code> mapping of the strings/values present in the dataframe with their corresponding column names. Use a dataframe constructor later after converting the obtained series to their <code>list</code> representation. <pre class="prettyprint"><code>pd.DataFrame(df.apply(lambda s: dict(zip(pd.Series(s), pd.Series(s).index)), 1).tolist()) </code></pre> <hr> 3) Faster generalized approach: After obtaining a list of dictionaries from <code>df.to_dict</code> + <code>orient='records'</code>, we need to swap it's respective key and value pairs while iterating through them in a loop. <pre class="prettyprint"><code>pd.DataFrame([{val:key for key, val in d.items()} for d in df.to_dict('r')]) </code></pre> <hr> Sample test case: <pre class="prettyprint"><code>df = df.assign(a5=['Foo', 'Bar', 'Baz']) </code></pre> Both these approaches produce: <img src="https://i.stack.imgur.com/6N7YG.png" alt="enter image description here"> <hr> @piRSquared EDIT 1 generalized solution <pre class="prettyprint"><code>def nic(df): v = df.values n, m = v.shape u, inv = np.unique(v, return_inverse=1) i = df.index.values c = df.columns.values r = np.empty((n, len(u)), dtype=c.dtype) r[i.repeat(m), inv] = np.tile(c, n) return pd.DataFrame(r, i, u) </code></pre> 1I would like to thank user @piRSquared for coming up with a really fast and generalized numpy based alternative soln.

How to swap a group of column headings with their values in Pandas

Tags:

python

pandas

dataframe

I have the following data frame:

a1  | a2  | a3  | a4 
--------------------- 
Bob | Cat | Dov | Edd 
Cat | Dov | Bob | Edd
Edd | Cat | Dov | Bob

and I want to convert it to

Bob | Cat | Dov | Edd
---------------------
a1  | a2  | a3  | a4
a3  | a1  | a2  | a4
a4  | a2  | a3  | a1

Note that the number of columns equals the number of unique values, and the number and order of rows are preserved

793

asked Jan 10 '17 15:01

edmondawad

2 Answers

1) Required approach:

A faster implementation would be to sort the values of the dataframe and align the columns accordingly based on it's obtained indices after np.argsort.

pd.DataFrame(df.columns[np.argsort(df.values)], df.index, np.unique(df.values))

enter image description here

Applying np.argsort gives us the data we are looking for:

df.columns[np.argsort(df.values)]
Out[156]:
Index([['a1', 'a2', 'a3', 'a4'], ['a3', 'a1', 'a2', 'a4'],
       ['a4', 'a2', 'a3', 'a1']],
      dtype='object')

2) Slow generalized approach:

More generalized approach while at the cost of some speed / efficiency would be to use apply after creating a dict mapping of the strings/values present in the dataframe with their corresponding column names.

Use a dataframe constructor later after converting the obtained series to their list representation.

pd.DataFrame(df.apply(lambda s: dict(zip(pd.Series(s), pd.Series(s).index)), 1).tolist())

3) Faster generalized approach:

After obtaining a list of dictionaries from df.to_dict + orient='records', we need to swap it's respective key and value pairs while iterating through them in a loop.

pd.DataFrame([{val:key for key, val in d.items()} for d in df.to_dict('r')])

Sample test case:

df = df.assign(a5=['Foo', 'Bar', 'Baz'])

Both these approaches produce:

enter image description here

@piRSquared EDIT ¹

generalized solution

def nic(df):
    v = df.values
    n, m = v.shape
    u, inv = np.unique(v, return_inverse=1)
    i = df.index.values
    c = df.columns.values
    r = np.empty((n, len(u)), dtype=c.dtype)
    r[i.repeat(m), inv] = np.tile(c, n)
    return pd.DataFrame(r, i, u)

¹_{I would like to thank user @piRSquared for coming up with a really fast and generalized numpy based alternative soln.}

167

answered Sep 18 '22 23:09

Nickil Maveli

You can reshape it with stack and unstack with a swapping of the values and index:

df_swap = (df.stack()                     # reshape the data frame to long format
             .reset_index(level = 1)      # set the index(column headers) as a new column
             .set_index(0, append=True)   # set the values as index
             .unstack(level=1))           # reshape the data frame to wide format

df_swap.columns = df_swap.columns.get_level_values(1)   # drop level 0 in the column index
df_swap

enter image description here

answered Sep 21 '22 23:09

Psidom

Related questions
                            
                                Accessing Username and Password in django request header returns None
                            
                                Scrapy 1.1.0 - no active project
                            
                                True + True = 2. Elegantly perform boolean arithmetic?
                            
                                Python Spyder initializing Hello World Kivi app once?
                            
                                What status code should a PATCH request with no changes return?
                            
                                Is the continue statement necessary in a Python while loop?
                            
                                Copying a list using a[:] or copy() in python is shallow? [duplicate]
                            
                                Error in Spark while declaring a UDF
                            
                                Plot confusion matrix sklearn with multiple labels
                            
                                How to divide the sum with the size in a pandas groupby
                            
                                Python- Is there a function or formula to find the complementary colour of a rgb code?
                            
                                Imported Enum class is not comparing equal to itself
                            
                                Can we return after raise statement
                            
                                How to Transpose each element in a 3D np array
                            
                                How to delete a django JWT token?
                            
                                Load npy file from S3 in python
                            
                                pyqt4 window resize event
                            
                                h5py, access data in Datasets in SVHN
                            
                                Splitting a 2 dimensional array or a list into two 1 dimensional lists in python [duplicate]
                            
                                Error installing pydns

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

How to swap a group of column headings with their values in Pandas

Tags:

python

pandas

dataframe

edmondawad

People also ask

2 Answers

Nickil Maveli

Psidom

Recent Activity

Donate For Us