suppose a dataframe like this one: <pre class="prettyprint"><code>df = pd.DataFrame([[1,2,3,4],[5,6,7,8],[9,10,11,12]], columns = ['A', 'B', 'A1', 'B1']) </code></pre> <img src="https://i.stack.imgur.com/XpIzr.png" alt="enter image description here"> I would like to have a dataframe which looks like: <img src="https://i.stack.imgur.com/nAd2y.png" alt="enter image description here"> what does not work: <pre class="prettyprint"><code>new_rows = int(df.shape[1]/2) * df.shape[0] new_cols = 2 df.values.reshape(new_rows, new_cols, order='F') </code></pre> of course I could loop over the data and make a new list of list but there must be a better way. Any ideas ?

You can use <code>lreshape</code>, for column <code>id</code> <code>numpy.repeat</code>: <pre class="prettyprint"><code>a = [col for col in df.columns if 'A' in col] b = [col for col in df.columns if 'B' in col] df1 = pd.lreshape(df, {'A' : a, 'B' : b}) df1['id'] = np.repeat(np.arange(len(df.columns) // 2), len (df.index)) + 1 print (df1) A B id 0 1 2 1 1 5 6 1 2 9 10 1 3 3 4 2 4 7 8 2 5 11 12 2 </code></pre> EDIT: <code>lreshape</code> is currently undocumented, but it is possible it might be removed(with pd.wide_to_long too). Possible solution is merging all 3 functions to one - maybe <code>melt</code>, but now it is not implementated. Maybe in some new version of pandas. Then my answer will be updated.

reshape a pandas dataframe

Tags:

python

pandas

dataframe

reshape

lreshape

suppose a dataframe like this one:

df = pd.DataFrame([[1,2,3,4],[5,6,7,8],[9,10,11,12]], columns = ['A', 'B', 'A1', 'B1'])

enter image description here

I would like to have a dataframe which looks like:

enter image description here

what does not work:

new_rows = int(df.shape[1]/2) * df.shape[0]
new_cols = 2
df.values.reshape(new_rows, new_cols, order='F')

of course I could loop over the data and make a new list of list but there must be a better way. Any ideas ?

864

asked Mar 21 '17 13:03

Moritz

1 Answers

You can use lreshape, for column id numpy.repeat:

a = [col for col in df.columns if 'A' in col]
b = [col for col in df.columns if 'B' in col]
df1 = pd.lreshape(df, {'A' : a, 'B' : b})

df1['id'] = np.repeat(np.arange(len(df.columns) // 2), len (df.index))  + 1
print (df1)
    A   B  id
0   1   2   1
1   5   6   1
2   9  10   1
3   3   4   2
4   7   8   2
5  11  12   2

EDIT:

lreshape is currently undocumented, but it is possible it might be removed(with pd.wide_to_long too).

Possible solution is merging all 3 functions to one - maybe melt, but now it is not implementated. Maybe in some new version of pandas. Then my answer will be updated.

138

answered Oct 05 '22 23:10

jezrael

Related questions
                            
                                How to run raw mongodb commands from pymongo
                            
                                Pass parameter with Python Flask in external Javascript
                            
                                Import caffe error
                            
                                Pandas read_table use first column as index
                            
                                DynamoDBNumberError on trying to insert floating point number using python boto library
                            
                                Group and average NumPy matrix
                            
                                Memory efficient way to split large numpy array into train and test
                            
                                non-blocking lock with 'with' statement
                            
                                How to detect if a point is contained within a bounding rect - opecv & python
                            
                                Luigi Pipeline beginning in S3
                            
                                Callbacks with ctypes (How to call a python function from C)
                            
                                Problems implementing an XOR gate with Neural Nets in Tensorflow
                            
                                Interpolating a closed curve using scipy
                            
                                How do I order fields of my Row objects in Spark (Python)
                            
                                How can I send an email using python logging's SMTPHandler and SSL
                            
                                Doing pairwise distance computation with TensorFlow
                            
                                How to fillna() with value 0 after calling resample?
                            
                                Spyder / iPython inline plot figure size
                            
                                Why does a class need __iter__() to return an iterator?
                            
                                ValueError: time data does not match format '%Y-%m-%d %H:%M:%S.%f'

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With