Pandas - reindex so I can keep values

Tags:

python

pandas

Long story short

I have a nested dictionary. When I turn it into a dataframe.

import pandas
pdf = pandas.DataFrame(nested_dict)

 95     96     97     98     99    100   101   102   103    104    105  \
A  70019    102   4243   3083   3540  6311  4851  5938  4140   4659   3100   
C      0    185    427    433   1190   910  3898  3869  2861   2149   3065   
D      8      9  23463   1237   2574  4174  3640  4747  3557   4582   5934   
E    141     89   5034   1576   2303  3416  2377  1252  1204   1703    718   
F      7     12   1937   2246   1687  1154  1317  3473  1881   2221   3060   
G    343   1550  13497  10659  12343  8213  9251  7341  6354   9058   9022   
H      1   1978   1829   1394   1945  1003  1382  1489  4182    932    556   
I      5    772   1361   3914   3255  3242  2808  3765  3284   2127   3120   
K      3  10353    540   2364   1196   882  3439  2107   803    743    621   
L      6     14   1599  11759   4571  4821  3450  5071  4364   1891   3677   
M      1      6    158    211    524  2738   686   443   612    509   1721   
N      6    186    299   2971    791  1440  2028  1163  1689   4296   1535   
P     54     31    726   6208   7160  5494  6184  4282  3587   3727   3821   
Q     10     87   1228   2233   1016  1801  1768  1693  3414    515    563   
R      7  53939   3030   8904   6712  6134  5127  3223  4764   3768   6429   
S     76   5213   3676   7480   9831  7666  5410  8185  7508  11237   8298   
T   4369   1253   3087   2487   6559  4572  6863  3184  7352   6068   4756   
V    732      5   7595   4331   5216  5444  5187  6013  4245   4545   4761   
W      0      6    103   1225    598   888   601   713  1298   1323    908   
Y     12      9   1968   1085   2787  5489  5529  7840  8691   9745  10136

Eventually I want to melt down this data frame to look like the following.

residue residue_num count
A       95          70019
A       96          102
A       97          4243
....

The residue column is being marked as the index so I don't know how to make it an arbitrary index like 0,1,2,3 and call "A C D E F.." another name.

EDIT Answered myself as per suggestion

710

asked Feb 11 '14 00:02

jwillis0720

2 Answers

Answered from here and here

import pandas
pdf = pandas.DataFrame(the_matrix)
pdf = pdf.reset_index()
pdf.rename(columns={'index':'aa'},inplace=True)
pandas.melt(pdf,id_vars='aa',var_name="position",value_name="counts")

     aa position counts
0    A   95  70019
1    C   95  0
2    D   95  8
3    E   95  141
4    F   95  7
5    G   95  343
6    H   95  1
7    I   95  5
8    K   95  3

101

answered Sep 17 '22 12:09

jwillis0720

Your pdf looks like a pivot table. Let's assume we have a dataframe with three columns. We can pivot it with a single function like this:

pivoted = df.pivot(index='col1',columns='col2',values='col3')

Unpivoting it back without losing the index requires a reset_index dance:

pivoted.reset_index().melt(id_vars=pivoted.index.name)

To get the exact original df:

pivoted.reset_index().melt(id_vars=pivoted.index.name, var_name='col2', value_name='col3')

PS. To my surprise, melt does not get a kwarg like keep_index=True. Enhancement suggestion is still open: https://github.com/pandas-dev/pandas/issues/17440

answered Sep 21 '22 12:09

tozCSS

Related questions
                            
                                Python requests vs. robots.txt
                            
                                Find maximum length of all n-word-length substrings shared by two strings
                            
                                django_openid_auth TypeError openid.yadis.manager.YadisServiceManager object is not JSON serializable
                            
                                How to get sum of each row and sum of each column in Scipy sparse matrices (csr_matrix and csc_matrix)?
                            
                                Vectorize finding closest value in an array for each element in another array
                            
                                Python MapReduce Hadoop Streaming Job that requires multiple input files?
                            
                                Scrapy retry or redirect middleware
                            
                                Tweepy Streaming - Stop collecting tweets at x amount
                            
                                Control indentation with Org–Babel
                            
                                Show the dictionary key in django template
                            
                                Python UCS2 decoding from hex string
                            
                                Django model latest() method
                            
                                Dynamically query a subset of columns in sqlalchemy
                            
                                Python-How to get the value from tkinter widget and assign it to a variable
                            
                                Check if key is in HDF5Store without path
                            
                                python and php bcrypt
                            
                                How to embed data in an IPython Notebook?
                            
                                python watchdog modified and created duplicate events
                            
                                Parse .docx in python 3
                            
                                Nearest Neighbors in Python given the distance matrix

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Pandas - reindex so I can keep values

Tags:

python

pandas

jwillis0720

People also ask

2 Answers

jwillis0720

tozCSS

Recent Activity

Donate For Us