I have a 719mb CSV file that looks like:
from, to, dep, freq, arr, code, mode   (header row) RGBOXFD,RGBPADTON,127,0,27,99999,2 RGBOXFD,RGBPADTON,127,0,33,99999,2 RGBOXFD,RGBRDLEY,127,0,1425,99999,2 RGBOXFD,RGBCHOLSEY,127,0,52,99999,2 RGBOXFD,RGBMDNHEAD,127,0,91,99999,2 RGBDIDCOTP,RGBPADTON,127,0,46,99999,2 RGBDIDCOTP,RGBPADTON,127,0,3,99999,2 RGBDIDCOTP,RGBCHOLSEY,127,0,61,99999,2 RGBDIDCOTP,RGBRDLEY,127,0,1430,99999,2 RGBDIDCOTP,RGBPADTON,127,0,115,99999,2 and so on...    I want to load in to a pandas DataFrame. Now I know there is a load from csv method:
 r = pd.DataFrame.from_csv('test_data2.csv')   But I specifically want to load it as a 'MultiIndex' DataFrame where from and to are the indexes:
So ending up with:
                   dep, freq, arr, code, mode RGBOXFD RGBPADTON  127     0   27  99999    2         RGBRDLEY   127     0   33  99999    2         RGBCHOLSEY 127     0 1425  99999    2         RGBMDNHEAD 127     0 1525  99999    2   etc. I'm not sure how to do that?
pandas MultiIndex to ColumnsUse pandas DataFrame. reset_index() function to convert/transfer MultiIndex (multi-level index) indexes to columns. The default setting for the parameter is drop=False which will keep the index values as columns and set the new index to DataFrame starting from zero.
The MultiIndex object is the hierarchical analogue of the standard Index object which typically stores the axis labels in pandas objects. You can think of MultiIndex as an array of tuples where each tuple is unique. A MultiIndex can be created from a list of arrays (using MultiIndex.
You could use pd.read_csv:
>>> df = pd.read_csv("test_data2.csv", index_col=[0,1], skipinitialspace=True) >>> df                        dep  freq   arr   code  mode from       to                                       RGBOXFD    RGBPADTON   127     0    27  99999     2            RGBPADTON   127     0    33  99999     2            RGBRDLEY    127     0  1425  99999     2            RGBCHOLSEY  127     0    52  99999     2            RGBMDNHEAD  127     0    91  99999     2 RGBDIDCOTP RGBPADTON   127     0    46  99999     2            RGBPADTON   127     0     3  99999     2            RGBCHOLSEY  127     0    61  99999     2            RGBRDLEY    127     0  1430  99999     2            RGBPADTON   127     0   115  99999     2   where I've used skipinitialspace=True to get rid of those annoying spaces in the header row.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With