Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

pandas DataFrame pivoting issue

Tags:

python

pandas

I've got some radar data that's in a bit of an odd format, and I can't figure out how to correctly pivot it using the pandas library.

My data:

    speed   time
loc     
A    63  0000
B    61  0000
C    63  0000
D    65  0000
A    73  0005
B    71  0005
C    73  0005
D    75  0005

I'd like to turn that into a DataFrame that looks like this:

    0000    0005
loc     
A    63     73
B    61     71
C    63     73
D    65     75

I've done a lot of fiddling around but can't seem to get the syntax correct. Can anyone please help?

Thanks!

like image 516
Travis Leleu Avatar asked Nov 04 '12 04:11

Travis Leleu


2 Answers

You can use the pivot method here:

In [71]: df
Out[71]: 
     speed  time
loc             
A       63     0
B       61     0
C       63     0
D       65     0
A       73     5
B       71     5
C       73     5
D       75     5

In [72]: df.reset_index().pivot('loc', 'time', 'speed')
Out[72]: 
time   0   5
loc         
A     63  73
B     61  71
C     63  73
D     65  75
like image 175
Chang She Avatar answered Oct 21 '22 18:10

Chang She


Assuming your data source is in a csv file,

from pandas.io.parsers import read_csv
df = read_csv("radar_data.csv")

df  # shows what is in df

       loc  speed  time
0    A     63     0
1    B     61     0
2    C     63     0
3    D     65     0
4    A     73     5
5    B     73     5
6    C     75     5
7    D     75     5
8    A     67     0
9    B     68     0
10   C     68     0
11   D     70     0

Note that I did not set loc as the index yet so it uses an autoincrement integer index.

panel = df.set_index(['loc', 'time']).sortlevel(0).to_panel()

However, if your data frame is already using loc as the index, we will need to append the time column into it so that we have a loc-time hierarchal index. This can be done using the new append option in the set_index method. Like this:-

panel = df.set_index(['time'], append=True).sortlevel(0).to_panel()

In either case, we should arrive at this scenario:-

panel  # shows what panel is

<class 'pandas.core.panel.Panel'>
Dimensions: 1 (items) x 4 (major) x 2 (minor)
Items: speed to speed
Major axis: A to D
Minor axis: 0 to 5

panel["speed"]  # <--- This is what you are looking for.


time   0   5
loc         
A     63  67
B     73  61
C     68  73
D     63  68

Hope this helps.

like image 21
Calvin Cheng Avatar answered Oct 21 '22 16:10

Calvin Cheng