Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

denormalize csv file with python / pandas dataframe

Tags:

python

pandas

I have a csv file structured as:

Location Parameter

A            10
A            20
B            14
B            16
C            15
C             9
C             6

I can easily get this in to a dataframe with read_csv.

I would like to use python / pandas to convert the dataframe to get columns for each of the A, B, C, and populate the values by the corresponding parameter, e.g.

A    B     C
10   14   15
20   16    9
NA   NA    6

with the ultimate goal of doing a boxplot on the dataframe.

Thanks in advance.

like image 924
rbmales Avatar asked Apr 28 '13 23:04

rbmales


1 Answers

I couldn't hit on the right pivoting/stacking approach -- someone else will probably come up with the right way -- so I fell back on groupby:

>>> df
  Location  Parameter
0        A         10
1        A         20
2        B         14
3        B         16
4        C         15
5        C          9
6        C          6
>>> cd = {k: v.reset_index(drop=True) for k,v in df.groupby("Location")["Parameter"]}
>>> pd.DataFrame(cd)
    A   B   C
0  10  14  15
1  20  16   9
2 NaN NaN   6
like image 161
DSM Avatar answered Oct 16 '22 00:10

DSM