Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Mapping data to ground truth list

Tags:

pandas

I have ground truth data in the following Python list:

ground_truth = [(A,16), (B,18), (C,36), (A,59), (C,77)]

So any value from:

0-16 gets mapped to A, 
17-18 maps to B, 
19-36 maps to C,
37-59 maps to A 
60-77 maps to C
and so on

I am trying to map a time series input say from numbers like

[9,15,29,32,49,56, 69]  to its respective classes like:
[A, A, C, C, A, A,  C]

Assuming my input is a Pandas series like:

in = pd.Series([9,15,29,32,49,56, 69])

How do I get to the series [A, A, C, C, A, A, C] ?

like image 202
bhaskarc Avatar asked Mar 04 '23 15:03

bhaskarc


1 Answers

Here's my approach:

gt = pd.DataFrame(ground_truth)

# bins for cut
bins = [0] + list(gt[1])

# categories
cats = pd.cut(pd.Series([9,15,29,32,49,56, 69]), bins=bins, labels=False)

# labels
gt.loc[cats, 0]

gives

0    A
0    A
2    C
2    C
3    A
3    A
4    C
Name: 0, dtype: object

Or, without creating new dataframe:

labels = np.array([x for x,_ in ground_truth])
bins = [0] + [y for _,y in ground_truth]        

cats = pd.cut(pd.Series([9,15,29,32,49,56, 69]), bins=bins, labels=False)

labels[cats]

which gives:

array(['A', 'A', 'C', 'C', 'A', 'A', 'C'], dtype='<U1')
like image 200
Quang Hoang Avatar answered Mar 19 '23 16:03

Quang Hoang