Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

python - how to map array of integer to another integer

Tags:

I am doing a CNN project, and I need to preprocess the label first.

The image file is a spectrogram, each file has a label of 250 values stored in an array. It tells a sequence of pitch values present in a particular spectrogram. For example, one label file looks like this:

[ 0  0  0  0  0  0  0  0  0  0  0 57 57 57 57 57 57 57 57 58 58 57 57 57
  0  0  0  0  0 56 57 57 56 56 56 56 56 56 56 56 56 57 57 58 59 61 62 62
 63 64 64 63 64 64 64 64  0  0  0  0 64 64 64 64 63 63 63 63 63 64 63 64
 64 64 65 66 66 66 66 66 65 65 66 66 66 66 65  0  0  0  0 65 65 65 66 66
 66 66 66 65 65 65  0  0  0  0 64 64 64 64 64 64 64 64 64 64 64 64 64 64
 63  0  0  0  0  0  0  0  0  0  0  0  0  0 60 60 60 60 61 61 62 62 62 62
 62 62 62 61  0  0  0 62 62 62 62 62 62 62 62 62 62 62 62 60  0 62 61 60
 61 61 61 61 61 61 61 61 61 60  0  0  0  0  0 61 60 60 60 61 61 61 61 61
 61  0  0  0  0  0  0 59 59 59 59 58 58 59 59 59 59  0  0  0  0  0  0  0
 59 59 58 58 59 59 59 59 59 59  0  0  0  0 58 57 57 57 57 57 57 57 57 57
 57 57 58 57  0  0  0  0  0  0]

After I summarize all label files, I have found these 51 unique values present in those labels. I stored these values in an array.

y_train = # y_test also contains these values
[ 0 30 31 32 33 34 35 36 37 38 
 39 40 41 42 43 44 45 46 47 48 
 49 50 51 52 53 54 55 56 57 58 
 59 60 61 62 63 64 65 66 67 68 
 69 70 71 72 73 74 76 77 81 83 
 85]

I need to execute to_categorical method to determine the class number (in my case, 51) before I can do CNN computation. You can see to_categorical docs here.

I have done it, but the result is 86, not 51. I assume because my label is already in an integer format, and the method thinks that I have 86 unique values ranging from 0-85 in a complete order, while in reality I have only 51 unique values, ranging from 0-85, but not in complete order (see y_train).

# convert to array first. y_train and y_test are labels for an image X_train and X_test.
y_train = np.array(y_train) # labels for X_train images
y_test = np.array(y_test) # labels for X_test images

# do to_categorical
y_train = to_categorical(y_train)
y_test = to_categorical(y_test)

# shape result
y_train:  (638, 250, 86) # 638 = total data, 250 = 1 data length, 86 = num_class
y_test:  (161, 250, 86) # 161 = total data, 250 = 1 data length, 86 = num_class

Then, I come up to an idea to map all unique values into a new integer to make to_categorical method thinks I have only 51 class, example:

0 -> 0
30 -> 1
31 -> 2
32 -> 3
...
85 -> 51

Is there a way in Python to achieve that kind of mapping from y_train array? And if there is, can I return it back to its original value when the computation is finished? Thank you.

like image 688
Dionisius Pratama Avatar asked Feb 04 '21 04:02

Dionisius Pratama


1 Answers

Yes, you can make a dictionary of all those mappings like below

map_dict = {}

for i, value in enumerate(y_train):
    map_dict[i] = value

Your new categories would be the keys of map_dict, that you can get like below

list(map_dict.keys())

Later on whenever you have to look back to the original values, you just need to check in the map_dict like

 map_dict[k]

For printing both the keys and value in the dictionary, do the following,

 for key, value in map_dict.items():
     print(key, ' --->', value)
like image 156
coderina Avatar answered Oct 12 '22 23:10

coderina