I am doing logistic regression on iris dataset from sklearn, I know the math and try to implement it. At the final step, I get a prediction vector, this prediction vector represents the probability of that data point being to class 1 or class 2 (binary classification).
Now I want to turn this prediction vector into target vector. Say if probability is greater than 50%, that corresponding data point will belong to class 1, otherwise class 2. Use 0 to represent class 1, 1 for class 2.
I know there is a for loop version of it, just looping through the whole vector. But when the size get large, for loop is very expensive, so I want to do it more efficiently, like numpy's matrix operation, it is faster than doing matrix operation in for loop.
Any suggestion on the faster method?
import numpy as np
a = np.matrix('0.1 0.82')
print(a)
a[a > 0.5] = 1
a[a <= 0.5] = 0
print(a)
[[ 0.1 0.82]]
[[ 0. 1.]]
import numpy as np
a = np.matrix('0.1 0.82')
print(a)
a = np.where(a > 0.5, 1, 0)
print(a)
A more general solution to a 2D array which has many vectors with many classes:
import numpy as np
a = np.array( [ [.5, .3, .2],
[.1, .2, .7],
[ 1, 0, 0] ] )
idx = np.argmax(a, axis=-1)
a = np.zeros( a.shape )
a[ np.arange(a.shape[0]), idx] = 1
print(a)
Output:
[[1. 0. 0.]
[0. 0. 1.]
[1. 0. 0.]]
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With