I am confused by two methods whereby an array is normalised and must sum total to 1.0:
Array to be normalised:
array([ 1.17091033, 1.13843561, 1.240346 , 1.05438719, 1.05386014,
1.15475574, 1.16127814, 1.07070739, 0.93670444, 1.20450255,
1.25644135])
Method 1:
arr = np.array(values / min(values))
array([ 1.25003179, 1.21536267, 1.32415941, 1.12563488, 1.12507221,
1.23278559, 1.23974873, 1.14305788, 1.00000000, 1.28589392,
1.34134236])
arr1 = arr / sum(arr) # Sum total to 1.0
array([ 0.09410701, 0.09149699, 0.09968761, 0.08474195, 0.08469959,
0.09280865, 0.09333286, 0.08605362, 0.07528369, 0.09680684,
0.1009812 ])
Method 2:
arr = np.array((values - min(values)) / (max(values) - min(values)))
array([ 0.73249564, 0.63092863, 0.94966065, 0.3680612, 0.3664128 ,
0.68197101, 0.70237028, 0.41910379, 0.0000000, 0.83755771,
1.00000000])
arr2 = arr / sum(arr) # Sum total to 1.0
array([ 0.10951467, 0.09432949, 0.14198279, 0.05502845, 0.054782 ,
0.10196079, 0.10501066, 0.06265978, 0.00000000, 0.12522239,
0.14950897])
Which method is correct? And why?
Both methods modify values into an array whose sum is 1, but they do it differently.
The first step of method 1 scales the array so that the minimum value becomes 1. This step isn't needed, and wouldn't work if values has a 0 element.
>>> import numpy as np
>>> values = np.array([2, 4, 6, 8])
>>> arr1 = values / values.min()
>>> arr1
array([ 1., 2., 3., 4.])
The second step of method 1 scales the array so that its sum becomes 1. By doing so, it overwrites any change done by the first step. You don't need arr1:
>>> arr1 / arr1.sum()
array([ 0.1, 0.2, 0.3, 0.4])
>>> values / values.sum()
array([ 0.1, 0.2, 0.3, 0.4])
The first step of method 2 offsets and scales the array so that the minimum becomes 0 and the maximum becomes 1:
>>> arr2 = (values - values.min()) / (values.max() - values.min())
>>> arr2
array([ 0. , 0.33333333, 0.66666667, 1. ])
The second step of method 2 scales the array so that the sum becomes 1. The offset from step 1 is still applied, but the scaling from step 1 is overwritten. Note that the minimum element is 0:
>>> arr2 / arr2.sum()
array([ 0. , 0.16666667, 0.33333333, 0.5 ])
You could get this result directly from values with :
>>> (values - values.min()) / (values - values.min()).sum()
array([ 0. , 0.16666667, 0.33333333, 0.5 ])
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With