I have a huge data set from which I derive two sets of datapoints, which I then have to plot and compare. These two plots differ in their in their range, so I want them to be in the range of [0,1]
. For the following code and a specific data set I get a constant line at 1 as the dataset plot, but this normalization works well for other sets:
plt.plot(range(len(rvalue)),np.array(rvalue)/(max(rvalue)))
and for this code :
oldrange = max(rvalue) - min(rvalue) # NORMALIZING
newmin = 0
newrange = 1 + 0.9999999999 - newmin
normal = map(
lambda x, r=float(rvalue[-1] - rvalue[0]): ((x - rvalue[0]) / r)*1 - 0,
rvalue)
plt.plot(range(len(rvalue)), normal)
I get the error:
ZeroDivisionError: float division by zero
for all the data sets. I am unable to figure out how to get both the plots in one range for comparison.
Normalization using sklearn MinMaxScaler In Python, sklearn module provides an object called MinMaxScaler that normalizes the given data using minimum and maximum values. Here fit_tranform method scales the data between 0 and 1 using the MinMaxScaler object.
Standardization: Standardizing the features around the center and 0 with a standard deviation of 1 is important when we compare measurements that have different units. Variables that are measured at different scales do not contribute equally to the analysis and might end up creating a bais.
Use the following method to normalize your data in the range of 0 to 1 using min and max value from the data sequence:
import numpy as np
def NormalizeData(data):
return (data - np.min(data)) / (np.max(data) - np.min(data))
Use scikit: http://scikit-learn.org/stable/modules/preprocessing.html#scaling-features-to-a-range
It has built in functions to scale features to a specified range. You'll find other functions to normalize and standardize here.
See this example:
>>> import numpy as np
>>> from sklearn import preprocessing
>>> X_train = np.array([[ 1., -1., 2.],
... [ 2., 0., 0.],
... [ 0., 1., -1.]])
...
>>> min_max_scaler = preprocessing.MinMaxScaler()
>>> X_train_minmax = min_max_scaler.fit_transform(X_train)
>>> X_train_minmax
array([[ 0.5 , 0. , 1. ],
[ 1. , 0.5 , 0.33333333],
[ 0. , 1. , 0. ]])
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With