Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Normalize a feature in this table

This has become quite a frustrating question, but I've asked in the Coursera discussions and they won't help. Below is the question:

enter image description here

I've gotten it wrong 6 times now. How do I normalize the feature? Hints are all I'm asking for.

I'm assuming x_2^(2) is the value 5184, unless I am adding the x_0 column of 1's, which they don't mention but he certainly mentions in the lectures when talking about creating the design matrix X. In which case x_2^(2) would be the value 72. Assuming one or the other is right (I'm playing a guessing game), what should I use to normalize it? He talks about 3 different ways to normalize in the lectures: one using the maximum value, another with the range/difference between max and mins, and another the standard deviation -- they want an answer correct to the hundredths. Which one am I to use? This is so confusing.

like image 227
bjd2385 Avatar asked Jun 09 '15 17:06

bjd2385


People also ask

How do you normalize a feature?

Standardization (Z-score Normalization) The general method of calculation is to determine the distribution mean and standard deviation for each feature. Next we subtract the mean from each feature. Then we divide the values (mean is already subtracted) of each feature by its standard deviation.

How do you normalize a table in SQL?

Now in order to satisfy the BCNF, we will be dividing the table into two parts. One table will hold Student ID which already exists and newly created column Professor ID . And in the second table, we will have the columns Professor ID , Professor and Subject . By doing this we are satisfied the Boyce Codd Normal Form.

How do you normalize a feature in Python?

Using MinMaxScaler() to Normalize Data in Python This is a more popular choice for normalizing datasets. You can see that the values in the output are between (0 and 1). MinMaxScaler also gives you the option to select feature range. By default, the range is set to (0,1).


2 Answers

...use both feature scaling (dividing by the "max-min", or range, of a feature) and mean normalization.

So for any individual feature f:

f_norm = (f - f_mean) / (f_max - f_min) 

e.g. for x2,(midterm exam)^2 = {7921, 5184, 8836, 4761}

> x2 <- c(7921, 5184, 8836, 4761) > mean(x2)  6676 > max(x2) - min(x2)  4075 > (x2 - mean(x2)) / (max(x2) - min(x2))  0.306  -0.366  0.530 -0.470 

Hence norm(5184) = 0.366

(using R language, which is great at vectorizing expressions like this)

I agree it's confusing they used the notation x2 (2) to mean x2 (norm) or x2'


EDIT: in practice everyone calls the builtin scale(...) function, which does the same thing.

like image 53
smci Avatar answered Sep 24 '22 05:09

smci


It's asking to normalize the second feature under second column using both feature scaling and mean normalization. Therefore,

(5184 - 6675.5) / 4075 = -0.366

like image 36
user6552158 Avatar answered Sep 26 '22 05:09

user6552158