Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Weka normalizing columns

I have an ARFF file containing 14 numerical columns. I want to perform a normalization on each column separately, that is modifying the values from each colum to (actual_value - min(this_column)) / (max(this_column) - min(this_column)). Hence, all values from a column will be in the range [0, 1]. The min and max values from a column might differ from those of another column.

How can I do this with Weka filters?

Thanks

like image 493
lmsasu Avatar asked Feb 16 '10 07:02

lmsasu


People also ask

What is normalizing a column?

The purpose of normalization is, primarily, to scale numeric data from different columns down to an equivalent scale. For example, suppose you execute the LINEAR_REG function on a data set with two feature columns, current_salary and years_worked . The output value you are trying to predict is a worker's future salary.

Is normalizing the same as scaling?

Scaling just changes the range of your data. Normalization is a more radical transformation. The point of normalization is to change your observations so that they can be described as a normal distribution.


1 Answers

This can be done using

weka.filters.unsupervised.attribute.Normalize

After applying this filter all values in each column will be in the range [0, 1]

like image 77
George Dontas Avatar answered Sep 22 '22 00:09

George Dontas