Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to detect and delete noise in rapidminer?

I am new in rapid miner 5, just want to know how to find noise in my data and show them in chart and how to delete them?

like image 346
H.Ghassami Avatar asked Oct 21 '22 02:10

H.Ghassami


1 Answers

A complex problem because it depends what you mean by noise.

If you mean finding individual attributes whose values are plain wrong then you could plot a histogram view and work out some sort of limits on what constitutes a valid value. You could then impose that rule by using Filter Examples to remove them.

If you mean finding attributes that have some sort of random jitter applied to them it would be difficult to detect these. Only by knowing beforehand what the expected shape of the distribution is could you compare with observation and do something about it. However, the action to take is by no means obvious.

If you mean finding examples within an example set that are obviously different from other examples then you could consider using the various outlier functions. The simplest one to get started is Detect Outlier (Distances). This finds a set number of outliers (default 10) based on a distance calculation that uses all the attributes for examples. It creates a new attribute called outlier that is set to true or false. You could then use the Filter Examples operator to remove those that are set to true.

Hope that helps at least as a start.

like image 199
Andrew Chisholm Avatar answered Oct 23 '22 23:10

Andrew Chisholm