Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

What is the meaning of jitter in visualize tab of weka

In weka I load an arff file. I can view the relationship between attributes using the visualize tab.

However I can't understand the meaning of the jitter slider. What is its purpose?

like image 890
Xolve Avatar asked Aug 09 '09 16:08

Xolve


People also ask

What is the purpose of the visualizer in Weka?

Weka's Visualize panel lets you look at a dataset and select different attributes – preferably numeric ones – for the x- and y-axes. Instances are shown as points, with different colors for different classes. You can sweep out a rectangle and focus the dataset on the points inside it.

What options are available on main panel in Weka?

In this WEKA tutorial, we provided an introduction to the open-source WEKA Machine Learning Software and explained step by step download and installation process. We have also seen the five options available for Weka Graphical User Interface, namely, Explorer, Experimenter, Knowledge flow, Workbench, and Simple CLI.

What are the features of Weka?

Weka features include machine learning, data mining, preprocessing, classification, regression, clustering, association rules, attribute selection, experiments, workflow and visualization. Weka is written in Java, developed at the University of Waikato, New Zealand.

What is Weka full form?

Waikato Environment for Knowledge Analysis (Weka), developed at the University of Waikato, New Zealand, is free software licensed under the GNU General Public License, and the companion software to the book "Data Mining: Practical Machine Learning Tools and Techniques". Weka.


2 Answers

You can find the answer in the mailing list archives:

The jitter function in the Visualize panel just adds artificial random noise to the coordinates of the plotted points in order to spread the data out a bit (so that you can see points that might have been obscured by others).

like image 118
Zed Avatar answered Sep 16 '22 12:09

Zed


I don't know weka, but generally jitter is a term for the variation of a periodic signal to some reference interval. I'm guessing the slider allows you to set some range or threshold below which data points are treated as being regular, or to modify the output to introduce some variation. The wikipedia entry can give you some background.

Update: from this pdf, the jitter slider is for this purpose:

“Jitter” option to deal with nominal attributes (and to detect “hidden”data points)

Based on the accompanying slide it looks like it introduces some variation in the visualisation, perhaps to show when two data points overlap.

Update 2: This google books extract (to Data mining By Ian H. Witten, Eibe Frank) seems to confirm my guess:

[jitter] is a random displacement applied to X and Y values to separate points that lie on top of one another. Without jitter, 1000 instances at the same data point would look just the same as 1 instance

like image 32
Rich Seller Avatar answered Sep 17 '22 12:09

Rich Seller