Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How does d3.scale.quantile work?

Tags:

d3.js

What is the meaning of this statement?

quantize = d3.scale.quantile().domain([0, 15]).range(d3.range(9));

I saw that the domain is:

0 - 0
1 - 15

range is from 0 to 8 and quantize.quantiles

0 - 1.6
1 - 3.3
2 - 4.9
3 - 6.6
4 - 8.3
5 - 9.9
6 -11.6
7 -13.3

How are the values to quantize.quantiles calculated ? I tried to call quantize(2) but the result was 1. How does quantile work?

like image 663
user1365697 Avatar asked May 14 '12 08:05

user1365697


2 Answers

The motivation of the quantile scale is to obtain classes which are representative of the actual distribution of the values in the dataset. Therefore, it is necessary to provide it during construction with the full list of values. The scale then splits the input domain (defined by these values) into intervals (quantiles) in such a way that about the same number of values falls into each of the intervals.

From the documentation:

To compute the quantiles, the input domain is sorted, and treated as a population of discrete values.

Hence, when specifying the domain we hand in the scale the whole list of values:

var scale = d3.scale.quantile()
  .domain([1, 1, 2, 3, 2, 3, 16])
  .range(['blue', 'white', 'red']);

If we then run:

scale.quantiles()

It will output [2, 3] which means that our population of values was split into these three subsets represented by 'blue', 'white', and 'red' respectively:

[1, 1] [2, 2] [3, 3, 16]

Note that this scale should be avoided when there are outliers in the data which you want to show. In the above example 16 is an outlier falling into the upper quantile. It is assigned the same class as 3, which is probably not the desired behavior:

scale(3)   // will output "red"
scale(16)  // will output "red"
like image 179
Ilya Boyandin Avatar answered Sep 18 '22 09:09

Ilya Boyandin


I would recommend reading over the quantile scale documentation, especially that on quantize.quantiles()

But basically, d3 sees that there are 9 values in the output range for this scale, so it creates 9 quantiles based on the 2 value data set: [0, 15].
This leads to the quantize.quantiles() values that you show in your question: [1.6, 3.3, .. ,13.3] , these represent the bounds of the quantiles - anything less than 1.6 will be mapped to the first element of the output range (in this case zero). Anything less than 3.3 and greater than 1.6 will be mapped to the second element of the output range (one). Hence quantize(2) = one, as expected.

like image 35
Josh Avatar answered Sep 18 '22 09:09

Josh