Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Optimising a group of dc.js line graphs

I have a group of graphs visualizing a bunch of data for me (here), based off a csv with approximately 25,000 lines of data, each having 12 parameters. However, doing any interaction (such as selecting a range with the brush on any of the graphs) is slow and unwieldy, completely unlike the dc.js demo found here, which deals with thousands of records as well but maintains smooth animations, or crossfilter's demo here which has 10 times as many records (flights) as I do.

I know the main resource hogs are the two line charts, since they have data points every 15 minutes for about 8 solid months. Removing either of them makes the charts responsive again, but they're the main feature of the visualizations, so is there any way I can make them show less fine-grained data?

The code for the two line graphs specifically is below:

        var lineZoomGraph = dc.lineChart("#chart-line-zoom")
            .width(1100)
            .height(60)
            .margins({top: 0, right: 50, bottom: 20, left: 40})
            .dimension(dateDim)
            .group(tempGroup)
            .x(d3.time.scale().domain([minDate,maxDate]));

        var tempLineGraph = dc.lineChart("#chart-line-tempPer15Min")
            .width(1100).height(240)
            .dimension(dateDim)
            .group(tempGroup)
            .mouseZoomable(true)
            .rangeChart(lineZoomGraph)
            .brushOn(false)
            .x(d3.time.scale().domain([minDate,maxDate])); 

Separate but relevant question; how do I modify the y-axis on the line charts? By default they don't encompass the highest and lowest values found in the dataset, which seems odd.

Edit: some code I wrote to try to solve the problem:

var graphWidth = 1100;
var dataPerPixel = data.length / graphWidth;

var tempGroup = dateDim.group().reduceSum(function(d) {
    if (d.pointNumber % Math.ceil(dataPerPixel) === 0) {
        return d.warmth;
    }
});

d.pointNumber is a unique point ID for each data point, cumulative from 0 to 22 thousand ish. Now however the line graph shows up blank. I checked the group's data using tempGroup.all() and now every 21st data point has a temperature value, but all the others have NaN. I haven't succeeded in reducing the group size at all; it's still at 22 thousand or so. I wonder if this is the right approach...

Edit 2: found a different approach. I create the tempGroup normally but then create another group which filters the existing tempGroup even more.

var tempGroup = dateDim.group().reduceSum(function(d) { return d.warmth; });
    var filteredTempGroup = {
        all: function () {
            return tempGroup.top(Infinity).filter( function (d) { 
                if (d.pointNumber % Math.ceil(dataPerPixel) === 0) return d.value;
            } );
        }
    };

The problem I have here is that d.pointNumber isn't accessible so I can't tell if it's the Nth data point (or a multiple of that). If I assign it to a var it'll just be a fixed value anyway, so I'm not sure how to get around that...

like image 205
IronWaffleMan Avatar asked Apr 12 '15 20:04

IronWaffleMan


1 Answers

When dealing with performance problems with d3-based charts, the usual culprit is the number of DOM elements, not the size of the data. Notice the crossfilter demo has lots of rows of data, but only a couple hundred bars.

It looks like you might be attempting to plot all the points instead of aggregating them. I guess since you are doing a time series it may be unintuitive to aggregate the points, but consider that your plot can only display 1100 points (the width), so it is pointless to overwork the SVG engine plotting 25,000.

I'd suggest bringing it down to somewhere between 100-1000 bins, e.g. by averaging each day:

var daysDim = data.dimension(function(d) { return d3.time.day(d.time); });

function reduceAddAvg(attr) {
  return function(p,v) {
    if (_.isLegitNumber(v[attr])) {
      ++p.count
      p.sums += v[attr];
      p.averages = (p.count === 0) ? 0 : p.sums/p.count; // gaurd against dividing by zero
    }
    return p;
  };
}
function reduceRemoveAvg(attr) {
  return function(p,v) {
    if (_.isLegitNumber(v[attr])) {
      --p.count
      p.sums -= v[attr];
      p.averages = (p.count === 0) ? 0 : p.sums/p.count;
    }
    return p;
  };
}
function reduceInitAvg() {
  return {count:0, sums:0, averages:0};
}
...
// average a parameter (column) named "param" 
var daysGroup = dim.group().reduce(reduceAddAvg('param'), reduceRemoveAvg('param'), reduceInitAvg);

(reusable average reduce functions from the FAQ)

Then specify your xUnits to match, and use elasticY to auto-calculate the y axis:

chart.xUnits(d3.time.days)
   .elasticY(true)
like image 114
Gordon Avatar answered Oct 21 '22 07:10

Gordon