Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Separate data from datahash

I am working on a Dimple/D3 chart that plots missing days' data as 0.

date                fruit   count
2013-12-08 12:12    apples  2
2013-12-08 12:12    oranges 5
2013-12-09 16:37    apples  1
                             <- oranges inserted on 12/09 as 0
2013-12-10 11:05    apples  6
2013-12-10 11:05    oranges 2
2013-12-10 20:21    oranges 1

I was able to get nrabinowitz's excellent answer to work, nearly.

My data's timestamp format is YYYY-MM-DD HH-MM, and the hashing + D3.extent time interval in days results in 0-points every day at midnight, even if there is data present from later in the same day.

An almost-solution I found was to use .setHours(0,0,0,0) to discard the hours/minutes, so that all data would appear to be from midnight:

...
var dateHash = data.reduce(function(agg, d) { 
 agg[d.date.setHours(0,0,0,0)] = true; 
 return agg; 
}, {});
...

This works as expected when there is just 1 entry per day everyday, BUT on days when there are multiple entries the values are added together. So in the data above on 12/10: apples: 6 , oranges: 3.

Ideally (in my mind) I would separate the plotting data from the datehash, and on the hash discard hours/minutes. This would compare the midnight-datehash with the D3 days interval, fill in 0s at midnight on days with missing data, and then plot the real points with hours/minutes intact.

I have tried data2 = data.slice() followed by setHours, but the graph still gets the midnight points:

...
// doesn't work, original data gets converted
var data2 = data.slice();
var dateHash = data2.reduce(function(agg, d) { 
 agg[d.date.setHours(0,0,0,0)] = true; 
 return agg; 
}, {});
...

Props to nrabinowitz, here is the adapted code:

// get the min/max dates
var extent = d3.extent(data, function(d) { return d.date; }),
  // hash the existing days for easy lookup
  dateHash = data.reduce(function(agg, d) {
      agg[d.date] = true;

// arrr this almost works except if multiple entries per day
//    agg[d.date.setHours(0,0,0,0)] = true; 

      return agg;
  }, {}),
  headers = ["date", "fruit", "count"];

// make even intervals
d3.time.days(extent[0], extent[1])
    // drop the existing ones
    .filter(function(date) {
        return !dateHash[date];
    })
    // fruit list grabbed from user input
    .forEach(function(date) {
fruitlist.forEach(function(fruits) {
        var emptyRow = { date: date };
        headers.forEach(function(header) {
            if(header === headers[0]) {
                emptyRow[header] = fruits;}
            else if(header === headers[1]) {
                emptyRow[header] = 0;};
    // and push them into the array
        data.push(emptyRow);
    });
// re-sort the data
data.sort(function(a, b) { return d3.ascending(a.date, b.date); });

(I'm not concerned with 0-points in the hour-scale, just the dailies. If the time.interval is changed from days to hours I suspect the hash and D3 will handle it fine.)

How can I separate the datehash from the data? Is that what I should be trying to do?

like image 669
williamtx Avatar asked Nov 11 '22 15:11

williamtx


1 Answers

I can't think of a smooth way to do this but I've written some custom code which works with your example and can hopefully work with your real case.

var svg = dimple.newSvg("#chartContainer", 600, 400),
    data = [
        { date : '2013-12-08 12:12', fruit : 'apples', count : 2 },
        { date : '2013-12-08 12:12', fruit : 'oranges', count : 5 },
        { date : '2013-12-09 16:37', fruit : 'apples', count : 1 },
        { date : '2013-12-10 11:05', fruit : 'apples', count : 6 },
        { date : '2013-12-10 11:05', fruit : 'oranges', count : 2 },
        { date : '2013-12-10 20:21', fruit : 'oranges', count : 1 }
    ],
    lastDate = {},
    filledData = [],
    dayLength = 86400000,
    formatter = d3.time.format("%Y-%m-%d %H:%M");

// The logic below requires the data to be ordered by date
data.sort(function(a, b) { 
    return formatter.parse(a.date) - formatter.parse(b.date); 
});

// Iterate the data to find and fill gaps
data.forEach(function (d) {

    // Work from midday (this could easily be changed to midnight)
    var noon = formatter.parse(d.date).setHours(12, 0, 0, 0);

    // If the series value is not in the dictionary add it
    if (lastDate[d.fruit] === undefined) {
        lastDate[d.fruit] = formatter.parse(data[0].date).setHours(12, 0, 0, 0);
    }

    // Calculate the days since the last occurance of the series value and fill
    // with a line for each missing day
    for (var i = 1; i <= (noon - lastDate[d.fruit]) / dayLength - 1; i++) {
        filledData.push({ 
            date : formatter(new Date(lastDate[d.fruit] + (i * dayLength))), 
            fruit : d.fruit, 
            count : 0 });
    }

    // update the dictionary of last dates
    lastDate[d.fruit] = noon;

    // push to a new data array
    filledData.push(d);

}, this);

// Configure a dimple line chart to display the data
var chart = new dimple.chart(svg, filledData),
    x = chart.addTimeAxis("x", "date", "%Y-%m-%d %H:%M", "%Y-%m-%d"),
    y = chart.addMeasureAxis("y", "count"),
    s = chart.addSeries("fruit", dimple.plot.line);
s.lineMarkers = true;
chart.draw();

You can see this working in a fiddle here:

http://jsfiddle.net/LsvLJ/

like image 168
John Kiernander Avatar answered Nov 15 '22 13:11

John Kiernander