Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Large data set breaks d3 sankey diagram

I'm making a d3 Sankey diagram from the example in https://bl.ocks.org/d3noob/5028304. This example works fine with a smaller data set. When I switched to using a larger data set, the visualization breaks. It looks like the problem is that the dy values become negative.

In the console, the error is:

Error: <rect> attribute height: A negative value is not valid. ("-9.02557856272838")

The code it points to is:

node.append("rect")
  .attr("height", function(d) { return d.dy; })

This is perhaps because the plots are going off screen? I looked at using d3 scales, but I'm not sure how to implement them. Maybe something like this:

d3.scale.ordinal()
.domain(data.map(function(d) { return d.name; }))
.rangeRoundBands([0, height], .2);

Or maybe there's a way to shrink the visualization as the data set gets larger so that everything will fit in the container.

Here is my code: https://plnkr.co/edit/hOjEBHhS7vfajD2wb8t9?p=preview

like image 891
kimli Avatar asked Nov 13 '16 21:11

kimli


People also ask

Does the length of a Sankey diagram matter?

(The length of the arrows does not matter in a Sankey Diagram.) Useful energy transfers are shown going left to right. Wasteful energy transfers are shown going downwards. Power stations are usually not very efficient.

What type of data does a Sankey diagram generally use?

Sankey diagrams use links with width proportional to the flow quantity visualized-- if a flow is twice as wide, it represents double the quantity. A Sankey diagram has multiple nodes which are connected by a link. Each node should only appear once and there can be utmost one link between a pair of nodes.

What makes a good Sankey diagram?

What to Know. Sankey diagrams show the flow of resources. They communicate sources and uses of the resources, materials, or costs represented. The key to reading and interpreting Sankey Diagrams is remembering that the width is proportional to the quantity represented.


1 Answers

With 945 nodes and 2463 links, there is no way this is going to fit in an 740-pixel-height container. Not only that, you have to ask yourself "how is this dataviz going to be useful to the reader with that huge amount of information?". But since that's none of my business, you can do a couple of things:

The first one, of course, is filtering your data. If that's not an option, you can increase the container height:

height = 3000 - margin.top - margin.bottom;

And reduce the padding of the nodes:

var sankey = d3.sankey()
    .nodeWidth(36)
    .nodePadding(1)
    .size([width, height]);

The result is in this plunker: https://plnkr.co/edit/Idb6ZROhq1kuZatbGtqg?p=preview

But if even that is not an option, you can change sankey.js code or, in a lazy solution, avoid negative numbers with this:

.attr("height", function(d) { return d.dy < 0 ? 0 : d.y; })

This being the result: https://plnkr.co/edit/tpaMpK5yXwYh9MEo8hgn?p=preview

like image 102
Gerardo Furtado Avatar answered Sep 28 '22 12:09

Gerardo Furtado