Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to visualize a large network in R?

Network visualizations become common in science in practice. But as networks are increasing in size, common visualizations become less useful. There are simply too many nodes/vertices and links/edges. Often visualization efforts end up in producing "hairballs".

Some new approaches have been proposed to overcome this issue, e.g.:

  • Edge bundling:
    • http://vis.stanford.edu/papers/divided-edge-bundling or
    • https://gephi.org/tag/edge-bundling/
  • Hierarchial edge bundling:
    • http://graphics.cs.illinois.edu/sites/graphics.dev.engr.illinois.edu/files/edgebundles.pdf
  • Group Attributes Layout:
    • http://wiki.cytoscape.org/Cytoscape_3/UserManual
    • How to make grouped layout in igraph?

I am sure that there are many more approaches. Thus, my question is: How to overcome the hairball issue, i.e. how to visualize large networks by using R?

Here is some code that simulates an exemplary network:

# Load packages lapply(c("devtools", "sna", "intergraph", "igraph", "network"), install.packages) library(devtools) devtools::install_github(repo="ggally", username="ggobi") lapply(c("sna", "intergraph", "GGally", "igraph", "network"),         require, character.only=T)  # Set up data set.seed(123) g <- barabasi.game(1000)  # Plot data g.plot <- ggnet(g, mode = "fruchtermanreingold") g.plot 

enter image description here

This questions is related to Visualizing Undirected Graph That's Too Large for GraphViz?. However, here I am searching not for general software recommendations but for concrete examples (using the data provided above) which techniques help to make a good visualization of a large network by using R (comparable to the examples in this thread: R: Scatterplot with too many points).

like image 257
majom Avatar asked Mar 17 '14 11:03

majom


People also ask

How do I visualize my network?

To visualize your network, you must use a network map, diagram, or drawing. Good network visualization tools will allow you to clearly understand, update, and share information relative to your network.

What is social network analysis in R?

Social Network Analysis in R, Social Network Analysis (SNA) is the process of exploring the social structure by using graph theory. It is mainly used for measuring and analyzing the structural properties of the network.


1 Answers

Another way to visualize very large networks is with BioFabric (www.BioFabric.org), which uses horizontal lines instead of points to represent the nodes. Edges are then shown using vertical line segments. A quick D3 demo of this technique is shown at: http://www.biofabric.org/gallery/pages/SuperQuickBioFabric.html.

BioFabric is a Java application, but a simple R version is available at: https://github.com/wjrl/RBioFabric.

Here is a snippet of R code:

 # You need 'devtools':  install.packages("devtools")  library(devtools)   # you need igraph:  install.packages("igraph")  library(igraph)   # install and load 'RBioFabric' from GitHub  install_github('RBioFabric',  username='wjrl')  library(RBioFabric)   #  # This is the example provided in the question:  #   set.seed(123)  bfGraph = barabasi.game(1000)   # This example has 1000 nodes, just like the provided example, but it   # adds 6 edges in each step, making for an interesting shape; play  # around with different values.   # bfGraph = barabasi.game(1000, m=6, directed=FALSE)   # Plot it up! For best results, make the PDF in the same  # aspect ratio as the network, though a little extra height  # covers the top labels. Given the size of the network,  # a PDF width of 100 gives us good resolution.   height <- vcount(bfGraph)  width <- ecount(bfGraph)  aspect <- height / width;  plotWidth <- 100.0  plotHeight <- plotWidth * (aspect * 1.2)  pdf("myBioFabricOutput.pdf", width=plotWidth, height=plotHeight)  bioFabric(bfGraph)  dev.off() 

Here is a shot of the BioFabric version of the data provided by the questioner, though networks created with values of m > 1 are more interesting. The inset detail shows a close-up of the upper left corner of the network; node BF4 is the highest-degree node in the network, and the default layout is a breadth-first search of the network (ignoring edge directions) starting from that node, with neighboring nodes traversed in order of decreasing node degree. Note that we can immediately see that, for example, about 60% of node BF4's neighbors are degree 1. We can also see from the strict 45-degree lower edge that this 1000-node network has 999 edges, and is therefore a tree.

BioFabric presentation of example data

Full disclosure: BioFabric is a tool that I wrote.

like image 115
wjrl Avatar answered Oct 04 '22 03:10

wjrl