Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How do I merge two nodes into a single node using igraph

Tags:

r

igraph

I am trying to merge two nodes (call them 'V' and 'U') in a graph (G) into a single node (V).

G is a hyperlink network of 779 nodes (websites). Each edge represents a hyperlink. V and U are actually the same website, but unfortunately the webpages from that website have become split into two separate nodes. So I want to put them back together into a single node.

I have researched the contract.vertices function, but I cannot understand how to adapt it here.

Here are the attributes of my graph (G).

> G
IGRAPH D--- 779 3544 -- 
+ attr: Image File (v/c), Ringset (v/n), Country Code TLD (v/n), Generic TLD (v/n), Number of Pages (v/n), Categorical 1 (v/n), Categorical 2 (v/n),
  Categorical 3 (v/n), id (v/c), label (v/c), Width (e/n)

I have two nodes that I want to merge together:

> V(g)$id[8]
[1] "http://www.police.uk/"

and

> V(g)$id[14]
[1] "http://police.uk/"

In total there are 779 nodes and 3544 edges in the graph.

I want these two nodes to become a single node in the graph (i.e. they will have the same "id"). All inlinks and outlinks from/to other nodes will now point only to this new single node.

All other attributes will remain the same, with the exception of Number of Pages (the value of this will be the sum of both the nodes before they are merged).

like image 535
timothyjgraham Avatar asked Sep 25 '13 06:09

timothyjgraham


1 Answers

contract.vertices is indeed the right function to try, but its API is a bit complicated since it is designed to be able to merge not only a single pair of nodes but also several pairs in a single pass. (It can also permute vertices). To this end, it requires a mapping from the old vertex IDs to the new ones.

In case you are unfamiliar with vertex IDs: igraph identifies each vertex of the graph with an integer in the range 1 to N where N is the number of vertices. The mapping that contract.vertices requires must be a list of length N where the i-th element of the list contains the new ID of the node corresponding to ID i before merging.

Suppose that your graph contains 10 nodes. The following mapping vector will simply map each node to the same ID that it already has, so it will not do any merging:

c(1,2,3,4,5,6,7,8,9,10)

Now, suppose that you want to merge node 7 into node 4. You have to tell igraph that the new ID of node 7 will be 4, so you have to change the 7th element in the above vector to 4:

c(1,2,3,4,5,6,4,8,9,10)

This will almost do the job; the problem is that igraph requires the node IDs to be in the range 1 to N and since you still have a node with ID 10 according to the above mapping, igraph will not delete the old node 7. You can either delete it manually with delete.vertices after you contracted the vertices, or you can specify a different mapping that not only merges node 7 into node 4 but also changes the ID of node 8 to 7, node 9 to 8 and node 10 to 9:

c(1,2,3,4,5,6,4,7,8,9)

Now, since you also want the Number of Pages attribute of the new node to be the sum of the values of the two old nodes, you must tell igraph what to do with the vertex attributes during the merge. The vertex.attr.comb parameter of contract.vertices serves this purpose. In your case, the value of vertex.attr.comb should be something like this:

list("Number of Pages"="sum", "first")

where "Number of Pages"="sum" means that the new value of the Number of Pages attribute should be calculated by summing the old attribute values, and "first" means that for all other attributes not mentioned here, the new value should be determined by the old value of the first node among the set of nodes that are merged into a single one. See ?attribute.combination in R for more details about the format of this argument.

like image 105
Tamás Avatar answered Nov 02 '22 16:11

Tamás