Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Is there an R function to get the unique edges in an undirected (not directed) network?

Tags:

r

edges

I want to count the number of the unique edges in an undirected network, e.g, net

   x  y
1  A  B
2  B  A
3  A  B

There should be only one unique edge for this matrix, because edges A-B and B-A are same for the undirected network.

For the directed network I can get the number of unique edges by:

nrow(unique(net[,c("x","y"]))

But this doesn't work for the undirected network.

like image 846
Z Qu Avatar asked Apr 09 '19 13:04

Z Qu


People also ask

What is an Edgelist in R?

An edge list is a data frame that contains a minimum of two columns, one column of nodes that are the source of a connection and another column of nodes that are the target of the connection. The nodes in the data are identified by unique IDs.

What is directed and undirected network?

Undirected graphs have edges that do not have a direction. The edges indicate a two-way relationship, in that each edge can be traversed in both directions. This figure shows a simple undirected graph with three nodes and three edges. Directed graphs have edges with direction.

How do you delete a self loop in Igraph?

Use the function graph_from_adjacency_matrix to convert your adjacency matrix into a graph and set the argument diag=F . That should get rid of the self-loops.

What is a directed network?

If the edges in a network are directed, i.e., pointing in only one direction, the network is called a directed network (or a directed graph, sometimes digraph for short). When drawing a directed network, the edges are typically drawn as arrows indicating the direction, as illustrated in the first figure, below.


3 Answers

Given that you are working with networks, an igraph solution:

library(igraph)

as_data_frame(simplify(graph_from_data_frame(dat, directed=FALSE)))

Then use nrow


Explanantion

dat %>% 
  graph_from_data_frame(., directed=FALSE) %>% # convert to undirected graph
  simplify %>%                                 # remove loops / multiple edges
  as_data_frame                                # return remaining edges
like image 127
user20650 Avatar answered Oct 11 '22 11:10

user20650


Try this,

df <- data.frame(x=c("A", "B", "A"), y = c("B", "A", "B"))
unique(apply(df, 1, function(x) paste(sort(unlist(strsplit(x, " "))),collapse = " ")))
[1] "A B"

So how does this work?

  1. We are applying a function to each row of the data frame, so we can take each row at a time. Take the second row of the df,

    df[2,]
      x y
    1 B A
    
  2. We then split (strsplit) this, and unlist into a vector of each letter, (We use as.matrix to isolate the elements)

    unlist(strsplit(as.matrix(df[2,]), " "))
    [1] "B" "A"
    
  3. Use the sort function to put into alphabetical order, then paste them back together,

    paste(sort(unlist(strsplit(as.matrix(df[2,]), " "))), collapse = " ")
    [1] "A B"
    

Then the apply function does this for all the rows, as we set the index to 1, then use the unique function to identify unique edges.

Extension

This can be extended to n variables, for example n=3,

df <- data.frame(x=c("A", "B", "A"), y = c("B", "A", "B"),  z = c("C", "D", "D"))
unique(apply(df, 1, function(x) paste(sort(unlist(strsplit(x, " "))),collapse = " ")))
[1] "A B C" "A B D"

If more letters are needed, just combine two letters like the following,

df <- data.frame(x=c("A", "BC", "A"), y = c("B", "A", "BC"))
df
   x  y
1  A  B
2 BC  A
3  A BC
unique(apply(df, 1, function(x) paste(sort(unlist(strsplit(x, " "))),collapse = " ")))
[1] "A B"  "A BC"

Old version

Using the tidyverse package, create a function called rev that can order our edges, then use mutate to create a new column combining the x and y columns, in such a way it works well with the rev function, then run the new column through the function and find the unique pairs.

library(tidyverse)
rev <- function(x){
  unname(sapply(x, function(x) {
    paste(sort(trimws(strsplit(x[1], ',')[[1]])), collapse=',')} ))
}
df <- data.frame(x=c("A", "B", "A"), y = c("B", "A", "B"))
rows <- df %>% 
  mutate(both = c(paste(x, y, sep = ", ")))

unique(rev(rows$both))
like image 42
Hector Haffenden Avatar answered Oct 11 '22 12:10

Hector Haffenden


Here is a solution without the intervention of igraph, all inside one pipe:

df = tibble(x=c("A", "B", "A"), y = c("B", "A", "B"))

It is possible to use group_by() and then sort() combinations of values and paste() them in the new column via mutate(). unique() is utilized if you have "true" duplicates (A-B, A-B will get into one group).

df %>%
  group_by(x, y) %>%
  mutate(edge_id = paste(sort(unique(c(x,y))), collapse=" ")) 

When you have properly sorted edge names in a new column, it's quite straightforward to count unique values or filter duplicates out of your data frame.
If you have additional variables for edges, just add them into grouping.

like image 41
perechen Avatar answered Oct 11 '22 11:10

perechen