I am using the R interface of igraph
to generate a random directed network (Erdös-Rényi) with a constant number of nodes, n
, and edges, m
, using the function sample_gnm
.
To make sure I understand which is the algorithm used, I checked the C source code although I have no experience in C.
As far as I understand the C code, there is an if
statement that should lead to an over-representation of nodes with index n
receiving directed edges.
This is the real code: https://github.com/igraph/igraph/blob/7d4be976481356fa673772e6e7c30b637ea8dd52/src/games.c#L734-L736 and this is how I understand the C code in pseudo-code:
# What is the maximum number of edges a network with n nodes could have
maxEdges := n*(n-1)
s := uniformly sample m integers from [1, maxEdges] without replacement
for (i = 1; i = m; i++) {
# Get IDs for nodes A and B with equal probability over n
nodeA := floor(s[i] / (n)) + 1
nodeB := s - ((nodeA - 1) * n)
# Since we do not allow loops, if nodeA = nodeB, assign n to nodeB
if (nodeA = nodeB) {
nodeB := n
}
}
However, I also run a simulation in R to make sure that this is really the case:
testFun = function(n,m) {
# Generate the network
g = sample_gnm(n, m, directed = TRUE, loops = FALSE)
# Find the "to" node IDs
toEdgename = ends(g, E(g))[, 2]
return(toEdgename)
}
# Create 1000 random networks and get the "to" node name for each edge
spam = replicate(1000, testFun(100, 9000))
# Plot the histogram
hist(sapply(1:ncol(spam),
# Count the percent of times the index 100 appeared per simulation
function(ii) sum(spam[, ii] == 100) / 9000),
100)
To my surprise, it does not lead to an observable bias.
This must mean that I do not comprehend what the C code is doing.
Could anyone please help me understand why does this piece of C code not lead to over-representation of the n
index?
The reason is that nodeB
in your pseudo-code can never be n
(or, in the C code, it can never be no_of_nodes - 1
. (However, nodeA
can be n
!)
In fact, the maximum value for nodeB
is given by maxEdges (mod n−1), and values in mod n−1 are in the range [0, n−1[; note that the upper bound is exclusive.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With