Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

What is the most efficient way to define a very sparse network matrix in Julia?

I have the data for a very large network which is quite sparse. I was wondering what would be the most memory efficient way to store and easiest to access whether two nodes are connected.

Obviously with N nodes, keeping an N*N matrix is not that efficient in terms of space I store. So I thought of maybe keeping the adjacency list like below:

Array(Vector{Int64}, N_tmp)

Where N_tmp <= N, as many nodes may not have any connections.

Could you help me whether there are better ways or maybe packages that are better in terms of memory and access?

like image 404
A.Yazdiha Avatar asked Oct 27 '16 15:10

A.Yazdiha


People also ask

How do you write a sparse matrix?

Description. S = sparse( A ) converts a full matrix into sparse form by squeezing out any zero elements. If a matrix contains many zeros, converting the matrix to sparse storage saves memory. S = sparse( m,n ) generates an m -by- n all zero sparse matrix.

What is sparsity of matrix?

A sparse matrix is a matrix that is comprised of mostly zero values. Sparse matrices are distinct from matrices with mostly non-zero values, which are referred to as dense matrices. A matrix is sparse if many of its coefficients are zero.

Why do we use sparse matrix?

Using sparse matrices to store data that contains a large number of zero-valued elements can both save a significant amount of memory and speed up the processing of that data. sparse is an attribute that you can assign to any two-dimensional MATLAB® matrix that is composed of double or logical elements.


1 Answers

In LightGraphs.jl, we use adjacency lists (basically, a vector of vectors) to store neighbors for each node. This provides very good memory utilization for large sparse graphs, allowing us to scale to hundreds of millions of nodes on commodity hardware, while providing fast access that beats the native sparse matrix data structure for most graph operations.

You might consider whether LightGraphs will meet your needs directly.

Edit with additional information: we store a sorted list of neighbors - this gives us a performance hit on edge creation, but makes it much faster to do subsequent lookups.

like image 188
sbromberger Avatar answered Sep 29 '22 09:09

sbromberger