Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Large Graph Representation in C++

There might be similar questions but I still have some parts that I couldn't figure out. I'm trying represent an undirected graph with no weights but just 1 for connected and 0 for not connected. I'm trying to represent a graph (reading from a file) which have 80500 nodes and over 5.5 million edges. I was wondering;

  1. Is it going to be a huge impact if I change my adjacency matrix(the one that I'm currently using) to a adjacency list. I have no problem with the implementation just asking will it worth the time to convert it to list?
  2. Since I just store 1 and 0 is there a special data type no store this. I'm using in and I guess a byte data type would save a lot of time.
  3. Any other structure than adjacency matrix or list that could be better for this typical problem?
like image 202
Ali Avatar asked Mar 28 '12 17:03

Ali


People also ask

What is considered a large graph?

'large graphs' are graphs the exploration of which require long computation times on a typical quad-core machine compared to what people judge reasonable (an hour, a day.

What is graph representation in C language?

A graph consists of a set of nodes or vertices together with a set of edges or arcs where each edge joins two vertices. Unless otherwise specified, a graph is undirected: each edge is an unordered pair {u,v} of vertices, and we don't regard either of the two vertices as having a distinct role from the other.

What are the different representations of graphs in memory?

A graph can be represented using 3 data structures- adjacency matrix, adjacency list and adjacency set.

What is adjacency matrix in C?

Adjacency Matrix is a 2D array of size V x V where V is the number of vertices in a graph. Let the 2D array be adj[][], a slot adj[i][j] = 1 indicates that there is an edge from vertex i to vertex j. Adjacency matrix for undirected graph is always symmetric. Adjacency Matrix is also used to represent weighted graphs.


2 Answers

Adjacency lists are wayyyy more better space-wise. Because then you just need to save 5.5 million * 2 numbers = 11 000 000 integers. Assuming you save short integers (2 bytes), then you need 22 000 000 bytes.

If you represent it using adjacency matrix, then you need to save 80500 * 80500 = 6 480 250 000 elements. Even is you save them as bytes, having 22 million bytes is much better than having over 6 billion of them.

EDIT: If you save eges as two 4-byte integers, then you have 44 000 000 bytes. If you save the matrix very efficiently with bit fiddling, then you can save 8 elements in one byte. But is means you still need to have 810 031 250 bytes. Not that large difference now, but it's still 20 times more.

like image 159
Jakub Zaverka Avatar answered Sep 23 '22 22:09

Jakub Zaverka


If your data is not sparse then you may not get as much space savings with an adjacency list. You could use an adjacency matrix with compressed or encoded rows (or columns, but your graph is unidirected, so compressing rows is probably more natural for lookups). With compression you will reduce space, at the time cost of decompressing rows on lookup.

like image 29
Alex Reynolds Avatar answered Sep 23 '22 22:09

Alex Reynolds