I have a graph ( adjacency_list (listS, vecS, bidirectionalS, VertexVal) ) in which I need to delete 100,000+ nodes. Each node also contains a structure of 2 64-bit integers and another 64-bit integer. The guid check that happens in the code below is checking 1st integer in the structure.
On my laptop ( i7 2.7GHz, 16GB RAM ) it takes about 88 seconds according to VTune.
Following is how I delete the nodes:
vertex_iterator vi,vi_end;
boost::tie(vi, vi_end) = boost::vertices(m_graph);
while (vi!=vi_end) {
if (m_graph[*vi].guid.part1 == 0) {
boost::remove_vertex(*vi,m_graph);
boost::tie(vi, vi_end) = boost::vertices(m_graph);
} else
++vi;
}
Vtune shows that the boost::remove_vertex() call takes 88.145 seconds. Is there a more efficient way to delete these vertices?
In your removal branch you re-tie()
the iterators:
boost::tie(vi, vi_end) = boost::vertices(m_graph);
This will cause the loop to restart every time you restart the loop. This is exactly Schlemiel The Painter.
I'll find out whether you can trust remove_vertex
not triggering a reallocation. If so, it's easily fixed. Otherwise, you'd want an indexer-based loop instead of iterator-based. Or you might be able to work on the raw container (it's a private member, though, as I remember).
Update Using vecS
as the container for vertices is going to cause bad performance here:
If the
VertexList
template parameter of theadjacency_list
wasvecS
, then all vertex descriptors, edge descriptors, and iterators for the graph are invalidated by this operation. <...> If you need to make frequent use of theremove_vertex()
function thelistS
selector is a much better choice for theVertexList
template parameter.
This small benchmark test.cpp
compares:
with -DSTABLE_IT
(listS
)
$ ./stable
Generated 100000 vertices and 5000 edges in 14954ms
The graph has a cycle? false
starting selective removal...
Done in 0ms
After: 99032 vertices and 4916 edges
without -DSTABLE_IT
(vecS
)
$ ./unstable
Generated 100000 vertices and 5000 edges in 76ms
The graph has a cycle? false
starting selective removal...
Done in 396ms
After: 99032 vertices and 4916 edges
using filtered_graph
(thanks @cv_and_he
in the comments)
Generated 100000 vertices and 5000 edges in 15ms
The graph has a cycle? false
starting selective removal...
Done in 0ms
After: 99032 vertices and 4916 edges
Done in 13ms
You can clearly see that removal is much faster for listS
but generating is much slower.
I was able to successfully serialize the graph using Boost serialization routines into a string, parse the string and remove the nodes I didn't need and de-serialize the modified string. For 200,000 total nodes in graph and 100,000 that needs to be deleted I was able to successfully finish the operation in less than 2 seconds.
For my particular use-case each vertex has 3 64bit integers. When it needs to be deleted, I mark 2 of those integers as 0s. A valid vertex would never have a 0. When the point comes to clean up the graph - to delete the "deleted" vertices, I follow the above logic.
In the code below removeDeletedNodes() does the string parsing and removing the vertices and mapping the edge numbers.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With