There are at least two ways to represent a linked list:
1.)Using an array based representations of the linked list, where
we keep an std::vector
of structs of the type
struct {
<whatever-type-you-want> item ;
int nextitem;
}
Here inserting into the list, is doing a push_back() on the vector and giving an appropriate value to next-item.
2) In which you
have a collection of structs all over RAM. Here insert is done with
the C++ operators new
.
Is it correct to say, that the first method is more efficient since all the items are in consecutive locations in memory, because of which one might be able to grow the linked list to a much larger size than the second method
In the second method, there might be memory fragmentation with huge linked lists because of which one might get a segmentation fault much earlier.
This memory efficient Doubly Linked List is called XOR Linked List or Memory Efficient as the list uses bitwise XOR operation to save space for one address. In the XOR linked list, instead of storing actual memory addresses, every node stores the XOR of addresses of previous and next nodes.
Arrays allow random access and require less memory per element (do not need space for pointers) while lacking efficiency for insertion/deletion operations and memory allocation. On the contrary, linked lists are dynamic and have faster insertion/deletion time complexities.
Also, better cache locality in arrays (due to contiguous memory allocation) can significantly improve performance. As a result, some operations (such as modifying a certain element) are faster in arrays, while others (such as inserting/deleting an element in the data) are faster in linked lists.
Singly linked list is preferred when we need to save memory and searching is not required as pointer of single index is stored. If we need better performance while searching and memory is not a limitation in this case doubly linked list is more preferred.
I'll go against everyone else here and say that, yes, the first approach might end up being more efficient. In the second approach, you're allocating memory on the heap O(N) times - N being the number of nodes in the list. If you're using a vector, you're only making O(log N) number of heap allocations.
Also, if you're on a 64 bit machine, the overhead of saving a pointer in each node may be a bit too much if you're dealing with lots of small items. Using a vector, you can use a smaller nextItem
- e.g. 32 bit instead of 64, which, if you're making a list to hold 32 bit ints, would be a 1.5 improvement in memory usage.
Another possible optimization is that if you know up-front that you'll be dealing with a lot of elements, you can reserve a big vector and have a single heap allocation for a pretty long time.
I recently took a course on applications of automata and the lecturer is implementing some of the algorithms for pretty large data sets. One of the techniques he told us was exactly your first approach of representing a linked list. I had a course work that I tried implementing both ways (with pointers and with a vector and nextItem
kind of thing) and the vector one was acting much better (it did have other optimizations too, but the vector definitely had an effect).
NOTE TO OTHERS
I think what @smilingbuddha is asking about is more like a collection of linked lists - or at least that's what I've used it for. For example, when you save a graph using a list of neighbors. You need a linked list (or array, or whatever) of all the neighbors for each node. So instead of keeping an array of linked lists or a vector of vectors, you just keep of array of indexes pointing to the last inserted neighbor for every node.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With