Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

STL What is Random access and Sequential access?

So I am curious to know, what is random access?

I searched a little bit, and couldn't find much. The understanding I have now is that the "blocks" in the container are placed randomly (as seen here). Random access then means I can access every block of the container no matter what position (so I can read what it says on position 5 without going through all blocks before that), while with sequential access, I have to go through 1st , 2nd, 3rd and 4th to get to the 5th block.

Am I right? Or if not, then can someone explain to me what random access is and sequential access is?

like image 200
Sumsar1812 Avatar asked Jun 10 '14 17:06

Sumsar1812


2 Answers

Sequential access means the cost of accessing the 5th element is 5 times the cost of accessing the first element, or at least that there is an increasing cost associated with an elements position in the set. This is because to access the 5th element of the set, you must first perform an operation to find the 1st, 2nd, 3rd, and 4th elements, so accessing the 5th element requires 5 operations.

Random access means that accessing any element in the set has the same cost as any other element in the set. Finding the 5th element of a set is still only a single operation.

So accessing a random element in a random access data-structure is going to have O(1) cost whereas accessing a random element in a sequential data-structure is going to have a O(n/2) -> O(n) cost. The n/2 comes from that if want to access a random element in a set 100 times, the average position of that element is going to be about halfway through the set. So for a set of n elements, that comes out to n/2 (Which in big O notation can just be approximated to n).

Something you might find cool:

Hashmaps are an example of a data structure which implements random access. A cool thing to note is that on hash collisions in a hash map, the two collided elements are stored in a sequential linked list in that bucket on the hash map. So that means that if you have 100% collisions for a hash map you actually end up with sequential storage.

Here's an image of a hashmap illustrating what I'm describing:

Hashmap

This means the worst case scenario for a hash map is actually O(n) for accessing an element, the same as average case for sequential storage, or put more correctly, finding an element in a hashmap is Ω(n), O(1), and Θ(1). Where Ω is worst case, Θ is best case, and O is average case.

So:

Sequential access: Finding a random element in a set of n elements is Ω(n), O(n/2), and Θ(1) which for very large numbers becomes Ω(n), O(n), and Θ(1).

Random access: Finding a random element in a set of n elements is Ω(n/2), O(1), and Θ(1) which for very large numbers becomes Ω(n), O(1), and Θ(1)

So random access has the benefit of giving better performance for accessing elements, however sequential storage data structures provide benefits in other areas.

Second Edit For @sumsar1812:

I want to preface with this is how I understand the advantages/use cases of sequential storage, but I am not as certain about my understanding of benefits of sequential containers as I am about my answer above. So please correct me anywhere I am mistaken.

Sequential storage is useful because the data will actually be stored sequentially in memory.

You can actually access the next member of a sequentially stored data set by offsetting a pointer to the previous element of that set by the amount of bytes it takes to store a single element of that type.

So since a signed int requires 8 bytes to store, if you have a fixed array of integers with a pointer pointing to the first integer:

int someInts[5];
someInts[1] = 5;

someInts is a pointer pointing to the first element of that array. Adding 1 to that pointer just offsets where it points to in memory by 8 bytes.

(someInts+1)* //returns 5

This means if you need to access every element in your data structure in that specific order its going to be much faster since each lookup for sequential storage is just adding a constant value to the pointer.

For random access storage, there is no guarantee that each element is stored even near the other elements. This means each lookup will be more expensive that just adding a constant amount.

Random access storage containers can still simulate what appears to be an ordered list of elements by using an iterator. However, as long as you allow random access lookups for elements there will be no guarantee that those elements are stored sequentially in memory. This means that even though a container can exhibit behavior of both a random access container and a sequential container, it will not exhibit the benefits of a sequential container.

So if the order of the elements in your container is supposed to be meaningful, or you plan on iterating and operating on every element in a data set then you might benefit from a sequential container.

In truth it still gets a little more complicated because a linked list, which is a sequential container doesn't actually store sequentially in memory whereas a vector, another sequential container, does. Here's a good article that explains use cases for each specific container better than I can.

like image 198
echochamber Avatar answered Oct 13 '22 21:10

echochamber


There are two main aspects to this, and it's unclear which of the two is more relevant to your question. One of those aspects is accessing the content of an STL container via iterators, where those iterators allow either random access or forward (sequential) access. The other aspect is that of accessing a container or even just memory itself in random or sequential orders.

Iterators - Random Access vs. Sequential Access

To start with iterators, take two examples: std::vector<T> and std::list<T>. A vector stores an array of values, whereas a list stores a linked list of values. The former is stored sequentially in memory, and this allows arbitrary random access: calculating the location of any element is just as fast as calculating the location of the next element. Thus the sequential storage gives you efficient random access, and the iterator is a random access iterator.

By contrast, a list performs a separate allocation for each node, and each node only knows where its neighbors are. Thus calculating the location of a random non-neighbor node cannot be done directly. Any attempt to do so must traverse all the intermediate nodes, and thus algorithms that attempt to skip nodes may perform badly. The non-sequential storage yields randomized locations and thus only efficient sequential access. Thus the iterator that list provides is a bidirectional iterator, one of a few different sequential iterators.

Memory - Random Access vs. Sequential Access

However there's another wrinkle in your question. The iterator parts only address the traversal of the container. Underneath that, however, the CPU will be accessing memory itself in a particular pattern. While at a high level the CPU is capable of addressing any random address with no overhead of calculating where it is (it's like a big vector), in practice reading memory involves caching and lots of subtleties that make accessing different parts of memory take different amounts of time.

For example, once you start working with a rather large data set, even if you're working with a vector, it's more efficient to access all elements in sequential order than to access all elements in some random order. By contrast a list doesn't make this possible. Since the nodes of a list aren't even necessarily located sequential memory locations, even a sequential access of the list's items may not read memory sequentially, and can be more expensive because of this.

like image 34
Michael Urman Avatar answered Oct 13 '22 21:10

Michael Urman