What kind of datastructure / algorithm should I use for looking up which section currently in, given a list of endpoints of each section?
For example, if I have a webpage with section headers and content,
and I'm currently at 130px, it should return that I'm currently in "Section 1".
Binary search through array of endpoints
from bisect import bisect_left
arr = [100, 350, 700, 1200]
pos = bisect_left(arr, 130, 0, arr[-1])
However, this could still take O(log n) for every change in position.
Hash table lookup of current location,
lookup = {0: "Introduction"
1: "Introduction"
...
10: "Section 1"
11: "Section 1"
...
}
section = lookup[130/10]
This is fast, but it wastes a lot of space
Is there a general data structure / algorithm that deals with this type of problem?
I like your first option, binary searches are very efficient for scanning and as you say the second option is not space efficient.
The traditional and very generic solution that scales in computer graphics is a 2d k-tree, which creates a tree that can be looked up by coordinates without wasting memory. Specifically its search, removal and insert complexities are all O(log n) and its space complexity is O(n).
Given that you are only doing one axis though and a web page will tend to have 1-100 sections (and is unlikely to have thousands, let alone millions, or billions of sections) then I would personally consider going with a very simple array and then move to a more complex k-tree when there is a measurable benefit/need. If you are writing this in C or another language that gives you some control on the memory layout, then an array of structures would be exceptionally fast to scan due to the design of modern cpus and memory hierarchies (specifically prefetchers and caching).
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With