Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Advanced data structures in practice

In the 10 years I've been programming, I can count the number of data structures I've used on one hand: arrays, linked lists (I'm lumping stacks and queues in with this), and dictionaries. This isn't really surprising given that nearly all of the applications I've written fall into the forms-over-data / CRUD category.

I've never needed to use a red-black tree, skip list, double-ended queue, circularly linked list, priority queue, heaps, graphs, or any of the dozens of exotic data structures that have been researched in the past 50 years. I feel like I'm missing out.

This is an open-ended question, but where are these "exotic" data structures used in practice? Does anyone have any real-world experience using these data structures to solve a particular problem?

like image 579
Juliet Avatar asked Dec 23 '08 15:12

Juliet


People also ask

What are the advanced data structures?

Advanced Data structures are one of the essential branches of data science which is used for storage, organization and management of data and information for efficient, easy accessibility and modification of data. They are the basic element for creating efficient and effective software design and algorithms.


1 Answers

Some examples. They're vague because they were work for employers:

  • A heap to get the top N results in a Google-style search. (Starting from candidates in an index, go through them all linearly, sifting them through a min-heap of max size N.) This was for an image-search prototype.

  • Bloom filters cut the size of certain data about what millions of users had seen down to an amount that'd fit in existing servers (it all had to be in RAM for speed); the original design would have needed many new servers just for that database.

  • A triangular array representation halved the size of a dense symmetrical array for a recommendation engine (RAM again for the same reason).

  • Users had to be grouped according to certain associations; union-find made this easy, quick, and exact instead of slow, hacky, and approximate.

  • An app for choosing retail sites according to drive time for people in the neighborhood used Dijkstra shortest-path with priority queues. Other GIS work took advantage of quadtrees and Morton indexes.

Knowing what's out there in data-structures-land comes in handy -- "weeks in the lab can save you hours in the library". The bloom-filter case was only worthwhile because of the scale: if the problem had come up at a startup instead of Yahoo, I'd have used a plain old hashtable. The other examples I think are reasonable anywhere (though nowadays you're less likely to code them yourself).

like image 65
Darius Bacon Avatar answered Oct 04 '22 11:10

Darius Bacon