Changing data structure representation at runtime: looking for other examples

Tags:

Which programs/algorithms change the representation of their data structure at runtime in order to obtain beter performance?

Context: Data structures "define" how real-world concepts are structured and represented in computer memory. For different kinds of computations a different data structure should/can be used to achieve acceptable performance (e.g., linked-list versus array implementation).

Self-adaptive (cf. self-updating) data structures are data structures that change their internal state according to a concrete usage pattern (e.g., self balancing trees). These changes are internal, i.e., depending the data. Moreover, these changes are anticipated upon by design.

Other algorithms can benefit from an external change of representation. In matrix-multiplication, for instance, it is a well know performance trick to transpose "the second matrix" (such that caches are used more efficiently). This is actually changing the matrix representation from row-major to column major order. Because "A" is not the same as "Transposed(A)", the the second matrix is transposed again after the multiplication to keep the program semantically correct.

A second example is using a linked-list at program start-up to populate "the data structure" and change to an array based implementation once the content of the list becomes "stable".

I am looking for programmers that have similar experiences with other example programs where an external change of representation is performed in their application in order to have better performance. Thus, where the representation (chosen implementation) of a data structure is changed at runtime as an explicit part of the program.

419

asked Jul 03 '14 12:07

madewael

1 Answers

The pattern of transforming the input representation in order to enable a more efficient algorithm comes up in many situations. I would go as far as to say this is an important way to think about designing efficient algorithms in general. Some examples that come to mind:

HeapSort. It works by transforming your original input list into a binary heap (probably a min-heap), and then repeatedly calling the remove-min function to get the list elements in sorted order. Asymptotically, it is tied for the fastest comparison-based sorting algorithm.
Finding duplicates in a list. Without changing the input list, this will take O(n^2) time. But if you can sort the list, or store the elements in a hash table or Bloom filter, you can find all the duplicates in O(n log n) time or better.
Solving a linear program. A linear program (LP) is a certain kind of optimization problem with many applications in economics and elsewhere. One of the most important techniques in solving LPs is duality, which means converting your original LP into what is called the "dual", and then solving the dual. Depending on your situation, solving the dual problem may be much easier than solving the original ("primal") LP. This book chapter starts with a nice example of primal/dual LPs.
Multiplying very large integers or polynomials. The fastest known method is using the FFT; see here or here for some nice descriptions. The gist of the idea is to convert from the usual representation of your polynomial (a list of coefficients) to an evaluation basis (a list of evaluations of that polynomial at certain carefully-chosen points). The evaluation basis makes multiplication trivial - you can just multiply each pair of evaluations. Now you have the product polynomial in an evaluation basis, and you interpolate (opposite of evaluation) to get back the coefficients, like you wanted. The Fast Fourier Transform (FFT) is a very efficient way of doing the evaluation and interpolation steps, and the whole thing can be much faster than working with the coefficients directly.
Longest common substring. If you want to find the longest substring that appears in a bunch of text documents, one of the fastest ways is to create a suffix tree from each one, then merge them together and find the deepest common node.
Linear algebra. Various matrix computations are performed most efficiently by converting your original matrix into a canonical form such as Hermite normal form or computing a QR factorization. These alternate representations of the matrix make standard things such as finding the inverse, determinant, or eigenvalues much faster to compute.

There are certainly many examples besides these, but I was trying to come up with some variety.

answered Oct 12 '22 23:10

Dan R

Related questions
                            
                                How to randomly shuffle a list that has more permutations than the PRNG's period?
                            
                                Optimize: Divide an array into continuous subsequences of length no greater than k such that sum of maximum value of each subsequence is minimum
                            
                                find the greatest sum of continuous subset, with different conditions
                            
                                Build stateful chain for different events and assign global ID in spark
                            
                                Given a node, how long will it take to burn the whole binary tree?
                            
                                Maximize minimum distance between arrays
                            
                                B-Tree - Why can't there be a node with an even number of keys?
                            
                                Algorithm to find optimal groups
                            
                                Uses of self referencing lists
                            
                                Product Naming Algorithm
                            
                                Josephus for large n (Facebook Hacker Cup)
                            
                                Lossless hierarchical run length encoding
                            
                                Word search algorithm
                            
                                hidden markov model thresholding
                            
                                Image comparison algorithm that ignores brightness
                            
                                How can I adapt the Levenshtein Distance algorithm to limit matches to a single word?
                            
                                Fuzzy Bit Matching
                            
                                Removing "almost duplicate" strings in subquadratic time
                            
                                Using circular permutations to reduce Traveling Salesman complexity
                            
                                String Matching: Computing the longest prefix suffix array in kmp algorithm

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Changing data structure representation at runtime: looking for other examples

Tags:

algorithm

data-structures

self-updating

madewael

People also ask

1 Answers

Dan R

Recent Activity

Donate For Us