In his seminal thesis, Chris Okasaki described the technique of data-structural bootstrapping. What work, if any, has been done to use this technique to improve locality in data structures?
For example, balanced binary trees are commonly used to create purely functional sets and dictionaries but a hash trie of small arrays are often significantly faster due to improved locality.
“Bootstrapping is a statistical procedure that resamples a single dataset to create many simulated samples. This process allows for the calculation of standard errors, confidence intervals, and hypothesis testing” (Forst).
Bootstrap aggregation or bagging Bootstrap aggregation, also called bagging, is a random ensemble method designed to increase the stability and accuracy of models. It involves creating a series of models from the same training data set by randomly sampling with replacement the data.
Bootstrap comes in handy when there is no analytical form or normal theory to help estimate the distribution of the statistics of interest since bootstrap methods can apply to most random quantities, e.g., the ratio of variance and mean. There are at least two ways of performing case resampling.
Bootstrapping is also a similar technique that helps analyze the performance of a model. In bootstrapping random data sets are generated then on each data set model is fitted on training and evaluated on the testing data.
You could try references to his book by Haskell or Clojure folk rather than just the CMU pdf : e.g.,
http://www.amazon.com/Purely-Functional-Structures-Chris-Okasaki/dp/0521663504
There was a question here on SO at :
What is the benefit of purely functional data structure?
There is also Clojure area this :
https://github.com/viksit/clojure-datastructures
And there was this on SE :
https://cstheory.stackexchange.com/questions/1539/whats-new-in-purely-functional-data-structures-since-okasaki
Hope something there provides a basis for a search that bears results :-)
You may have to use an academic or biz ref search engine and you may want to look at poster sessions at a conf because search is not obvious here, e.g., Mercury can generate Erlang code ... so searching caching and locality with respect to performance in functional programming in some hardware area dealing with latency.
Canada'a National Research Council (NRC) had some work going on ... you could try a search of their pub's/notices/reports
But note: a search with
bigdata latency locality NRC 2012
gives rather different result from
bigdata functional latency locality NSF 2012
( and I would next drop the 2012 and try using the google search tool date range option for recent results)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With