Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Bootstrapping collections for performance

Tags:

In his seminal thesis, Chris Okasaki described the technique of data-structural bootstrapping. What work, if any, has been done to use this technique to improve locality in data structures?

For example, balanced binary trees are commonly used to create purely functional sets and dictionaries but a hash trie of small arrays are often significantly faster due to improved locality.

like image 313
J D Avatar asked Jun 14 '12 10:06

J D


People also ask

What is the purpose of bootstrapping?

“Bootstrapping is a statistical procedure that resamples a single dataset to create many simulated samples. This process allows for the calculation of standard errors, confidence intervals, and hypothesis testing” (Forst).

Does bootstrapping increase accuracy?

Bootstrap aggregation or bagging Bootstrap aggregation, also called bagging, is a random ensemble method designed to increase the stability and accuracy of models. It involves creating a series of models from the same training data set by randomly sampling with replacement the data.

When should the bootstrapping technique be used?

Bootstrap comes in handy when there is no analytical form or normal theory to help estimate the distribution of the statistics of interest since bootstrap methods can apply to most random quantities, e.g., the ratio of variance and mean. There are at least two ways of performing case resampling.

What is bootstrap for model evaluation?

Bootstrapping is also a similar technique that helps analyze the performance of a model. In bootstrapping random data sets are generated then on each data set model is fitted on training and evaluated on the testing data.


1 Answers

You could try references to his book by Haskell or Clojure folk rather than just the CMU pdf : e.g.,

http://www.amazon.com/Purely-Functional-Structures-Chris-Okasaki/dp/0521663504

There was a question here on SO at :

What is the benefit of purely functional data structure?

There is also Clojure area this :

https://github.com/viksit/clojure-datastructures

And there was this on SE :

https://cstheory.stackexchange.com/questions/1539/whats-new-in-purely-functional-data-structures-since-okasaki

Hope something there provides a basis for a search that bears results :-)

You may have to use an academic or biz ref search engine and you may want to look at poster sessions at a conf because search is not obvious here, e.g., Mercury can generate Erlang code ... so searching caching and locality with respect to performance in functional programming in some hardware area dealing with latency.

Canada'a National Research Council (NRC) had some work going on ... you could try a search of their pub's/notices/reports

But note: a search with

bigdata latency locality NRC 2012

gives rather different result from

bigdata functional latency locality NSF 2012

( and I would next drop the 2012 and try using the google search tool date range option for recent results)

like image 150
RobertShifflet Avatar answered Sep 21 '22 06:09

RobertShifflet