Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Theory of caching [closed]

Tags:

caching

theory

Is there a unified theory of caching? That is, collections of theorems and algorithms for constructing caches and/or optimizing for them?

The question is deliberately broad, because the results I am looking for are also broad. Formulas for maximum achievable speedup, metrics for caching algorithms, stuff like that. A university-level textbook would probably be ideal.

like image 697
Jørgen Fogh Avatar asked Jun 17 '09 11:06

Jørgen Fogh


People also ask

What are the two types of caching?

L1 cache, or primary cache, is extremely fast but relatively small, and is usually embedded in the processor chip as CPU cache. L2 cache, or secondary cache, is often more capacious than L1.

What is the concept of caching?

Caching (pronounced “cashing”) is the process of storing data in a cache. A cache is a temporary storage area. For example, the files you automatically request by looking at a Web page are stored on your hard disk in a cache subdirectory under the directory for your browser.

What is the caching mechanism?

In computing, a cache is a high-speed data storage layer which stores a subset of data, typically transient in nature, so that future requests for that data are served up faster than is possible by accessing the data's primary storage location.


2 Answers

The vast majority of real-world caching involves exploiting the "80-20 rule" or Pareto distribution. See below for how it looks

Pareto distribution

This manifests itself in applications as:

  • Much of the runtime is spent in the same piece of code (making code caches on CPUs effective)
  • Often, when a variable is accessed, it will be accessed again soon (making data caches on CPUs effective)
  • When a browser looks up a website's hostname once, it will access it quite frequently in the near future (making DNS caches effective)

Therefore, I would say the "theory of caching" is to use up just a few extra resources which are generally "rare" but "fast" to compensate for the most active repeated things you're going to do.

The reason you do this is to try to "level out" the number of times you do the "slow" operation based on the highly skewed chart above.

like image 163
Chris Harris Avatar answered Sep 18 '22 13:09

Chris Harris


I talked to one of the professors at my school, who pointed me towards online algorithms, which seems to be the topic I am looking for.

There is a great deal of overlap between caching algorithms and page replacement algorithms. I will probably edit the WikiPedia pages for these topics to clarify the connection, once I have learned more about the subject.

like image 24
Jørgen Fogh Avatar answered Sep 19 '22 13:09

Jørgen Fogh