Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Interview question: Factorials and caching

Tags:

caching

I recently came across the following interview question:

You need to design a system to provide answers to factorials for between 1 and 100. You can cache 10 numbers. How would you arrange/manage that cache, and what is the worst case for lookup on a cache miss.

What do you think would be a suitable answer and what would be the reasoning behind this? Personally, I would cache the 1st ten numbers for the first input, and then subsequently maintain an LRU cache based on the most recent input because people are more likely to repeat searches. Not really sure what the worst case for lookup on cache miss would be though. Probably O(n) if you use a dynamic programming approach in implementing the factorial function. What do you think?

like image 354
OckhamsRazor Avatar asked Aug 28 '11 15:08

OckhamsRazor


3 Answers

subsequently maintain an LRU cache based on the most recent input because people are more likely to repeat searches

That might be a reasonable design choice, but in the interview I'd phrase that more as a question to the interviewer: "would it be reasonable to assume that calls will be more like to be made with recent values, or is there some other expected grouping (large numbers will be requested more often then small or vice-versa)?"

For example, it might make sense to cache LRU, but never throw away the values for 10, 20, 30, 40, etc. That way the worst case for calculating a factorial once the cache fills with those values is to have to perform 10 multiplications.

Another consideration you might make is that computers can very easily deal with certain factorials:

  • 12! is the largest that can be held in a 32-bit word.
  • 20! is the largest that can be held in a 64-bit word.
  • 34! is the largest that can be held in a 128-bit word.

Since today's machines can easily deal with 64-bit arithmetic (maybe even 128 bit arithmetic), it might also make sense to never cache a value for 20! or below once the cache fills with values greater than that.

The worst case for a lookup on a cache miss would depend on how you store the cache. Is it an array sorted by the argument to the function? (look-up is O(log n)) Is it an array in LRU order? (look up on cache miss is O(n)). I think you'd also want to make clear that you'd want a cache lookup to always return the highest value in the cache that's less than what you're looking for - that cache value represents work that you don't have to do for this particular factorial calculation.

like image 113
Michael Burr Avatar answered Nov 03 '22 06:11

Michael Burr


The idea here is to select some numbers so it is possible to calculate others from them. Assuming your system does multiplication with constant speed and no division at all, caching (or "pre-calculating") 1!, 11!, 21!, 31!, 41!, 51!, 61!, 81!, 91! allows calculating all the remaining numbers with at most 9 multiplications for each.

In reality multiplying larger numbers is more costly, and you also could do division (e.g. 90! = 91! / 91), and you don't really need 1!, which can lead to another optimal distribution of those anchor numbers.

like image 41
Paŭlo Ebermann Avatar answered Nov 03 '22 06:11

Paŭlo Ebermann


How about using 9 of the cache slots for the factorial of 10, 20, 30, 40, 50, 60, 70, 80, and 90? The 10th slot could be LRU. The worst case lookup is then O(10) or constant time.

like image 30
Connor Doyle Avatar answered Nov 03 '22 08:11

Connor Doyle