I can't quite understand the main differences between the two caches and I was wondering if someone could help me out?
I know that with a fully associative cache an address can be stored on any line in the tag array and a direct-mapped cache can have only one address on one line.
But that's about all I know.
In direct mapping, only one possible place in the cache is available for each block in the main memory. In associative mapping, any place in the cache is available for each block in the main memory.
Full associative mapping has much less potential for collisions between blocks trying to occupy the cache. That is, two or more main memory blocks may have to fit into the same cache block with direct mapping, but could go into different cache blocks with a full (or set) associative mapping.
A fully associative cache ▪ A fully associative cache permits data to be stored in any cache block, instead of forcing each memory address into one particular block. — When data is fetched from memory, it can be placed in any unused block of the cache.
A fully associative cache is specified as a cache of size N blocks. Each block can map to any portion of memory. Fully associative caches are then the most flexible, because they give the replacement policy much more flexibility (has more blocks to choose from).
In short you have basically answered your question. These are two different ways of organizing a cache (another one would be n-way set associative, which combines both, and most often used in real world CPU).
Direct-Mapped Cache is simplier (requires just one comparator and one multiplexer), as a result is cheaper and works faster. Given any address, it is easy to identify the single entry in cache, where it can be. A major drawback when using DM cache is called a conflict miss, when two different addresses correspond to one entry in the cache. Even if the cache is big and contains many stale entries, it can't simply evict those, because the position within cache is predetermined by the address.
Full Associative Cache is much more complex, and it allows to store an address into any entry. There is a price for that. In order to check if a particular address is in the cache, it has to compare all current entries (the tags to be exact). Besides in order to maintain temporal locality, it must have an eviction policy. Usually approximation of LRU (least recently used) is implemented, but it is also adds additional comparators and transistors into the scheme and of course consumes some time.
Fully associative caches are practical for small caches (for instance, the TLB caches on some Intel processors are fully associative) but those caches are small, really small. We are talking about a few dozen entries at most.
Even L1i and L1d caches are bigger and require a combined approach: a cache is divided into sets, and each set consists of "ways". Sets are directly mapped, and within itself are fully associative. The number of "ways" is usually small, for example in Intel Nehalem CPU there are 4-way (L1i), 8-way (L1d, L2) and 16-way (L3) sets. N-way set associative cache pretty much solves the problem of temporal locality and not that complex to be used in practice.
I would highly recommend a 2011 course by UC Berkeley, "Computer Science 61C", available on Archive. In addition to other stuff it contains 3 lectures about memory hierarchy and cache implementations.
There is also a 2015 edition of this course freely available on youtube.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With