The bits in the address are divided into 3 groups:
tag | set index | Block offset
t bits | s bits | b bits
If the size of the block in the cache is B
bytes, then you would need b = (log2 B) bits
to specify the block offset.
If the cache has S sets, then s = (log2 S)
bits are used for the set index. And if the cache is fully associate, there would be only one set, i.e. S = 1
and s = 0
, meaning no bits would be used for set index.
The remaining bits are used for the tag t
can be calculated using t = NUM_BITS - s - b
This would guarantee that any address can be mapped to a corresponding cache line and just looking at the valid bit
of the cache line, will confirm if we have the address in the cache or not. Note that a cache line size would usually be greater than the word size to leverage the gain from spatial data locality in programs.
When a requested address is not found in the cache, we need to calculate the starting address of the block of data that would be brought into the cache.
The size of the range is always equal to the cache block size. The start address is calculated by zeroing out the block offset bits of the address. The end address is calculated by using all 1s
for the block offset bits of the address.
Depending on the cache associativity, and the eviction scheme (LRU
vs LFU
), the line within the set where this new block of data would be stored is decided and populated.