TLB misses vs cache misses?

https://stackoverflow.com/questions/10446301

05-06-2021
|

Question

Could someone please explain the difference between a TLB (Translation lookaside buffer) miss and a cache miss?

I believe I found out TLB refers to some sort of virtual memory address but I wasn't overly clear what this actually meant?

I understand cache misses result when a block of memory (the size of a cache line) is loaded into the (L3?) cache and if a required address is not held within the current cache lines- this is a cache miss.

Solution

Well, all of today's modern operating systems use something called virtual memory. Every address generated by CPU is virtual. There are page tables that map such virtual addresses to physical addressed. And a TLB is just a cache of page table entries.

On the other hand L1, L2, L3 caches cache main memory contents.

A TLB miss occurs when the mapping of virtual memory address => physical memory address for a CPU requested virtual address is not in TLB. Then that entry must be fetched from page table into the TLB.

A cache miss occurs when the CPU requires something that is not in the cache. The data is then looked for in the primary memory (RAM). If it is not there, data must be fetched from secondary memory (hard disk).

OTHER TIPS

The following sequence after loading first instruction address (i.e. virtual address) in PC makes concept of TLB miss and cache miss very clear.

The first instruction • Accessing the first instruction

Take the starting PC
Access iTLBwith the VPN extracted from PC: iTLBmiss
Invoke iTLBmiss handler
Calculate PTE address
If PTEsare cached in L1 data and L2 caches, look them up with PTE address: you will miss there also
Access page table in main memory: PTE is invalid: page fault
Invoke page fault handler
Allocate page frame, read page from disk, update PTE, load PTE in iTLB, restart fetch • Now you have the physical address
Access Icache: miss
Send refill request to higher levels: you miss everywhere
Send request to memory controller (north bridge)
Access main memory
Read cache line
Refill all levels of cache as the cache line returns to the processor
Extract the appropriate instruction from the cache line with the block offset • This is the longest possible latency in an instruction/data access

source https://software.intel.com/en-us/articles/recap-virtual-memory-and-cache

As the HOW of both the processes are mentioned. On the note of performance, a cache miss does not necessarily stall the CPU. A small number of cache misses can be tolerated using algorithmic pre-fetching techniques. A TLB miss however causes the CPU to stall till the TLB has been updated with the new address. In other words prefetching can mask a cache miss but not a TLB miss.

Licensed under: CC-BY-SA with attribution

Not affiliated with StackOverflow