The cache is initially empty:
When the CPU fetches instructions or variables from memory, they are stored in the cache:
A typical memory transfer time is 10-60 nsec nowadays
When the CPU fetches instructions or variables from the same memory location again:
The cache can now return the data requested by the CPU in a speedier manner !!!
|
Consider the following program loop that consists of 100 machine instructions:
mov r0, #0
Loop: cmp r0, #1000 <----+
beq Done |
instr 1 | 100 instructions
instr 2 |
... | Each instruction is store
add r0, r0, #1 | in 1 memory location (for simplicity)
b Loop <----+
Done: ...
|
The loop is executed for 1000 times
Suppose that:
|
Consider the following program loop that consists of 100 machine instructions:
mov r0, #0
Loop: cmp r0, #1000 <----+
beq Done |
instr 1 | 100 instructions
instr 2 |
... | Each instruction is store
add r0, r0, #1 | in 1 memory location (for simplicity)
b Loop <----+
Done: ...
|
Time needed to fetch instructions to execute the loop without using cache memory:
# instructions fetched = 1000 × 100
Time needed to fetch 1 instr = 50 nsec
Total time needed to fetch instr = 1000 × 100 × 50
= 5,000,000 nsec
Avg time = 5,000,000 / 100000 = 50 nsec per instruction
|
Consider the following program loop that consists of 100 machine instructions:
mov r0, #0
Loop: cmp r0, #1000 <----+
beq Done |
instr 1 | 100 instructions
instr 2 |
... | Each instruction is store
add r0, r0, #1 | in 1 memory location (for simplicity)
b Loop <----+
Done: ...
|
Time needed to fetch instructions to execute the loop with a cache memory:
First time thorugh the loop: # instructions fetched = 100 Time needed to fetch 1 instr = 50 nsec Total time needed to fetch instr = 100 × 50 = 5,000 nsec |
Consider the following program loop that consists of 100 machine instructions:
mov r0, #0
Loop: cmp r0, #1000 <----+
beq Done |
instr 1 | 100 instructions
instr 2 |
... | Each instruction is store
add r0, r0, #1 | in 1 memory location (for simplicity)
b Loop <----+
Done: ...
|
Time needed to fetch instructions to execute the loop with a cache memory:
Second,3rd, .. 1000 time thorugh the loop:
# instructions fetched = 100 (each)
Time needed to fetch 1 instr = 5 nsec !! (from cache !)
Total time needed to fetch instr = 100 × 5
= 500 nsec
|
Consider the following program loop that consists of 100 machine instructions:
mov r0, #0
Loop: cmp r0, #1000 <----+
beq Done |
instr 1 | 100 instructions
instr 2 |
... | Each instruction is store
add r0, r0, #1 | in 1 memory location (for simplicity)
b Loop <----+
Done: ...
|
Time needed to fetch instructions to execute the loop with a cache memory:
Total time to fetch all instructions:
1st time through loop = 5,000 nsec
The other 999 times = 999 × 500 = 499,500 nsec
Total time = 504,500 nsec
Avg time = 504,500 / 100000 = 5.045 nsec per instruction
|
Important observation from the example:
|
There are cache replacement technique that can ensure the content of the cache memory will be replaced by the most-recently used data fecthed from the RAM memory
We will discuss replacement policies when we discuss paging (next topic)
Terminology:
|
Consider the following program loop that consists of 100 machine instructions:
mov r0, #0
Loop: cmp r0, #1000 <----+
beq Done |
instr 1 | 100 instructions
instr 2 |
... | Each instruction is store
add r0, r0, #1 | in 1 memory location (for simplicity)
b Loop <----+
Done: ...
|
# cache hits and cache hit ratio:
1st time through loop (100) = 0 cache hits
The other 999 times = 999 × 100 = 99,900 cache hits
Total memory requests = 100,000 memory accesses
Total cache hits = 99,900 memory accesses
Cache hit ratio = 99,900 / 100,000 = 99.9 %
|
Experimental data reports that a typical computer program execution has
|
We will now study the structure of cache memories