How is the cache memory used ?

The cache is initially empty:

 

How is the cache memory used ?

When the CPU fetches instructions or variables from memory, they are stored in the cache:

A typical memory transfer time is 10-60 nsec nowadays

How is the cache memory used ?

When the CPU fetches instructions or variables from the same memory location again:

The cache can now return the data requested by the CPU in a speedier manner !!!

Tradeoff: size vs. speed

  • RAM memory are cheap(er) but slow(er)

    • 8 G byte RAM costs about $40

  • Cache memory are fast but very expensive

        

  • Therefore:

      • The cache memory is small in size because it's very expensive to make

      • The cache memory can therefore only store a (small) portion of the content of the RAM memory

Demonstration the usefulness of a cache memory

Consider the following program loop that consists of 100 machine instructions:

         mov  r0, #0

  Loop:  cmp  r0, #1000   <----+
         beq  Done             |
         instr 1               |  100 instructions 
	 instr 2	       |     
	 ...		       |  Each instruction is store
	 add  r0, r0, #1       |  in 1 memory location (for simplicity)   
	 b    Loop        <----+

  Done:  ...
  

The loop is executed for 1000 times



Suppose that:

  • Memory speed = 50 nsec to fetch one instruction
  • Cache speed = 5 nsec to fetch one instruction         

Demonstration the usefulness of a cache memory

Consider the following program loop that consists of 100 machine instructions:

         mov  r0, #0

  Loop:  cmp  r0, #1000   <----+
         beq  Done             |
         instr 1               |  100 instructions 
	 instr 2	       |     
	 ...		       |  Each instruction is store
	 add  r0, r0, #1       |  in 1 memory location (for simplicity)   
	 b    Loop        <----+

  Done:  ...
  


Time needed to fetch instructions to execute the loop without using cache memory:

    # instructions fetched           = 1000 × 100
 
    Time needed to fetch 1 instr     = 50 nsec

    Total time needed to fetch instr = 1000 × 100 × 50  
                                     = 5,000,000 nsec

    Avg time = 5,000,000 / 100000 = 50 nsec per instruction     

Demonstration the usefulness of a cache memory

Consider the following program loop that consists of 100 machine instructions:

         mov  r0, #0

  Loop:  cmp  r0, #1000   <----+
         beq  Done             |
         instr 1               |  100 instructions 
	 instr 2	       |     
	 ...		       |  Each instruction is store
	 add  r0, r0, #1       |  in 1 memory location (for simplicity)   
	 b    Loop        <----+

  Done:  ...
  


Time needed to fetch instructions to execute the loop with a cache memory:

 First time thorugh the loop:

    # instructions fetched           = 100
 
    Time needed to fetch 1 instr     = 50 nsec

    Total time needed to fetch instr = 100 × 50  
                                     = 5,000 nsec

Demonstration the usefulness of a cache memory

Consider the following program loop that consists of 100 machine instructions:

         mov  r0, #0

  Loop:  cmp  r0, #1000   <----+
         beq  Done             |
         instr 1               |  100 instructions 
	 instr 2	       |     
	 ...		       |  Each instruction is store
	 add  r0, r0, #1       |  in 1 memory location (for simplicity)   
	 b    Loop        <----+

  Done:  ...
  


Time needed to fetch instructions to execute the loop with a cache memory:

 Second,3rd, .. 1000 time thorugh the loop:

    # instructions fetched           = 100 (each)
 
    Time needed to fetch 1 instr     = 5 nsec !! (from cache !)

    Total time needed to fetch instr = 100 × 5  
                                     = 500 nsec

Demonstration the usefulness of a cache memory

Consider the following program loop that consists of 100 machine instructions:

         mov  r0, #0

  Loop:  cmp  r0, #1000   <----+
         beq  Done             |
         instr 1               |  100 instructions 
	 instr 2	       |     
	 ...		       |  Each instruction is store
	 add  r0, r0, #1       |  in 1 memory location (for simplicity)   
	 b    Loop        <----+

  Done:  ...
  


Time needed to fetch instructions to execute the loop with a cache memory:

 Total time to fetch all instructions:

    1st time through loop             =   5,000 nsec
    The other 999 times   = 999 × 500 = 499,500 nsec

    Total time                        = 504,500 nsec

    Avg time = 504,500 / 100000 = 5.045 nsec per instruction 

Refreshing the content of the cache memory

Important observation from the example:

  • The cache memory must always store (a portion) the most "useful" data from memory:

      • The cache memory must be always contain the instructions of the current program loop

    I.e.:

      • When the program exits the current loop and the execution goes into another loop, then the content of the cache memory must be replaced by the instructions in the 2nd loop !!!

There are cache replacement technique that can ensure the content of the cache memory will be replaced by the most-recently used data fecthed from the RAM memory

We will discuss replacement policies when we discuss paging (next topic)

Cache hit and cache hit ratio

Terminology:

  • Cache hit = a memory request that can be satisfied by the cache

    (I.e., the cache contains the requested value)

  • Cache hit ratio = the fraction of memory requests that were satisfied by the cache

# Cache hits and cache hit ratio in an example

Consider the following program loop that consists of 100 machine instructions:

         mov  r0, #0

  Loop:  cmp  r0, #1000   <----+
         beq  Done             |
         instr 1               |  100 instructions 
	 instr 2	       |     
	 ...		       |  Each instruction is store
	 add  r0, r0, #1       |  in 1 memory location (for simplicity)   
	 b    Loop        <----+

  Done:  ...
  


# cache hits and cache hit ratio:

    1st time through loop (100)               =      0 cache hits
    The other 999 times           = 999 × 100 = 99,900 cache hits

    Total memory requests         = 100,000 memory accesses
    Total cache hits              =  99,900 memory accesses

    Cache hit ratio = 99,900 / 100,000 = 99.9 % 

# Cache hits and cache hit ratio in practical programs
 

Experimental data reports that a typical computer program execution has

  • Cache hit ratio of 90% - 95%

    (The cache hit ratio is highly dependent on how often the program executes loops)

 

 

We will now study the structure of cache memories