How the pipeline CPU executes multiple instructions

Clock cycle 1: The IF stage fetches the 1st instruction into the IR(ID) register

Watch carefully at the operation performed by each individual stage !! (It's exactly like before)

How the pipeline CPU executes multiple instructions

Start 2: ID stage fetches all possible source operands, IF stage fetches the next instruction

How the pipeline CPU executes multiple instructions

End 2: all source operands for add r1,r2,r3 are fetched and next instruction fetched

Notice: the instruction add r1,r2,r3 is moved into the IR(EX) register

How the pipeline CPU executes multiple instructions

Start 3: EX stage operates, ID stage fetches ops, IF stage fetches instr

How the pipeline CPU executes multiple instructions

End 3: results in all stages are stored away

Notice: all instructions are moved forward

How the pipeline CPU executes multiple instructions

Start 4: MEM: forwards, EX: operates, ID stage fetches ops, IF stage fetches instr

How the pipeline CPU executes multiple instructions

End 4: results in all stages are stored away

Notice: all instructions are moved forward

How the pipeline CPU executes multiple instructions

Start 5: WB: update R1, MEM: forwards, EX: operates, ID: fetches ops, IF: fetches instr

How the pipeline CPU executes multiple instructions

End 5: results in all stages are stored away

Notice the significant speed up !!!

DEMO (using Aaron's pipelined CPU)

Execute this command on a lab machine:

/home/cs355001/demo/pipeline/1c-ALU-speedup

Program being executed:

10 65 // mov r1, #65 R1 = 00000000 01000001 18 4 // mov r2, #4 R2 = 00000000 00000100 26 24 // mov r3, #24 R3 = 00000000 00011000 34 2 // mov r4, #2 R4 = 00000000 00000010 42 8 // mov r5, #8 R5 = 00000000 00001000 58 3 // mov r7, #3 R7 = 00000000 00000011 0 0 // nop 0 0 // nop 0 0 // nop 0 0 // nop 0 0 // nop 8 19 // add r1,r2,r3 (R1=R2+R3) 8 37 // add r1,r4,r5 (R1=R4+R5) 8 55 // add r1,r6,r7 (R1=R6+R7)