Motivation for SIMD computers
 

  • In the 1970's (cold war era), there was a lot of interest in developing more power nuclear weapons

  • There was also weather forecast applications that require speedy computations

  • Common traits of nuclear bomb (explosion) computations and weather forecast computations:

      • The algorithms need to solve (partial) differential equations that use matrix and vector operations

Review: matrix and vector

  • A vector is an (column) array of numerical values, e.g.:

             +-   -+
             |  1  |
         v = |  1  |        
             |  1  |
             +-   -+                


  • A matrix is a two-dimensional array of numerical values, e.g.:

             +-         -+
             | 1   0   0 |
         A = | 1   0   0 |
             | 1   0   0 |
             +-         -+            

Review: vectors and vector addition
 

Example of vectors in a plane (2 dimension) --- high school material:

Review: vectors and vector addition
 

Effect of adding 2 vectors:

SISD computer algorithm used to add vectors
 

The SISD computer algorithm used to add 2 vectors A and B is using multiple + instructions:

 Input:  float A[N] // vector 1 in array A 
         float B[N] // vector 2 in array B

 Output: float C[N] // Output vector

 Vector addition algorithm:

    // The SISD computer performs
    // N addition operations
  
    for (i = 0; i < N; i++)       
       C[i] = A[i] + B[i];        

  

Each + instruction adds 1 pair of numbers

SIMD computer "algorithm" used to add vectors
 

  • The SIMD computer has a "vector addition" operation - that is a single computer instruction - that can perform N different addition operation simultaneously

  • Example: the add operation is applied simultaneously to 3 pairs of inputs:

             

SIMD computer "algorithm" used to add vectors
 

The SIMD computer can add 2 vectors A and B using a single instruction:

 Input:  float A[N] // vector 1 in array A 
         float B[N] // vector 2 in array B

 Output: float C[N] // Output vector

 Vector addition algorithm:

    // The SIMD computer performs
    // N addition operations at once:
  
    C[0..N] = A[0..N] + B[0..N]       
 

  

The + instruction can add N pairs of numbers

Vector (computer) instructions
 

  • A vector instruction has one or 2 input arrays and produces an output array of values

    Example:

             

The matrix multiplication algorithm using vector operations

Review: how to multiply 2 matrices (= row colum sum)

       +-              -+          +-              -+
       |A11    A12    A13|          |B11    B12    B13|
   A = |A21    A22    A23|      B = |B21    B22    B23|
       |A31    A32    A33|          |B31    B32    B33|
       +-              -+          +-              -+

 Then:

              +-              -+
              |C11    C12    C13|
   C =  A*B = |C21    C22    C23|
              |C31    C32    C33|
              +-              -+

 where:

    Cij = Ai1*B1j + Ai2*B2j + Ai3*B3j 
          (for i = 1, 2, 3 and j = 1, 2, 3)
  

The matrix multiplication algorithm using vector operations
 

Consider the following matrix multiplication:

I will show you how the matrix multiplication can be performed using vector operations

The matrix multiplication algorithm using vector operations
 

We first initialize the output to the ZERO matrix:

We now use vector multiplications and vector additions to compute the row-column sum !

The matrix multiplication algorithm using vector operations
 

Multiply a11 with the 1st row vector in B:

This is one constant × vector vector operation in a SIMD computer !!!

The matrix multiplication algorithm using vector operations
 

Add the resulting vector to the row in the output matrix:

This is one vector + vector vector operation in a SIMD computer !!!

The matrix multiplication algorithm using vector operations
 

Multiply a12 with the 2nd row vector in B:

This is one constant × vector vector operation in a SIMD computer !!!

The matrix multiplication algorithm using vector operations
 

Add the resulting vector to the row in the output matrix:

This is one vector + vector vector operation in a SIMD computer !!!

The matrix multiplication algorithm using vector operations
 

Multiply a13 with the 3rd row vector in B:

This is one constant × vector vector operation in a SIMD computer !!!

The matrix multiplication algorithm using vector operations
 

Add the resulting vector to the row in the output matrix:

This is one vector + vector vector operation in a SIMD computer !!!    1 row is done !!

The matrix multiplication algorithm using vector operations
 

Multiply a21 with the 1st row vector in B:

This is one constant × vector vector operation in a SIMD computer !!!

The matrix multiplication algorithm using vector operations
 

Add the resulting vector to the row in the output matrix:

This is one vector + vector vector operation in a SIMD computer !!!    And so on !!

Terminology: vector computer and data parallellism
 

  • SIMD computer = a computer that can perform a Single (= same) Snstruction on Multiple Data items

  • SIMD computers are a.k.a. vector computers/processors

  • The parallel execution of the same operation on multiple data items is called:

      • data parallelism           

Pros and Cons of Vector instructions
 

  • Pros: faster processing

      • The time to execute 1 vector addition/multiplication (any operation) is about same as the time to execute 1 ordinary addition/multiplication (any operation)

      • 1 vector instruction performs K times amount of operations compared to SISD computer operations !!

  • Cons: more expensive

      1. We need multiple ALU circuits to perform multiple simultaneous operations !!!      

      2. We need multiple system busses to bring the data from the memory to the CPU for operation.