for (i = 0; i < N; i = i + 1) for (j = 0; j < N; j = j + 1) C[i*N + j] = 0.0; for (i = 0; i < N; i = i + 1) for (j = 0; j < N; j = j + 1) for (k = 0; k < N; k = k + 1) C[i*N + j] = C[i*N + j] + A[i*N + k]*B[k*N + j];
HINT: the statement C[i*N + j] = 0.0 can also be part of the parallel section
HINT: you do NOT need any lock variables !!!
Compile the program using CC -o pj8 pj8.C
The program is run using the command:
where N is the size of the matrices. The program constructs 2 random N x N matrices and multiply them.
If the second argument to the program is "print", the result is printed; otherwise, the program simply exits.
The program pj8.C is intended for you to perform some performance experiments by comparing the execution times of your parallel version of matrix multiply and this sequential matrix multiply program.
Your program (should have the name "main" to make grading easy) must accept 3 arguments and is run using the following command:
The first argument N is the size of the matrices, the second argument NThreads is the number of worker threads. If the third argument to the program is "print", the result is printed; otherwise, the program simply exits.
You need to make sure that the parellel matrix multiply program is correct; so in the testing phase, run your program as:
(Also try 4x4 and 5x5 matrices to be sure).
After you have determined that your program is correct, run performance comparison tests using:
Compare the running times with the sequential version.
Run your performance tests on compute.mathcs.emory.edu (more CPUs) !!!
/home/cs561000/turnin Makefile pj8
/home/cs561000/turnin yourfile.C pj8a