Another way to distribute the work load

Find the Minimum value in an array - take 2

Let's do the "Find min" example again, now splitting the task of "Finding the minimum value" in an array in a different manner

Solution 2:

Split the array into 2 (approximate) equal halfs
Thread 1 finds the minimum in the odd-indexed elements of the array
(I.e.: x[0], x[2], x[4], etc)
Thread 2 finds the minimum in the even-indexed elements of the array
(I.e.: x[1], x[3], x[5], etc)
Main thread waits for the results and find the actual minimum.

Pictorially:

values handled by thread 0 | | | | | | | | | | | | | | V V V V V V V V V V V V V V |-|-|-|-|-|-|-|-|-|-|-|-|-|-|-|-|-|-|-|-|-|-|-|-|-|-|-| ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ | | | | | | | | | | | | | | values handled by thread 1 Thread 0 Thread 1 | | | | V V min[0] min[1] \ / \ / \ / \ / \ / main thread | | V Actual minimum

The division of labor in general is:

Main Thread: (UNCHANGED)

// ----------------------------------- // Create worker threads.... // ----------------------------------- for (i = 0; i < num_threads; i = i + 1) { start[i] = i; // Pass ID to thread in a private variable if ( pthread_create(&tid[i], NULL, worker, (void *)&start[i]) ) { cout << "Cannot create thread" << endl; exit(1); } } // ----------------------------------- // Wait for worker threads to end.... // ----------------------------------- for (i = 0; i < num_threads; i = i + 1) pthread_join(tid[i], NULL); // ---------------------------------------- // Post processing: Find actual minimum // ---------------------------------------- my_min = min[0]; for (i = 1; i < num_threads; i++) if ( min[i] < my_min ) my_min = min[i];

Worker Thread: (CHANGED !!!)

void *worker(void *arg) { int i, s; double my_min; s = * (int *) arg; // Convert arg to an integer // -------------------------------------- // Find min in my range // -------------------------------------- my_min = x[s]; for (i = s+num_threads; i < MAX; i += num_threads) { if ( x[i] < my_min ) my_min = x[i]; } min[s] = my_min; // Store min in private slot return(NULL); /* Thread exits (dies) */ }

See the elements processed by the thread s:

It's much easier to code the worker thread !!!

Example Program: (Demo above code)
- Prog file: click here
Compile with: g++ -pthread min-mt2.C

Speed up...
- Try running the programs using different threads (the program prints the elapsed time)
- Notice that the first version have drastically improved times on multi-processors (e.g. on compute
  But the second version... no so much...
- $60,000 question:
  - Why the second version is not doing so great ?
- Answer: paging...