- A GPU device consists of
multiple
multi-processors
(called "stream" multiprocessors)
- Each multi-processor
consists of
a number of processors (= CUDA cores)
- Each processor (CUDA core) has
a number of
(non-shared) private registers
- The processors in the
same multi-processor will
share some
(fast) shared memory
- A GPU has a
(large but slow)
device memory that
is shared by
all multi-processors
|