Jump to content United States-English
HP.com Home Products and Services Support and Drivers Solutions How to Buy
» Contact HP
More options
HP.com home
Fortran 90, Fortran 77, C, aC++: Exemplar Programming Guide > Chapter 6 Advanced shared-memory programming

Thread IDs and nested parallelism

» 

Technical documentation

Complete book in PDF
» Feedback
Content starts here

 » Table of Contents

 » Glossary

As discussed in Chapter 4, "Chapter 4 “Basic shared-memory
programming”
," you can manually parallelize nested loops and tasks to exploit up to two dimensions of parallelism. If you choose to do this, the first dimension must be node-parallel and the second must be thread-parallel. If thread-parallelism is exploited first, no dimensions are left; it is a programming error to attempt to spawn node-parallelism from within a thread-parallel construct. However, single-dimensional thread-parallel code can exploit all the threads on a system, even if they span hypernodes.

If you attempt to spawn thread-parallelism from within a thread-parallel construct and the two constructs are in the same routine, the compiler will ignore your directives on the inner thread-parallel construct. Consequently, the inner parallel construct will simply run serially. Calling a thread-parallel routine from another thread-parallel routine is considered an error but is not caught at compile-time.

Thread ID assignments

Chapter 3, "Chapter 3 “Compiler optimizations”," discusses how programs are initiated as a collection of threads, one per available processor, and how all but thread 0 are idle until parallelism is encountered. We will now discuss the details of how threads are spawned and assigned IDs.

When a process begins, the threads created to run it have unique kernel thread IDs. Thread 0, which runs all the serial code in the program, has kernel thread ID 0; the rest of the threads have unique but unspecified kernel thread IDs at this point. The num_threads() intrinsic will return the number of threads created, regardless of how many are active when it is called.

When thread 0 encounters parallelism, it spawns some or all of the threads created at program start. This means it causes these threads to go from idle to active, at which point they begin working on their share of the parallel code. All available threads are spawned by default, but this can be changed using various compiler directives.

If the parallel structure is thread-parallel, then num_threads() threads will be spawned, subject to user-specified limits. At this point, kernel thread 0 becomes spawn thread 0, and the spawned threads are assigned spawn thread IDs ranging from 0..num_threads()-1 (this range begins at what used to be kernel thread 0). If you manually limit the number of spawned threads, these IDs will range from 0 to one less than your limit. If you attempt to spawn thread-parallelism within an already thread-parallel structure, the thread attempting to spawn will acquire spawn thread ID 0. If all threads attempt to spawn thread-parallelism in this manner, they will all become spawn thread 0, each in a unique context.

If the parallel structure is node-parallel, then num_nodes() threads will be spawned, one per available hypernode, subject to user-specified limits. Again, kernel thread 0 becomes spawn thread 0, and in this case, the spawn thread IDs range from 0..num_nodes()-1, subject to user limits as described above.

If thread-parallelism is then encountered within this node-parallelism, num_node_threads() threads will be spawned on the hypernode or hypernodes encountering the thread-parallelism. These spawned threads will have spawn thread IDs, which are specific to the hypernode they are running on, ranging from 0..num_node_threads()-1, with spawn thread ID 0 belonging to the initial thread that executes the spawn. num_node_threads() may return a different value on each hypernode when called from node-parallel code.

Note that, with nested parallelism, a node-parallel thread that encounters a thread-parallel construct becomes spawn thread 0 on that hypernode regardless of its previous spawn thread ID. When this thread exits the thread-parallel construct, it returns to its previous spawn thread ID. The my_thread() intrinsic function returns the caller's spawn thread ID, which depends on the level of parallelism.

Printable version
Privacy statement Using this site means you accept its terms Feedback to webmaster
© Hewlett-Packard Development Company, L.P.