Intel Extensions

The Intel® C++ Compiler implements the following groups of functions as extensions to the OpenMP* run-time library:

The Intel extensions described in this section can be used for low-level debugging to verify that the library code and application are functioning as intended. It is recommended to use these functions with caution because using them requires the use of the -openmp_stubs command-line option to execute the program sequentially. These functions are also generally not recognized by other vendor's OpenMP-compliant compilers, which may cause the link stage to fail for these other compilers.

Note

The functions below require the pre-processor directive #include <omp.h>.

Stack Size

In most cases, directives can be used in place of extensions. For example, the stack size of the parallel threads may be set using the KMP_STACKSIZE environment variable rather than the kmp_set_stacksize_s()function.

Note

A run-time call to an Intel extension takes precedence over the corresponding environment variable setting. See the definitions of stack size functions in the Stack Size table below.

Memory Allocation

The Intel® C++ Compiler implements a group of memory allocation functions as extensions to the OpenMP run-time library to enable threads to allocate memory from a heap local to each thread. These functions are kmp_malloc(), kmp_calloc(), and kmp_realloc(). The memory allocated by these functions must also be freed by the kmp_free()function. While it is legal for the memory to be allocated by one thread and kmp_free()'d by a different thread, this mode of operation has a slight performance penalty. See the definitions of these functions in the Memory Allocation table below.

Stack Size

Function Description
kmp_get_stacksize_s() Returns the number of bytes that will be allocated for each parallel thread to use as its private stack. This value can be changed with kmp_set_stacksize_s() prior to the first parallel region or with the KMP_STACKSIZE environment variable.
kmp_get_stacksize() This function is provided for backwards compatibility only. Use kmp_get_stacksize_s() for compatibility across different families of Intel processors.
kmp_set_stacksize_s(size) Sets to size the number of bytes that will be allocated for each parallel thread to use as its private stack. This value can also be set via the KMP_STACKSIZE environment variable. In order for kmp_set_stacksize_s() to have an effect, it must be called before the beginning of the first (dynamically executed) parallel region in the program.
kmp_set_stacksize(size) This function is provided for backward compatibility only; use kmp_set_stacksize_s() for compatibility across different families of Intel processors.

Memory Allocation

Function Description
kmp_malloc(size) Allocate memory block of size bytes from thread-local heap.
kmp_calloc(nelem, elsize) Allocate array of nelem elements of size elsize from thread-local heap.
kmp_realloc(ptr, size) Reallocate memory block at address ptr and size bytes from thread-local heap.
kmp_free(ptr) Free memory block at address ptr from thread-local heap. Memory must have been previously allocated with kmp_malloc(), kmp_calloc(), or kmp_realloc().