Setting Optimizations with -On Options

The following table details the effects of the -O0, -O1, -O2, -O3, and -fast options. The table first describes the characteristics shared by both IA-32 and Itanium architectures and then explicitly describes the specifics (if any) of the -On and -fast options’ behavior on each architecture.

Option

Effect

-O0

Disables -On optimizations. On IA-32 systems, this option sets the -fp option.

-O1

Optimizes to favor code size and code locality.
Disables loop unrolling.
May improve performance for applications with very large code size, many branches, and execution time not dominated by code within loops.
In most cases, -O2 is recommended over -O1.
IA-32 systems
:
Disables intrinsics inlining to reduce code size. Enables optimizations for speed. Also disables intrinsic recognition and the -fp option. This option is the same as the -O2 option.
Itanium-based systems
:
Disables software pipelining and global code scheduling. Enables optimizations for server applications (straight-line and branch-like code with flat profile). Enables optimizations for speed, while being aware of code size. For example, this option disables software pipelining and loop unrolling.

-O2, -O

This option is the default for optimizations. However, if -g is specified, the default is -O0.
Optimizes for code speed.
This is the generally recommended optimization level.

On IA-32 systems, this option is the same as the -O1 option.

Itanium-based systems:
Enables optimizations for speed, including global code scheduling, software pipelining, predication, and speculation.

On these systems, the -O2 option enables inlining of intrinsics. It also enables the following capabilities for performance gain: constant propagation, copy propagation, dead-code elimination, global register allocation, global instruction scheduling and control speculation, loop unrolling, optimized code selection, partial redundancy elimination, strength reduction/induction variable simplification, variable renaming, exception handling optimizations, tail recursions, peephole optimizations, structure assignment lowering and optimizations, and dead store elimination.

-O3

Enables -O2 optimizations and in addition, enables more aggressive optimizations such as prefetching, scalar replacement, and loop and memory access transformations. Enables optimizations for maximum speed, but does not guarantee higher performance unless loop and memory access transformation take place. The -O3 optimizations may slow down code in some cases compared to -O2 optimizations. Recommended for applications that have loops that heavily use floating point calculations and process large data sets.
IA-32 systems
:
In conjunction with -ax{K|W|N|B|P} or -x{K|W|N|B|P} options, this option causes the compiler to perform more aggressive data dependency analysis than for -O2. This may result in longer compilation times.

On Itanium-based systems, enables optimizations for technical computing applications (loop-intensive code): loop optimizations and data prefetch.

-fast

This option is a single, simple method to enable a collection of optimizations for run-time performance. Sets the following options that can improve run-time performance:

-O3: maximum speed and high-level optimizations, see above

-ipo: enables interprocedural optimizations across files

-static: prevents linking with shared libraries

Provides a shortcut that requests several important compiler optimizations. To override one of the options set by -fast, specify that option after the -fast option on the command line.

The options set by the -fast option may change from release to release.

IA-32 systems:

In conjunction with -ax{K|W|N|B|P} or -x{K|W|N|B|P} options, this option provides the best run-time performance.