Use -unroll[n] to specify the maximum number of times you want to unroll a loop. The following example unrolls a loop at most four times:
prompt>ifc -unroll4 a.f
To disable loop unrolling, specify n as 0. The following example disables loop unrolling:
prompt>ifc -unroll0 a.f
Omit n to let the compiler decide whether to perform unrolling or not. Use n = 0 to disable unroller.
Itanium compiler currently uses only -unroll0 (n = 0); all other values are NOPs.
The benefits are:
Unrolling eliminates branches and some of the code.
Unrolling enables you to aggressively schedule (or pipeline) the loop to hide latencies if you have enough free registers to keep variables live.
The Pentium
- Pentium 4 processor, until they have a maximum of 16 iterations
- Pentium III or Pentium
II processors, until they have a maximum of 4 iterations
The potential costs are:
Excessive unrolling, or unrolling of very large loops can lead to increased code size.
If the number of iterations of the unrolled loop is 16 or less, the branch predictor should be able to correctly predict branches in the loop body that alternate direction.
For more information on how to optimize with -unroll[n],
refer to Intel