The Compiler supports the two IEEE standard formats (32 and 64 bits wide) for floating-point types. The following table shows the range of values for the various floating-point representations.
The Compiler implements the default format for a float as 32-bit IEEE, and double as IEEE 64-bit format. If you need speed more than the added accuracy of double arithmetic operations, issue the -Fd: Double is IEEE32 command-line option. Using this option, the Compiler implements both float and double using the IEEE 32-bit format.
Use the -T: Flexible Type Management option to change the default format of a float or double.
| Type | Default Format | Default Value Range | Formats Available with the -T Option | |
|---|---|---|---|---|
| Min | Max | |||
| float | IEEE32 | 1.17549435E-38F | 3.402823466E+38F | IEEE32, IEEE64 |
| double | IEEE64 | 2.2259738585972014E-308 | 1.7976931348623157E+308 | IEEE32, IEEE64 |
| long double | IEEE64 | 2.2259738585972014E-308 | 1.7976931348623157E+308 | IEEE32, IEEE64 |
| long long double | IEEE64 | 2.2259738585972014E-308 | 1.7976931348623157E+308 | IEEE32, IEEE64 |