Abstract
Interval arithmetic has traditionally been hampered by low performance of programs written to take advantage of it. I suggest a method based on IEEE Std 754-2008, which has the potential of being much faster.
Introduction
Interval arithmetic is one attempt to answer the question of computation accuracy - given a set of floating-point operations, how accurate are the results?
Interval arithmetic replaces a single floating-point number with a range of numbers which we know contains the value. For example, measuring a 3m x 4m room with a metric tape measure might give results accurate to within 0.5cm, which would be represented as [2.995, 3.005] x [3.995, 4.005]. Any operations on intervals are performed in a manner which preserves this property. For example,
add(a, b) == add([a_inf, a_sup], [b_inf, b_sup]) == [a_inf + b_inf, a_sup + b_sup].
A basic implementation
The basic idea of interval arithmetic is to perform each operation twice – once in “round to -infinity” (RD) mode, and once in “round to +infinity” (RU) mode. While this is guaranteed to work, it has the following disadvantages:
-
Each operation incurs three changes of the rounding mode – to RU, to RD, and back to the default (typically “round to nearest” or RN).
-
On most modern processors, each of these switches is extremely expensive. For example, on late-model Intel processors reading the SSE rounding mode is expensive, while setting it is a serializing operation!
The RU approach
An approach that works nicely for some operations is to take advantage of the fact that RD( a ) == -RU( -a ). In order to do this, we store the interval as [-a, b] rather than [a, b]. It is obvious that addition/subtraction may be performed directly with only RU, while multiplication and division may be performed with some additional sign-bit flipping.
This has the advantages of:
However, it cannot be universally used. For example, sqrt(-a) == NaN, not -sqrt(a).
The “round to nearest” approach
The problem of interval arithmetic boils down to an issue of selection. Given an infinitely-precise result A for one of the bounds, we need to round down if we are calculating the lower bound, or round up if we are calculating the upper bound of the interval.
We know that RN( A ) is equal to either RD( A ) or RU( A ), but which?
It turns out that this information can be made available. Furthermore, use of “round to nearest” mode is required in order to make this work.
Exact transforms
Given two floating-point numbers a, b, it is possible to calculate exact transforms as follows:
- TwoSum() - (a, b) are transformed into (s,e) such that s = RN(a + b) and e = a + b - s
s = RN(a + b)
a' = RN(s - b)
b' = RN(s - a')
da = RN(a - a')
db = RN(b - b')
e = RN(da + db)
The TwoSum() algorithm requires 6 floating-point operations.
- TwoProd - (a, b) are transformed into (p, e) such that p = RN(a * b) and e = a * b - p
Split(x, n)
Require: C = 2^n + 1, where n = (bits in mantissa + 1)/2
t1 = RN(C * x)
t2 = RN(x - t1)
xh = RN(t1 + t2)
xl = RN(x - xh)
TwoProd(x, y)
(xh, xl) = Split(x, s)
(yh, yl) = Split(y, s)
p = RN(x * y)
t1 = RN(-p + RN(xh * yh))
t2 = RN(t1 + RN(xh * yl))
t3 = RN(t2 + RN(xl * yh))
e = RN(t3 + RN(xl * yl))
The TwoProd() algorithm requires 17 floating-point operations
If e is negative, then RN( a + b ) == RU( a + b ), and RD( a + b ) == prev( s ). If e is positive, then RN( a + b ) == RD( a + b ), and RU( a + b ) == succ( s ). Similar arguments apply to TwoProd().
Basic Functions
The TwoSum() and TwoProd() procedures are sufficient for performing interval addition/subtraction and multiplication. If an fma() operation is present (it is mandatory under IEEE 754-2008), it is possible to calculate the following functions:
p = RN(a * b)
e = fma(a, b, -p)
The TwoProdFMA() algorithm performs a TwoProd() with 2 floating-point operations, instead of 17.
- Division() - q = RN(a / b); r = -fma(q, b, -a)
- Reciprocal() - q = RN(1.0 / b); r = -fma(q, b, -1.0)
- Sqrt() - q = RN( sqrt(a) ); r = -fma(q, q, -a)
If r is negative, then the result = RN( A ) == RU( A ) and RD( A ) == prev( result ). If r is positive, then the result = RN( A ) == RD( A ) and RU( A ) == succ( result ).
Note that:
- prev(x) and succ(x) are required by IEEE Std 754, and are implemented in C++ 2011 as std::nextafter()
- If an fma() operation is not present in hardware, it may be simulated in software by using the TwoProd() and TwoSum() operations. This requires a total of 35 floating-point operations – 17 for TwoProd(), and 18 for the 3 TwoSum() operations required.
Testing
The following C++ implementations of interval arithmetic were tested:
-
“Basic” interval arithmetic – sets the rounding mode to RU, RD, and default as necessary
-
“RU” interval arithmetic – stores [a, b] as [-a, b], eliminating the requirement to set the rounding mode to RD in many cases.
-
“round to nearest” interval arithmetic – implements the algorithms defined in this paper.
In all cases, SIMD instructions were used if available and useful.
The test involves 100,000,000 operations with random operands, as follows:
- Addition
- Multiplication
- division (0.0 not included in the denominator’s interval)
- square sqr(x) = (x*x)
- square root
- hypot(x, y) = sqrt(sqr(x), sqr(y)).
This hypot() function was chosen because it includes addition (which may be optimized by the RU method), squaring (which may also optimized by the RU method), and square root (which may be optimized only by the “round to nearest” method). It is also easily calculated in straightforward C++ code.
The code was compiled using Visual Studio 2013 in x64 Release mode, using the /fp:strict compiler option. Note that use of this option is essential if accurate results are to be obtained.
The results (in nanoseconds per operation) on an Intel Core i7 2670QM processor are as follows:
Speed Test Results
| Basic
| RU
| Round to nearest
|
Addition
| 104
| 79
| 25
|
Multiplication
| 230
| 194
| 61
|
Division
| 280
| 217
| 94
|
Square
| 238
| 185
| 42
|
Square Root
| 139
| 83
| 44
|
hypot()
| 804
| 702
| 173
|
Notes:
-
The times are given after subtracting the loop overhead and the time required to generate random inputs.
-
Unlike the latest Intel processors, this 2670QM processor does not contain fma() in hardware, and must simulate it in software. This significantly reduces the advantage of the “round to nearest” method for multiplication, division, square, and square root operations.
Summary
An interval arithmetic library that uses round-to-nearest mode is practical, and on current popular hardware – significantly (3+ times) faster than the directed rounding approach.
Future directions
An interval math package that will include most of the forward functions required by the IEEE-1788 Standard for interval arithmetic is currently under development.
References
-
IEEE Std 754-2008
-
IEEE Std 1788-2015
-
Handbook of Floating-Point Arithmetic, Jean-Michel Muller et al., Birkhäuser
-
Interval Arithmetic: from Principles to Implementation, T. Hickey, Q. Ju, M.H. van Emden
-
Interval Operations in Rounding to Nearest, S.M. Rump, P. Zimmerman, S. Boldo, G. Melquiond