Introduction
Here’s a disappearing trick... Take an object, any object. Scale it to fit inside a unit cube. Now translate this so the unit cube has opposing corners at (1,1,1) and (2,2,2). Guess what! All the coordinates are now positive and have the same exponent. Strip out the sign bits and the exponents and save just a single exponent and all the mantissas. The object now takes up much less space. Was anything lost? Nothing of real value actually.
This simple trick belies a much more significant issue that can help save the planet. Read on…
Mantissa ‘Space’
Floating point numbers are made up of a sign bit, a mantissa and an exponent. To cut a long story short, floating point numbers can be thought of as a logical signed n-bit mantissa and a logical signed exponent where:
RealValue = (LogicalMantissa / (2^(n-1)) * 2^LogicalExponent
A signed n-bit number can range from -2^(n-1)
to 2^(n-1)-1
, for example -32768
to 32767
for a 16-bit number. The formula “(LogicalMantissa / (2^(n-1))
” is used as the value ranges from -1.0
to almost 1.0
for any chosen value of n
and this makes the following discussion much easier.
If the logical mantissa is less than half its range, the number can have the mantissa doubled, by left shifting, and the exponent decremented without affecting the real value in any way. This shift and decrement can be repeated until the top two bits, the sign bit and its neighbour, differ. Since a normalized mantissa’s top two bits always differ, then only one of them actually needs to be stored. Floating point hardware does this normalization automatically, i.e., programmers only ever see normalized floating point numbers.
Note: Zero doesn’t fit this technique, so it is treated as special and has all zero bits. For all the other gory floating point details, go look it up somewhere else.
A normalized mantissa means the (LogicalMantissa / (2^(n-1))
value ranges from -1.0
to almost -0.5
and 0.5
to almost 1.0
. Values between -0.5
to almost 0.5
are, by definition, not normalized and so they would have been adjusted to have a lower exponent with the mantissa scaled up to be within the two normalized ranges [-1.0,-0.5)
and [0.5,1.0)
.
Note: "[
" includes the value, ")
" means up to but not included.
In one sense, the exponent defines a one dimensional ‘space’ in which the integral mantissa exists. Adding or subtracting numbers requires the floating point hardware to denormalize the mantissa of the smaller number by right shifting it into the mantissa space of the larger number. An integer calculation is then done and the result renormalized if necessary.
Consider plotting all the real values corresponding to all the normalized (LogicalMantissa / (2^(n-1))
values for a number with a logical exponent of zero. On a number line, they will be equally spaced points from -1.0
up to but not including -0.5
and 0.5
up to but not including 1.0
. Similarly, plotting the points for a number with a logical exponent of -1
has the same number of points packed into half the range, this doubles the resolution, the range being from -0.5
up to but not including -0.25
and 0.25
up to but not including 0.5
. This pattern repeats itself up and down for all possible exponent values. See figure 1.
Figure 1. Floating point values shown on a number line
In this way, the mantissa spaces bound each other in a logarithmic fashion having an extremely high resolution with an extremely small range near zero and an extremely low resolution with an extremely large range near the negative and positive limits.
3D Coordinates
3D coordinates are most commonly transformed into some other coordinate system before being used in a computation. Since 3D transformations can be combined into one transformation matrix, there is little computational difference between the processing of a list of integer coordinates compared with a list of floating point coordinates.
So what happened with the disappearing trick? If the object originally happened to be centered on the origin, then any coordinates closer to the origin would have had a higher resolution than any coordinates further away which also happen to have a larger exponent. Since a 3D object is generally subjected to rotations, the effective resolution of any given point happens to be the worst case resolution of its x, y or z coordinate, i.e., the coordinate with the largest exponent. Conceptualizing the spatial regions of differing resolutions forms a series of concentric cubes centered on the origin that double in size and while halving resolution as the exponent increments in value. Does it actually matter that points nearer the origin have a higher resolution to points further away? No.
The increasing-resolution-with-closeness-to-zero characteristic of floating point coordinate numbers exceeds the needs of 3D objects since a single resolution for all of a given object’s coordinates is quite adequate.
So 3D coordinates are wasteful because the exponent information is irrelevant compared with appropriate integer values. For instance, for a C++ 'float' the coordinate data size is reduced by 27% (!) with no real loss of information.
Object data storage can be reduced and/or general point resolution increased by finding a bounding cube for any given object then mapping this cube to an m-bit integral 3D coordinate system where the cube’s opposing corners map to (0,0,0) and (M-1, M-1, M-1) where M=2^m.
Note: Variations on this theme could map to a signed integral cube from (-M/2, M/2, -M/2) to (M/2-1, M/2-1, M/2-1). A bounding rectangular prism could be used along with non-uniform scaling. The x, y and z transformation scale factors could be powers of two so as to use simple bit shifting and avoid multiplications. An object could be rotated and/or skewed to achieve an even smaller bounding rectangular prism but this is rarely worthwhile.
One particularly useful implementation stores three 21-bit x, y and z integral values and a flag in a 64-bit block. The flag can be used, for instance, for a move-to/line-to/triangle-fan-to purpose. For an object fitting in a roughly 1m or 3’ cube the resolution is 0.48 µm or 19 millionths-of-an-inch – plenty of accuracy for most applications.
Summarizing…
To summarize, any object defined using floating point coordinates can save storage space by transforming the coordinates into integer values and storing the integer coordinates along with the reverse scale/translate transformation. Floating point numbers (or in some cases just exponents) are used for the transformation values.
Pros and Cons
Many others have recognized the benefits of integral coordinates and they are commonly used. But there still are many 3D programs that use floating point coordinates, particularly in the Computer Aided Design/Manufacturing/Engineering (CAD/CAM/CAE) industry. Some would argue that all points need to be transformed into model space for computations and this is indeed how many programs work. The fixed model space typically has a 'coincident-point' tolerance which leads to tangible limits on the physical size that can be modelled. Dynamically selecting a model or 'computational' space would remove the physical size limitations, but that's another story.
One drawback of integral coordinates is how to handle valid new coordinates that are outside the existing integral range. In this case, floating point coordinates may be the best option and, perhaps, when editing an object its coordinates are temporarily ‘unpacked’ into floating point numbers. Another option is increasing the integral range by adjusting the transformation values and all of the existing integral coordinates. Alternatively the object’s coordinates could be stored as separate sets of coordinates with different exponents but this starts complicating things. During the life of most objects, their coordinates do not change and this issue is moot.
Other benefits of integral coordinates are increased bandwidth due to the reduced data size and, where relevant, increased frame rates since less data needs to be processed for the same object. Caching performance also improves. These aspects all reduce the amount of processing and, in so doing, reduce the carbon dioxide footprint somewhere up the energy food chain.
The Code
The C++ program in the appendix below reads in a text file of 3D coordinates and outputs a text file with a transformation matrix followed by the list of 21-bit integral 3D coordinates.
Conclusion
A little investigation into the characteristics of floating point numbers and a comparison with the needs of 3D coordinates has shown that, in general, 3D coordinates should be integer values. 3D programs benefit from the reduced storage, the increased resolution and the increased speed. Surely integer coordinates are something you need to consider.
Oh. And the disappearing trick… The saved mantissas happen to be unsigned integers ranging from 0 to 2^n-1. Can you figure out why?
References
- Computer Graphics: Principles and Practice, 2nd Edition, 1990, Foley et al.
- Mathematics for 3D Game Programming and Computer Graphics, 2nd Edition, 2004, Lengyel
- www.flipcode.com/archives/3D_Graphics_on_Mobile_Devices-Part_2_Fixed_Point_Math.shtml
- www.gamasutra.com/view/feature/1965/visualizing_floats.php
- www.gameprogrammer.com/4-fixed.html
- www.pathengine.com/Contents/Overview/FundamentalConcepts/
WhyIntegerCoordinates/page.php
Appendix - Integerize3D Program
#include <stdio.h>
#include <tchar.h>
#include <math.h>
#define INTEGER_BITS 21
#define MAXI ((1L<<(INTEGER_BITS))-1) #define MAX_COUNT 1024
#define LENGTHOF(a) (sizeof(a)/sizeof(a[0]))
typedef struct { float v3[3]; } TVec3;
TVec3 CoordList[MAX_COUNT]; TVec3 Min, Max;
void CheckLimits( TVec3& Min, TVec3& Max, TVec3& coord );
int main( int , char ** )
{
TCHAR line[256];
int index = 0;
while (_getts_s(line,LENGTHOF(line)))
{
int ScanCount = _stscanf_s( line, _T("%f %f %f"),
CoordList[index].v3+0,
CoordList[index].v3+1,
CoordList[index].v3+2 );
if (ScanCount == 3)
{
if (index == 0)
Min = Max = CoordList[0];
else
CheckLimits( Min, Max, CoordList[index] );
if (++index == LENGTHOF(CoordList))
break;
}
else
_ftprintf_s( stderr, _T("Non coordinate line...\n"
"\t\"%s\"\n"), line );
}
float Factor[3], Offset[3];
for (int i=0; i<3; i++)
{
Factor[i] = MAXI / (Max.v3[i] - Min.v3[i]);
Offset[i] = Min.v3[i];
}
_tprintf_s( _T("Transformation:\n"
"%12g %12g %12g\n"
"%12g %12g %12g\n"
"%12g %12g %12g\n"
"%12g %12g %12g\n\n"),
1/Factor[0], 0.0, 0.0,
0.0, 1/Factor[1], 0.0,
0.0, 0.0, 1/Factor[2],
Offset[0], Offset[1], Offset[2] );
_tprintf_s( _T("Points:\n") );
for (int i=0; i""0;"" /> coord.v3[i])
_tprintf_s( _T("%7d %7d %7d\n"),
(int) (Factor[0] * (CoordList[i].v3[0]-Offset[0])),
(int) (Factor[1] * (CoordList[i].v3[1]-Offset[1])),
(int) (Factor[2] * (CoordList[i].v3[2]-Offset[2])) );
}
void CheckLimits( TVec3& Min, TVec3& Max, TVec3& coord )
{
for (int i=0; i<3; i++)
if (Min.v3[i] > coord.v3[i])
Min.v3[i] = coord.v3[i];
else if (Max.v3[i] < coord.v3[i])
Max.v3[i] = coord.v3[i];
}
History
- March 5, 2010: Initial post
- March 5, 2010: Added paragraph in "3D Coordinates" section
- March 6, 2010: Fixed "-1.0" in third paragraph