# Squeezing precise numbers into fixed width types... or faking it

I have this app where we store numbers. Don't we all? Unfortunately, this app stores numbers without context. So, I don't know whether the number will be an integer or represent numbers throughout the real number space. Compounding this situation, I stand to gain particular advantage if I can store all these numbers in a fixed width datatype (each number consume the same number of bits of storage space). On current computer systems, there are two native types to each do some of what we want: the 64bit IEEE floating point type "double" and the 64bit integer types "signed/unisnged long long". However, I don't know which datatype to use before I actually see the number for the first time.

Let's take temperature as an example. In some accepted unit like C, F or K a possible value would be something like 37.23 or 49.05 which is a non-integer value where it would seem to make sense to use a float or a double. Round trip time in seconds is a great example of what one would think of as a perfect use for a IEEE floating point number as a float or a double. In that format, the most significant digits are more accurate, so if you are measuring the latency between two networked system (microseconds) or you are measuring the time to run an enormous data mining operation (days) you will get something accurate. Why is a type that looses accuracy okay here? If your operation takes months or years to complete an operation, you don't likely care if the operation took 2 years and 37 microseconds or 2 years and 12 microseconds. It may seem that a double would, generally speaking, be pretty good for just about anything. Not so.

In the real world, we frequently measure the number of octets (bytes) passed through switching equipment. These numbers are typically represented in a counter that ranges in integral value between 0 and 2^{64}-1. Now, if we store the number as a double, as we approach the upper end, the double starts to loose accuracy. Here's a quick sample so you can see it. I'll take X and X+100 as doubles and then subtract them to see the difference. As X gets very very large, 100 is less significant (the one-hundreds place), so we start to loose accuracy. I save you some reading by starting at 2^{48} as that's where the interesting things start.

#include#include #include int main() { int i; for(i=48;i<63;i++) { uint64_t i1 = 1; double d1; i1 <<= i; d1 = (double)(i1 + 100); printf("%g - %llu = 100, saw %llu (%g%%)\ ", d1, i1, (uint64_t)d1 - i1, fabs((double)((uint64_t)d1 - i1)-100.0)); } }

Producing the output:

2.81475e+14 - 281474976710656 = 100, saw 100 (0%) 5.6295e+14 - 562949953421312 = 100, saw 100 (0%) 1.1259e+15 - 1125899906842624 = 100, saw 100 (0%) 2.2518e+15 - 2251799813685248 = 100, saw 100 (0%) 4.5036e+15 - 4503599627370496 = 100, saw 100 (0%) 9.0072e+15 - 9007199254740992 = 100, saw 100 (0%) 1.80144e+16 - 18014398509481984 = 100, saw 100 (0%) 3.60288e+16 - 36028797018963968 = 100, saw 96 (4%) 7.20576e+16 - 72057594037927936 = 100, saw 96 (4%) 1.44115e+17 - 144115188075855872 = 100, saw 96 (4%) 2.8823e+17 - 288230376151711744 = 100, saw 128 (28%) 5.76461e+17 - 576460752303423488 = 100, saw 128 (28%) 1.15292e+18 - 1152921504606846976 = 100, saw 0 (100%) 2.30584e+18 - 2305843009213693952 = 100, saw 0 (100%) 4.61169e+18 - 4611686018427387904 = 100, saw 0 (100%)

You might ask, why the one-hundreds place is important in numbers that are this large. The answer is very simple. This measures octets through a switch, think of milliliters of water from a faucet. After you've used the faucet a long time, that number is staggering. However, if I want to know liters/second over the last 5 seconds it is very likely that the two samples I take (now and five seconds from now) will be both enormous and relatively close. And while the accuracy of difference matters, it is too insignificant to be represented in the IEEE floating point format. The interesting thing is, the source never reports anything but integral values, so if we were simply using an integral datatype we would have exactness.

So, what do you do when you don't know? Arbitrary precision math, or in SQL: numeric. Numerics are great because you always get the correct answer. Numerics are bad because computers are not very good at doing math outside of the IEEE floating point and integral spaces; in other words, it's painfully slow. Perhaps worse than being slow, the numeric datatype is a variable width type, which in this case means I've lost the battle.

Now back to my app: the inputs are either floating point numbers or integral numbers, but the only way we can tell is if they contain a decimal point when expressed in non-scientific notation; or, do they have any non-zero values in places to the right of the one's place? What we want to do is give accurate integral storage to apps that are courteous enough to provide integral numbers in the range of -2^{63} to 2^{64}-1 and use a double otherwise. This allows us to use 64bits of fixed space to store the value (though we need some extra bits to note that the 64bits represents a double vs. a signed integer vs. an unsigned integer). Once we've tackled this, we can accept numeric values from SQL stored them in fixed size (with expected loss of insignificant places for floating point and expected accuracy for integral values) and return them back as numerics.

My little project fronts against PostgreSQL, so I had to convert integers and doubles into numerics. Much to my surprise, the internal method of converting this is to convert from type1 to a string and then from a string to type2. This is a very inefficient process. It turns out that direct double to numeric conversion (as expected) is much faster. In our case it shaves off approximately 50% of the CPU cycles.

As a disclaimer, this is a very special purposed storage type and neither replaces the double, integer or numeric type. It is, instead, a hybrid type with special properties that are attuned to the problem at hand. Before you go implementing something similar, make sure you know what you are doing. Our goals were to retain accuracy for integral types, achieve a fixed-width storage format and still allow representation of numbers both large and arbitrarily smaller than 1.