Best way to format single & double values as strings for SimpleDB?
-
22-08-2019 - |
Question
Amazon's SimpleDB stores values as strings, and I need to store numeric values so that they still compare correctly, for example:
"0001" < "0002"
I think bytes, integers and decimals will be fairly straightforward, but I'm a little unsure on the best way to handle singles and doubles, since they can be very small or large and would appreciate any suggestions from those more clever than I!
(I'm using C#)
Solution
If you already have a way to represent sign-magnitude numbers (like the integers
that you said wouldn't be too hard), then you're already there ;-]
From Comparing Floating Point Numbers
The IEEE float and double formats were designed so that the numbers are “lexicographically ordered”, which – in the words of IEEE architect William Kahan means “if two floating-point numbers in the same format are ordered ( say x < y ), then they are ordered the same way when their bits are reinterpreted as Sign-Magnitude integers.”
static public string DoubleToSortableString(double dbl)
{
Int64 interpretAsLong =
BitConverter.ToInt64(BitConverter.GetBytes(dbl), 0);
return LongToSortableString(interpretAsLong);
}
static public double DoubleFromSortableString(string str)
{
Int64 interpretAsLong =
LongFromSortableString(str);
return BitConverter.ToDouble(BitConverter.GetBytes(interpretAsLong), 0);
}
static public string LongToSortableString(long lng)
{
if (lng < 0)
return "-" + (~lng).ToString("X16");
else
return "0" + lng.ToString("X16");
}
static public long LongFromSortableString(string str)
{
if (str.StartsWith("-"))
return ~long.Parse(str.Substring(1, 16), NumberStyles.HexNumber);
else
return long.Parse(str.Substring(1, 16), NumberStyles.HexNumber);
}
-0010000000000000 => -1.79769313486232E+308 -3F0795FFFFFFFFFF => -100000 -3F3C77FFFFFFFFFF => -10000 -3F70BFFFFFFFFFFF => -1000 -3FA6FFFFFFFFFFFF => -100 -3FDBFFFFFFFFFFFF => -10 -400FFFFFFFFFFFFF => -1 00000000000000000 => 0 03FF0000000000000 => 1 04024000000000000 => 10 04059000000000000 => 100 0408F400000000000 => 1000 040C3880000000000 => 10000 040F86A0000000000 => 100000 07FEFFFFFFFFFFFFF => 1.79769313486232E+308
OTHER TIPS
One option (if you don't require they be human-readable) would be to store the exponent first (zero-filled), then the mantissa. Something like "(07:4.5) for what would normally be written 4.5e7.
*
smile*
Are you going to be dealing with signed values or positive floats less than 1? If so, you'll need to do something like offsets as well, but on your brackets (e.g. [] for positive, () for negative) as well as the mantissa.
If you want to be able to sort integers in with your singles, etc. You should probably just normalize everything to the largest type (e.g. your doubles) on the way in rather than trying to get too tricky.
Thus:
- 7 --> [100,17.0]
- 0.1 --> [099,11.0]
- -2 --> (100,08.0)
and so on.