Game Development Community

Simple double-precision add problem

by Valador, Inc. · in Torque Game Engine · 01/20/2009 (10:29 am) · 3 replies

Hi there.


I've been trying to change my copy of the engine to support double-precision floating-point values in script. So far,so good. Then this issue came up.


Please note the below code:

const char *CodeBlock::exec(U32 ip, const char *functionName, Namespace *thisNamespace, U32 argc, const char **argv, bool noCalls, StringTableEntry packageName, S32 setFrame)
{
...
case OP_ADD:
floatStack[FLT-1] = floatStack[FLT] + floatStack[FLT-1];
FLT--;
break;
...
}


The simple add above is somehow not being processed properly.

If

floatStack[FLT] 32343.235253242525 double
floatStack[FLT-1] 4.5300000000000002 double

then

floatStack[FLT-1], after the add operation above, should be equal to 32347.765253242525. But instead, it's equal to 32347.765625.

Somehow, a significant loss of precision/error has occurred here. And I'm not sure why. After all, if I create a new executable, with just this code:

int main()
{
double rhs=32343.235253242525;
double lhs=4.53;
double result=lhs+rhs;
return 0;
}

lhs 4.5300000000000002 double
rhs 32343.235253242525 double
result 32347.765253242524 double

The above result is exactly what I'd normally expect.

I do know that floatStack[FLT] and floatStack[FLT-1], by default, contain single-precision values cast to double-precision. But nevertheless, this is a simple double-precision add. And I'm only having this strange double-precision add behavior in TGEA 1.8 code engine, and nowhere else.


I was thinking this had something to do with intrinsic functions begin enabled. On/off, (/iO), makes no difference. /fp:strict, or /fp:precise, makes no difference. I'm still thinking there's a simple answer to this problem that I'm overlooking. I just don't know what it is.


Please help :(

Thanks

Franklin

#1
01/20/2009 (10:48 am)
huh. are you compiling with optimization ?
try making an intermediate variable and checking the values ?
eg
double tmp = floatStack[FLT] + floatStack[FLT-1];
floatStack[FLT-1] = tmp;

PS,
when posting source code,
you can keep nice code-formatting by using [ code ] and [ /code ] (but without spaces).
#2
01/21/2009 (7:48 am)

Optimization = Disabled (/Od)
Enable Intrinsic functions = No (by default, this was originally on. I switched it off. But still, no effect).
Favor Size or Speed = Neither
Whole Program Optimization = No

Floating Point Model = /fp:precise (I also tried /fp:strict) (Haven't tried /fp:fast yet).

I tried

double tmp=floatStack[FLT] + floatStack[FLT-1];
floatStack[FLT-1]=tmp;

way back, when I first started trying to figure this problem out.

As expected, identical results (as in what I previously described) occur:

The actual adding is correct: when I highlight the
floatStack[FLT] + floatStack[FLT-1]
portion of the code (not the assignment part), the debugger gives me the correct vlaue (32347.765253242524). But when that value is assigned to tmp (or floatStack[FLT-1]), via initialization(construction) or assignment, the resulting double value in tmp is 32347.765625.

As far as I see it, I turned off all optimization. But just in case I overlooked something, what specifically do I need to turn off (or switch to) (in project properties) in order to disable (completely) optimization (at least as far as floating-point math is concerned)? I'm figuring, if this is disabled (no shortcuts), no screwy double-precision assignment problems will occur.

Please tell me about any other tricks, or ideas: specifically, tricks to take the result of a legitimate double-precision calculation and effectively assign to a double (without actually assigning it). I know there are plenty of tricks with integers, such as with bit-shifting, ORing, ANDing, and so on.

I've been reduced to converting the values by hand into buffers (sprintf also screws things up in the same way described, so sprintf also has the same problem, and I can't rely on it), by manually visiting each digit (in the most accurate way I can, with a double variable), and inserting to a static char array. Then performing BCD add on each character, sticking into another buffer, then converting back into a double.

This amount of effort seems to be utterly rediculous. Perhaps I'm trying to solve this wacky double-precision add/assignment problem in the wrong way?


#3
01/21/2009 (10:40 am)
out of curiosity,
is the amount of error you're seeing when assigning the results of the double add the same as if you assign the add to a single-precision float ?

maybe try this
double tmp = (double)(floatStack[FLT]) + (double)(floatStack[FLT-1]);

or something like
double tmp1 = (double)(floatStack[FLT  ]);
double tmp2 = (double)(floatStack[FLT-1]);
double tmp  = tmp1 + tmp2;

.. not really proposing these as solutions, but more as investigation to try to find the root.