Game Development Community

dev|Pro Game Development Curriculum

Floating point to integer conversion

by Brad Schick · 05/19/2001 (9:32 am) · 2 comments

/*******************************************************************************
 During performance testing I noticed that almost %8 of my game's time was spent
 in the _ftol function converting floating point numbers to integers. While my
 game does an extreme amount of such conversions, this may effect other games
 to a lesser degree. The inline ASM functions below are designed to speed up
 various types of floating point to integral conversions.

 When a floating point number is cast to an integer, ANSI C requires that the
 integer be truncated. In MS VC this is done by the _ftol library function.
 In order to truncate, the _ftol function is forced to change the Intel FPU
 from the default rounding mode of "round to nearest" (RC field of 00B) to
 the "round toward zero" mode (RC field of 11B) then back again. This FPU
 mode switching is very slow.

 The helper functions below use rounding conversions for specific purposes
 instead of truncating conversions. This is much faster since the FPU's mode
 does not need to be changed. Since these funcitons are inlined the
 conversions are even faster because _ftol is a libary function call.

 IMPORTANT NOTES:
 
 1) These functions do not have the same behavior as standard arithmetic
 rounding. 'fist' uses the default FPU rounding mode which rounds to the
 nearest even number. So 0.5 rounds to 0.0 while 1.5 rounds to 2.0. This
 may not always be desirable! The FPU could be set to "round up" mode
 and back, but that would kill performance. Just add 0.5 and cast instead.

 2) Even if "round to the nearest" is OK, rounding in general is not
 always an acceptable float to integer conversion. Know your code before
 using these functions.

 3) This code has been compiled and tested using VC6. The ASSERT calls
 are place holders and will not compile by default.
 
*******************************************************************************/

/*******************************************************************************
 Rounding from a float to the nearest integer can be done several ways.
 Calling the ANSI C floor() routine then casting to an int is very slow.
 Manually adding 0.5 then casting to an int is also somewhat slow because
 truncation of the float is slow on Intel FPUs. The fastest choice is to
 use the FPU 'fistp' instruction which does the round and conversion
 in one instruction (not sure how many clocks). This function is almost
 10x faster than adding and casting.

 Caller is expected to range check 'v' before attempting to round.
 Valid range is INT_MIN to INT_MAX inclusive.
*******************************************************************************/
__forceinline int Round( double v )
{
    ASSERT( v >= INT_MIN && v <= INT_MAX );
    int result;
    __asm
    {
        fld    v      ; Push 'v' into st(0) of FPU stack
        fistp  result ; Convert and store st(0) to integer and pop
    }

    return result;
}

/*******************************************************************************
  Same behavior as Round, except that PRound returns and unsigned value
  (and checks 'v' for being positive in debug mode). This method can
  be used for better type safety if 'v' is known to be positive.

 Caller is expected to range check 'v' before attempting to round.
 Valid range is 0 to UINT_MAX inclusive.
*******************************************************************************/
__forceinline unsigned PRound( double v )
{
    ASSERT( v >= 0 && v <= UINT_MAX );
    unsigned result;
    __asm
    {
        fld    v      ; Push 'v' into st(0) of FPU stack
        fistp  result ; Convert and store st(0) to integer and pop
    }

    return result;
}

// Ignore warning about not returning a value
#pragma warning( disable: 4035 )

/*******************************************************************************
  To check if a double is actually an integral value you could 
  cast the double to an int (which used the slow ANSI C _ftol function)
  then subtracted the int from the original double and tested for zero. This
  is fairly slow. The code below produces the same result but it faster
  because a rounding float to int conversion is used. I'm actually not
  sure if the code below is optimal but I profiled several variations and
  this was the fastest...
  
  Returns true if 'v' is a valid integer in the range of INT_MIN to INT_MAX
    inclusive and fills 'i' with the integer
  Returns false if 'v' is not an integer.

  There is no need to range check 'v' before calling IsInteger
*******************************************************************************/
__forceinline bool IsInteger( double v, int *i )
{
    if( v < (double)INT_MIN || v > (double)INT_MAX )
        return false;
    
    // Using a local int to store conversions then reloading
    // it is faster than doing multiple conversions.
    int local;
    __asm
    {
        fld     v        ; Push 'v' into st(0) of FPU stack
        fist    local    ; Convert and store st(0) to integer
        fild    local    ; Push integer to st(0)
        fcompp           ; Compare st(0) and st(1) then pop twice
        fnstsw  ax       ; Moves FPU code flags to AX (AH) register
        test    ah,40h   ; Test if AH is 40h (meaning st(0) == st(1) )
        je      SetF     ; Jump to SetF if test was false
        mov     edx,i    ; Move local to *i
        mov     eax,local
        mov     [edx],eax
        mov     al,1     ; Set return value to true 
        jmp     Bye      ; Jump to exit
    SetF:
        xor     al,al    ; Set return value to false
    Bye:
    }
}

/*******************************************************************************
  Identical to IsInteger but checks 'v' for a more restrictive range.
  Returns true if 'v' is an intergal value and between 0 and USHRT_MAX inclusive.
  Unlike Round this method can be called with a negative value.
  
  Returns true if 'v' is a valid integer in the range of 0 to USHRT_MAX
    inclusive and fills 'i' with the integer
  Returns false if 'v' is not an index.

  There is no need to range check 'v' before calling IsIndex
*******************************************************************************/
__forceinline bool IsIndex( double v, int *i )
{
    // Change the max value to suite your needs.
    if( v < 0.0 || v > (double)USHRT_MAX )
        return false;

    // Using a local int to store conversions then reloading
    // it is faster than doing multiple conversions.
    int local;
    __asm
    {
        fld     v        ; Push 'v' into st(0) of FPU stack
        fist    local    ; Convert and store st(0) to integer
        fild    local    ; Push integer to st(0)
        fcompp           ; Compare st(0) and st(1) then pop twice
        fnstsw  ax       ; Moves FPU code flags to AX (AH) register
        test    ah,40h   ; Test if AH is 40h (meaning st(0) == st(1) )
        je      SetF     ; Jump to SetF if test was false
        mov     edx,i    ; Move local to *i
        mov     eax,local
        mov     [edx],eax
        mov     al,1     ; Set return value to true 
        jmp     Bye      ; Jump to exit
    SetF:
        xor     al,al    ; Set return value to false
    Bye:
    }
}

/*******************************************************************************
  Used to avoid errors from precision limitations when converting a double to
  a boolean. Converts the argument to a positive then compares the
  result to a small double that is the maximum allowed value that is considered
  false. This technique is about 8 times faster than using the C runtime
  fabs() function because like _ftol fabs() changes FPU control flags.
  
  Returns true if 'v' is within or at +- 1.0e-10 of zero
  Returns false if 'v' is outside +- 1.0e-10 of zero

  There is no need to range check 'v' before calling IsIndex
*******************************************************************************/
const double g_MaxBool = 1.0e-10;
__forceinline bool ToBool( double v )
{
    __asm
    {
        fld     v        ; Push 'v' into st(0) of FPU stack
        fabs             ; Drop the sign of st(0)
        fcomp   g_MaxBool; Compare st(0) to g_MaxBool
        fnstsw  ax       ; Moves FPU code flags to AX (AH) register
        test    ah,41h   ; Test if AH is 40h or 1h (meaning st(0) <= g_MaxBool)
        je      SetF     ; Jump to SetF if test was false
        xor     al,al    ; Set return value to false (since 'v' is 0 or very small)
        jmp     Bye      ; Jump to exit
    SetF:
        mov     al,1     ; Set return value to true 
    Bye:
    }
}

#pragma warning( default: 4035 )

#1
07/07/2001 (12:34 pm)
Hmm, excuse me, every half-decent compiler does that just as well, and otherwise, it'S common knowledge anyway. Besides, having to switch data types is not at all advised.

Otherwise, neat functions...
#2
07/07/2001 (2:48 pm)
Perhaps you could share your "common knowledge" and describe how to take advantage of these optimizations in pure C/C++ code?

No matter how good (or bad) your compiler is, the Intel FPU's default rounding mode does not match ANSI C. This is the root performance issue, not the compiler's optimizing abilities. As I mentioned in the comments, the C standard requires truncating conversions from integer to float. And since the Intel FPU normally operates in rounding mode the FPU control flags must be changed during a conversion.

The code I wrote allows a developer to say: "Forget the ANSI C standard, I can live with rounding conversions because they are much faster on Intel."

It it true that there could be a C library (or compiler specific) function for rounding conversions, but I am not aware of any such functions in either the VC6 or ANSI C libraries. If you know something otherwise please share it. I would gladly nuke the ASM code in my game because it is harder to understand and maintain than C code.

Also, while eliminating such conversions is a good strategy it is not always possible. For example, try implementing a typeless scripting language without any such conversions (and without a custom math package).

-Brad