Game Development Community

Powers of 2 Tutorial? Rant?

by David Stewart Zink · in Torque Game Engine · 02/11/2002 (9:12 am) · 10 replies

First, the code--

From platform/types.h:
inline bool isPow2(const U32 in_num)
{
   return (in_num == getNextPow2(in_num));
}
Assuming "isPow2" returns true if in_num is a power of two, then getNextPow2 must return the smallest power of two at least as large as its input.

From core/nTypes.cc:
U32 getNextPow2(U32 io_num)
{
   S32 oneCount   = 0;
   S32 shiftCount = -1;
   while (io_num) {
      if(io_num & 1)
         oneCount++;
      shiftCount++;
      io_num >>= 1;
   }
   if(oneCount > 1)
      shiftCount++;

   return U32(1 << shiftCount);
}

//--------------------------------------
U32 getBinLog2(U32 io_num)
{
   AssertFatal(io_num != 0 && isPow2(io_num) == true,
               "Error, this only works on powers of 2 > 0");

   S32 shiftCount = 0;
   while (io_num) {
      shiftCount++;
      io_num >>= 1;
   }

   return U32(shiftCount - 1);
}
These two functions should be inline. After being rewritten, of course.

Typical usage, core/bitStream.h:
inline U32 BitStream::readRangedU32(U32 rangeStart, U32 rangeEnd)
{
   AssertFatal(rangeEnd >= rangeStart, "error, end of range less than start");

   U32 rangeSize = rangeEnd - rangeStart + 1;
   U32 rangeBits = getBinLog2(getNextPow2(rangeSize));

   U32 val = U32(readInt(S32(rangeBits)));
   return val + rangeStart;
}
Note that the usage here "getBinLog2(getNextPow2(X))" is dreadfully inefficient; what is actually needed is "getNextBinLog2(X)" or somesuch, which would just return shift-count from getNextPow2.

So compare the code generated for isPow2(in_num) above with the canonical code, visible in millions of texts on the topic:
inline bool isPow2(U32 in_num)
{
   return (in_num & (in_num - 1)) == 0;
}
Much simpler, eh? Of course, it does return "true" for isPow2(0), but then the old version returns "0 == (1 << -1)" which is the same thing on many architectures.

For the other functions there can be more debate as to appropriate code, but try this:
inline U32 getBinLog2(U32 io_num)
{
   register U32 t_io_num = io_num;
   register U32 shiftCount = 0;
   while (t_io_num) { // There is an asm for this, if needed.
      shiftCount++;
      t_io_num >>= 1;
   }

   return isPow2(io_num) ? shiftCount - 1 : shiftCount;
}

inline U32 getNextPow2(U32 io_num)
{
	return isPow2(io_num) ? io_num : (U32(1) << getBinLog2(io_num));
}

inline U32 BitStream::readRangedU32(U32 rangeStart, U32 rangeEnd)
{
   AssertFatal(rangeEnd >= rangeStart, "error, end of range less than start");

   U32 rangeSize = rangeEnd - rangeStart + 1;
   U32 rangeBits = getBinLog2(rangeSize);

   U32 val = U32(readInt(S32(rangeBits)));
   return val + rangeStart;
}
and presto the networking code becomes that little bit faster. I would guess that the getBinLog2(getNextLog2(X)) is called tens of thousands of times a second when in full networking mode. So this is probably a win.

#1
02/11/2002 (8:09 pm)
Geez, nice work :)
You're really diving through the bowles of all this stuff. I'm impressed.
#2
02/11/2002 (9:43 pm)
Make sure you check witk tim/rick and maybe get that checked into the main code. They'd know how useful it is, and whether it'd cause any issues.

d
#3
02/20/2003 (12:27 pm)
hey! it had a birthday! Not in head, though.
#4
02/21/2003 (12:02 am)
hrm, i'm going to try this when I get to that stage in my game and see what difference it makes. It sure looks like it makes sense. I wonder if Tim/Rick saw this?
#5
12/17/2003 (4:33 pm)
Ok, I came up with an even faster getBinLog2!

I'm definitely going to merge these changes into TNL - and then later into Torque

U32 getBinLog2(U32 number)
{
    U32 result = 0;
    if(number & 0xFFFF0000)
    {
        number >>= 16;
        result += 16;
    }
    if(number & 0xFF00)
    {
        number >>= 8;
        result += 8;
    }
    if(number & 0xF0)
    {
        number >>= 4;
        result += 4;
    }
    if(number & 0xC)
    {
        number >>= 2;
        result += 2;
    }
    if(number & 2)
    {
        number >>= 1;
        result += 1;
    }
    return result;
}
#6
12/17/2003 (5:37 pm)
I've only had 2 hours sleep (RotK), so I may be silly, but you can probably drop the last "number >>= 1;".

If you do getBinLog2() that way, you should code getNextPow2 to use it. Perhaps:
U32 getNextPow2(U32 number)
{
    U32 result = ((U32)1) << getBinLog2(number);
    return (result == number) ? result : (result << 1);
}
#7
12/17/2003 (5:54 pm)
Good call!

here's how I ended up doing the implementation of the functions:

U32 getBinLog2(U32 number)
{
   U32 result = 0;
   if(number & 0xFFFF0000)
   {
      number >>= 16;
      result += 16;
   }
   if(number & 0xFF00)
   {
      number >>= 8;
      result += 8;
   }
   if(number & 0xF0)
   {
      number >>= 4;
      result += 4;
   }
   if(number & 0xC)
   {
      number >>= 2;
      result += 2;
   }
   if(number & 2)
      result ++;
   return result;
}           

U32 getNextBinLog2(U32 number)
{
   return getBinLog2(number) + (isPow2(number) ? 0 : 1);
}

U32 getNextPow2(U32 number)
{
   return isPow2(number) ? number : (1 << (getBinLog2(number) + 1));
}
#8
12/18/2003 (6:41 am)
Why write the function "getNextBinLog2"? I dont see it being used anywhere.
#9
12/18/2003 (7:53 am)
This is a REAL Bunch of geeks!

this is from someone else in the thread listed above.

supposedly only takes 6 cycles!

inline uint32 getNextPowerOfTwo(uint32 value)
#pragma warning (disable:4035) 
// don't whine about lacking return value (as it's returned in eax)
{
    __asm
    {
        mov ecx,-1
        mov eax,value
        dec eax
        bsr ecx,eax
        mov eax,1
        inc ecx
        shl eax,cl              
    }
}
#10
12/18/2003 (10:27 am)
Hmm...

nice to have that Bit Scan Reverse instruction (BSR) - not a very platform independent way to go though...

maybe do the float to int, normalize exponent trick (that was a good flipCode thread!):
inline U32 getBinLog2(U32 value)
{
   F32 floatValue = F32(value);
   return (*((U32 *) &floatValue) >> 23) - 127);
}

inline U32 getNextPow2(U32 value)
{
   return isPow2(value) ? value : (1 << (getBinLog2(value) + 1));
}