GetBinRange() in sceneObject.cc
by asmaloney (Andy) · in Torque Game Engine · 01/17/2007 (9:27 pm) · 5 replies
I have a question about 'bins' in sceneObject.cc. In my profiling, I saw that Container::castRay() was quite expensive [3.4%] and that its calls getBinRange() were taking a relatively significant amount of time for what its doing. It came down to the fmods and casting from float to unsigned int.
How do the bins work for ray casting? I've rewritten the getBinRange() function to use ints [see below] and reduced
Container::castRay() to 2.0%, _however_ the resulting minBin and maxBin are not always exactly what they were in the float version due to rounding. Does this matter? Maybe I need to use ceil() on the min and max args instead of casting [and therefore flooring] the float? This would make sure that I err on the side of too many bins I suppose.
----------
How do the bins work for ray casting? I've rewritten the getBinRange() function to use ints [see below] and reduced
Container::castRay() to 2.0%, _however_ the resulting minBin and maxBin are not always exactly what they were in the float version due to rounding. Does this matter? Maybe I need to use ceil() on the min and max args instead of casting [and therefore flooring] the float? This would make sure that I err on the side of too many bins I suppose.
----------
// Utility method for bin insertion
void getBinRange(const F32 min,
const F32 max,
U32& minBin,
U32& maxBin)
{
static const U32 csmBinSizeU32 = Container::csmBinSize;
static const U32 csmTotalBinSizeU32 = Container::csmTotalBinSize;
AssertFatal(max >= min, "Error, bad range! in getBinRange");
S32 maxI = ( max );
S32 minI = ( min );
S32 minC = minI % csmTotalBinSizeU32;
if ( minC < 0 )
minC += csmTotalBinSizeU32;
AssertFatal(minC >= 0 && minC < csmTotalBinSizeU32, "Bad minCoord");
minBin = minC / csmBinSizeU32;
if ((maxI - minI) >= csmTotalBinSizeU32)
{
maxBin = minBin + (Container::csmNumBins - 1);
}
else
{
S32 maxC = maxI % csmTotalBinSizeU32;
if ( maxC < 0 )
maxC += csmTotalBinSizeU32;
AssertFatal(maxC >= 0 && maxC < csmTotalBinSizeU32, "Bad maxCoord");
maxBin = maxC / csmBinSizeU32;
if ( minI != maxI && minC >= maxC)
maxBin += Container::csmNumBins;
}
AssertFatal(maxBin >= minBin, "Error, min should always be less than max!");
}
#2
You're right - it is close! I had 478 differences out of almost 1,000,000. One little special case:
Took that it to 77 diffs [see the end of the post] out of almost 1,000,000 calls and I suspect most of those are rounding issues since they're all wrapped around or off-by-one.
But I'm still unclear how these bins work - could you give a bit of overview on what they are and how they're used? Maybe give a requirement for the function? :-) What happens if they're 'wrong'?
Changing the bin size of course changes the results - when would you do this and why is it a float?
01/18/2007 (7:24 am)
Here's a quick test I did. Using my journal again, I ran the two functions each time, comparing results, and sending differences to the console.You're right - it is close! I had 478 differences out of almost 1,000,000. One little special case:
S32 maxI = ( max );
S32 minI = ( min );
[b]
if ( maxI == 0 && minI == 0 )
{
minBin = (Container::csmNumBins - 1);
maxBin = Container::csmNumBins;
return;
}
[/b]
S32 minC = minI % csmTotalBinSizeU32;Took that it to 77 diffs [see the end of the post] out of almost 1,000,000 calls and I suspect most of those are rounding issues since they're all wrapped around or off-by-one.
But I'm still unclear how these bins work - could you give a bit of overview on what they are and how they're used? Maybe give a requirement for the function? :-) What happens if they're 'wrong'?
Changing the bin size of course changes the results - when would you do this and why is it a float?
[counter] | min max | minBin maxBin | newMinBin newMaxBin -------------------------------------------------------------------------- binCheck 110714 | -516.052307 -192.712875 | 7 12 | 7 13 binCheck 110715 | -192.712875 380.174805 | 12 21 | 13 21 binCheck 144629 | -516.052307 -192.712875 | 7 12 | 7 13 binCheck 144630 | -192.712875 380.174805 | 12 21 | 13 21 binCheck 156860 | -516.052307 -192.712875 | 7 12 | 7 13 binCheck 156861 | -192.712875 380.174805 | 12 21 | 13 21 binCheck 166434 | -516.052307 -192.712875 | 7 12 | 7 13 binCheck 166435 | -192.712875 380.174805 | 12 21 | 13 21 binCheck 177226 | -516.052307 -192.712875 | 7 12 | 7 13 binCheck 177227 | -192.712875 380.174805 | 12 21 | 13 21 binCheck 187100 | -516.052307 -192.712875 | 7 12 | 7 13 binCheck 187101 | -192.712875 380.174805 | 12 21 | 13 21 binCheck 196334 | -516.052307 -192.712875 | 7 12 | 7 13 binCheck 196335 | -192.712875 380.174805 | 12 21 | 13 21 binCheck 205890 | -516.052307 -192.712875 | 7 12 | 7 13 binCheck 205891 | -192.712875 380.174805 | 12 21 | 13 21 binCheck 215242 | -516.052307 -192.712875 | 7 12 | 7 13 binCheck 215243 | -192.712875 380.174805 | 12 21 | 13 21 binCheck 363967 | -832.643127 910.515503 | 2 17 | 3 18 binCheck 399661 | -1024.920654 776.143860 | 15 30 | 0 15 binCheck 746428 | -640.306519 1162.082642 | 5 20 | 6 21 binCheck 748839 | -640.389343 1161.999756 | 5 20 | 6 21 binCheck 750836 | 394.997314 1418.566528 | 6 22 | 6 21 binCheck 843853 | -64.916992 1771.111328 | 14 29 | 15 30 binCheck 1046833 | -448.361816 1459.447632 | 8 23 | 9 24 binCheck 1177544 | -896.864441 424.512512 | 1 16 | 2 17 binCheck 1186817 | -896.908936 421.904266 | 1 16 | 2 17 binCheck 1220982 | -896.048096 417.510498 | 1 16 | 2 17 binCheck 1254794 | -511.597778 -256.654968 | 8 11 | 8 12 binCheck 1254795 | -256.654968 -1.712153 | 11 15 | 12 15 binCheck 1628984 | -1152.533081 325.868469 | 13 28 | 14 29 binCheck 1674650 | -128.364685 258.806183 | 13 20 | 14 20 binCheck 1674652 | -128.364685 -109.760086 | 13 14 | 14 14 binCheck 1688779 | -98.595238 -64.025963 | 14 14 | 14 15 binCheck 1688780 | -64.025963 -29.456697 | 14 15 | 15 15 binCheck 1757958 | -576.719910 320.626221 | 6 21 | 7 21 binCheck 1758436 | -0.468959 436.796814 | 15 22 | 0 6 binCheck 1758438 | -0.468959 436.796814 | 15 22 | 0 6 binCheck 1775077 | -0.118295 460.742126 | 15 23 | 0 7 binCheck 1775079 | -0.118295 460.742126 | 15 23 | 0 7 binCheck 1795937 | -0.953003 398.458405 | 15 22 | 0 6 binCheck 1795939 | -0.953003 398.458405 | 15 22 | 0 6 binCheck 1803582 | -4.840283 -0.364081 | 15 15 | 15 16 binCheck 1803583 | -0.364081 12.435919 | 15 16 | 0 0 binCheck 1806025 | -0.234639 85.100548 | 15 17 | 0 1 binCheck 1806027 | -0.234639 10.915523 | 15 16 | 0 0 binCheck 1818866 | -4.180336 -0.533844 | 15 15 | 15 16 binCheck 1818867 | -0.533844 12.266157 | 15 16 | 0 0 binCheck 1819680 | -448.912201 241.875504 | 8 19 | 9 19 binCheck 1819682 | -448.912201 -376.768555 | 8 10 | 9 10 binCheck 1831039 | -13.508532 -0.708530 | 15 15 | 15 16 binCheck 1831040 | -0.708530 12.091469 | 15 16 | 0 0 binCheck 1832176 | -0.518476 98.826111 | 15 17 | 0 1 binCheck 1832178 | -0.518476 9.971036 | 15 16 | 0 0 binCheck 1833061 | -7.154755 -0.205092 | 15 15 | 15 16 binCheck 1833062 | -0.205092 12.594909 | 15 16 | 0 0 binCheck 1849652 | -0.139053 85.736076 | 15 17 | 0 1 binCheck 1849654 | -0.139053 10.001350 | 15 16 | 0 0 binCheck 1850139 | 0.286976 0.812093 | 0 0 | 15 16 binCheck 1851296 | -0.688253 67.985390 | 15 17 | 0 1 binCheck 1851298 | -0.688253 9.153960 | 15 16 | 0 0 binCheck 1851756 | -7.904757 -0.034291 | 15 15 | 15 16 binCheck 1851757 | -0.034291 12.765708 | 15 16 | 0 0 binCheck 1853038 | -8.886495 -0.688759 | 15 15 | 15 16 binCheck 1853039 | -0.688759 12.111241 | 15 16 | 0 0 binCheck 1853577 | -0.867231 75.642365 | 15 17 | 0 1 binCheck 1853579 | -0.867231 3.398440 | 15 16 | 0 0 binCheck 1861981 | -4.955020 -0.971613 | 15 15 | 15 16 binCheck 1861982 | -0.971613 11.828386 | 15 16 | 0 0 binCheck 1870529 | -3.255590 -0.105453 | 15 15 | 15 16 binCheck 1870530 | -0.105453 12.694547 | 15 16 | 0 0 binCheck 1871066 | -0.434523 70.007599 | 15 17 | 0 1 binCheck 1871068 | -0.434523 5.322051 | 15 16 | 0 0 binCheck 1881556 | -0.122970 70.608200 | 15 17 | 0 1 binCheck 1881558 | -0.122970 4.375703 | 15 16 | 0 0 binCheck 1909770 | -320.289978 251.195099 | 10 19 | 11 19 binCheck 1909782 | -320.290009 -273.448395 | 10 11 | 11 11
#3
For fast spatial queries (like finding the objects that overlap with a given box, which is a frequently done query in collision detection code, or determining what's visible, ie, overlapping with the view frustum), the scenegraph implements a binning system.
Basically, imagine a 2d grid. The number of squares in the grid and the size of each square is what those constants control. Each square contains a list of all the objects that overlap that square.
This 2d grid is repeated infinitely on the XY plane. So if I make a new object at (0,0), then move exactly one grid's-length out in any direction, and make a second object, they'll get stored in the same bin (aka square) in the bin.
It turns out that most game worlds are fairly spread out, XY-wise (since you are usually running, flying, or driving, not riding an elevator), so this system works reasonably well. But based on the object density and size of the world, you might want to tweak the grid's bin size and bin count.
Bin size is important because EVERY bin an object overlaps will contain a reference to it. So if you move a mouse through the world, you might overlap 2 bins briefly (or very rarely 4) but most of the time just one. Linking/unlinking yourself from the bin's lists will therefore be cheap, and things are goo.
But if you move an elephant through the world, it might _always_ be overlapping a 3x3 area of bins - that means best case you have to update 9 bins when it moves, and worst cast 16 bins. If your game level is a 10 story building full of roving elephants, suddenly you're spending a whole lot of time traversing lists on bins, and linking and unlinking things.
Furthermore, the average number of objects in a bin have a big effect on collision performance. Imagine you have a grid of 1x1 - just one big bin that covers the whole world. Since the bin system is used to speed up spatial queries, this means that all of a sudden you'll be checking every object in the world for overlap, every time you make any query (ray cast, convex query, build polylist, etc.). Slow!
On the other hand, if you had a single object in the world, and a very fine grid that was about the same size (say your elephant is 100m square, and you have your bins set up to be 128 on a side, with 1 meter bins), you also run into problems. Specifically, in that case, to do collision for the elephant, you'd end up checking approximately 10,000 bins - all of which contain the exact same object, which you have to identify as already-processed, and reject. The reject test is fast, but there's still an overhead for all that processing!
The other thing you'll run into is due the fact that the grid repeats. Like I said above, if you have an object at (0,0) and another at (100,100), and your grid is 100m on a side, they both get put in the same bins. (The spatial queries all check the real bounds of the objects the bins return against your query volume, in order to avoid false positives with stuff way far away.) So if you have a game world with objects spread out over a big area, like a typical MMO level might be set up, you'll get some performance loss due to looking at objects that couldn't possibly be in the query volume, but that happen to be in the right bin.
That is where the bin count comes into play. By increasing the bin count you use up more memory, but you can reduce "false potentials" coming up due to the grid repeating over space.
So that's an overview of the container bin system and how it works. There are lots of other spatial-query frameworks out there, but this one has served us well for many years. :)
In any case - the important thing about the bin range calculator is that it needs to give conservative results (every bin that the given number range overlaps). It also has to deal with the fact that the bins repeat (it doesn't have to do the modulo/wrap itself - the other functions that call it deal with that - but it can save processor time by only ever returning a range up to the bin count), and any weird conditions with objects that overlap the origin and so forth.
Phew - I hope that explains the context of this code a bit more. :) The current implementation isn't 100% bug free although it works well enough, as evidenced by the number of games that have shipped with it! So the fact you have some differences isn't bad... but I'm not sure from the data log if the differences are safe or not. It looks like it might be rounding negative numbers towards zero rather than away, which I don't believe is what is wanted... although I'm not going to swear to it, either - it could be ok!
Especially lines like the following seem worrisome to me:
Seems like it might be wrapping prematurely?
The if you added also seems odd to me, though it could be correct. It seems to say that if the range specified is in [0,1), then return the furthest-out bins? It seems like if I gave 0.1, 0.2 as my input range, then I should get a min and max of zero as the overlapping bins (in most situations). Does that seem right? This stuff is a bit tricky, so a second opinion is always good. :)
01/18/2007 (10:57 am)
Nice analysis work, sir! :)Quote:
But I'm still unclear how these bins work - could you give a bit of overview on what they are and how they're used? Maybe give a requirement for the function? :-) What happens if they're 'wrong'?
Changing the bin size of course changes the results - when would you do this and why is it a float?
For fast spatial queries (like finding the objects that overlap with a given box, which is a frequently done query in collision detection code, or determining what's visible, ie, overlapping with the view frustum), the scenegraph implements a binning system.
Basically, imagine a 2d grid. The number of squares in the grid and the size of each square is what those constants control. Each square contains a list of all the objects that overlap that square.
This 2d grid is repeated infinitely on the XY plane. So if I make a new object at (0,0), then move exactly one grid's-length out in any direction, and make a second object, they'll get stored in the same bin (aka square) in the bin.
It turns out that most game worlds are fairly spread out, XY-wise (since you are usually running, flying, or driving, not riding an elevator), so this system works reasonably well. But based on the object density and size of the world, you might want to tweak the grid's bin size and bin count.
Bin size is important because EVERY bin an object overlaps will contain a reference to it. So if you move a mouse through the world, you might overlap 2 bins briefly (or very rarely 4) but most of the time just one. Linking/unlinking yourself from the bin's lists will therefore be cheap, and things are goo.
But if you move an elephant through the world, it might _always_ be overlapping a 3x3 area of bins - that means best case you have to update 9 bins when it moves, and worst cast 16 bins. If your game level is a 10 story building full of roving elephants, suddenly you're spending a whole lot of time traversing lists on bins, and linking and unlinking things.
Furthermore, the average number of objects in a bin have a big effect on collision performance. Imagine you have a grid of 1x1 - just one big bin that covers the whole world. Since the bin system is used to speed up spatial queries, this means that all of a sudden you'll be checking every object in the world for overlap, every time you make any query (ray cast, convex query, build polylist, etc.). Slow!
On the other hand, if you had a single object in the world, and a very fine grid that was about the same size (say your elephant is 100m square, and you have your bins set up to be 128 on a side, with 1 meter bins), you also run into problems. Specifically, in that case, to do collision for the elephant, you'd end up checking approximately 10,000 bins - all of which contain the exact same object, which you have to identify as already-processed, and reject. The reject test is fast, but there's still an overhead for all that processing!
The other thing you'll run into is due the fact that the grid repeats. Like I said above, if you have an object at (0,0) and another at (100,100), and your grid is 100m on a side, they both get put in the same bins. (The spatial queries all check the real bounds of the objects the bins return against your query volume, in order to avoid false positives with stuff way far away.) So if you have a game world with objects spread out over a big area, like a typical MMO level might be set up, you'll get some performance loss due to looking at objects that couldn't possibly be in the query volume, but that happen to be in the right bin.
That is where the bin count comes into play. By increasing the bin count you use up more memory, but you can reduce "false potentials" coming up due to the grid repeating over space.
So that's an overview of the container bin system and how it works. There are lots of other spatial-query frameworks out there, but this one has served us well for many years. :)
In any case - the important thing about the bin range calculator is that it needs to give conservative results (every bin that the given number range overlaps). It also has to deal with the fact that the bins repeat (it doesn't have to do the modulo/wrap itself - the other functions that call it deal with that - but it can save processor time by only ever returning a range up to the bin count), and any weird conditions with objects that overlap the origin and so forth.
Phew - I hope that explains the context of this code a bit more. :) The current implementation isn't 100% bug free although it works well enough, as evidenced by the number of games that have shipped with it! So the fact you have some differences isn't bad... but I'm not sure from the data log if the differences are safe or not. It looks like it might be rounding negative numbers towards zero rather than away, which I don't believe is what is wanted... although I'm not going to swear to it, either - it could be ok!
Especially lines like the following seem worrisome to me:
Quote:
binCheck 1853577 | -0.867231 75.642365 | 15 17 | 0 1
binCheck 1853579 | -0.867231 3.398440 | 15 16 | 0 0
binCheck 1861981 | -4.955020 -0.971613 | 15 15 | 15 16
binCheck 1861982 | -0.971613 11.828386 | 15 16 | 0 0
binCheck 1870529 | -3.255590 -0.105453 | 15 15 | 15 16
binCheck 1870530 | -0.105453 12.694547 | 15 16 | 0 0
binCheck 1871066 | -0.434523 70.007599 | 15 17 | 0 1
binCheck 1871068 | -0.434523 5.322051 | 15 16 | 0 0
binCheck 1881556 | -0.122970 70.608200 | 15 17 | 0 1
binCheck 1881558 | -0.122970 4.375703 | 15 16 | 0 0
Seems like it might be wrapping prematurely?
The if you added also seems odd to me, though it could be correct. It seems to say that if the range specified is in [0,1), then return the furthest-out bins? It seems like if I gave 0.1, 0.2 as my input range, then I should get a min and max of zero as the overlapping bins (in most situations). Does that seem right? This stuff is a bit tricky, so a second opinion is always good. :)
#4
01/18/2007 (11:06 am)
Someone should get the above into TDN. Wonderful post.
#5
w.r.t. the if - you are correct - I'm reacting to a rounding/truncation error:
This is probably not the correct way to handle this.
[Oh, and I can't count - 1,000,000 in my post above should be 2,000,000 :-)]
01/18/2007 (12:17 pm)
@Ben: Thanks a lot for explaining that - it gives me a much better idea of what I'm dealing with here. Let me digest that and look at what I'm doing again. I need to figure out what the problem conditions are with the old solution so I don't just force them into the new one :-)w.r.t. the if - you are correct - I'm reacting to a rounding/truncation error:
binCheck 457827 | -0.433071 0.433071 | 15 16 | 0 0
This is probably not the correct way to handle this.
[Oh, and I can't count - 1,000,000 in my post above should be 2,000,000 :-)]
Associate Kyle Carter
The main thing that needs to be true about this function is that it be conservative - it should always return all the bins the given float range possibly overlaps. If it's too conservative, you take a performance hit checking bins you don't need to.
What I'd suggest for enabling us to consider a replacement with confidence is writing a simple unit test - a function that runs some different inputs and checks the outputs against "known good" values. Make sure you check negative values and corner cases - not just whole positive numbers. :P That way also if it breaks for a case it's easy to add it to the test and make sure that it doesn't happen again.
Modulo (%) tends to act wonky with negative numbers, so check that that works as you expect.
I think that what you've posted works about as well as the existing code, but it also has some of its problems (with negative numbers), and might have issues with corner cases. For instance, what if I change the bin size to 2.37? Or if I have a range that's -0.9 to 23.999 - will it work correctly?
(To be clear, I think some of the issues are OK to not fix, in exchange for speed - provided we document them. So we might say "bins must be whole numbers that are powers of 2 in size," which is totally fine, provided we say so up front and hopefully have an assert to catch if it's not. The key point is we noticed this limitation and made sure that it won't cath unawares.)