Performance issue
by Chris Labombard (TGEA License) · in Torque Game Engine Advanced · 04/28/2006 (1:39 pm) · 22 replies
I added AIGuard to my TSE project and when I add a guard to the mission I get a massive performance hit.
The guard and my player both use the spaceOrc.
Ive used the profiler to narrow down what's causing the issue and it is
void Player::updateWorkingCollisionSet()
specifically, this line:
mConvex.updateWorkingList(mWorkingQueryBox,
isGhost() ? sClientCollisionContactMask : sServerCollisionContactMask);
This is what the profiler says when I start a profiler block around just that one line of code.
Can someone please suggest how I might fix this problem?
The guard and my player both use the spaceOrc.
Ive used the profiler to narrow down what's causing the issue and it is
void Player::updateWorkingCollisionSet()
specifically, this line:
mConvex.updateWorkingList(mWorkingQueryBox,
isGhost() ? sClientCollisionContactMask : sServerCollisionContactMask);
This is what the profiler says when I start a profiler block around just that one line of code.
Quote:
Ordered by non-sub total time -
%NSTime % Time Invoke # Name
120.412 120.820 218 Player_updateworkingCol
Can someone please suggest how I might fix this problem?
#2
04/28/2006 (2:07 pm)
I can understand having to do that for an MMO, but this is just one additional AI player.
#3
How is your mission? Is your terrain atlas or standard? Do you have colideable static shapes in the mission?
I suggest you go further down into updateWorkingList() and add more profiler blocks along the way. It's easy to do, but just make sure a PROFILE_END() call will be made for every PROFILE_START() one before leaving a function. That way you can narrow further down what is causing the bottleneck, since updateWorkingList does a lot bunch of things.
04/29/2006 (12:01 pm)
The standard collision is CPU consuming, but it shouldn't be that much to cause noticeable performance losses with only 2 players in a mission.How is your mission? Is your terrain atlas or standard? Do you have colideable static shapes in the mission?
I suggest you go further down into updateWorkingList() and add more profiler blocks along the way. It's easy to do, but just make sure a PROFILE_END() call will be made for every PROFILE_START() one before leaving a function. That way you can narrow further down what is causing the bottleneck, since updateWorkingList does a lot bunch of things.
#4
I have static collidable shapes... They arent near the players though. Not that that helps, as I dont know how large bins are.
It's Atlas terrain.
I'll rip through updateWorkingCollision and find out exactly what's doing it.
Oh... i've tried removing all the statics from the mission. No difference.
04/29/2006 (12:04 pm)
Ya, that's how I figured out that that was the offending line. I have static collidable shapes... They arent near the players though. Not that that helps, as I dont know how large bins are.
It's Atlas terrain.
I'll rip through updateWorkingCollision and find out exactly what's doing it.
Oh... i've tried removing all the statics from the mission. No difference.
#5
Your problem sounds very strange - Manoel's suggestion to do more detailed profiling seems reasonable to me.
04/29/2006 (1:51 pm)
Is mWorkingQueryBox a normal size?Your problem sounds very strange - Manoel's suggestion to do more detailed profiling seems reasonable to me.
#7
04/30/2006 (12:10 pm)
The query box is scaled by the velocity as well as the bounds box, so the model doesn't matter insomuch as how it is being used (though big boxes make things harder at lower velocities). But looking at the code, I can't see how any error could get introduced. I think I'm just leading you down a dead end, but if you ever put a breakpoint in there, you could check if the newLen variable ever is larger than you think it should be (roughly velocity / 25 I think).
#8
The boxes are 3.5x3.5x4 for guards (velocity 2) and 7x7x7 for patrollers (velocity 13-20).
Basically what's going on is buildConvex is being called on 4 - 6 seperate entities... None of them do anythign but Atlas.
Here is basically how it's being called:
0.002 0.002 357 AtlasResourceInfo_timesCalled
104.309 104.639 31474 AtlasResourceInfo_registerConvex
So, buildCollisionInfo was called 357 times and registered 31474 convexes.
So ~ 88 convexes are being registered every time we build the collision info. One convex appears to be 1 terrain triangle.
That seems like a lot to me. What do you guys think? Is it some kinda bug in Atlas, or is it from large boxes?
05/01/2006 (6:15 am)
Alright, I broke up updateWorkingCollision a bit.The boxes are 3.5x3.5x4 for guards (velocity 2) and 7x7x7 for patrollers (velocity 13-20).
Basically what's going on is buildConvex is being called on 4 - 6 seperate entities... None of them do anythign but Atlas.
Here is basically how it's being called:
if(mAtlasResource)
mAtlasResource->buildCollisionInfo(localBox, convex, NULL);
//which calls
gotData |= e->mGeom->buildCollisionInfo(&chunkMat, localBox, c, poly);
// That is this :
AtlasResourceInfo::buildCollisionInfo
//which calls
registerConvex(localMat, offset, c);0.002 0.002 357 AtlasResourceInfo_timesCalled
104.309 104.639 31474 AtlasResourceInfo_registerConvex
So, buildCollisionInfo was called 357 times and registered 31474 convexes.
So ~ 88 convexes are being registered every time we build the collision info. One convex appears to be 1 terrain triangle.
That seems like a lot to me. What do you guys think? Is it some kinda bug in Atlas, or is it from large boxes?
#9
05/01/2006 (11:08 am)
I can concur with what your saying Brainiac. I recently ported over some old AI code from TGE last week, basically just took it out of an old project and placed it into TSE. The AI worked fine at first, then I ramped up my threshold to give 3 AI players and when they'd all get going my frame rate drops from 70 to 15. The same code running in TGE doesn't show much of a drop at all.
#10
With the same player models and code in TGE... In TGE my frame rate is a lot more spikey than in TSE, with a frame rate of anywhere from 50-100 fps on average. With 1 plaeyr + 3 bots it dips to around 30-45 fps, with 1 player +15 bots I still maintani a reasonable 20-30 fps.
It is definatly a TSE or atlas based issue.
05/02/2006 (8:29 am)
An update on this... It's definately something TSE related. As a test (and out of curiousity) I ported our client over to TGE in the past day. The project basically, merged and diffed, same scripts, same models... In TSE if I play with 0 bots I get 70-100 fps, if I push my minimum player setting up to 4 (so 1 player + 3 bots) then I drop to 13-17 fps. Huge performance decrease.With the same player models and code in TGE... In TGE my frame rate is a lot more spikey than in TSE, with a frame rate of anywhere from 50-100 fps on average. With 1 plaeyr + 3 bots it dips to around 30-45 fps, with 1 player +15 bots I still maintani a reasonable 20-30 fps.
It is definatly a TSE or atlas based issue.
#11
05/04/2006 (11:12 am)
Would anyone care to offer a suggestion of how this might be fixed?
#12
www.garagegames.com/mg/forums/result.thread.php?qt=43540
05/04/2006 (11:27 am)
It sounds like an Atlas problem to me. I don't have the problem, so I can't really try fixing it, but Vincent had a different problem with Atlas that he resolved by increasing the tree depth when he was generating the .chu file (at least thats what it sounds like he did)www.garagegames.com/mg/forums/result.thread.php?qt=43540
#13
Porting back to TGE solved the performance issues.
05/04/2006 (12:00 pm)
We had this issue too and it's not Atlas, as we tried with only interiors present.Porting back to TGE solved the performance issues.
#14
You may haev been having issues because your interiors and TSStatics were trying to collide with teh terrain as well ;)
05/04/2006 (1:43 pm)
Stefan - According to my debugging it is atlas that has the performace issue.You may haev been having issues because your interiors and TSStatics were trying to collide with teh terrain as well ;)
#15
Right now it's so slow that bringing 4-5 skinned players into an empty scene can drop you from well over 200FPS to around 90 or even less.
Something a bit odd about the profiler blocks quoted here is that they're showing well over 100% time spent on just that block - that suggests that there might be a mismatched block or other oddity happening. Maybe you're running on a system that tends to give broken QueryPerformanceCounter/RDTSC results? (Pretty much all timing methods "break" on some system out there; the QPC/RDTSC approaches tend to die on multiprocessor systems.)
How dense is your Atlas geometry? The system will bog down if you're doing frequent queries that give results over maybe 30 polygons. If you have a very finely tesselated terrain, things will run quite slowly. Similarly, if your players are moving very fast relative to the ground they'll query more geometry and you'll get the same difference.
As Max mentions, having a proper tree depth can have a huge impact on performance; ideally you want no more than a few thousand polygons in a chunk, or else performance queries will in fact run very slowly. Turn on bounding box rendering to see how big your chunks are, and wireframe to get an idea of how many polygons are in each chunk. The export tools also dump some information about this.
Interiors and TSStatics do not issue collision queries at runtime, so they would not be a performance issue. What happens if you try putting the same scene on a very large, simple interior (like a big box)?
05/04/2006 (4:00 pm)
Does the performance remain low if the players aren't rendering? Skin mesh rendering in TSE CVS is pretty slow atm; Brian has been spending most of the last week optimizing this aspect of the engine, as it's a bit of a pain. ;)Right now it's so slow that bringing 4-5 skinned players into an empty scene can drop you from well over 200FPS to around 90 or even less.
Something a bit odd about the profiler blocks quoted here is that they're showing well over 100% time spent on just that block - that suggests that there might be a mismatched block or other oddity happening. Maybe you're running on a system that tends to give broken QueryPerformanceCounter/RDTSC results? (Pretty much all timing methods "break" on some system out there; the QPC/RDTSC approaches tend to die on multiprocessor systems.)
How dense is your Atlas geometry? The system will bog down if you're doing frequent queries that give results over maybe 30 polygons. If you have a very finely tesselated terrain, things will run quite slowly. Similarly, if your players are moving very fast relative to the ground they'll query more geometry and you'll get the same difference.
As Max mentions, having a proper tree depth can have a huge impact on performance; ideally you want no more than a few thousand polygons in a chunk, or else performance queries will in fact run very slowly. Turn on bounding box rendering to see how big your chunks are, and wireframe to get an idea of how many polygons are in each chunk. The export tools also dump some information about this.
Interiors and TSStatics do not issue collision queries at runtime, so they would not be a performance issue. What happens if you try putting the same scene on a very large, simple interior (like a big box)?
#16
I noticed that the profiler gives number over 100% if you don't profile it for very long. I profiled it for a good length of time before that and the numbers were normal. I went into the game very quickly so i could grab somethign to show.
The players arent moving fast, or at all. The geometry doesnt look dense... Actually it kinda looks like TGE terrain.
I've been told the tree depth is 4 and it was made with a raw file of 2048x2048... I honestly don't know what that means. I'll have to research it.
and I dont know how to turn on wireframe or bounding box rendering. I 'll google the site to try and find the info. If it's not something commonly known, please let me know how to do it :)
Thaks for the responses.
05/04/2006 (6:05 pm)
Yes, the performace remains low, even when the players are not being redered. I noticed that the profiler gives number over 100% if you don't profile it for very long. I profiled it for a good length of time before that and the numbers were normal. I went into the game very quickly so i could grab somethign to show.
The players arent moving fast, or at all. The geometry doesnt look dense... Actually it kinda looks like TGE terrain.
I've been told the tree depth is 4 and it was made with a raw file of 2048x2048... I honestly don't know what that means. I'll have to research it.
and I dont know how to turn on wireframe or bounding box rendering. I 'll google the site to try and find the info. If it's not something commonly known, please let me know how to do it :)
Thaks for the responses.
#17
I believe with the parameters described you'll end up with 64 triangles per collision bin which may be a bit more than you really want. The chunks end up being about 128 samples on a side, as well, so having 16384 verts and slightly more polygons, which might be a performance issue... unless my math is off; I'm about to leave for dinner so it might be. ;)
I'd like to see a full profiler dump of 30 seconds to a minute of gameplay, if it isn't too much trouble for you.
05/04/2006 (6:37 pm)
The TSEDoc.chm file that ships with TSE has documentation on how to change the parameters I mentioned, as well as information on what the various creation parameters are about.I believe with the parameters described you'll end up with 64 triangles per collision bin which may be a bit more than you really want. The chunks end up being about 128 samples on a side, as well, so having 16384 verts and slightly more polygons, which might be a performance issue... unless my math is off; I'm about to leave for dinner so it might be. ;)
I'd like to see a full profiler dump of 30 seconds to a minute of gameplay, if it isn't too much trouble for you.
#18
Profiler Dump
05/05/2006 (7:02 am)
Here ya go Ben ... and thanks for the tip on where to find the info.Profiler Dump
#19
So you're basically doing an extra 100,000 collision ops, per tick. As an optimization you might want to try adding a bounding box check before the call to registerConvex; if that clears it up it might be worth keeping in. Better is to tune the collision structures so the check isn't needed. ;)
BTW - your profiler data seems kinda messed, most notably the negative sub-times.
05/05/2006 (10:38 am)
Let me know if changing the tree depth helps, looks like you're trying to register approximately 100 polygons per collision query, which most of the time will involve iterating over a list 100 elements long for each polygon, and doing this 10 times a second.So you're basically doing an extra 100,000 collision ops, per tick. As an optimization you might want to try adding a bounding box check before the call to registerConvex; if that clears it up it might be worth keeping in. Better is to tune the collision structures so the check isn't needed. ;)
BTW - your profiler data seems kinda messed, most notably the negative sub-times.
#20
Now my performance issues are from TSShape rendering... Which is odd because all Ive got is my player and 8 or so AI players.
What kinda time frame is expected on Brian's optimizations? Is this a next Milestone thing, or are they going to be in the CVS soon?
I have always found my profiler blocks get messed up if I'm rendering at < 1 FPS.
05/09/2006 (7:59 am)
Ok, I changed the tree depth to 6. I have no performance issues coming from atlas now :) Actually it surprised me how little cpu it took once I changed the tree depth.Now my performance issues are from TSShape rendering... Which is odd because all Ive got is my player and 8 or so AI players.
Quote:
28.368 28.465 3076 TSShapeInstanceRender
What kinda time frame is expected on Brian's optimizations? Is this a next Milestone thing, or are they going to be in the CVS soon?
I have always found my profiler blocks get messed up if I'm rendering at < 1 FPS.
Torque Owner Prairie Games
Prairie Games, Inc.
-JR