Optimization tips for large number of sprites ?
by Bob Dobbs · in Torque X 2D · 12/02/2010 (2:25 pm) · 41 replies
Hey community, care to lend a helping hand ? I'm about losing my tether trying to optimize my project.
I thought I had fixed this issue along time ago but it seems to be rearing its ugly head.I'm almost at giving up and binning it stage T_T
Main problem seems to be with me having too many sprites on screen. Basically I'm attempting to spawn 100+ enemies. I've searched the forums and tried the "FarDistance" trick as well as UseLayerSorting to false as well as tinkering with the core as other posts have suggested.
I'm just wondering can Torque or XNA for that matter have more than 100-300 moving animated sprites in a scene at one time ? (Maybe I'm asking or expecting too much)
Here's the lowdown:
Each of my enemies are made up of 3 sprite layers; the head, the torso and legs.
Each of these parts has 30 Frames of animation as a 326x334 animated sprite and there are 5 variations on each of those animations; walk, lunge, attack and death.
I attempt to spawn 100 enemies so quick calculation time; 3 * 100 = 300 sprites they appear on screen with the spawner at intervals of one second in a large spawn area, then are activated with a simple AI Player chase component.
Just 100 "enemies"(made up of 3 sprites remember) and no problems on PC, on Xbox about 40 is my limit before the games FPS begins to go into lag and jerky scrolling.
Following the golden rules of optimization and bug fixing, I thought ok break it down...
Too many sprites ?
Too much animation data?
Sprites too big ?
Too much collision data ?
Frequency of AI upfates ?
So I set about suggestions on the forums with Layer sorting and FarDistance, little bit of improvement,
Next I though how about setting the sprites visibility, animationdata, velocity speed and unneeded collisions to null or false while out of camera view.ie almost "disabling" the 3 sprites that make up an enemy while they are outwith the camera view. See www.torquepowered.com/community/forums/viewthread/122951
with a bit of
Again a slight improvement but no great shakes.
I set about to dismantle my spawner so that it wouldn't generate the enemies as 3 layers and mounted sprites, just a single unanimated sprite, I got up to about 200 before lag started happening.
Heres a video
Next step is reduce the animation data down from 30fps to 10fps AND reduce the sprite size by half
I'm worried about doing the above as it really degrades the graphic quality of my game, I was setting about to make something that looked really polished so it's a bit disheartening to do so.
Ideally I would want a vast amount of enemies "present" then activate them while they are on camera, ideally if I could hit the 200 mark I would be happy (mind that's 600 for my enemies cos they are made up of three's)
Does anyone have any tips for debugging or improving performance ?
or just by looking at my sizes and amount of animation data am I expecting too much ? By looking at other posts lots of people seem to run into the 300 mark before lag hits in, is this the limit ?
I've been tackling this for a week and a bit now as previously I had generated 200+ sprites in a very small play area but now as I'm building my level to finalise the game I'm hitting this hurdle with lag and its heartbreaking.
Any help or debug suggestions no matter how small would be really appreciated, thanks in advance.
I thought I had fixed this issue along time ago but it seems to be rearing its ugly head.I'm almost at giving up and binning it stage T_T
Main problem seems to be with me having too many sprites on screen. Basically I'm attempting to spawn 100+ enemies. I've searched the forums and tried the "FarDistance" trick as well as UseLayerSorting to false as well as tinkering with the core as other posts have suggested.
I'm just wondering can Torque or XNA for that matter have more than 100-300 moving animated sprites in a scene at one time ? (Maybe I'm asking or expecting too much)
Here's the lowdown:
Each of my enemies are made up of 3 sprite layers; the head, the torso and legs.
Each of these parts has 30 Frames of animation as a 326x334 animated sprite and there are 5 variations on each of those animations; walk, lunge, attack and death.
I attempt to spawn 100 enemies so quick calculation time; 3 * 100 = 300 sprites they appear on screen with the spawner at intervals of one second in a large spawn area, then are activated with a simple AI Player chase component.
Just 100 "enemies"(made up of 3 sprites remember) and no problems on PC, on Xbox about 40 is my limit before the games FPS begins to go into lag and jerky scrolling.
Following the golden rules of optimization and bug fixing, I thought ok break it down...
Too many sprites ?
Too much animation data?
Sprites too big ?
Too much collision data ?
Frequency of AI upfates ?
So I set about suggestions on the forums with Layer sorting and FarDistance, little bit of improvement,
Next I though how about setting the sprites visibility, animationdata, velocity speed and unneeded collisions to null or false while out of camera view.ie almost "disabling" the 3 sprites that make up an enemy while they are outwith the camera view. See www.torquepowered.com/community/forums/viewthread/122951
with a bit of
sceneObject.AnimationPaused = true;
sceneObjectHead.Visible = false;
sceneObjectTorso.Visible = false;
sceneObjectLegs.Visible = false;
sceneObject.Collision.CollidesWith -= TorqueObjectDatabase.Instance.GetObjectType("objPlayer"); //only other collsiion is fellow enemies and walls
_active = false; //the AI chase
_speed = 0; //the velocityAgain a slight improvement but no great shakes.
I set about to dismantle my spawner so that it wouldn't generate the enemies as 3 layers and mounted sprites, just a single unanimated sprite, I got up to about 200 before lag started happening.
Heres a video
Next step is reduce the animation data down from 30fps to 10fps AND reduce the sprite size by half
I'm worried about doing the above as it really degrades the graphic quality of my game, I was setting about to make something that looked really polished so it's a bit disheartening to do so.
Ideally I would want a vast amount of enemies "present" then activate them while they are on camera, ideally if I could hit the 200 mark I would be happy (mind that's 600 for my enemies cos they are made up of three's)
Does anyone have any tips for debugging or improving performance ?
or just by looking at my sizes and amount of animation data am I expecting too much ? By looking at other posts lots of people seem to run into the 300 mark before lag hits in, is this the limit ?
I've been tackling this for a week and a bit now as previously I had generated 200+ sprites in a very small play area but now as I'm building my level to finalise the game I'm hitting this hurdle with lag and its heartbreaking.
Any help or debug suggestions no matter how small would be really appreciated, thanks in advance.
#2
Hoddie,
I've recently spent a lot of time trying to optimize our game. We are running into similar issues... too many sprites on the screen and the FPS becomes crap. Our game runs fine on PC, but on the Xbox it was unplayable. There was far too much stuttering. After looking into for weeks, I have optimized it to be playable. I wouldn't have been able to do it if I didn't have access to the engine source code. Our problem was mainly garbage collection. If you haven't profiled your game to see what the garbage collector is doing, then I'd start there first:
http://blogs.msdn.com/b/shawnhar/archive/2007/06/29/how-to-tell-if-your-xbox-garbage-collection-is-too-slow.aspx
http://blogs.msdn.com/b/shawnhar/archive/2007/07/02/twin-paths-to-garbage-collector-nirvana.aspx
I found that it would be too difficult to try to keep the number of objects allocated on the heap to a small number, so I focused on not making run-time allocations. When I started this process, our game was making between 500,000 and 1,000,000 bytes of allocations on the heap each second; it was ridiculous and led to the game stuttering all the time. Now we're down to somewhere around 15,000 bytes on average. It's not perfect, but it's good enough for now and plays at 60 fps most of the time.
This seems to be just one side of the problem though. As you have noted, the more sprites there are on the screen, the lower the FPS. We still have this problem as well. Right now we're limited to probably around 130-150 sprites on the screen at once. After that it starts slowing down. If the sprites have particle effects attached to them, then it's going to be closer to 50 sprites. (We mount particle effects to some of our weapons being fired, makes it look really cool, but we have to watch how much we do it).
Anyway, I haven't made much progress on this second part of the problem, and if anyone has any tips I'd love to read them as well.
Thanks!
12/27/2010 (9:57 pm)
Hey guys, this topic is also something I'm really interested in. If you have any tips, Pino, I'd love to hear them. Hoddie,
I've recently spent a lot of time trying to optimize our game. We are running into similar issues... too many sprites on the screen and the FPS becomes crap. Our game runs fine on PC, but on the Xbox it was unplayable. There was far too much stuttering. After looking into for weeks, I have optimized it to be playable. I wouldn't have been able to do it if I didn't have access to the engine source code. Our problem was mainly garbage collection. If you haven't profiled your game to see what the garbage collector is doing, then I'd start there first:
http://blogs.msdn.com/b/shawnhar/archive/2007/06/29/how-to-tell-if-your-xbox-garbage-collection-is-too-slow.aspx
http://blogs.msdn.com/b/shawnhar/archive/2007/07/02/twin-paths-to-garbage-collector-nirvana.aspx
I found that it would be too difficult to try to keep the number of objects allocated on the heap to a small number, so I focused on not making run-time allocations. When I started this process, our game was making between 500,000 and 1,000,000 bytes of allocations on the heap each second; it was ridiculous and led to the game stuttering all the time. Now we're down to somewhere around 15,000 bytes on average. It's not perfect, but it's good enough for now and plays at 60 fps most of the time.
This seems to be just one side of the problem though. As you have noted, the more sprites there are on the screen, the lower the FPS. We still have this problem as well. Right now we're limited to probably around 130-150 sprites on the screen at once. After that it starts slowing down. If the sprites have particle effects attached to them, then it's going to be closer to 50 sprites. (We mount particle effects to some of our weapons being fired, makes it look really cool, but we have to watch how much we do it).
Anyway, I haven't made much progress on this second part of the problem, and if anyone has any tips I'd love to read them as well.
Thanks!
#3
As for me well I can get up to 145 enemies on screen which due to my enemies design is actually 3 X 145 = 435 sprites before I hit a reduction in frame rate.Pino did some great tweaks as I was overusing findObject and I had some other suggestions from the community as well.
Its still not perfect and I'm still trying to optimise to get a perfect "200" enemies on screen at one time without lag.
Here's a few things I tried
1)Removing the overlaps and collision polys for mounted sprites to cut down on collisions, as well as removing almost ALL components other than strictly essential.
2)Reduced the sprite sheet sizes by half and reduced the number of frames by half and DXT compression enabled; this improves performance but the smoothness of the animations and quality of the graphics has taken a big knock that I’m not 100% satisfied with but I guess in optimisation those sacrifices have to be made.
3)Experimented with the pool and pool with components options but not too much success, didn’t seem to make any radical difference; then again I’m perhaps not using correctly ie implementing a proper pooling strategy. I was reading up in Jon Kanalakis’ book about it - “only requires that you implement CopyTo() method for all of your objects” hmmm not exactly sure what he means but digging around the torque forums for clues and hints on pooling
4)Try delaying your processTick for the movement of your enemies AI or movement process.like so
There is a slight drawback using this with the enemies rotation looking a little bit “tick tock”
Also implemented setting the enemies velocity to 0 if they were outwith the circle of light round the player or off camera; (see video)
Way I see it is to try and get any enemies that are off screen doing as little as possible ie remove non essential collision polys, cease movement.
Try get your game spawning enemies without movement and / or other components, then gradually increase velocity, collisions, other components, etc to see what's causing the lag. Its a pain in the balls having to redeploy to Xbox each time when it works ok on PC, but guess it the only way to find out.
I'm still tweaking and fiddling with other ideas, if you want a look at Pino's fixes just mail me at hoddie.campbell@gmail.com and I can give u SVN access to what Pino had done and vice versa if you can gimme more detail on your project I can see if I can make any other suggestions as this issue is a major "clog" for my own project.
All the best
12/28/2010 (12:12 am)
Hey Jacob, thanks for your insights, I will have a look into those links later and have a proper looky see.As for me well I can get up to 145 enemies on screen which due to my enemies design is actually 3 X 145 = 435 sprites before I hit a reduction in frame rate.Pino did some great tweaks as I was overusing findObject and I had some other suggestions from the community as well.
Its still not perfect and I'm still trying to optimise to get a perfect "200" enemies on screen at one time without lag.
Here's a few things I tried
1)Removing the overlaps and collision polys for mounted sprites to cut down on collisions, as well as removing almost ALL components other than strictly essential.
2)Reduced the sprite sheet sizes by half and reduced the number of frames by half and DXT compression enabled; this improves performance but the smoothness of the animations and quality of the graphics has taken a big knock that I’m not 100% satisfied with but I guess in optimisation those sacrifices have to be made.
3)Experimented with the pool and pool with components options but not too much success, didn’t seem to make any radical difference; then again I’m perhaps not using correctly ie implementing a proper pooling strategy. I was reading up in Jon Kanalakis’ book about it - “only requires that you implement CopyTo() method for all of your objects” hmmm not exactly sure what he means but digging around the torque forums for clues and hints on pooling
4)Try delaying your processTick for the movement of your enemies AI or movement process.like so
elapsedSinceAiProcessing += elapsed;
if (elapsedSinceAiProcessing < 0.50f)
{
return;
}
else
{
elapsedSinceAiProcessing = 0;
// Do whatever you want to do
}There is a slight drawback using this with the enemies rotation looking a little bit “tick tock”
Also implemented setting the enemies velocity to 0 if they were outwith the circle of light round the player or off camera; (see video)
Way I see it is to try and get any enemies that are off screen doing as little as possible ie remove non essential collision polys, cease movement.
Try get your game spawning enemies without movement and / or other components, then gradually increase velocity, collisions, other components, etc to see what's causing the lag. Its a pain in the balls having to redeploy to Xbox each time when it works ok on PC, but guess it the only way to find out.
I'm still tweaking and fiddling with other ideas, if you want a look at Pino's fixes just mail me at hoddie.campbell@gmail.com and I can give u SVN access to what Pino had done and vice versa if you can gimme more detail on your project I can see if I can make any other suggestions as this issue is a major "clog" for my own project.
All the best
#4
Simplifying the collision poly has worked well for me. You can expand the collision poly of the torso to encompass the head which will make collision detection simpler. The head becomes a simpler object.
There's always the option to move code into C++. That's always my last resort.
01/05/2011 (2:08 am)
I have most of my animation using schedules. Taking the frame-by-frame animation out of the update functions speeds up the update functions considerably. I can have some sprites updating every 50ms, others every 150ms. There can be a tick-tock sort of animation with skipped frames but for my purposes, it works just fine. For background animations, it's fine.Simplifying the collision poly has worked well for me. You can expand the collision poly of the torso to encompass the head which will make collision detection simpler. The head becomes a simpler object.
There's always the option to move code into C++. That's always my last resort.
#5
All the best
01/05/2011 (3:41 am)
Cheers Nikos, gives me some more food for thought.I don't really use much background animation. Been experimenting with lots of options but I don't think with my current endeavours will get up to the magic 200-300 I'm after without a rebuild of sorts. Think I need to implement a proper finite state model to streamline processing of my enemies and player motions, so far have been a little "slap dash". Out of interest what kind of sprite count were you aiming for ? What kind of sizes etc ? Just curious for reference.All the best
#6
01/19/2011 (9:35 am)
Maybe jon quil has some tips, I remember playing his game Burn1420 in playtest, it had hundreds of enemies moving around on screen. He's a member here.
#7
01/23/2011 (3:00 pm)
I'm also having this problem currently, but in a much different setup. I have two grids on my board, which are basically a much simplified version of the tilemap that torque uses. They hold a gridposition and an object. I place sprites into the grid, all of which have no collision components or data. The grid size is 12x6, so I have a potential for 72 total sprites to fill this grid. Each of these sprites are 50x50 If two similar colored sprites are touching, i also introduce an animation effect which is 4 frames and runs at 8 fps with a size of 100x100. When either of the boards begin to fill up, the framerate drops, down to almost 6fps. There has to be some sort of optimization I'm missing here. On the pc, i run at 30fps consistently.
#8
I've been quite busy lately between my day job, my games and trying to finalize the 4.0 CEV so I couldn't spare time for this issue of yours. As soon the CEV steps in GA I'll see to this problem trying to produce some sample code. Basically, from what I can understand, your code needs attention and some refactoring as I can't really see this as an engine issue.
01/23/2011 (3:42 pm)
Hey guys,I've been quite busy lately between my day job, my games and trying to finalize the 4.0 CEV so I couldn't spare time for this issue of yours. As soon the CEV steps in GA I'll see to this problem trying to produce some sample code. Basically, from what I can understand, your code needs attention and some refactoring as I can't really see this as an engine issue.
#9
01/23/2011 (9:25 pm)
As a follow up to this, I've turned off all effects, including my particle effects and animations, and now only have the base sprites, which are 50x50. If i fill the board up (so 72 + 72 = 144), the framerate drops the 7-10 fps. If I pause the game, which pauses all of my game's processticks (not affecting my GameTimeScale, I just check for a boolean isGamePaused), the framerate remains the same. I don't believe it's a problem with my GC since I've ran the profiler and the GC rarely is running, so I'm not sure what is going on. When I restart, which clears the board, the framerate jumps back up to 60 fps. Will continue to research this.
#10
01/24/2011 (1:05 am)
@Michael: The best approach would be to produce a repro project and send it over to me.
#11
@Aaron - Yep had spoke to Jon quite a bit he gave me a few suggestions such as delaying the process tick
01/24/2011 (3:52 am)
@Pino - Hey no probs, understand your busy.Had you had any other ideas in addition to the stuff you had implemented in my SVN ? I gained employment myself recently so back to busy,busy schedules =( I'm kinda starting from scratch again by using the PSK with its Actor's and FSM as means to see if I can make more efficient code; tho a wee bit worried I will end back at square 1. I guess having 600 moving sprites on screen is a little over ambitious but will wait n see. Will PM you sometime when I make progress.Thnx for all the hard work so far !@Aaron - Yep had spoke to Jon quite a bit he gave me a few suggestions such as delaying the process tick
#12
01/24/2011 (4:37 am)
If you are using CEV4, but you don't need to render particles from offscreen emitters, then you can set T2DSceneGraph.RenderOffScreenParticles to false to free some more CPU.
#13
01/24/2011 (8:53 am)
@Hoddie: yep, I did some more refactoring but I had to stop for lack of time :( as soon I get some workable timeslot I'll finish that and make a commit.
#14
01/24/2011 (9:55 pm)
I've modified my code to see how many unique Texture2d's i could draw before beginning to take a performance hit. Using the same texture that my T2DStaticSprite uses, I was able to render 35,000 textures before it hit the framerate I was getting when rendering 140 T2DStaticSprites. With the UseLayerSorting on the scenegraph set to false, I can get to 400 sprites before the framerate drops to 10. I know Torque Renders differently and T2DStaticSprites hold more information, but why is there such a large discrepancies in the speeds in which the objects render?
#15
01/24/2011 (11:26 pm)
I think Torque X isn't doing any batching and is rendering like the XNA SpriteBatch Immediate mode and drawing all the objects right away causing the drop in performance when too many objects are in the scene. I think performance could be improved by batching T2DSprites to reduce the amount of draw calls. Similar to what SpriteBatches Deferred mode does, or just rewrite the renderer to use SpriteBatch while maintaining compatibility with the material system of course. I plan on doing this but, I won't be able to get to it until I finish the art for my game.
#16
@Alex: agreed, actually I tried to use the spritebatch when I started optimizing the engine but its implementation breaksl the whole thing :( It's true that I didn't have a lot of TX inner knowledge when I did that (it was in the very beginning of the CEV) so maybe I missed something obvious but that made me put away that idea as non-viable... it might be a good idea to look into it with the current built up TX knowledge :)
01/25/2011 (2:28 am)
@Michael: well... that's quite obvious :) I've got your repro and actually can't understand how that can be... looking into the issue.@Alex: agreed, actually I tried to use the spritebatch when I started optimizing the engine but its implementation breaksl the whole thing :( It's true that I didn't have a lot of TX inner knowledge when I did that (it was in the very beginning of the CEV) so maybe I missed something obvious but that made me put away that idea as non-viable... it might be a good idea to look into it with the current built up TX knowledge :)
#17
01/25/2011 (11:57 am)
@Pino - Are you not experiencing the same framerate issues on the 360 with the repro project I sent you?
#18
01/25/2011 (1:48 pm)
Yep, that's why I said that I can't understand how that can be... and mostly how I didn't see it in the first place!
#19
01/25/2011 (2:27 pm)
hmm, and no changes were made on your end?... i'll double check tonight. I was running a debug instance from my wireless laptop, i don't know if that would have anything to do with it. i'll report back with what i find.
#20
BTW: I've just tried using the XNA SpriteBatch instead of the RenderQuad and... no good... the problem doesn't lies with the rendering part of the code. Still investigating...
01/25/2011 (2:43 pm)
Maybe I'm not making mysel clear: I do see the problem on my Xbox and I wonder how I didn't see it.BTW: I've just tried using the XNA SpriteBatch instead of the RenderQuad and... no good... the problem doesn't lies with the rendering part of the code. Still investigating...
Associate Giuseppe De Francesco
DFT Games Ltd