iTorque 2D engine efficiency boost
by Daniel Liverance · in iTorque 2D · 08/10/2011 (6:30 am) · 31 replies
Hello everyone.
-- note, this is more a story just below, if you are just here to get the performance boost,
scroll all the way to the bottom and find the code (it's simple I promise). otherwise enjoy the story :)
--------------- STORY ---------------
I just recently finished work on a game for the iPad called Space Fart (you can find a blog about it on GarageGames here -> Blinker Studios - Space Fart)
I began this project by running a simple test project on an iPad (1st Generation) and I printed out the frame rate to see how well it did.
I was mortified when I saw that with ONLY a background of 1024 x 1024 pixels it ran at (I forget now, and don't have the iPad with me to reproduce) about 30 - 35 fps. That was ONLY A BACKGROUND!!! Obviously when we added ANYTHING else to the scene the FPS dropped even further until our game was sporting a nice 15 - 20 fps. This was just simply not acceptable.
So what I did was run some profiling tools that I have on my Mac and one of them is called Open GL ES Performance Detective (pretty awesome name) and what it does is: it tells me the FPS of a running app on a device that is using OpenGL ES (which iTorque does use) and when it drops to a level that I find unacceptable I click the Collect Evidence button and it immediately starts checking the app to see what is going on and tells me the problem.
When I ran the tool on Space Fart for the iPad, this is what it told me:
As you can see, it is saying that the FPS is limited by the graphics pipeline, and it offers some suggestions such as reducing the amount of pixels pushed to the screen and so forth. BUT the most notable is the one at the bottom where it says that the app is using alpha tests. I thought that this was normal because obviously you need to take out the alpha pixels in an image that uses alpha or else you get a big white spot wherever the alpha pixels were.
well after this I did some research on OpenGL ES for IOS devices and I came across this page -> Open GL ES Tuning .
If you read through that page it offers a lot of tweaks that will improve the graphics performance of an app, but at the very bottom of the page it mentions that you need to avoid Alpha testing using GL_ALPHA_TEST. I took a look through the engine and sure enough, I found it. Now I thought that commenting out GL_ALPHA_TEST would make white pixels appear on every image that I drew that had alpha enabled, but I soon found out that Blending is the fix for that. If you turn off blending AND comment out GL_ALPHA_TEST the white pixels will appear, but if you turn on Blending and leave out Alpha test, the alpha pixels are blended with actual coloured pixels and the result is as if the alpha pixels were ignored.
This is good news because on every iTorque game, blending is on by default for every scene object. So if you have a background of any type, and blending is enabled on it, you do not need to do an alpha test.
So I went and commented out the glEnable(GL_ALPHA_TEST) and glAlphaFunc(.....) and ran my projects again to find that Space Fart, which had a measly 23 FPS, now had a decent 45 - 50 fps. As well the test project with just the background went from 30 - 35 FPS to a solid 60 FPS.
Incredible, its like a 300% speed boost with no catch. At least I believe there is no catch (I would appreciate any comments below, tell me if there is any trouble). The reason apparently is that GL_ALPHA_TEST disables hardware optimizations for the fragment shader.
So I think that EVERYONE that is using iTorque needs to read this code fix, because I also saw a performance boost on the iPhone 3G AND the iPhone 4. (I am also 100% convinced that this fix will allow for at LEAST 30 up to 40 fps using Retina Display, a good tutorial to get that going found here -> Retina Display by Pedro Vicente , seeing as the iPad 1 and iPhone 4 have the exact same CPU and the iPad has a higher resolution.)
So thank you for following along, it was a long story I know, but treat yourself to the code down below. it seems like its a small amount of code but enjoy anyways :)
--------------- Story End, Time for the Code ----------------
comment out lines 2918 and 2919 inside of "iTorque2D1_5_Preview2/engine/source/T2D/t2dSceneWindow.cc". (change iTorque2D1_5_Preview2 to whatever version you are currently using.
this is what it should look like before and after:
before:
and After
Thanks again and keep an eye out for any more apps from Blinker Studios ;)
-- note, this is more a story just below, if you are just here to get the performance boost,
scroll all the way to the bottom and find the code (it's simple I promise). otherwise enjoy the story :)
--------------- STORY ---------------
I just recently finished work on a game for the iPad called Space Fart (you can find a blog about it on GarageGames here -> Blinker Studios - Space Fart)
I began this project by running a simple test project on an iPad (1st Generation) and I printed out the frame rate to see how well it did.
I was mortified when I saw that with ONLY a background of 1024 x 1024 pixels it ran at (I forget now, and don't have the iPad with me to reproduce) about 30 - 35 fps. That was ONLY A BACKGROUND!!! Obviously when we added ANYTHING else to the scene the FPS dropped even further until our game was sporting a nice 15 - 20 fps. This was just simply not acceptable.
So what I did was run some profiling tools that I have on my Mac and one of them is called Open GL ES Performance Detective (pretty awesome name) and what it does is: it tells me the FPS of a running app on a device that is using OpenGL ES (which iTorque does use) and when it drops to a level that I find unacceptable I click the Collect Evidence button and it immediately starts checking the app to see what is going on and tells me the problem.
When I ran the tool on Space Fart for the iPad, this is what it told me:

As you can see, it is saying that the FPS is limited by the graphics pipeline, and it offers some suggestions such as reducing the amount of pixels pushed to the screen and so forth. BUT the most notable is the one at the bottom where it says that the app is using alpha tests. I thought that this was normal because obviously you need to take out the alpha pixels in an image that uses alpha or else you get a big white spot wherever the alpha pixels were.
well after this I did some research on OpenGL ES for IOS devices and I came across this page -> Open GL ES Tuning .
If you read through that page it offers a lot of tweaks that will improve the graphics performance of an app, but at the very bottom of the page it mentions that you need to avoid Alpha testing using GL_ALPHA_TEST. I took a look through the engine and sure enough, I found it. Now I thought that commenting out GL_ALPHA_TEST would make white pixels appear on every image that I drew that had alpha enabled, but I soon found out that Blending is the fix for that. If you turn off blending AND comment out GL_ALPHA_TEST the white pixels will appear, but if you turn on Blending and leave out Alpha test, the alpha pixels are blended with actual coloured pixels and the result is as if the alpha pixels were ignored.
This is good news because on every iTorque game, blending is on by default for every scene object. So if you have a background of any type, and blending is enabled on it, you do not need to do an alpha test.
So I went and commented out the glEnable(GL_ALPHA_TEST) and glAlphaFunc(.....) and ran my projects again to find that Space Fart, which had a measly 23 FPS, now had a decent 45 - 50 fps. As well the test project with just the background went from 30 - 35 FPS to a solid 60 FPS.
Incredible, its like a 300% speed boost with no catch. At least I believe there is no catch (I would appreciate any comments below, tell me if there is any trouble). The reason apparently is that GL_ALPHA_TEST disables hardware optimizations for the fragment shader.
So I think that EVERYONE that is using iTorque needs to read this code fix, because I also saw a performance boost on the iPhone 3G AND the iPhone 4. (I am also 100% convinced that this fix will allow for at LEAST 30 up to 40 fps using Retina Display, a good tutorial to get that going found here -> Retina Display by Pedro Vicente , seeing as the iPad 1 and iPhone 4 have the exact same CPU and the iPad has a higher resolution.)
So thank you for following along, it was a long story I know, but treat yourself to the code down below. it seems like its a small amount of code but enjoy anyways :)
--------------- Story End, Time for the Code ----------------
comment out lines 2918 and 2919 inside of "iTorque2D1_5_Preview2/engine/source/T2D/t2dSceneWindow.cc". (change iTorque2D1_5_Preview2 to whatever version you are currently using.
this is what it should look like before and after:
before:
// Setup new viewport.
dglSetViewport(updateRect);
// Set ModelView.
glMatrixMode(GL_MODELVIEW);
glPushMatrix();
glLoadIdentity();
// Enable Alpha Test.
glEnable ( GL_ALPHA_TEST ); //<------------------------------- you want to comment out this line
glAlphaFunc ( GL_GREATER, 0.0f );//<------------------------------- and this line
glDisable ( GL_DEPTH_TEST );
//glEnable ( GL_DEPTH_TEST );
//glDepthFunc ( GL_LEQUAL );
// implement "Don't Render Object" functionality. Hide it before we render, unhide it afterwards.
bool previousDontRenderObjectVisibility = false;
if ((mpDontRenderObject != NULL) && mpDontRenderObject->mVisible)
{
previousDontRenderObjectVisibility = true;
mpDontRenderObject->mVisible = false;
}and After
// Setup new viewport.
dglSetViewport(updateRect);
// Set ModelView.
glMatrixMode(GL_MODELVIEW);
glPushMatrix();
glLoadIdentity();
// Enable Alpha Test.
//glEnable ( GL_ALPHA_TEST ); <-------------------------- Commented out
//glAlphaFunc ( GL_GREATER, 0.0f );<-------------------------- Commented out
glDisable ( GL_DEPTH_TEST );
//glEnable ( GL_DEPTH_TEST );
//glDepthFunc ( GL_LEQUAL );
// implement "Don't Render Object" functionality. Hide it before we render, unhide it afterwards.
bool previousDontRenderObjectVisibility = false;
if ((mpDontRenderObject != NULL) && mpDontRenderObject->mVisible)
{
previousDontRenderObjectVisibility = true;
mpDontRenderObject->mVisible = false;
}Thanks again and keep an eye out for any more apps from Blinker Studios ;)
About the author
I am 21 years old, and have just recently graduated from College for Game Programming (with honours) and I am now the Lead Programmer at Blinker Studios.
#2
It looks impressive. And thanks for the credits regarding my resource. I did that post adapted from several earlier posts in the site, credited at end. I have to give a try to "Open GL ES Performance Detective" :-)
08/10/2011 (9:56 am)
@DanielIt looks impressive. And thanks for the credits regarding my resource. I did that post adapted from several earlier posts in the site, credited at end. I have to give a try to "Open GL ES Performance Detective" :-)
#3
Here is what we will do:
1. Leave the Blending rollout, with the default values set as they are now
2. Add a new rollout for Alpha Testing, disabled by default. This gets us to where you are at. With this disabled, the GL_ALPHA_TEST will not occur
3. The user can choose to disable blending and enable alpha testing. Additionally, they can set the value of the alpha test instead of using the globally set value
I'll work on a porting guide before release to better explain this.
08/10/2011 (11:08 am)
Alright, I had a discussion with the team about this. We are actually going to extend this to provide a little more control. Some users are probably disabling Blending and could be making use of alpha testing. If that's the case, directly disabling the alpha testing could break their project.Here is what we will do:
1. Leave the Blending rollout, with the default values set as they are now
2. Add a new rollout for Alpha Testing, disabled by default. This gets us to where you are at. With this disabled, the GL_ALPHA_TEST will not occur
3. The user can choose to disable blending and enable alpha testing. Additionally, they can set the value of the alpha test instead of using the globally set value
I'll work on a porting guide before release to better explain this.
#4
wow that was fast :)
That is a good idea actually, with the alpha test disabled by default.
Thanks for the support, I'm glad I could help.
@Pedro
NP for the credit, I actually used that to get Retina Display working on some of my projects.
08/10/2011 (11:20 am)
@ Michaelwow that was fast :)
That is a good idea actually, with the alpha test disabled by default.
Thanks for the support, I'm glad I could help.
@Pedro
NP for the credit, I actually used that to get Retina Display working on some of my projects.
#5
This also opened up another can of worms in the object serialization macros. The change we are about to make will break backwards compatibility if we do not make any modifications. Anyone who has used iT2D for a while is familiar with the constant breakage of particle effects when a new engine update is released. This is due to the changes to t2dSceneObject's property list.
What we are about to do affects every single game object, which worries me that no one's objects will show up properly (if at all). This has motivated me to inject an automatic serialization upgrade routine, which will check to see if the project being loaded is older. If it is older, then all the objects will use a previous version of the serialization loader. Then everything will be saved out using the new header.
I will make sure you get credit for both the discovery of the speed boost and opening up the can of worms that lead up to this =)
08/10/2011 (11:57 am)
Quote:wow that was fast :)I guess I forgot to mention that the iT2D team grew by three people recently, one of which is a very senior programmer. That helps get things done quickly. Eric (CEO) also got involved with the evaluation.
This also opened up another can of worms in the object serialization macros. The change we are about to make will break backwards compatibility if we do not make any modifications. Anyone who has used iT2D for a while is familiar with the constant breakage of particle effects when a new engine update is released. This is due to the changes to t2dSceneObject's property list.
What we are about to do affects every single game object, which worries me that no one's objects will show up properly (if at all). This has motivated me to inject an automatic serialization upgrade routine, which will check to see if the project being loaded is older. If it is older, then all the objects will use a previous version of the serialization loader. Then everything will be saved out using the new header.
I will make sure you get credit for both the discovery of the speed boost and opening up the can of worms that lead up to this =)
#6
08/10/2011 (4:31 pm)
Alright, we have something working in. As far as presentation goes, this is a WIP. However, this achieves the performance boost by default. Additionally, this does not break previous projects/objects and provides control over this optimization (for those of you who are actually using GL_ALPHA_TEST):
#7
08/10/2011 (4:32 pm)
Notice the new rollout called Alpha Testing under Blending. Also notice blending has been disabled, which would normally result in a white/garbled background if Alpha Test Value is set to -1. You can enable alpha testing by setting it to 0 - 255.
#9
Also I am truly Honoured to have been given credit to this :)
Thanks again.
08/10/2011 (7:09 pm)
Nice additions, it looks good.Also I am truly Honoured to have been given credit to this :)
Thanks again.
#10
@Michael: are you saying that if we use this optimization and have blending off, the app will crash?
*update* just put in this optimization in Are You Quick Enough and everything ran fine. All the moving graphics are so much smoother! I have blending on and off on my objects and nothing weird happened. Can't wait till all the torque devs see this, they will go bonkers. I'm so happy I'm going to go call my mom.
08/10/2011 (7:49 pm)
Awesome! Finally itorque can have decent performance on the iPad1 with this optimization. Thanks Daniel. It's a happy day. Now..if only itorque had built-in support for box2d :)@Michael: are you saying that if we use this optimization and have blending off, the app will crash?
*update* just put in this optimization in Are You Quick Enough and everything ran fine. All the moving graphics are so much smoother! I have blending on and off on my objects and nothing weird happened. Can't wait till all the torque devs see this, they will go bonkers. I'm so happy I'm going to go call my mom.
#11
08/11/2011 (1:51 am)
Excellent, big thanks Daniel! - will try this out today :)
#12
08/11/2011 (3:26 am)
Early indications are very positive, we now get a solid 60FPS on the iPad1 with Cannibal Cookout. Previous FPS was in mid 30s.
#13
08/11/2011 (6:01 am)
Awesomeness. I'm passing the link to this thread around to iPad devs.
#14
YES!!!!
08/11/2011 (7:03 am)
Quote:
Awesome! Finally itorque can have decent performance on the iPad1 with this optimization. Thanks Daniel. It's a happy day. Now..if only itorque had built-in support for box2d :)
YES!!!!
#15
The only time something weird will happen is when you have blending on whatever background you have turned off AND the alpha test commented out. It won't crash, there will just be a white opaque pixel wherever you expected there to be a clear pixel.
Also when I was searching for bottlenecks in my app, the next MAJOR bottleneck is the built in physics for iTorque.
25 circles colliding brought the CPU to 100% on the iPad 1, fix the physics and iTorque should be very much more efficient. (I tested it using some tools to check the cpu usage at given times) :)
08/11/2011 (8:00 am)
@JohnnyThe only time something weird will happen is when you have blending on whatever background you have turned off AND the alpha test commented out. It won't crash, there will just be a white opaque pixel wherever you expected there to be a clear pixel.
Also when I was searching for bottlenecks in my app, the next MAJOR bottleneck is the built in physics for iTorque.
25 circles colliding brought the CPU to 100% on the iPad 1, fix the physics and iTorque should be very much more efficient. (I tested it using some tools to check the cpu usage at given times) :)
#16
There is a resource about Box2D integration with Torque
Box2D Integration on Google Code
08/11/2011 (11:20 am)
@JohnnyThere is a resource about Box2D integration with Torque
Box2D Integration on Google Code
#17
08/12/2011 (12:09 am)
Any news on the next update with this added feature?
#18
Considering the next feature in 1.5 will be retina and universal app support, this post is a great help. I was worried about performance when rendering in high resolution mode, but the speed boost has alleviated this worry.
08/12/2011 (6:20 am)
@George - In addition to this post I made earlier?Considering the next feature in 1.5 will be retina and universal app support, this post is a great help. I was worried about performance when rendering in high resolution mode, but the speed boost has alleviated this worry.
#19
@Michael - Can you give us an high level technical overview on how you planning to do universal and retina support?
08/14/2011 (1:38 pm)
@Scott - 60 fps is music to my ear. I knew something was holding backing the engine (and it wasn't us!).@Michael - Can you give us an high level technical overview on how you planning to do universal and retina support?
#20
I tried the game. Fun to play. I am doing a game and my main issue now is level interface objects, like change level by pressing a button, etc.
I am using this approach, but there are issues
wrong level loaded
What was your approach to change levels and interface?
08/14/2011 (9:51 pm)
@DanielI tried the game. Fun to play. I am doing a game and my main issue now is level interface objects, like change level by pressing a button, etc.
I am using this approach, but there are issues
wrong level loaded
What was your approach to change levels and interface?
Employee Michael Perry
ZombieShortbus
I'm going to have a couple guys look at this and verify the approach. If they cannot point out any "gotchas", this will go into 1.5 final.
Again, I appreciate you posting this finding and also want to say congrats on the game release! =)