Optimization Kit
by asmaloney (Andy) · 01/31/2007 (2:43 pm) · 16 comments
Download Code File
Optimization Kit
----------
Version History
----------
v1.1.0 - 03 Feb 2007
- improved rendering of precipitation and splashes
- improved rendering of decals
- fixed a bug I introduced in Precipitation::getWindVelocity() [VectorF() is not the same as VectorF( 0.0f, 0.0f, 0.0f ) - the constructor does not initialize... oops]
- changed files:
-> game/fx/precipitation.cc
- new files:
-> sim/decalManager.cc
- This release includes a slightly modified version of Alex Scarborough's resource "Batched Decal Rendering"
v1.0.1 - 01 Feb 2007
- some fixes for Windows compiling [const fixes]
- new file included for const fix - dgl/gTexManager.h
v1.0 - 25 Jan 2007
- initial version
What is it?
----------
Simply put, it's a series of changes to TGE 1.5 to make it faster. When I purchased TGE 1.5 and tried the demo on my PowerBook G4, I was surprised at how slow the Stronghold mission was. In late December 2006, I offered to port & package Illumina for the Mac, and found that it had performance issues too, though not quite as bad.
I decided to take a look at the TGE code to see where I could optimize it - specifically for PPC Macs, but hopefully for other builds as well. My goal is to make the Stronghold mission run at acceptable speeds on a PPC Mac.
I'm trying to keep the changes local and simple - no major re-architecting - so that they're easy to apply to the stock TGE code and the changes are obvious and understandable. Hopefully this way the changes might make their way into the main codebase.
So far, with these changes, I have decreased the time my Stronghold profile journal takes from ~37s to ~31s.
I didn't get quite as much done in the first version as I'd hoped - this is just the tip of the iceberg... I'd like this to be an ongoing project - adding in optimizations from others and removing ones that GG eventually incorporate into the main code base - kind of a testing ground for optimizations.
What isn't it?
----------
It's not magic. While I've been working with the Stronghold mission, your project could have quite a different usage profile. These changes may gain you nothing - nada - zero. Almost all the changes, however, are conservative and should not negatively impact your project's performance - even on Windows. [If you find this not to be the case, please let me know what's not working properly ASAP so I can look into it.]
The intent is not that you just put this in blindly, but that you review the changes and check the profile of your project to see if the changes are suitable.
Methodology
----------
Platform: Mac OS X 10.4.8 on PowerBook G4
Video card: ATI Mobility Radeon 9700
Compiler: gcc 3.3 for PPC and gcc 4.0 for Intel [this is how the Torque Demo project is set up]
I have recorded a journal of a runthrough of Stronghold, trying to cover all sorts of activity - firing, going into and looking around structures, entering and leaving water, and so on. I use the same journal for all my runs and start the profiler after the assets have loaded.
I use Apple's Shark for profiling and asm optimization suggestions. I pick a target function based on this, run a profile and save it, then iterate changes to that function, checking the asm and profile times. [Man I'm getting tired of seeing the orcs do the same thing over and over!]
Note: I am applying my changes on top of each other - not completely independently - so it is possible that an individual change will not gain as much as all of them together. E.G. If I've optimized functions A and B where A is inlined in B and you only apply the changes for B, you won't see as much an improvement as if you applied both.
Installation
----------
You can put in all the optimizations at the same time or pick which ones you want. If you do the latter, read them carefully to make sure you aren't leaving anything out.
Instead of making a patch, which I know some people find difficult, I've included the changed files in their entirety. If you haven't changed the files, simply copy over the versions included with this kit and include the new file in the 'ok' dir in your build process. Otherwise, you'll want to search for the markers '[ok] Start' to see what's changed and integrate it by hand.
Since there are a lot of changes in header files and inline functions, I'd recommend a clean rebuild.
[If you really want a diff you can use with patch on the command line, please contact me.]
What's in it?
----------
The majority of the changes are designed to improve performance on a Mac PPC using gcc. All changes, even if they are marked as just Mac PPC or Mac Intel should be valid for Windows - you just won't see any gains. Ones marked with a question mark are ones I don't yet know if there is an improvement. [If someone can let me know by checking the asm code generated and profiling I'll update this.]
Changes to the code are marked like this:
Files Changed
[The package contains descriptions of each change in these files.]
dgl/dgl.cc
dgl/gTexManager.h
game/fx/precipitation.h
game/fx/precipitation.cc
interior/interior.h
interior/interior.cc
interior/interiorRender.cc
math/mBox.h
math/mBox.cc
math/mMath_C.cc
math/mMathFn.h
math/mMatrix.h
math/mPlane.h
math/mPoint.h
math/mRandom.h
math/mRect.h
math/mathUtils.h
sceneGraph/sceneState.h
sceneGraph/sceneGraph.h
sceneGraph/sceneState.cc
sim/decalManager.cc
terrain/blender.cc
terrain/terrRender.cc
New File: ok/okFogCalc.h
Summary
----------
So that's it! I hope you find it useful. If you try any of these changes, please let me know how they worked/didn't work for you. If you have any other suggestions, corrections, or optimizations you'd like included, please contact me [my address is included in the package or you can find it through my profile].
[Edit: New version - v1.1.0]
Optimization Kit
----------
Version History
----------
v1.1.0 - 03 Feb 2007
- improved rendering of precipitation and splashes
- improved rendering of decals
- fixed a bug I introduced in Precipitation::getWindVelocity() [VectorF() is not the same as VectorF( 0.0f, 0.0f, 0.0f ) - the constructor does not initialize... oops]
- changed files:
-> game/fx/precipitation.cc
- new files:
-> sim/decalManager.cc
- This release includes a slightly modified version of Alex Scarborough's resource "Batched Decal Rendering"
v1.0.1 - 01 Feb 2007
- some fixes for Windows compiling [const fixes]
- new file included for const fix - dgl/gTexManager.h
v1.0 - 25 Jan 2007
- initial version
What is it?
----------
Simply put, it's a series of changes to TGE 1.5 to make it faster. When I purchased TGE 1.5 and tried the demo on my PowerBook G4, I was surprised at how slow the Stronghold mission was. In late December 2006, I offered to port & package Illumina for the Mac, and found that it had performance issues too, though not quite as bad.
I decided to take a look at the TGE code to see where I could optimize it - specifically for PPC Macs, but hopefully for other builds as well. My goal is to make the Stronghold mission run at acceptable speeds on a PPC Mac.
I'm trying to keep the changes local and simple - no major re-architecting - so that they're easy to apply to the stock TGE code and the changes are obvious and understandable. Hopefully this way the changes might make their way into the main codebase.
So far, with these changes, I have decreased the time my Stronghold profile journal takes from ~37s to ~31s.
I didn't get quite as much done in the first version as I'd hoped - this is just the tip of the iceberg... I'd like this to be an ongoing project - adding in optimizations from others and removing ones that GG eventually incorporate into the main code base - kind of a testing ground for optimizations.
What isn't it?
----------
It's not magic. While I've been working with the Stronghold mission, your project could have quite a different usage profile. These changes may gain you nothing - nada - zero. Almost all the changes, however, are conservative and should not negatively impact your project's performance - even on Windows. [If you find this not to be the case, please let me know what's not working properly ASAP so I can look into it.]
The intent is not that you just put this in blindly, but that you review the changes and check the profile of your project to see if the changes are suitable.
Methodology
----------
Platform: Mac OS X 10.4.8 on PowerBook G4
Video card: ATI Mobility Radeon 9700
Compiler: gcc 3.3 for PPC and gcc 4.0 for Intel [this is how the Torque Demo project is set up]
I have recorded a journal of a runthrough of Stronghold, trying to cover all sorts of activity - firing, going into and looking around structures, entering and leaving water, and so on. I use the same journal for all my runs and start the profiler after the assets have loaded.
I use Apple's Shark for profiling and asm optimization suggestions. I pick a target function based on this, run a profile and save it, then iterate changes to that function, checking the asm and profile times. [Man I'm getting tired of seeing the orcs do the same thing over and over!]
Note: I am applying my changes on top of each other - not completely independently - so it is possible that an individual change will not gain as much as all of them together. E.G. If I've optimized functions A and B where A is inlined in B and you only apply the changes for B, you won't see as much an improvement as if you applied both.
Installation
----------
You can put in all the optimizations at the same time or pick which ones you want. If you do the latter, read them carefully to make sure you aren't leaving anything out.
Instead of making a patch, which I know some people find difficult, I've included the changed files in their entirety. If you haven't changed the files, simply copy over the versions included with this kit and include the new file in the 'ok' dir in your build process. Otherwise, you'll want to search for the markers '[ok] Start' to see what's changed and integrate it by hand.
Since there are a lot of changes in header files and inline functions, I'd recommend a clean rebuild.
[If you really want a diff you can use with patch on the command line, please contact me.]
What's in it?
----------
The majority of the changes are designed to improve performance on a Mac PPC using gcc. All changes, even if they are marked as just Mac PPC or Mac Intel should be valid for Windows - you just won't see any gains. Ones marked with a question mark are ones I don't yet know if there is an improvement. [If someone can let me know by checking the asm code generated and profiling I'll update this.]
Changes to the code are marked like this:
// [ok] Start - description ... code ... // [ok] End
Files Changed
[The package contains descriptions of each change in these files.]
dgl/dgl.cc
dgl/gTexManager.h
game/fx/precipitation.h
game/fx/precipitation.cc
interior/interior.h
interior/interior.cc
interior/interiorRender.cc
math/mBox.h
math/mBox.cc
math/mMath_C.cc
math/mMathFn.h
math/mMatrix.h
math/mPlane.h
math/mPoint.h
math/mRandom.h
math/mRect.h
math/mathUtils.h
sceneGraph/sceneState.h
sceneGraph/sceneGraph.h
sceneGraph/sceneState.cc
sim/decalManager.cc
terrain/blender.cc
terrain/terrRender.cc
New File: ok/okFogCalc.h
Summary
----------
So that's it! I hope you find it useful. If you try any of these changes, please let me know how they worked/didn't work for you. If you have any other suggestions, corrections, or optimizations you'd like included, please contact me [my address is included in the package or you can find it through my profile].
[Edit: New version - v1.1.0]
About the author
#2
01/31/2007 (3:39 pm)
You rock Andy!
#3
@Rubes: I added my machine [PowerBook G4] and video card [ATI Mobility Radeon 9700] to the description. On the main profile journal I was using, I got about a 20% performance increase. On another test, it was more like 10%. Again it will depend heavily on your project - I know yours seems to be special :-) I hope this resource will get the ball rolling and others will have suggestions/contributions as well.
@Alex: No, you rock!
01/31/2007 (6:59 pm)
Oh - just noticed this made it up - cool!@Rubes: I added my machine [PowerBook G4] and video card [ATI Mobility Radeon 9700] to the description. On the main profile journal I was using, I got about a 20% performance increase. On another test, it was more like 10%. Again it will depend heavily on your project - I know yours seems to be special :-) I hope this resource will get the ball rolling and others will have suggestions/contributions as well.
@Alex: No, you rock!
#4
I'm going to try and drop as much of this in my engine to see what kind of PC performance I can get out of it and will post my results later.
This is definitely a MUST resource for any Mac developers.
02/01/2007 (5:49 am)
Finally! I've been following your opt. posts for a while now Andy, and I'm glad this got compiled into a resource.I'm going to try and drop as much of this in my engine to see what kind of PC performance I can get out of it and will post my results later.
This is definitely a MUST resource for any Mac developers.
#5
These are repeated throughout the build, 114 errors before I canceled the build to see if I screwed up the merge. Checking it now =)
02/01/2007 (7:14 am)
Getting a lot of conversion errors, that I'm trying to work through. I'm integrating into a fresh build of the AFX-1.5 TGE. Shouldn't take long to make the right fixes, but I don't want to nullify the optimizations:engine\sceneGraph\sceneGraph.h 236 'return' : cannot convert from 'const LightManager *' to 'LightManager *' engine\sceneGraph\sceneGraph.h line 271'TextureHandle::getBitmap' : cannot convert 'this' pointer from 'const TextureHandle' to 'TextureHandle &' engine\sceneGraph\sceneGraph.h line 316: '=' : cannot convert from 'const FogVolume [3]' to 'FogVolume *' engine\sceneGraph\sceneState.cc 846 '=' : cannot convert from 'const Vector<T> *' to 'Vector<T> *'
These are repeated throughout the build, 114 errors before I canceled the build to see if I screwed up the merge. Checking it now =)
#6
P.S. have put in a fresh SDK15 and have the same errors.
02/01/2007 (7:19 am)
Michael, I got the same on AFX. Just finished merge and got lots of conversion errors too.P.S. have put in a fresh SDK15 and have the same errors.
#8
I'm going to clean all and rebuild all now to see what happens.
*EDIT* - PC developer btw.
02/01/2007 (7:25 am)
Andy - Did a complete rebuild of the Torque Demo, but didn't rebuild the various linked projects (ljpeg, opengl2d3D, ect).I'm going to clean all and rebuild all now to see what happens.
*EDIT* - PC developer btw.
#9
I'm on win, I did clean-rebuild and got a fresh install of SDK.
Could be problem of VS? (tried in both - VC7 and VS2005Express)
02/01/2007 (7:28 am)
@Andy:I'm on win, I did clean-rebuild and got a fresh install of SDK.
Could be problem of VS? (tried in both - VC7 and VS2005Express)
#10
02/01/2007 (7:30 am)
OK - that's good. Both PCs... :-) Could you send me the complete build log so I can take a look at all the problems? I imagine VC++ is going to complain about different things here. I will look them over, make fixes, and post a new one for you.
#11
02/01/2007 (7:40 am)
Ok, I've sent you the log after running a full clean and rebuild.
#12
Thanks bank and Michael for trying it and the reports!
02/01/2007 (8:47 am)
I have put a new version up [1.0.1] to fix the Windows build. I had to back off from some of the const-ing I did to keep the VC++ compiler happy.Thanks bank and Michael for trying it and the reports!
#13
Summary so far:
02/01/2007 (10:42 am)
The latest build integrates perfectly. AFX users need only pay close attention to terrain.cc, where AFX code is interlaced with Andy's code. Other than that, merging was a no-brainer.Summary so far:
- 0 Build errors[li]No new warnings[li]Run in release mode, stronghold level (TGE 1.5 w/ AFX)[li]Small FPS increase[li]FPS increase is not ground breaking, but any improvement is GOOD[li]I'm going to throw my own math optimizations in next to see what kind of FPS increase I get[li]Finally, I'm going to build with compiler optimizations turned on to the max to see what the final result is.
#15
04/29/2007 (3:35 pm)
Thought I should just point out here that the majority of the code from this kit was integrated into TGE 1.5.1, so I don't anticipate releasing any updates to this.
#16
07/03/2007 (11:47 am)
Thanks a lot! 
Torque 3D Owner Rubes
Edit: Can you summarize what kind of performance improvement you have seen, and what machine you're seeing this on? I'll let you know what I get as soon as I can get this installed.