Game Development Community

Performance Tests: Part 2

by Chris W Hill · in iTorque 2D · 04/11/2011 (2:28 pm) · 30 replies

blog.kihongames.com/2011/04/11/labs-itorque2d-vs-cocos2d-performance-part-2/

It is with a heavy heart that I acknowledge that the amount of effort required to fix the bugs in Torque has now tipped over in favor of using other technology. Fixing the graphics performance problems and also the (near-insurmountable) TorqueScript performance issues is too much for us. Thanks everyone for your help.

C
Page «Previous 1 2
#1
04/11/2011 (2:38 pm)
Thanks Chris, useful information. You can get reasonable performance in Torque but you do have to move some of the core game loop into C++ / objective C.
#2
04/11/2011 (2:56 pm)
Invaluable data there Chris, I plan on starting to improve my knowledge of Torque to better understand possible enhancements. However I wouldn't turn down Torque for these reasons.

Your tests all compare Torque using Torque Script against Cocos2D. However by default as I understand it Cocos2D is not a scripting API and so any meaningful comparisons would have to be against a Torque program with C++ implemented behaviours/classes (I do however believe documentation in this area is lacking).

Even if I take the LUA integration potential of Cocos2D I come to the following conclusions:

1. I'm not a big fan of LUA and find C++ just as easy to use so may as well use that as on average it's 32 times faster than LUA.

2. Torque Script works primarily by exposing the C++ API, LUA's popularity is for its ease in integration and exposing of C++ APIs. As Torque Script already prompts the appropriate points to expose the API integration of LUA does not seem like a monumental task if you wished to adopt it.

Adding to what Scott says, for significant performance it's normal practise to place cpu heavey code into C++, I have access to most AAA game engines such as CryENGINE 3.3 and Gamebryo Lightspeed 3.2 and both of these recommend you place ALL per frame code into C++ and these are both based on LUA scripting.

Would you consider posting your benchmark xCode project so that we can all have a common platform with which to test improvements/changes and their effectiveness?

Cheers
#3
04/11/2011 (3:33 pm)
So yes, I'm familiar with the concept that you should move your core loops out of TorqueScript, and have been doing exactly that in our game code. I expect that any performant process should happen in c, and with that in mind we have all kinds of special function such as getting an AABB of a set of sprites, special collision processing, box2d, etc, all in tight C loops.

But iterating over 100 items and doing some minor logic? That's not a huge number. I think that should be feasible in a scripting language, without requiring a developer to retool a designer's code to make it production ready.

The project I used was not added for a couple reasons. I will fix it up, commit, and export for you here shortly.
#4
04/11/2011 (4:35 pm)
Here is the Torque Project for the tests. Note that you will need to figure out how to fix the getRealTime bug for these to give useful numbers. These tests were run on a first gen iPad. The numbers may be slightly different, your mileage may vary!
#5
04/11/2011 (5:02 pm)
I'll agree this isn't as relevant of a blog post as the previous one. It's well known TS should only be used for simple calls to the engine, UI, menus, etc. Anything anywhere related to performance should be in C++. It's by no means insurmountable to move things from script into C++. In fact, it's really easy.

I'm not debating the numbers. Hey, I know Torque is a bit bloated / slow out of the box. But with some trimming and some smart planning, it's a more viable option than presented in these blogs. :)
#6
04/11/2011 (6:27 pm)
I just ran some additional tests. I cracked open the RainyDay tutorial. Did not modify anything in the code besides adding a showFPS call.

Then I dropped a bunch of Flowers on the stage. At 30 Flowers I started to see a performance hit. At 40 flowers I was crawling at 12fps on the iPad. Note: I was not moving the cloud or interacting with the ipad in any manner. This is just the game running on its own.

I feel as though the update call on the flowers is a very simple call. One that the torquescript would be expected to handle. Here is the code so you do not have to look it up:

PlantGrowthBehavior::onUpdate(%this)
{
   // If the current moisture is lower than the threshold, start dieing
   // Otherwise, we have enough moisture to grow
   if (%this.moisture < %this.lowMoistureThreshold)
      %this.RunAnimation(1.0);
   else
      %this.RunAnimation(-1.0);
   
   // Modify amount of moisture based on whether we're being watered or not
   if (%this.watering)
      %this.moisture += %this.wateringRate;
   else
      %this.moisture -= %this.dryRate;

   // Clamp moisture value between minMoisture and maxMoisture
   if (%this.moisture < %this.minMoisture)
      %this.moisture = %this.minMoisture;
   if (%this.moisture > %this.maxMoisture)
      %this.moisture = %this.maxMoisture;
}

I also feel that 30 sprites on the stage is a very unreasonable number for a ceiling. It is very limiting to game design and art.

Therefore I do not see Torque as a viable option.

#7
04/11/2011 (6:29 pm)
@Chris - I get you completely, it should be better, but you really should put ALL onUpdate/per frame code into C++ as you just know you're throwing away frames otherwise, regardless of engine. The code you ran is fairly simple but x100 in script and I'd expect to drop some frames in most engines when you realise you're just running on an iPad (1ghz Cortex-A8).
Imagine C++ takes 10ns x 100 = 1ms, LUA would be more like 320ns x 100 = 32ms, it very quickly adds up to something noticeable and passing that per frame limit.

I think this does point out the importance of what Scott was saying regarding the way Torque script stores and converts data. I think it's also worth taking a look at the hash table and seeing if there's a faster implementation anywhere given how frequently this is referenced saving any time would prove very effective.

I'll also note that all major engines even with complex physX do indeed adopt a variable onUpdate/tick delta, when you're short on frames Torque's current implementation pretty much guarantees that you'll not recover your frames any time soon and would explain a sudden performance fall-off. Though I understand very well this would likely affect the physics system, but if GG are looking at a Box2D integration, this could be the time to sneak in the change.
#8
04/11/2011 (6:34 pm)
@alistair Well, I'm currently screwing with Corona. I'll let you know what my Lua performance is like in Corona with my next blog post!
#9
04/11/2011 (8:49 pm)
@Chris - I really appreciate these posts. This has spurred an effort I had been hoping would happen before the next engine release. I cannot post details, but we are bringing in a senior level programmer with extensive Torque knowledge. He is going to take your tests and some custom ones, put them through the ringer and discover/fix bottlenecks. Even if you go to experiment with Cocos2D and Corona, be sure to stay tuned with iT2D. We have turned a corner and are now pushing harder than ever on the engine.
#10
04/11/2011 (9:21 pm)
@Michael: No problem, Michael. Thanks for you and everyone's hard work on Torque. Its a great product in many regards. Of course anything could be better. It helps to have some concrete numbers. I'm a fan of those, mainly because of this book I got called Video Game Optimization from some guys called Ben Garney and Eric Preisz. ;)
#11
04/11/2011 (10:06 pm)
@Chris - Yeah. Really wish I could have one of those guy's help out with. . .oh wait =)
#12
04/12/2011 (10:23 am)
Ok, my latest performance numbers are up, regarding Lua performance in Corona. I did some comparisons to TorqueScript as an aside. I also heavily referenced this conversation.

blog.kihongames.com/2011/04/12/labs-corona-lua-and-scripting/

So my studio just had a debate regarding our current project, performance, etc. We are going to complete our current game using Torque, but we are scaling back. We also debated the point of re-writing TorqueScript for optimization. Even internally we can't seem to agree on the best approach for using a scripting language in a game engine! :)
#13
04/12/2011 (12:21 pm)
Wow, I expected TS to be slower that LUA by some factor but not 16-50. On the plus side this means that there must be some serious wiggle room to improve the language.

I think it does go to point out what had been iterated on these posts as well regarding per frame optimisation. That even iterating over just 100 objects LUA still takes 3ms (yes TS much worse) but throw in a few random scripting events or more complex onUpdate and you'll potentially drop frames.

Ideally you'd be able to fully test your game on a powerful PC and then port the script after the level designers are done but multi-touch makes that one a little difficult!

Again thanks for the research!
#14
04/14/2011 (2:02 pm)
@Chris - I'm using your sample project from the second blog post about performance. I switched to the pong function for the behavior. What FPS are you getting on the iPad when you run that? I'd like to compare it to mine.
#15
04/14/2011 (2:31 pm)
Hi Michael,

I'm getting around 1.1 fps. Just so you know, my engine has been modified to run at 60fps(among other things). I would assume you will get better performance at 30fps due to the implementation of iTickable.
#16
04/14/2011 (2:34 pm)
Very interesting. Out of the box it is running at 49fps for me. This is using your project, your art, your code and iT2D 1.4.1. When I move just a few things into C++, I jump up to 60fps.
#17
04/14/2011 (2:39 pm)
Wow! Mark has independently tested my results, and saw the same thing I did. I'm wondering what the difference is. Let me check out a stock 1.4.1 and test against it.
#18
04/14/2011 (2:41 pm)
Forgive me for this question...did you switch to release mode and test? Because it was running at 1fps in Debug. When I switched to release, BAM, 60fps. Same results with your first project you provided too.
#19
04/14/2011 (2:50 pm)
Wow, that is a huge difference! Chris, any other changes you've made to the engine that explains this? Are you both using the same version of the iPad?

Also, on a side rant, this apparent need for huge frame rates always bugs me just a little bit. Film is 24 frames per second and it looks really good to me. I've never gone to a movie and thought to myself, "boy that would be better if the motion wasn't so jumpy". That said, I can see that there are benefits to calculating collision more often than 24 times per second. Am I missing the point? Is there some real reason that 60 fps is important? Really, I want to know. I'm sure there is a reason, that's why I labeled this a rant.
#20
04/14/2011 (3:02 pm)
Film is interlaced which tricks the human brain/eye into thinking that there is a much smoother motion than there actual is (it is perceptually the equivalent of twice the 24 fps).

With digital mediums (like on a computer or with video games) each frame is distinct and in order to have it look as smooth as an interlaced film you have to double the number of frames shown. Combine that with the fact that framerates fluctuate in a video game (they don't in film), it has been found that having an average of 60 fps is the best to make sure you don't drop below a "good" framerate.
Page «Previous 1 2