Game Development Community

Strange Behaviour of TGE 1.4 and FPS

by Bruno Grieco · in Torque Game Engine · 01/17/2006 (8:22 am) · 16 replies

Guys,

Just started working with TGE 1.4 using XCode 2.2.1 ( iMac G4 1.25Mhz, OSX 10.4.4 ).
Compiled a Torque Demo Debug target that works fine.
Compiled a Torque Demo release target that has some strange problems :

It works fine with racing and tutorial missions, but it hangs on FPS when the mission is loaded.
It yields several "onNeedRelight: Unknown command." errors, but the debug version does the same and works.
The last lines of the log are :
onNeedRelight: Unknown command.
onNeedRelight: Unknown command.
onNeedRelight: Unknown command.
*** Mission loaded
Connect request from: 
Connection established 1488
CADD: 1489 local

I've already rounded up some usual suspects, deleted .dsos and lit mission files. but couldn't find the culprit.
Racing example runs fine, that's what is really bugging me.

#1
01/27/2006 (5:27 pm)
I Tracked the OnNeedRelight bug to the editor files at the creator directory.

OnNeedRelight resides in ./editor/EditorGui.cs but it's not loaded until the editor is invoked. The fix for this was loading this file in creator/main.cs

just add
exec("./editor/EditorGui.cs");
#2
01/27/2006 (5:30 pm)
The deployment version not working is still a mistery !!

I actually made it work by commenting the lines that start the AIManager in /server/scripts/game.cs
on the startGame() function.

It seems the bug has something to do with schedules. But I'm not sure.
#3
01/27/2006 (5:34 pm)
Quote:iMac G4 1.25Mhz
Always knew PCs had a little speed advantage ;)
#4
01/28/2006 (6:35 pm)
Bruno,
When you say that FPS hangs, which FPS are we talking about? The starter.fps mod, or the FPS in the demo?
Also, how does it hang? What is on the screen when the app stops responding? Can you hit the cancel button & get back to the main menu? Do the editors work, does anything work, or does the app itself just totally hang?
#5
01/29/2006 (6:21 am)
Hi Paul,

Both hang. The starter.fps mod and the FPS in the demo. But that only happens with the deployment version. the Torque Demo Debug OSX works great.

The whole program crashes. There is nothing that can be done.

Some suspects that I'm looking into :

scheduled events : I was able to postpone the crashing by commenting the lines that create kork in startGame(), AIManager.think, that schedules itself for later execution is certainly a culprit. But there are also scheduled functions in the client GUI side.

3D rendering : XCode debugger catches an exception on ExtrudePolyList::extrude(), after it inverts a poly.
#6
02/08/2006 (5:00 am)
Quote:
Quote:
iMac G4 1.25Mhz

Always knew PCs had a little speed advantage ;)

The G4s are older CPUs. G5 Dual and quatro core rocks any today's intel(like) CPUs. :-p
#7
02/08/2006 (5:05 am)
Um, doesn't intel manufacturer chips for apple now? Hmmm, interesting point isn't it.
#8
02/08/2006 (8:07 am)
Yes, intel does in fact now manufacture apple's chips. Jobs switched to Intel because the G* line was NOT performing on par with the latest Intel/AMD chips were for PCs. It's good to finally be able to tell my mac friend to stfu ;).
#9
02/08/2006 (8:36 am)
Uh...I think he was poking fun at the typo (MHz instead of GHz).
#10
02/08/2006 (10:54 am)
Guys,

I haven't still fixed this problem. Could we leave the intel x Motorola discussion for other thread.

Thanks
#11
02/08/2006 (11:18 am)
Wish I could help...I remember having a similar crash at the same point, but it was back in October and I can't remember what the solution was. For some reason I think it had something to do with the compiler options that I was using, but IIRC the 1.4 build doesn't need those anymore.

In any case, one way to pinpoint the problem a little more is to go into starter.fps/common/server/clientConnection.cs and look for the function: GameConnection::onConnect.

In there, you should see the line where the CADD line gets printed:

echo("CADD: " @ %client @ " " @ %client.getAddress());

Just put a few more "echo" lines in the code to let you know how far after that it gets, then perhaps we can know more.
#12
02/08/2006 (12:37 pm)
Compiler directives seem to be one of the suspects. But one of the biggest problems with Xcode is that I just can't find them in the project. I can open a text editor and look at the project with it where I can see them ( it's in XML ). But that's certainly not the way to go.

The last line that get's printed is the last line in onConnect. After that it begins the game. As I posted before, commenting the AIManager/AIPlayer startup helps, but doens't work 100% of the time.

The exception occurs inside the ExtrudePoly method. But I don't think that's where the problem is.
#13
02/08/2006 (2:29 pm)
Hopefully I understand you correctly...The compiler directives, in Xcode, are in the Project menu under "Edit Active Target" (assuming you've first set the active target with the "Set Active Target" command under the same menu). Once you select that, the Target settings window appears, and just select "GCC Compiler Settings". The compiler flags are listed there.

One thing I've noted is that my 1.4 build uses GCC4.0, while my 1.3 build used GCC3.3. A number of compiler flags have changed as well, so perhaps that's it.
#14
02/08/2006 (4:44 pm)
Ok, so...
@Bruno: I was finally able to reproduce the bug you're talking about.
That is... my release builds decided to explode too.
In extrudePolyList.cc .

As it turns out, GCC is generating bad code in a particular partially unrolled loop. The loop is supposed to run 4 times, it runs only 3 times, resulting in garbage values in the extruded box 3d polygon, which causes problems. The code itself is not to blame. It's really very stable, and I've no idea why GCC craps out when it hits it.
The bug is very hard to reproduce, and since it's the compiler spitting out bad code, in an optimized build, very hard to debug. I'm sure that it's bad codegen, because I read the asm dump as I stepped through.

So, here's the code that's tripping GCC up:
In files engine/collision/polyhedron.cc:14 and engine/interior/interiorCollision.cc:165
In functions Polyhedron::buildBox() and InteriorPolytope::buildBox():
void Polyhedron::buildBox(const MatrixF& transform,const Box3F& box)
{
   ... <snip>...
   // The edges are constructed so that the vertices
   // are oriented clockwise for face[0]
   edgeList.setSize(12);
   Edge* edge = edgeList.begin();
   S32 nextEdge = 0;
   for (int i = 0; i < 4; i++) {
      S32 n = (i == 3)? 0: i + 1;
      S32 p = (i == 0)? 3: i - 1;
      edge->vertex[0] = i;
      edge->vertex[1] = n;
      edge->face[0] = i;
      edge->face[1] = 4;
      edge++;
      edge->vertex[0] = 4 + i;
      edge->vertex[1] = 4 + n;
      edge->face[0] = 5;
      edge->face[1] = i;
      edge++;
      edge->vertex[0] = i;
      edge->vertex[1] = 4 + i;
      edge->face[0] = p;
      edge->face[1] = i;
      edge++;
   }
}

GCC's optimizer has 3 branches to predict & track in this loop. It's getting confused, and ending the loop before the 4th run through. So... the fix is really dumb. I mean, really dumb. We remove 2 of the branches, so that it won't get tripped up.

We change:
S32 n = (i == 3)? 0: i + 1;
      S32 p = (i == 0)? 3: i - 1;
to
S32 n = ( i + 1 ) & 0x3;
      S32 p = ( i + 3 ) & 0x3;
Which calculates exactly the same thing, but without branching. In theory this 'fixed' code is faster, but most modern optimizers & chips will unroll or branch-predict this loop well enough so that there is little measurable difference.

So, a bit long-winded, but there's the fix. Already rolled into the next 1.4 point release, submitted here for your approval.

Share and Enjoy

/Paul
#15
02/08/2006 (5:14 pm)
Very cool. Thanks, Paul.
#16
02/08/2006 (5:45 pm)
@Paul

IT WORKS !!. Thanks a lot, Paul. Now that was a tricky bug, wasn't it ?

@Rubes
Thanks for the Xcode tip. I hate those compiler IDEs It's very hard to find something you are looking for.