Game Development Community

Putting TGE renderer in another thread?

by Igor G · in Torque Game Engine · 07/16/2007 (5:31 pm) · 18 replies

Using Torque's profiler, it seems that 70% of my game loop is spent in the rendering, so....maybe I can multithread it and have the other threads do other things?

I'm not sure if it's possible to put the OpenGL rendering on another thread, and whether the benefits would justify doing so.

Any thoughts?

#1
07/16/2007 (7:29 pm)
It's possible to put rendering on a separate thread. I wouldn't suggest it though.

70% in rendering is awfully vague, if you put up your profiler dump I (or someone else) might be able to identify a specific problem area, which will be much easier to optimize.
#2
07/16/2007 (7:34 pm)
Also keep in mind that it's a "100% sum" game--something has to take up 100% of your profile dump.

What FPS are you getting with this profile dump?

If it's 90+, then you should be confident that you aren't overloading things like processTick() with un-optimized AI, etc, but if it's 10-- then you should be concerned, and creating additional Profile_Start/End blocks to determine what specifically might be an issue.
#3
07/17/2007 (6:04 am)
My world's constantly running in the low teens (10~15), and I'm just exploring paths to take to optimize our world/city.
The problem is due to the density of the city, and produces such a huge amount of work on the CPU and video card.

In my profile dump, I notice that TSShapeInstance::MeshObjectInstance::render takes up 75% of my game loop, and I was wondering if the rendering could be offloaded to another thread to free the CPU to do other tasks.
Of course, I'm not sure if there's a huge benefit from doing this, so that's why I'm asking.

Thanks.
#4
07/17/2007 (7:53 am)
Instead of moving rendering off to another thread, I suggest you make the renderer more efficient. The main slowdown in the TGE rendering subsystem is that it isn't really render-state aware. To do this would require adding a scene manager that would intercept all of the rendering calls, and keep them sorted by material & render state, and then would draw them en-masse. This would also give you the concurrency that you are looking for (the driver will batch all of the drawing calls so they happen in the background).
#5
07/17/2007 (10:32 am)
Hey Jaimi - that would be a good idea in general....but for my case, I don't think it's the material change that's so slow.

glDrawElements is the reason the rendering is so slow. My scene/city averages between 1 million to 3.5 million polygons, so maybe there's a trick or optimization for glDrawElements?
#6
07/17/2007 (11:10 am)
3.5 million polygons is a lot of polygons - You may want to find ways to decrease the count (using LOD, adding portals, etc). It's possible that the renderstate changes may also be affecting you more than you may realize, since much of that happens in the background (textures being swizzled, etc), so you won't see it on the call to glbindtexture.
#7
07/17/2007 (11:33 am)
With proper LoD you should never have 3 mill polys in your pipeline. 500k is more than enough. I wonder if you are using DTS where you should be using DIFs? Lots of LoD tricks to incorporate too.. Just for giggles you can set your last LoD to be an invisible cube. Really cuts down on the polys and at max LoD distance it should make zero difference to playability.

Good luck
#8
07/17/2007 (12:54 pm)
Actually, our DTS building LoDs are about as optimized as they can be. We even have the max LoD level be invisible.

3.5 million is probably the max you'll ever see in the city, and 1.5-2 is on average.

Like I've mentioned before - my project is a very very dense city and unfortunately, it's confidential, so I can't disclose more details. I am trying to optimize it as much as possible, and I'm trying to see what paths I can go down to do this optimization. But I will look into render states, and see whether that helps or not.
#9
07/17/2007 (1:06 pm)
Good luck Igor. Sounds ambitious.

Just an FYI though if you have 3 mill + poly count in your pipeline there is simply no way the models are using LoD. It's impossible. If you are using DIF portals and LoD you could build the most detailed city in the world and not even hit 1 mill.

Please recheck your LoD on your models. I would suggest to you that is your first problem.

A simple test would be to take your most complex polycount item and toss it in a stand alone zone. Take the profiler dump at the different LoD levels and try and get poly counts on the diff levels as well. I have built an entire fantasy city including massive bridges and finely detailed housing structures and my count was way under 250k

I apologize if I am mistaken about your work but even 1 million pollies screams out to me that somethign isnt right. You can make the most detailed city rendering in the world and it will still have less than 500k pollies if it's LoDd correctly :)

Good luck
#10
07/17/2007 (1:34 pm)
Flybynight - I've even employed a simple occlusion culling scheme that cuts out most of the occluded buildings, and I still get a huge amount of polys per frame 1-2 Million+.
The buildings in my world are extremely complex, and you're right - I'll probably have to optimize the LOD levels in my building as much as possible, but I'm not sure whether or not there's a huge gain there, since buildings far away or occluded aren't rendered anyways.
#11
07/17/2007 (2:08 pm)
Interesting. Large Unreal 3 engine scenes are around 500,000 to 1.5m tris according to the Unreal Tech site. That's a full environment. You should really check out your meshes and portaling to ensure occlusion. Or optimize your details with normal maps or texture work. Because it sounds like your artists were working in a procedural rendering mindset rather than a real-time rendering one.
#12
07/17/2007 (9:49 pm)
Igor I just noticed above where you stated:

"Actually, our DTS building LoDs are about as optimized as they can be."

That is your problem right there I think. DTS meshes do not use portals..

Do your buildings use interiors? Actually regardless of wether or not your buildings are hollow in an environment where you are talking about a "heavily detailed" world you are going to have huge rendering problems using DTS over DIF.

I am not an expert on this but that one statement stuck out at me and if I am thinking clearly (which is not a given btw ;) ) you need to go back to the table and look at which objects SHOULD be DTS and which should be DIFs.

Sorting out the proper meshes with the proper portals is going to assist you a bit but you are still way overbudget on polycount somewhere and you are going to have to figure out what needs to be trimmed.

Someone once submitted a 20k poly sword to me for use in my fantasy game. After a lengthy discussion that was going no where I redid the sword myself, put a nicer skin on it and in a voting contest people preferred my sword which was less than 300 polys to the 20k poly sword. I am a firm believer that there is no replacement for skill. More polys doesnt always mean better detail. Sometimes it just means less skill. I'd go back and beat up on the 3D asset guys ;)

PS: No offence meant in the above btw. Your art assets are probably light years beyond my skills but the basic premise of poly budgetting still prevail.

I wish you luck on your quest.
#13
07/18/2007 (4:56 am)
Hey guys, thanks for all the responses. We do use DTS shapes to represent all our buildings, and we do know that they are not optimized for rendering (and collisions). However, we aren't building a 'game', so we need our city as detailed as possible and DTS shapes offer what DIFs don't. Also, DTS shapes are a reusable asset, as they are exported from Maya, and DIFs are GarageGames/Torque only, so there's a business case against DIFs in that part.

Maybe eventually, there will be a way to normal map DTS shapes so we can reduce their complexity, but at the moment, we will still go with DTS shapes, and we've already gone through many evaluations of DTS vs. DIFs.

Perhaps there's a way to integrate invisible DIFs with DTS shapes so the engine can take advantage of DIF portals. How about baking in a DTS shape into an invisible DIF shape that approximates the DTS? Would that be possible?
#14
07/18/2007 (7:19 am)
The dts objects are killing your frame rates. Torque (or any real-time sim engine) try and use BSP as often as possible and Torque provides ways to bake in DTS objects right into DIFs for details. Dif details unless the view distance is up close and personal is much easier (and faster) on a 2d texture level to get working then it is in 3d vertex modeling, on top of rendering faster.

BSP is pretty widely support as well and in many cases of real-time render engines is used as the standard. You might look into making all your ground level objects DTS based and the higher ones DIFs if you're really concerned about high details. There is also the option of billboarding anything that wont be viewed up close like anything above the 3rd or 4th floor. I think you'll find that BSP models will save you a ton of rendering headaches.

Heres some BSP (DIF) based detail examples, but these are without shaders, normal maps or bump maps, which you can add to DIFs in TGEA and using the MK in TGE. Most are not using DTS objects either for various details.

www.garagegames.com/mg/forums/result.thread.php?qt=63636
www.garagegames.com/mg/forums/result.thread.php?qt=60540
www.garagegames.com/mg/forums/result.thread.php?qt=61152
#15
07/18/2007 (8:19 am)
This seems on topic to the original post...

Quote:Q. How will Unreal Tournament 3 use multiple cores on a CPU? Does it take advantage of Quad Core CPU's? If so, how/what task is assigned to each core?

A. Unreal Engine 3 is a transitional multithreaded architecture. It runs two heavyweight threads, and a pool of helper threads.

The primary thread is responsible for running UnrealScript AI and gameplay logic and networking. The secondary thread is responsible for all rendering work. The pool of helper threads accelerate additional modular tasks such as physics, data decompression, and streaming.
The article.

To make that work in TGE would be tough as rendering can easily use state from the simulation. You would need something more like TGEA's RenderManager which collects the entire scene state before rendering it.
#16
07/18/2007 (9:19 am)
Well to be fair Tom the original Topic was about seperating the rendering into a thread to boost his performance.. I think what the responses have been about is helping him find the real performance issues.

Bottom line it's about using the right tool for the job. Doesnt matter if it's a game you are making, a city model for concept presentation or architectural promotions.. you still need to use the right tools for the job. In Torque your tools are DIFs and DTSs. They are both very different and have very different uses.

As far as DTS exporting from Maya... It's a non-issue. If you wish to do all your modelling in Maya do so.. Keep your original assets in maya format and export to DIF just like you are exporting to DTS.. I dont see the issue.

If your buildings have interiors to them, Wether you need collision or not, you MUST use DIFs if you want to get proper framerates. the portal system in Torque isnt even working if all of your major detail models are DTS.

I guess if I had to put it into an IT perspective it's like saying you want to run a mail server on your network and you are using Unix servers but you want to run MS Exchange on your unix boxes.. Using all kinds of crazy EMUs and such sure it can be done.. but the performance hits are terrible and Unix comes with its own fantastic mail servers built in.. So the question is Why.

Igor if it's as simple as you just wanting to seperate the rendering into it's own thread and thats that, Tom's info is bang on. If you just want to increase your performance marginally by beating on it with more hardware you will improve your framerate slightly by increasing your system specs significantly. If you are already running a 70% rendering profile, threading it out will only gain you another 30% (tops) performance boost. So if you are getting 10-15 FPS now you will only boost that by 3-5 FPS after multi-threading. (under ideal circumstances). By increasing the core CPU performance (IE upgrading to a more powerful system) you are going to increase your FPS more substantially. Even so, I doubt you are looking at more than 25 FPS total even on a more powerful CPU. If that number will get you over a hump and do the job you need to do then go ahead and multi-thread and change to a newer test PC.

IMO it's an aweful lot of work and expense for a very minimaland capped gain. By utilizing the tools of the Torque engine the way they are meant to be used you could explode your FPS by 200-300% just by using DIFs where they are supposed to be used. The reason is portals. DIFs will use portals to not render any of the massive ammount of details that you might have on the interiors of the buildings unless it can be seen through a portal. If you have ultra detailed interiors this could drop your poly count by upwards of 75%. It will take your 3 million tris and drop them down into a far more managable 200-400k range. This obviously assumes there is massive detail on the inside of your buildings.

Anyways Igor only you know whats best for your project. I wouldnt dream of arm twisting you one way or the other and anything that I say should be taking witha grain of salt as I am not as expert at this. I just wanted to look at your original querry and try and determine what the true underlying problem was because Torque definitely is capable of far more than you were getting out of it. We have gotten to the crux of the issue and you now know what is causing the framerate issues. Deal with it however it will work best for you. If you just need marginal FPS boost then a new CPU upgrade and multi-threading your rendering out will give you that. IMO exporting your data into a DIF would be a far more effective FPS increase (potentially doubling or tripling your FPS) for far less expense and effort.

Good luck Igor :) I think you've gotten the answers you need. Cheers
#17
07/18/2007 (9:27 am)
Quote:
As far as DTS exporting from Maya... It's a non-issue. If you wish to do all your modelling in Maya do so.. Keep your original assets in maya format and export to DIF just like you are exporting to DTS.. I dont see the issue.

I have to take exception to this (and only this, the rest of the post was pretty spot on)--Maya is a "free poly' modeling application, and DIF requires CSG based datasets for proper export.

It is almost never a productive pipeline to try to force models made in a free poly editor into a CSG based format via export. Even the various "mods" like Game Level Builder that attempt to enforce CSG requirements onto a free poly editor most commonly cause many more issues than they are worth.

Now, it would be possible to bring shapes modeled in Maya in as static meshes into Constructor, and then export to DIF that way, but given the complexity and detail of the scenes he's talking about, it might take effort to break the buildings down into appropriate sizes for static mesh utilization.

As Fly's general post indicates, it's a matter of using the right tool for the right job--a hammer is great for putting nails into wood, but doesn't work very well for trying to cut a whole through a board. DTS is also definitely not designed for the level of complexity you are using it for.
#18
07/18/2007 (10:53 am)
DOH. Thank you for pointing that out Stephen. I knew my lack of expertise would bite me in the butt at some point.

I knew it could be done but didnt realize the pitfalls of it. Thanks for clarifying that. I'll keep that on file.