TGEA and Multi-thread.
by Eric Preisz · in Torque Game Engine Advanced · 08/19/2007 (6:14 pm) · 4 replies
I found something in GFXD3DDevice::init that I thought was interesting. If TORQUE_MULTITHREAD is defined then DirectX is set to multi-threaded mode. If you look at the DirectX docs you will find the following about the flag...
... makes a Direct3D thread take ownership of its global critical section more frequently, which can degrade performance...
From what I belive (and I can't remember my source), there is no good reason to use this flag. If you want to add the ability to mult-thread your app in DX9, then you should look for opportunities in places other than d3d rendering code. If you keep directx all on the main thread, then you won't have any threading problems.
Since the reason to multi-thread is for performance, then I doubt that you want to run Directx9 in multi-threaded mode.
I'm guessing that TORQUE_MULTITHREADED doesn't acually cause any rendering across multiple threads anyways, so if the critical section isn't being contended, then the performance issue is not really a big deal.
Intel has a tool called thread profiler that would answer some of these questions better than the directx docs. Does anyone else have input?
... makes a Direct3D thread take ownership of its global critical section more frequently, which can degrade performance...
From what I belive (and I can't remember my source), there is no good reason to use this flag. If you want to add the ability to mult-thread your app in DX9, then you should look for opportunities in places other than d3d rendering code. If you keep directx all on the main thread, then you won't have any threading problems.
Since the reason to multi-thread is for performance, then I doubt that you want to run Directx9 in multi-threaded mode.
I'm guessing that TORQUE_MULTITHREADED doesn't acually cause any rendering across multiple threads anyways, so if the critical section isn't being contended, then the performance issue is not really a big deal.
Intel has a tool called thread profiler that would answer some of these questions better than the directx docs. Does anyone else have input?
About the author
Manager, Programmer, Author, Professor, Small Business Owner, and Marketer.
#2
If that's the case, then the bigger problem is creating a resource on the fly. I acutually think the create calls are probably the slowest in DirectX. I would think that type of streaming would cause a hitch. I notice TGEA was doing this once to create a RT for the glow effect.
The best way to do streaming terrain is to allocate all the geometry up front and then modify it on the CPU via dynamic buffer or using a SM 3.0 displacement mapping technique.
08/20/2007 (1:47 am)
It's been about two years since I played with the inner workings of Atlas. When you say streaming, you don't mean that its paging terrain on and off the harddrive right? Like Oblivion does? Atlas didn't support streaming when I last worked with it.If that's the case, then the bigger problem is creating a resource on the fly. I acutually think the create calls are probably the slowest in DirectX. I would think that type of streaming would cause a hitch. I notice TGEA was doing this once to create a RT for the glow effect.
The best way to do streaming terrain is to allocate all the geometry up front and then modify it on the CPU via dynamic buffer or using a SM 3.0 displacement mapping technique.
#3
DX resource creation is "slow", but not so slow that they couldn't be done infrequently during runtime. Creating a texture or VB every couple of seconds is ok... doing it 10 times a frame is bad. Still Atlas doesn't do that... it keeps a cache of freed textures and reuses them, i suspect it does the same for geometry.
Anyway looking closer it seem that Atlas does not create resources in the loader thread... it only does the disk access work.
So it should be safe to change the default behavior to not set the dx multithread flag. Probably adding a second flag for controlling GFX multithread is a good approach to take.
08/20/2007 (2:55 am)
@Eric - Yes it does.... both texture data and geometry... take a look at AtlasResource.h.DX resource creation is "slow", but not so slow that they couldn't be done infrequently during runtime. Creating a texture or VB every couple of seconds is ok... doing it 10 times a frame is bad. Still Atlas doesn't do that... it keeps a cache of freed textures and reuses them, i suspect it does the same for geometry.
Anyway looking closer it seem that Atlas does not create resources in the loader thread... it only does the disk access work.
So it should be safe to change the default behavior to not set the dx multithread flag. Probably adding a second flag for controlling GFX multithread is a good approach to take.
#4
Thanks for the research. I presume that I will be diving in to TGEA here some more on a contract I'm working on. For now...no time for research. Too many milestones. I'm glad we put this out here.
Yea, I agree...creating one every couple of seconds wouldn't be that bad...except I would expect a hitch caused by the massive driver work.
I wrote terrain streaming about 5 years ago, and on a single threaded machine I found that it was faster to load the entire terrain chunks syncronously. It surprised me. Chances are I had a massive critical section and it was running serial with the overhead of threading though. I'm glad to hear it's only loading in the second thread. If I were to do it now, I would use IO Completion Ports or an event syncronization.
Thanks again.
08/20/2007 (8:21 am)
@TomThanks for the research. I presume that I will be diving in to TGEA here some more on a contract I'm working on. For now...no time for research. Too many milestones. I'm glad we put this out here.
Yea, I agree...creating one every couple of seconds wouldn't be that bad...except I would expect a hitch caused by the massive driver work.
I wrote terrain streaming about 5 years ago, and on a single threaded machine I found that it was faster to load the entire terrain chunks syncronously. It surprised me. Chances are I had a massive critical section and it was running serial with the overhead of threading though. I'm glad to hear it's only loading in the second thread. If I were to do it now, I would use IO Completion Ports or an event syncronization.
Thanks again.
Associate Tom Spilman
Sickhead Games