Game Development Community

T3D 1.1BETA 1, 2 and 3 BLUE SCREEN OF DOOM! Possibly reflections - RESOLVED

by Steve Acaster · in Torque 3D Professional · 02/11/2010 (11:54 am) · 111 replies

WTF? Bad crash of some kind. Right-click and select "view image" for it's full bizarreness.

img138.imageshack.us/img138/6032/screenissue.jpg
And yes, that clock does say 05:15, which is why I'm running a trace now.

All the "speckles" are flashing and it's all over the desktop. New_Project.exe stops responding. Flashing "speckles" remain on the screen after closing the application. Requires a reboot to clear them.

GTS250 (Galaxy), latest driver, Win7 with Aero.

Obviously you can see a rocket, a cloud layer and thus lot's of relection - which is what I'm thinking/wildly stabbing in the dark at is the issue here. That's a big old lump of water plane to reflect stuff, that is.

Problem is when running from a Debug I get a full blown BSOD, so it's not possible for me to call a stack.

If I turn FullReflect off, it doesn't crash or glitch when I do the same thing.
---

still an issue in T3D 1.1beta 2

Update 27Jan2011
RESOLVED!
One line fix here
#21
02/13/2010 (6:09 pm)
If the entire PC locks up then there's not a lot you can do other then using a process watcher to keep track of the program and dump logs that you can pick up after you restart the pc. If it's just the program that freezes you should still be able to get a trace from the point you had to kill the app. Since this is predominately a graphics card bug (Whether it's caused by engine code locking up the GPU, or the driver not being able to do what it's supposed to) a GPU debugger would be better then Visual Studio since you should be able to see the call to the graphics card that is crashing it and then find where in the code that call is made. Unfortunately the only GPU debuggers I know of are OpenGL and GLSL only(I'm sure they exist, but I've never even heard of a DirectX debugger).
#22
02/13/2010 (6:17 pm)
Thanks for info Bryan, I've got it down to just an app freeze - but as you say VisualStudio not much help.

"Waiting for d3d9xwhatever" isn't exactly a fountain a wisdom. ;)
#23
02/13/2010 (6:25 pm)
I've never used it, but I've heard of it constantly and after looking at it some it appears that PerfHUD is THE Direct3D debugger: developer.nvidia.com/object/nvperfhud_home.html

I seem to remember some talk about T3D already having the necessary code exposed to plug into PerfHUD so maybe you can get something useful out of it. I would do it but I've been pretty busy preparing for boot camp next week.
#24
02/13/2010 (6:48 pm)
I'll give it a go.

Bootcamp? Sounds like exercise!
#25
02/14/2010 (8:25 am)
@Steve can you not just do a Debug > Break All when it freezes? that's all I have been doing anyway...
#26
02/14/2010 (11:50 am)
Getting task manager up actually unfreezes T3D, but I've managed to get interesting speckles on the screen so the GPU's freaked.

>	ntdll.dll!77d064f4() 	
 	[Frames below may be incorrect and/or missing, no symbols loaded for ntdll.dll]	
 	ntdll.dll!77d05e4c() 	
 	KernelBase.dll!760a6872() 	
 	KernelBase.dll!760a691e() 	
 	user32.dll!768d90be() 	
 	MMDevAPI.dll!746d24b8() 	
 	MMDevAPI.dll!746d2f76() 	
 	kernel32.dll!769e1194() 	
 	ntdll.dll!77d1b3f5() 	
 	ntdll.dll!77d1b3c8()
#27
02/14/2010 (4:34 pm)
Did you try running Direct3D in debug mode, and cranking the debug spew all the way up? It will kick and scream at any GPU misuse that way.
#28
02/14/2010 (6:09 pm)
Doh! Completely forgot about the Dx Control Panel.

EDIT:
wtf?

With debug set rather than retail, T3D fails to start with the warning:
D3DERR_INVALIDCALL
Invalid call
GFXPCD3D9Device::intit - CreateDevice failed!
retry - cancel

//and the stack starts with:
>	New Project_DEBUG.dll!dRealloc_r(void * in_pResize=0x04de4ed0, unsigned int in_size=960, const char * fileName=0x11204948, const unsigned int line=223)  Line 1727 + 0xd bytes	C++

//and then there's just tonnes of it


[edit2]

When I get it to work, and it corrupts/freezes I tab out and record nothing more than
//lots of these
GFXPCD3D9Device::beginScene - Device needs to be reset, resetting device...
GFXPCD3D9Device::reset - depthstencil 3f0f1e0 has 2 ref's
--- Resetting D3D Device ---
The thread 'Win32 Thread' (0x237c) has exited with code 0 (0x0).
The thread 'Win32 Thread' (0x26c0) has exited with code 0 (0x0).
The thread 'Win32 Thread' (0x2448) has exited with code 0 (0x0).
The thread 'Win32 Thread' (0x1650) has exited with code 0 (0x0).

... think I'll give this up for a bad job for the moment...
#29
02/19/2010 (12:46 am)
I can reproduce this quite handily now. Give yourself plenty of ammo in player.cs. Face a large waterblock. Hold down the right mouse button to get triple fire going, and get several missiles going in the air towards the waterblock. Pivot slightly to the left.

First, it appears all normals collide. Then what must be the texture of the moon shows up screen-wide in blood red. And at this point, I can no longer take a screenshot on my PC, nothing shows up in Paint if I can manage to get to Paint by hitting the Windows key. Alt-tabbing is less than completely functional, so the Windows key and arrows key are the only option, but the clipboard buffer which should have the capture is blank.

I was able to get a screen cap off my laptop with a 9400.

dmz.worldofantra.com/public/torquesite/9400.png

Getting one from my desktop with an 8600 was much harder, but I crashed it out several more times (just to be SURE it happened reproducibly), and was finally able to get one screenshot off the 8600 too.

dmz.worldofantra.com/public/torquesite/8600.png

The first picture is extremely typical, if the esc key is still working. It's like the normal buffer, or buffers, turned inside out--and then they exploded.

This might not be the same bug Steve has, but it happens consistently and often, and can crash a client out in network play, as well as have the serving client host crash out but the clients keep being able to play.

This is a Bad Crash, as it requires a complete restart of the machine.
#30
02/19/2010 (1:19 pm)
Okay, still at this ...
On a freeze and a break I get
d3d9.dll!63d08333() 	
 	[Frames below may be incorrect and/or missing, no symbols loaded for d3d9.dll]	
 	d3d9.dll!63d08ff6() 	
 	d3d9.dll!63d0a154() 	
 	d3d9.dll!63c268cb() 	
 	d3d9.dll!63c51fed() 	
 	d3d9.dll!63c269d3() 	
>	New Project_DEBUG.dll!GFXD3D9OcclusionQuery::getStatus(bool block=true, unsigned int * data=0x0012b5f4)  Line 61 + 0x1c bytes	C++
 	New Project_DEBUG.dll!LightFlareData::prepRender(SceneState * state=0x12869010, LightFlareState * flareState=0x09121fc0)  Line 235 + 0x19 bytes	C++
 	New Project_DEBUG.dll!Sun::prepRenderImage(SceneState * state=0x12869010, const unsigned int stateKey=79703, const unsigned int __formal=4294967295, const unsigned int __formal=4294967295)  Line 334	C++
 	New Project_DEBUG.dll!SceneGraph::treeTraverseVisit(SceneObject * obj=0x09121c70, SceneState * state=0x12869010, const unsigned int stateKey=79703)  Line 323	C++
 	New Project_DEBUG.dll!SceneGraph::_buildSceneTree(SceneState * state=0x12869010, unsigned int objectMask=4294967295, SceneObject * baseObject=0x0324ef50, unsigned int baseZone=0, unsigned int currDepth=1)  Line 186	C++
 	New Project_DEBUG.dll!SceneGraph::renderScene(SceneState * sceneState=0x12869010, unsigned int objectMask=4294967295)  Line 232	C++
 	New Project_DEBUG.dll!SceneGraph::renderScene(ScenePassType passType=SPT_Diffuse, unsigned int objectMask=4294967295)  Line 207	C++
 	New Project_DEBUG.dll!GameRenderWorld()  Line 345	C++
 	New Project_DEBUG.dll!GameTSCtrl::renderWorld(const RectI & updateRect={...})  Line 53	C++
 	New Project_DEBUG.dll!GuiTSCtrl::onRender(Point2I offset={...}, const RectI & updateRect={...})  Line 347	C++
 	New Project_DEBUG.dll!GameTSCtrl::onRender(Point2I offset={...}, const RectI & updateRect={...})  Line 156	C++
 	New Project_DEBUG.dll!GuiCanvas::renderFrame(bool preRenderOnly=false, bool bufferSwap=true)  Line 1604	C++
 	New Project_DEBUG.dll!GuiCanvas::handlePaintEvent(unsigned int did=0)  Line 247	C++
 	New Project_DEBUG.dll!fastdelegate::FastDelegate1<unsigned int,void>::operator()(unsigned int p1=0)  Line 990 + 0x1a bytes	C++
 	New Project_DEBUG.dll!Signal<void __cdecl(unsigned int)>::trigger(unsigned int a=0)  Line 326 + 0x17 bytes	C++
 	New Project_DEBUG.dll!Journal::Call<Signal<void __cdecl(unsigned int)>,unsigned int>(Signal<void __cdecl(unsigned int)> * obj=0x04d4df30, void (unsigned int)* method=0x105e0039, unsigned int a=0)  Line 541 + 0xa8 bytes	C++
 	New Project_DEBUG.dll!JournaledSignal<void __cdecl(unsigned int)>::trigger(unsigned int a=0)  Line 64 + 0x12 bytes	C++
 	New Project_DEBUG.dll!GuiCanvas::paint()  Line 1423	C++
 	New Project_DEBUG.dll!fastdelegate::FastDelegate0<void>::operator()()  Line 905 + 0x16 bytes	C++
 	New Project_DEBUG.dll!Signal<void __cdecl(void)>::trigger()  Line 315 + 0x13 bytes	C++
 	New Project_DEBUG.dll!Process::processEvents()  Line 78	C++
 	New Project_DEBUG.dll!StandardMainLoop::doMainLoop()  Line 543 + 0x5 bytes	C++
 	New Project_DEBUG.dll!torque_enginetick()  Line 103 + 0x5 bytes	C++
 	New Project_DEBUG.dll!TorqueMain(int argc=2, const char * * argv=0x0079c028)  Line 369 + 0x5 bytes	C++
 	New Project_DEBUG.dll!torque_winmain(HINSTANCE__ * hInstance=0x00400000, HINSTANCE__ * __formal=0x00000000, char * lpszCmdLine=0x0026346d, HINSTANCE__ * __formal=0x00000000)  Line 423 + 0x17 bytes	C++
 	New Project_DEBUG.exe!WinMain(HINSTANCE__ * hInstance=0x00400000, HINSTANCE__ * hPrevInstance=0x00000000, char * lpszCmdLine=0x0026346d, int nCommandShow=1)  Line 47 + 0x16 bytes	C++
 	New Project_DEBUG.exe!__tmainCRTStartup()  Line 263 + 0x2c bytes	C
 	New Project_DEBUG.exe!WinMainCRTStartup()  Line 182	C
 	kernel32.dll!77761194() 	
 	ntdll.dll!779eb3f5() 	
 	ntdll.dll!779eb3c8()

Which I think is the same as Leon's report.
#31
02/19/2010 (6:36 pm)
Well yeah it looks the same, at the time I really thought it was getting stuck in the GFXD3D9OcclusionQuery::getStatus() WHILE loop but I tried various methods (even one to let it count to 1000 then exit) and it still didn't fix the bug...

I didn't / don't really know how to go about debugging something like this... I just don't have enough EXP with Torque / C++ right now...

I may try have another look tomorrow, I see we may have a fix for PerfHud (I haven't tried it) but I am not even sure if it's going to help in this case??
#32
02/19/2010 (7:33 pm)
Yes, I can get PerfHud running (barely know what all the stuff means though!) - but it black screens on the lockup ...
#33
02/25/2010 (5:34 am)
Okay, this one got me now, but in a completely different situation. We're updating our project from beta 5 to 1.1b, and our character model is completely mangled when loaded from the cached DTS files. Quickly looking at the code, it looks like somewhere down the road T3D is changing the vertex buffer vertex format to a smaller one, while copying from a larger sized buffer while updating skin, which obviously causes data overwriting.

Trying to figure out if this is caused by our own source code modifications, I took the model and its TSShapeConstructor script files and tried to load them on vanilla 1.1b. The result? The model shows up mangled, T3D crashes and my Windows 7 desktop goes ultra sparkly.

I cannot post the model here, but I'll try to get in touch with someone from TP who can deal with this.

-- EDIT --
I fixed my issue correcting a problem that happens when TSShape::initVertexFeatures() is called more than once for shapes that have vertex colors or 2nd texture coordinates and computing a wrong vertex size in the 2nd time, but I doubt it is related to the crash you guys are seeing.
#35
03/11/2010 (10:22 am)
Just upgraded to an AMD Quad with a Radeon 5870 (latest drivers), I now can't reproduce the error...

It will sometimes give me a 1 second pause (Thinking maybe thats where the Nvidia cards implode?) but never does what it used to.
#36
03/11/2010 (11:16 am)
*Reboots PC*
Nah, still here for Nvidia!
#37
03/11/2010 (2:43 pm)
@Steve: since I experienced the exact same symptoms on something entirely different, it's very likely that the cause is the same: copying data to a vertex buffer with the wrong size. Maybe you should try the TSShape fix to see if it's the problem you have (I doubt it, but it doesn't hurt).

In TSShape::initVertexFeatures(), find these lines:

hasColors |= !mesh->colors.empty();
hasTexcoord2 |= !mesh->tverts2.empty();

And replace them by:
hasColors |= !mesh->colors.empty() || mesh->mHasColor;
hasTexcoord2 |= !mesh->tverts2.empty() || mesh->mHasTVert2;

If that doesn't help, be on the lookout for anything in the scene when the glitch happens that might be updating vertex buffers directly. The issue I had was caused by a copying data from a larger buffer to a smaller one, misaligning the data and probably overwriting system-memory GPU resources with bad data due to the out-of-bounds write.
#38
03/11/2010 (5:10 pm)
@Manoel
quote: I doubt it
nah, that gave me a no warning BSOD.

Alas I'd be hard pressed to define the exact coding circumstances, as I am no coder (so terms like vertex buffers are somewhat alien to me). However I have noted various out-of-bounds crashes when using debug (very different circumstances), but of course don't really understand what that means either ....

There is one thing I can assert - crashes occur with large bodies of reflective water with projectiles using very dense particle trails - info that probably is not that much use to a programmer ...

(the good news being all my custom stuff doesn't use this so I don't get the crashes ... but that is rather "localized" good news)
#39
03/22/2010 (9:56 am)
Just to add my experiences here.

I've just upgraded to win 7 ultimate and get the exact same problem as steve with my GTX260

In Vista I had a similar problem in that when the missiles were fired over the water I'd get what appeared to be a hard lock, but it would resolve itself after about 30 seconds.

now I'm not a programming guru here, but it appears that under vista there some kind of 'clean-up' going on that clears the problem after a certain time-out, whilst under win7 something is getting pretty annoyed before that happens, and the system Blue Screens.
#40
03/29/2010 (8:20 am)

Seeing the pixel snow problem that Steve describes here too, though not with Torque but rather when playing Arx Fatalis. System is Windows 7 and GTX 275.

Overall, it seems to me like there's still quite a good deal of issues in the graphics driver chain there. Not sure if this is the fault of the NVIDIA drivers or not but anyways, I'm seeing a lot of instabilities and graphics problems. Frequent BSODs and driver crashes ("The graphics drivers has recovered from a...").