System crash
by Tim Doty · in Torque Game Builder · 03/03/2005 (6:19 pm) · 22 replies
Three times tonight I've had a total system lockup closing T2D. Makes a mess of what I'm working on ;^) I'm running SuSE 9.1 with the NVidia 5336 driver on a dual Athlon MP2400/MX4000Twin. If I had to guess it'd be the SMP. Is T2D thread-safe? Possibly another problem?
About the author
#2
Because its a total system lockup (no response on the network even) it must be the kernel dying. I'm just trying to hazard where the fault is to figure out what to do about it. A likely culprit would be the nvidia (kernel) driver, but T2D *could* be aggravating the problem. Anyone else with an SMP system?
03/03/2005 (6:50 pm)
I could have asked that better ;^) thread-safe really only applies to multi-threaded applications, the OS takes care of multiple processes, threads and SMP. So, to be a bit more clear, when you say that none of the TGE applications are thread-safe is that saying they are multi-threaded but don't use exclusively thread-safe functions?Because its a total system lockup (no response on the network even) it must be the kernel dying. I'm just trying to hazard where the fault is to figure out what to do about it. A likely culprit would be the nvidia (kernel) driver, but T2D *could* be aggravating the problem. Anyone else with an SMP system?
#3
Never had T2D lock the system up, ever. That's not to say there isn't a problem. T2D doesn't do anything fancy with the graphics card and I'm not aware of any stability problems on that hardware.
Sorry I couldn't provide any more insight and sorry this is your first impression of T2D, not a very good one I'm afraid.
If you get any more information or you find/resolve the problem, then I'd appreciate you passing the info along if you would.
Thanks in advance,
- Melv.
03/04/2005 (1:20 am)
@Tim: A preemptive multitasking OS won't cause problems with TGE apps but Stephen is correct in that the app is not thread-safe but this shouldn't affect it running (at least on Windows but I'm not sure of the issue in SuSE 9.1) as we don't do threaded calls anywhere.Never had T2D lock the system up, ever. That's not to say there isn't a problem. T2D doesn't do anything fancy with the graphics card and I'm not aware of any stability problems on that hardware.
Sorry I couldn't provide any more insight and sorry this is your first impression of T2D, not a very good one I'm afraid.
If you get any more information or you find/resolve the problem, then I'd appreciate you passing the info along if you would.
Thanks in advance,
- Melv.
#4
If there is any one thing that's common to most issues with Torque products, it's OpenGL video card drivers, especially nVidia ones. There have been multiple reports on multiple platforms of issues with several of the drivers for several cards, and I'd suggest this might be your best place to start looking.
03/04/2005 (5:47 am)
I could have answered it better as well! As Melv says, none of the out of the box Torque products even use threads, although it does provide support for them. Also as Melv says, there shouldn't be any issues related to multi-cpu, since it's still one process. If there is any one thing that's common to most issues with Torque products, it's OpenGL video card drivers, especially nVidia ones. There have been multiple reports on multiple platforms of issues with several of the drivers for several cards, and I'd suggest this might be your best place to start looking.
#5
03/04/2005 (9:26 am)
SMP doesn't affect anything torque wise. Try newer nvidia drivers, you're a version behind.
#6
Would be wonderful if somehow we could build a list of "known stable" drivers per video card, per platform, but that's a lot of data collection.
03/04/2005 (10:01 am)
Or in the case of nvidia, sometimes you have to try older drivers, as in my experience at least, they tend to break OpenGL every time they release a new driver. It's a true pain, because it almost always looks like it's Torque's fault, but it's very often the driver.Would be wonderful if somehow we could build a list of "known stable" drivers per video card, per platform, but that's a lot of data collection.
#7
I think everyone has suggested about as much as anyone can for the moment.
I hope you can resolve the problem.
- Melv.
03/04/2005 (10:27 am)
Might be worth also trying the TGE demo or some of the other product demos to see if the same thing happens.I think everyone has suggested about as much as anyone can for the moment.
I hope you can resolve the problem.
- Melv.
#8
I am using an older version of the nvidia driver because the latest version is incompatible with the 2.6 kernel series (perhaps only certain versions, but definitely the version I'm using). There was just a kernel update from SuSE to address the nvidia driver problem. Looks like I'll give that a whirl.
The only OpenGL I really use is Celestia. Its not locked up this computer, but it could be luck. Also, it only happens when I close T2D. As I don't run Celestia *that* often and certainly don't start/stop as frequently as I was T2D it could just be something to do with closing out OpenGL on this driver.
As to my impression of T2D: I'm very impressed and in a good way. The project for which I bought T2D I may not use it, it makes development so quick I've started another.
If I were to make a suggestion it would be to point out the console very up front. Although this may be old hat to other game developers my programming background is not gamer and my gaming background (while long and extensive -- 25+ years) is paper & pencil roleplaying.
Thanks again!
(I just want to make clear that in nearly every respect T2D is far superior to what I've been working for the aforementioned project. There are a few deep design niggles that kind of bug me for this particular project. I *want* to use T2D for what it can do, but I'm going to get my feet wet with the a simpler project and see if I can bend it the way I want.)
03/04/2005 (3:05 pm)
Thanks for the replies!I am using an older version of the nvidia driver because the latest version is incompatible with the 2.6 kernel series (perhaps only certain versions, but definitely the version I'm using). There was just a kernel update from SuSE to address the nvidia driver problem. Looks like I'll give that a whirl.
The only OpenGL I really use is Celestia. Its not locked up this computer, but it could be luck. Also, it only happens when I close T2D. As I don't run Celestia *that* often and certainly don't start/stop as frequently as I was T2D it could just be something to do with closing out OpenGL on this driver.
As to my impression of T2D: I'm very impressed and in a good way. The project for which I bought T2D I may not use it, it makes development so quick I've started another.
If I were to make a suggestion it would be to point out the console very up front. Although this may be old hat to other game developers my programming background is not gamer and my gaming background (while long and extensive -- 25+ years) is paper & pencil roleplaying.
Thanks again!
(I just want to make clear that in nearly every respect T2D is far superior to what I've been working for the aforementioned project. There are a few deep design niggles that kind of bug me for this particular project. I *want* to use T2D for what it can do, but I'm going to get my feet wet with the a simpler project and see if I can bend it the way I want.)
#9
03/04/2005 (3:41 pm)
Thanks Tim. And if you feel like talking about the design stuff, feel free to open a thead about it. We are always interested in ideas / feedback, and we're still early in the dev process here, if you look at it with a long-term perspective, so adaptations are possible. I'll just note here that animation and imagemap system will be significantly extended and refined in coming releases, in case that's an area you were talking about. (Let's open up a separate thread if the discussion is to continue down this path! :)
#10
re: this thread. I just got the kernel update, found that the latest nvidia driver still has the same problems (it has problems with screen updates), and reverted to the previous nvidia driver. Here's hoping that it was coincidence last night and my system is crash free...
03/04/2005 (8:40 pm)
@Josh: Well, I took you up on it and opened up a thread where I spell out a bit more thoroughly what I'm doing.re: this thread. I just got the kernel update, found that the latest nvidia driver still has the same problems (it has problems with screen updates), and reverted to the previous nvidia driver. Here's hoping that it was coincidence last night and my system is crash free...
#11
03/05/2005 (8:41 am)
I have figured out a pattern: the lockups occur when I use the window manager close gadget for T2D, but so far (knock on wood) if I select File/Quit it doesn't crash. I don't think it actually crashed every time I used the window manager close gadget, but it has only crashed when I did.
#12
Could you elaborate on that one Tim?
- Melv.
03/06/2005 (1:38 am)
"Window Manager Close Gadget".Could you elaborate on that one Tim?
- Melv.
#13
03/06/2005 (3:42 am)
I think he's talking about the window-close button (eg upper-right X close button in Windows). Crashes when closing that way, as opposed to quiting from T2D itself. Right Tim?
#14
03/06/2005 (12:32 pm)
@Josh: That is correct. I've been careful to close from the menu as I'm not ready for system lockup yet -- and so far no lockups. One thing *has* changed, I've got a slightly updated kernel. My suspicion at present is that for some reason there is a code or timing difference with the shutdown code depending on how it is closed. I *will* give a shot at making it crash again to see if I can reproduce it.
#15
03/06/2005 (1:11 pm)
Argh. Okay, I got my first crash closing from the menu so scratch that idea. Thanks for trying to help but it'll probably only get fixed with a new driver from nvidia.
#16
I haven't kept strict track, but it appears to be about 10 to 20% of the time? (at least, closing from the menu -- I'm still avoiding the window manager close gadget. Three kernel locks in one night is enough for me).
Well, to kill the whole system its gotta be killing the kernel. Even with linux being monolithic an application itself normally can't do that, it requires a driver. The only other OpenGL program I use is Celestia, and to be honest despite how much I love it I never seem to run it that often so the fact it hasn't crashed the computer on quit doesn't mean much. So I'm thinking nVidia's OpenGL has a problem (with v5336, anyway, and the newer release doesn't work with at least some 2.6 kernels -- including mine).
edit:
@Robert: Thanks for the script load suggestion, it hadn't occured to me. That helps quite a bit.
03/10/2005 (3:26 pm)
There is some randomness as to whether or not it occurs, but it *only* occurs when closing T2D. Happened again the other night. It happens either with the window manager close gadget or with the menu quit. I haven't tried directly issuing a quit() -- but if the devs think it will do any good I'll try that (I just hate killing the system, it leaves things in a mess with file locks and such open).I haven't kept strict track, but it appears to be about 10 to 20% of the time? (at least, closing from the menu -- I'm still avoiding the window manager close gadget. Three kernel locks in one night is enough for me).
Well, to kill the whole system its gotta be killing the kernel. Even with linux being monolithic an application itself normally can't do that, it requires a driver. The only other OpenGL program I use is Celestia, and to be honest despite how much I love it I never seem to run it that often so the fact it hasn't crashed the computer on quit doesn't mean much. So I'm thinking nVidia's OpenGL has a problem (with v5336, anyway, and the newer release doesn't work with at least some 2.6 kernels -- including mine).
edit:
@Robert: Thanks for the script load suggestion, it hadn't occured to me. That helps quite a bit.
#17
03/19/2005 (11:40 am)
@Robert: I've been using quit(); almost exclusively and it has not crashed once since then.
#18
I've got this thread on my bugs-list still and I won't take it off until we find out what's going on. If you do solve the issue, be sure to post here.
Good Luck,
- Melv.
03/19/2005 (11:57 am)
Sorry to hear that you're still having problems Tim. At this point, I'm not sure what else to suggest. You adding the "quit();" command to a button on screen rather than typing it? Would make things quicker.I've got this thread on my bugs-list still and I won't take it off until we find out what's going on. If you do solve the issue, be sure to post here.
Good Luck,
- Melv.
#19
03/19/2005 (12:15 pm)
Well, I haven't had a crash in a week so I think that is good. The reason I posted was to let Robert, yourself, whoever know that -- for whatever reason -- calling quit() directly seems to avoid the kernel lock.
#20
03/30/2005 (3:26 pm)
Well, it is a much lower lock rate, but it has happened at least once, maybe twice since then. However, NVidia has released a new driver that works with the kernel I'm using and I just got that installed. If I'm right and the kernel lock is due to the driver hopefully this will cure it. After a bit when I feel up to the risk I'm going to start/quit a lot to see if I can make it happen.
Torque 3D Owner Stephen Zepp