Game Development Community

Major GeForce 6600 GT performance issues

by Tom Spilman · in Torque Game Engine Advanced · 05/24/2005 (1:53 pm) · 45 replies

So we have a map that performs beautifully on an ATI 9800 Pro, but is dog slow on my GeForce 6600 GT. I'm getting 2-3 fps as reported by metrics(fps) and NVPerfHud while on the 9800 Pro we get 80-100fps.

So i started by trying to force down the pixel shader level. I tried forcing it to 2.0, 1.4. 1.1... all of them seem to give me the same performance. I did this by changing the prefs to:

$pref::Video::forcedPixVersion = 1.1;
$pref::Video::forcePixVersion = 1;

Which seems to work as the console reports it's forcing shaders down... but i can't be sure it's not lying to me. =)

Next i fired off NVPerfHud. It shows a tremendous amount of time is being spent in the driver. When i use the T key to force 2x2 textures performance comes up to a respectable frame rate. So according to NVidia's docs that means i'm limited by texture bandwidth. Now i do have alot of large textures in the scene... but the 9800 seems not to mind.

So the 6600 which is supposed to be a better performing card than the 9800 Pro has a texture bandwidth issue while the 9800 does not. That seems illogical to me.

I suspect one of the shaders is the culprit here. Anyone have any tips to help diagnose this problem?

About the author

Tom is a programmer and co-owner of Sickhead Games, LLC.

Page «Previous 1 2 3 Last »
#1
05/24/2005 (2:01 pm)
FYI. I changed my GFXD3DDevice::init() to support NVPerfHud:

UINT AdapterToUse=D3DADAPTER_DEFAULT;
   D3DDEVTYPE DeviceType=D3DDEVTYPE_HAL;

   #ifdef INTERNAL_RELEASE

      // If the 'NVIDIA NVPerfHUD' adapter is loaded then
      // set the type and adaptor to support it.
      for (UINT Adapter=0;Adapter<mD3D->GetAdapterCount();Adapter++)
      {
         D3DADAPTER_IDENTIFIER9 Identifier;
         HRESULT Res=mD3D->GetAdapterIdentifier(Adapter,0,&Identifier);
         if (strcmp(Identifier.Description,"NVIDIA NVPerfHUD")==0)
         {
            AdapterToUse=Adapter;
            DeviceType=D3DDEVTYPE_REF;
            break;
         }
      }

   #endif   

   HRESULT hres = mD3D->CreateDevice( AdapterToUse, DeviceType, winState.appWindow,
                                      D3DCREATE_MIXED_VERTEXPROCESSING,
                                      &d3dpp, &mD3DDevice );

It wouldn't hurt to support this in the main codebase... and it's another reason to have the INTERNAL_RELEASE build configuration in the project.
#2
05/24/2005 (2:54 pm)
Another note. NvPerfHud reports that i'm using 96MB of video memory with a tiny amount of APG memory. I doubt that 96MB of textures is the issue. It must be the shader compile doing something funky. Gonna try some different drivers including the Nvidia beta drivers.
#3
05/24/2005 (6:34 pm)
I updated my drivers to Nvidia's latest developer beta drivers, 76.91, and my frame rate went up from 3fps to 8fps... still something is really wrong here. And man... NVPerfHud 3.0 beta is freaking awesome.
#4
05/24/2005 (8:15 pm)
Reading more about the 6600 GT at TechReport it seems that theoretically and in single texture situations the 6600 is much slower than the 9800 pro. Still when multi-texturing, which i assume the 6600 is doing here, we should be going much faster than the 9800.

This info has me thinking that the 6600 GT just can't handle the textures we're throwing at it. We have:

3 - 1024x1024
2 - 1024x512
19 - 512x512
4 - 512x256
11 - 256x256
... and a couple more 128 and 64s.

Now those can be multiplied by 2 as most have a normal map associated with them as well and at the moment normal maps have to have the same resolution as the diffuse map. Honestly that doesn't seem like a ridiculous amount of textures to me. It's only 96MB of textures in memory and with multi-texture all of it is happening in a single pass.

Or maybe i'm way out of touch with what is possible on cards?
#5
05/25/2005 (3:41 am)
Hmmm interesting....

Just thought I'd break this thread up a little and stop it looking like you were talking to yourself ;o)
#6
05/25/2005 (10:46 am)
It wouldn't be the first time i did. =)
#7
05/25/2005 (12:57 pm)
I had the same sort of problem, but I have a GeForce 6800 GT. I loaded 16 1024x1024 32-bit textures each having their own 1024x1024 normal map. That shold take up 128 megs of my video card's 256 megs of memory. I noticed a HUGE performance drop, and assumed that the video card was swapping texture data between system memory and video memory (because the video card ran out of memory).

I only had a performance issue when I loaded a lot of large textures. Everything ran full speed with a single 1024x1024 texture and a 1024x1024 normal map, even if I used the Parallax shader that was posted on the forums.
#8
05/25/2005 (1:16 pm)
Well one 1024 diffuse+normal map is alot less than 16, but it seems odd that the card wouldn't have the texture bandwidth to render everything in it's texture memory at least 2x over.
#9
05/25/2005 (1:37 pm)
Well the only thing that is confusing me, is that my card runs Doom 3 and Half Life 2 extremely fast. So its very unlikely the GeForce cards have a performance issue that the Radeons don't. I mean I can run Doom 3 with all the settings turned all the way up, with 1280x1024 resolution and I still get at least 60 fps.
#10
05/25/2005 (1:40 pm)
If you think it's a shader issue, you could re-run the test with them disabled. Quickest spot to do that would be in gfxd3dshader.cpp, you could comment out the ::process() body and see what effect it has on fps. (I wouldn't expect good rendering though, heh).

Hope that helps
#11
05/25/2005 (1:41 pm)
Is it possible that the Radeon is using some sort of features that is taking a shortcut on the render path, that the Nvidia card does? One of those "optimizations" where the driver has the video card render things incorrectly to get a few extra frames per second?
#12
05/25/2005 (4:12 pm)
It may be a batching problem. If it's drawing one huge texture somewhere, then drawing another huge texture and then going BACK and rendering the first huge texture somewhere else, it's going to be a big hit on the texture cache. No idea why this would affect the Radeon so much less than the Geforce though.

To turn off shader rendering, you could try just unmapping all the textures to Materials in materialMap.cs. That will fall back to fixed function in most cases.

Are the large textures on the terrain by chance? Might be taking a hit on the blending.
#13
05/25/2005 (7:33 pm)
Well in my case I was just using the regular old milestone 2 demo. I created a single interior that was basically a set of cubes, each one with one of the 16 textures on it. I only noticed a performance hit if i looked at the cubes. Since I was using the stock demo stuff, I didn't have any custom terrain or anything like that.
#14
06/07/2005 (7:42 am)
One idea is that it might be the Radeon card cache's some shaders on driver whilst the Nvidia doesnt...

Remember, these drivers are pretty hacky and are made to work well with certain sets of shader attributes (usually combinations used by specific games to get better benchmark speeds).

Best talk with one of the manufacturers to ask about thier driver performance under different conditions. I'm assuming ATI are doing some speculative caching of large textures that maybe NVIDIA dont.
#15
06/08/2005 (2:42 pm)
I was messing with Ben's 6600 briefly yesterday and found that if you remove the TerrainBlock from the demo mission, the bad performance (3fps) goes away and it hauls ass. Might want to try that. Not sure why this is affecting the 6600 so much.
#16
06/16/2005 (1:00 am)
Bump for relevance.

I ran TSE on my 6800 OC GT tonight for the first time and was underwhelmed at the FPS. The terrain_water_demo chunks along at 27-40FPS from the original camera position. Moving along in a plane from the original position does nothing to improve FPS performance.

Not to mention, when the camera rotates around the Space Orc in the demo so that you can look out towards the terrain through a portal, there is a HUGE performance hit... 1-2FPS... until the portal is no longer in LOS. Same when the camera paths out the door to reveal the outside terrain.

I haven't checked this on Radeon as I don't have a 9xxx series card handy. Has anyone with an ATI noticed the FPS drop in the demo?

I'm not to technically astute when it comes to the differing technologies between ATI and NVidia, but I gather there is something... funky... in TSE that is forcing the cards to interpret things differently. Unfortuntately I don't even know where to start looking. ;(
#17
06/16/2005 (1:33 am)
Bryce, I've noticed a drop in the SpaceOrc demo too but haven't measured the framerate.

I have tested with a 9800Pro and a 9600XT card using both Omega and ATI stock drivers. The 9600XT test used drivers downloaded about 2 weeks ago, the 9800Pro drivers dated roughly three months (can't remember exactly).

I have a GfFX5700LE card I'm intending to test it on aswell but haven't got a chance yet.

The new Terrain demo runs beatifully on both the ATI cards tough (as expected).

[EDIT]The drop in the Space Orc demo was ofcourse more noticable on the 9600XT but nevertheless apparent on the 9800Pro[/EDIT]
#18
06/16/2005 (3:27 am)
I have similar performance issues with the default demo on a 6600OC. Seems to only happen when the space orc is inside an interior though. If I put a space orc on the terrain, it's fine, and when I remove the space orc from the interior, it's fine.
#19
06/16/2005 (5:50 am)
I just ran through the demo again, and never once dropped below 100 fps (using metrics(fps); in the console).

Specs:
Pentium 4 3.4G, 1G memory
nVidia GeForce 6800 GT, 256 memory, running in 1152x864 res, full shaders.
driver: 6.14.10.6674.
#20
06/16/2005 (7:48 am)
@Stephen: You're using the 66.74 driver set. What happens when you use 71.84?

The only difference between your config and mine is I am running an Athlon 3200XP+ and the most recent ForceWare drivers. Are those differences really that pronounced or did NVidia break something in the newer driver set?

Hmmm...
Page «Previous 1 2 3 Last »