Shader compilation errors when using SM 2.0
by Manoel Neto · in Torque 3D Professional · 11/20/2009 (1:40 pm) · 14 replies
Some compilation errors when using SM 2.0, either running in a PS2.0 cards or forcing the shader version using this:

There are two errors. This one affects a lot of materials:
The other error is the classical "Compiled shader code uses too many arithmetic instruction slots" in Gideon's and swarmGuns's materials.
- EDIT -
I took a peek at the shaders, and look at this:
And this:
$pref::Video::forcedPixVersion = 2; $pref::Video::forcePixVersion = 1;

There are two errors. This one affects a lot of materials:
error X3003: redefinition of 'ConnectData::detCoord'
The other error is the classical "Compiled shader code uses too many arithmetic instruction slots" in Gideon's and swarmGuns's materials.
- EDIT -
I took a peek at the shaders, and look at this:
struct ConnectData
{
float4 hpos : POSITION;
float2 out_texCoord : TEXCOORD0;
float2 detCoord : TEXCOORD1;
float2 detCoord : TEXCOORD2;
float3x3 worldToTangent : TEXCOORD3;
float3 wsPosition : TEXCOORD6;
float4 screenspacePos : TEXCOORD7;
float fogAmount : TEXCOORD8;
};detCoord is indeed declared twice. There's also this a few lins below:... uniform float2 detailScale : register(C12), uniform float2 detailScale : register(C13), ...
And this:
// Base Texture OUT.out_texCoord = IN.texCoord; // Detail OUT.detCoord = IN.texCoord * detailScale;It's obvious what's going on: the shaderGen got confused in SM 2.0 when generating shaders with both detail diffuse and detail normal maps.
About the author
Recent Threads
#2
11/21/2009 (2:33 pm)
Tom: the instruction limit is 64, not 96 (that's SM2.0b or something like that, that only some Radeons are capable of). The offending shaders were using 74 instructions. Disabling specular lighting on the affected materials makes them compile (I tried only removing Gideon's specular map, but that didn't do it).
#3
SM 2.X is 96 instructions of any kind.
11/21/2009 (7:00 pm)
Right... original SM 2.0 is 64 arithmatic and 32 texture instructions.SM 2.X is 96 instructions of any kind.
#4
That should fix the issues with broken materials in BL with detail normal maps.
11/21/2009 (7:56 pm)
Ok... one line fix. In shaderGen\HLSL\shaderFeatureHLSL.cpp around line 515 in the function ShaderFeatureHLSL::addOutDetailTexCoord():Var* ShaderFeatureHLSL::addOutDetailTexCoord( Vector<ShaderComponent*> &componentList,
MultiLine *meta,
bool useTexAnim )
{
// Check if its already added.
Var *outTex = (Var*)LangElement::find( "detCoord" ); // CHANGED!
if ( outTex )
return outTex;That should fix the issues with broken materials in BL with detail normal maps.
#5
Graphic cards using pixel shader 2.0 continue to fail if pixel specular is turned on (1).
"FAIL: An undetermined error occurred (80004005)
C:/Torque/Torque 3D 2009 Pro 1.1 Beta 1/My Projects/New Project/game/shaders/procedural/9f86c46ff260fc71_P.hlsl(74,12): error X5608: Compiled shader code uses too many arithmetic instruction slots (74). Max. allowed by the target (ps_2_0) is 64."
The following happens when running with card profile:
Attempting to create GFX device: ATI Radeon Xpress 1150 (D3D9)
Device created, setting adapter and enumerating modes
Cur. D3DDevice ref count=1
Pix version detected: 2.000000
Vert version detected: 2.000000
Maximum number of simultaneous samplers: 16
Number of simultaneous render targets: 4
Hardware occlusion query detected: Yes
Using Direct3D9Ex: No
WMIVideoInfo: DxDiag initialized
Initializing GFXCardProfiler (D3D9)
o Chipset : 'ATI Technologies Inc.'
o Card : 'Radeon Xpress 1150 '
o Version : '7.14.0010.0449'
- Scanning card capabilities...
GFXCardProfiler (D3D9) - Setting capability 'autoMipMapLevel' to 1.
GFXCardProfiler (D3D9) - Setting capability 'maxTextureWidth' to 2048.
GFXCardProfiler (D3D9) - Setting capability 'maxTextureHeight' to 2048.
GFXCardProfiler (D3D9) - Setting capability 'maxTextureSize' to 2048.
GFXCardProfiler (D3D9) - Setting capability 'lerpDetailBlend' to 1.
GFXCardProfiler (D3D9) - Setting capability 'fourStageDetailBlend' to 1
03/10/2010 (12:48 pm)
This issue still remains in 1.1betaGraphic cards using pixel shader 2.0 continue to fail if pixel specular is turned on (1).
"FAIL: An undetermined error occurred (80004005)
C:/Torque/Torque 3D 2009 Pro 1.1 Beta 1/My Projects/New Project/game/shaders/procedural/9f86c46ff260fc71_P.hlsl(74,12): error X5608: Compiled shader code uses too many arithmetic instruction slots (74). Max. allowed by the target (ps_2_0) is 64."
The following happens when running with card profile:
Attempting to create GFX device: ATI Radeon Xpress 1150 (D3D9)
Device created, setting adapter and enumerating modes
Cur. D3DDevice ref count=1
Pix version detected: 2.000000
Vert version detected: 2.000000
Maximum number of simultaneous samplers: 16
Number of simultaneous render targets: 4
Hardware occlusion query detected: Yes
Using Direct3D9Ex: No
WMIVideoInfo: DxDiag initialized
Initializing GFXCardProfiler (D3D9)
o Chipset : 'ATI Technologies Inc.'
o Card : 'Radeon Xpress 1150 '
o Version : '7.14.0010.0449'
- Scanning card capabilities...
GFXCardProfiler (D3D9) - Setting capability 'autoMipMapLevel' to 1.
GFXCardProfiler (D3D9) - Setting capability 'maxTextureWidth' to 2048.
GFXCardProfiler (D3D9) - Setting capability 'maxTextureHeight' to 2048.
GFXCardProfiler (D3D9) - Setting capability 'maxTextureSize' to 2048.
GFXCardProfiler (D3D9) - Setting capability 'lerpDetailBlend' to 1.
GFXCardProfiler (D3D9) - Setting capability 'fourStageDetailBlend' to 1
#6
03/10/2010 (1:36 pm)
Yeah, I've seen it happen on a X600 card we have at work. A simple model with only diffuse and specular fails to render, so we ended up simply forcing specular off in SM 2.0 cards.
#7
I don't have much experience with shaderGen but it should be possible to get normal, diffuse and specular to compile in SM2.0 (64 Arithmetic Ops instead of 69) shouldn't it?
03/10/2010 (2:08 pm)
right, I can use $pref::Video::disablePixSpecular = 1 if the user has SM 2.0, but the loss in quality (our normal maps also have to go when specular is disabled) is pretty dramatic and I doubt it will be acceptable to my boss.I don't have much experience with shaderGen but it should be possible to get normal, diffuse and specular to compile in SM2.0 (64 Arithmetic Ops instead of 69) shouldn't it?
#8
The lighting function is compute4Lights() in shaders\common\lighting.hlsl. Pay attention to the comment there: the light positions are stored in a not-so-obvious way to reduce instructions already.
You can try adding:
03/10/2010 (2:21 pm)
I think it should be possible, but the BL shaders are quite big since they do lighting for 4 lights in a single pass. I'm sure someone is looking into this for the next beta, but maybe you could try modifying the shader that computes the lighting for BL to see if you can get it down to 64 instructions.The lighting function is compute4Lights() in shaders\common\lighting.hlsl. Pay attention to the comment there: the light positions are stored in a not-so-obvious way to reduce instructions already.
You can try adding:
#define TORQUE_BL_NOSPOTLIGHT...at the top of the shader to disable spotlight support. I believe this will reduce the number of instructions.
#9
03/10/2010 (2:41 pm)
thanks for the help, I got it working now by reducing the number of lightVectors[3] (and the # of iterations in the for-loops) from 3 to 2 and it seems to work great. I am sure there are some problems with this approach but at least now I am getting specularity (spec color, not texture) on my SM 2.0 profiles and graphic quality is about 10x what I was getting before.
#10
03/10/2010 (2:53 pm)
Interesting, thanks for that Jeff. Hopefully this will be fixed properly very soon.
#11
06/05/2010 (1:45 pm)
Logged as TQA-237.
#12
I would like a more proper solution or an updated lighting hlsl script that will compile in both conditions.
07/21/2010 (3:47 pm)
any progress on this? My work around of reducing the array sizes only seems to work on the shader 2.0 cards, it causes an in shader 3.0 cards. so ultimately I had to make an engine change in shadergen/hlsl/pixSpecularHLSL to use a different hlsl file, I named mine lightingSM2.hlsl if the user is using pixelVersion <= 2.0.I would like a more proper solution or an updated lighting hlsl script that will compile in both conditions.
#13
07/21/2010 (4:21 pm)
@Jeff: you can make the same shader code behave differently for different shader versions using this:#if TORQUE_SM > 20 // Code here will only be compiled for GPUs with shader model 2.0 or greater #else // Code here will be compiled shader model 2.0 or lower #endifThis is evaluated during shader compilation time, so it has zero performance cost.
#14
07/21/2010 (6:27 pm)
great, thats exactly what i needed. Thanks for the help
Associate Tom Spilman
Sickhead Games
Thanks Manoel... i'll look into these.
ShaderGen will be getting an overhaul soon and stuff like that should be much harder to get wrong.
As far as having too many instructions for SM 2.0... i'll check that specific case and see if there is something we can optimize. I can't think of any one thing there that should be causing it to go over the 96 instructions or whatever is the limit.