Game Development Community

T3D (1.1b) eating shader constants [fix] and worldToObj bug [also fix] - RESOLVED

by Manoel Neto · in Torque 3D Professional · 02/26/2010 (3:58 am) · 9 replies

I wrote a custom shadow class to replace the projected shadow in my project. This class just re-renders my characters using a custom material that flattens their models onto a pre-defined plane and uses stencil to prevent shadows from blending against each other. It looks good, however my shadows are always centered on their casters, so if someone jumps, the shadows gets off the ground.

I re-wrote my projection over and over again, trying to project in object space, then in world-space, and nothing fixed it. Right now I perform the projection in world space and convert the result to object space before applying the modelview:

#include "common/hlslStructs.h"

struct ConnectV
{
   float4 hpos : POSITION;
};

uniform float4x4 modelview;
uniform float4x4 objTrans;
uniform float4x4 worldToObj;

ConnectV main( VertexIn_P IN )
{
	ConnectV OUT;
	
	float3 planeCenter = float3(0,0,0);
	float3 planeNormal = float3(0,0,1);
	float3 rayVec      = float3(1,1,-1);
	float3 rayOrigin   = mul(objTrans, IN.pos.xyz);
	
	float planeD = dot(planeCenter, planeNormal);
	float cosAngle = dot(rayVec, planeNormal);
	float distToPlane = planeD - dot( rayOrigin, planeNormal );
	float distToProj = distToPlane / cosAngle;	
	
	float3 shadowPos = rayOrigin + (rayVec)*distToProj;
		
	shadowPos = mul( worldToObj, shadowPos );

	OUT.hpos = mul( modelview, float4(shadowPos,1) );

	return OUT;	
}

But the shadows keep moving off the projection plane, so I then decided to look at the shader constants using NV Perf HUD, and then the problem became obvious.

Here's the objTrans shader constant (4x4 matrix at C4) value when the object is drawn normally:
i47.tinypic.com/2zivuap.pngThe object is at position (0, 0, 0.11) with scale (0.45, 0.45, 0.45) and no rotation.

Now look at its value when I'm drawing my custom material:
i50.tinypic.com/rr5z50.pngThe last row at C7 (the one which should contain the position) is gone! It was overwrriten by the contents of worldToObj. Seems T3D decided that objTrans was a 3x3 matrix for some reason. I'm going to debug the code that sets the constants to see what is going wrong.

#1
02/26/2010 (5:01 am)
Alright, it looks like the problem is in GFXD3D9Shader::_getShaderConstants(), but I can't quite figure out a solution. Seems that D3D is "optimizing" the number of constants used by my matrix4x4 uniforms based on how they are used in the shader. If I mul() them with float3 values, the shader compiler only allocates 3 registers for them.

This forces all uniforms to use 4 registers:
ConnectV main( VertexIn_PNT IN )
{
	ConnectV OUT;
	
	float3 planeCenter = float3(0,0,0);
	float3 planeNormal = float3(0,0,1);
	float3 rayVec      = float3(1,1,-1);
	float3 rayOrigin   = mul(objTrans, float4(IN.pos.xyz,1)).xyz;
	
	float planeD = dot(planeCenter, planeNormal);
	float cosAngle = dot(rayVec, planeNormal);
	float distToPlane = planeD - dot( rayOrigin, planeNormal );
	float distToProj = distToPlane / cosAngle;	
	
	float3 shadowPos = rayOrigin + (rayVec)*distToProj;
		
	
	shadowPos = mul( worldToObj, float4(shadowPos,1) ).xyz;

	OUT.hpos = mul( modelview, float4(shadowPos,1) );

	return OUT;	
}
But my shadows now are drawn at completely wrong positions. It seems that worldToObj is not quite capable of bringing back coordinates from world space into object space.
#2
02/26/2010 (5:53 am)
Alright, as I suspected worldToObj is not being calculated correctly. The problem is in m_matF_invert_to_C(const F32 *m, F32 *d), which is used by every method in MatrixSet which returns an inverted matrix (like MatrixSet::getWorldToObject()).

This function was obviously written based on m_matF_inverse_C(F32 *m), but whoever wrote it didn't test it properly. There are two bugs in it:

1) It's using the non-inverted matrix to transform the position:
m_matF_x_vectorF(m, temp, &temp[3]);
The correct is:
m_matF_x_vectorF(d, temp, &temp[3]);

2) It's not setting the 4th matrix row. Since m_matF_inverse_C() does the inversion in-place, the target matrix is likely to have been initialized (who inverts non-initialized matrices?), but this isn't the case for m_matF_invert_to_C(), which inverts from source to destination (which is probably NOT initialized). The causes the 4th row to retain whatever was in memory when it was created. To fix, add this at the end of m_matF_invert_to_C():
d[12] = m[12];
d[13] = m[13];
d[14] = m[14];
d[15] = m[15];
#3
02/26/2010 (9:39 am)
Great catch, Manoel!
A few week ago I was working with the world inverse matrix ,I saw something was wrong.
#4
03/02/2010 (7:25 am)
Holy cow ! Thanks. Was driving myself crazy with this one. (didn't even expect to find a fix here)
#5
03/02/2010 (11:35 am)

Merged.

Thanks a lot Manoel.
#6
03/02/2010 (12:00 pm)
Quote:Seems that D3D is "optimizing" the number of constants used by my matrix4x4 uniforms based on how they are used in the shader.
Yes... the HLSL compiler will do that. James fixed a bug in GFXD3D9Shader::_getShaderConstants() in r28350:

case D3DXPC_MATRIX_ROWS :
case D3DXPC_MATRIX_COLUMNS :                     
{
   switch (constantDesc.RegisterCount) // WAS constantDesc.Columns
   {
      case 3 :
      desc.constType = GFXSCT_Float3x3;
      break;
      case 4 :
      desc.constType = GFXSCT_Float4x4;
      break;
   }
}

It seems Columns always reported 4 when RegisterCount would report back the compiler optimized 3.

Quote:The problem is in m_matF_invert_to_C(const F32 *m, F32 *d)
Ouch... that would be very bad. I'll have to look into that one immediately.

UPDATE: That Rene is fast! :)
#7
03/02/2010 (1:32 pm)
@Tom: wouldn't the 3x3 matrix still miss the position information? Unexpected shader behaviors would still happen when trying to mul() float4x4 matrices with float3 vectors.

I believe the shader compiler behavior is outside our reach, so maybe the best approach would issue console warnings to make people aware they need to use the float4 trick?
#8
03/02/2010 (1:54 pm)
@Manoel

Right. With the mul(float4x4,float3) case or mul(float4x4,float4( float3, 0)) the HLSL compiler will "optimize" the float4x4 constant to a float3x3. We cannot keep that from happening and the compiler is doing the right thing... the 4th column isn't needed.

If you want your transform to include position you have to do mul(float4x4,float4) or mul(float4x4,float4(float3, 1)).

We might be able to develop a warning for this to the console output, but thats probably all we can do. At the end of the day its a mistake in your shader.
#9
03/02/2010 (2:10 pm)
It took me some cursing at PerfHUD until I realized what was going on. A warning would be helpful for the less resourceful.