Difference between revisions of "Talk:Lesson 2 - disable effect"

From Bo3b's School for Shaderhackers
Jump to: navigation, search
(Optimal way to disable an effect: new section)
(Optimal way to disable an effect)
Line 46: Line 46:
 
texldl r30, c200.z, s0
 
texldl r30, c200.z, s0
 
if_ne  r30.x, c200.x
 
if_ne  r30.x, c200.x
   mov oC0.xyzw, c2.wwww
+
   mov oC0.xyzw, c200.wwww
 
   ret
 
   ret
 
endif
 
endif

Revision as of 20:09, 21 August 2014

To discuss Lesson 2 - disable effect, please use Add topic in the upper right.

Optimal way to disable an effect

When making a real fix, it might be valuable to consider other factors beyond just the easist way to do it, particularly because in general we'll just copy/paste the code. Let's copy/paste code that works better in all cases.

It's also worth noting that sometimes we need to set the output to all 1 instead, if a PS is used as part of a masking operation. It's rare, but happens.


When we disable an effect using the simplest case, the original shader is still fully executed, but the results thrown away by our last instruction where we assign zero to the output. There's not point in wasting the GPU cycles on stuff that is unused, so we can make it slightly better by making it so that original code is not executed.

Secondly, maybe you or others want to play the game in 2D. The effect is still disabled in 2D, even though it doesn't cause problems there. We can make it so that the disable operation only happens in 3D.


...
//   Texture2D_0              s0       1

    ps_3_0
    def c1, 0, 2, -0.333299994, 9.99999997e-007
...
    dcl_texcoord2_pp v0.xyz
...
    dcl_2d s0
...
    texld r0, v4, s0
    add r1, r0.w, c1.z
...

Would become:

...
//   Texture2D_0              s0       1

    ps_3_0
    def c1, 0, 2, -0.333299994, 9.99999997e-007
def c200, 0, 0, 0.0625, 0
...
    dcl_texcoord2_pp v0.xyz
...
    dcl_2d s0
...
// Check if we are in 3D, and only disable effect if separation==0
texldl r30, c200.z, s0
if_ne  r30.x, c200.x
  mov oC0.xyzw, c200.wwww
  ret
endif

    texld r0, v4, s0
    add r1, r0.w, c1.z
...


notes: Not positive this is actually optimal just yet. Need to revisit.

We can switch to Shader Model 4.0, because DX10 video cards have been out for forever (starting with 8800 GTX), and we don't expect anyone to actually only be able to run SM3.0. Switching to SM4.0 allows us to use the discard instruction for simplicity. Consider switching to SM4, at the expense of not matching some other Helix info.