Question

I have the following code in the Fragment Shader:

precision lowp float;

varying vec2 v_texCoord;
uniform sampler2D s_texture;

uniform bool color_tint;
uniform float color_tint_amount;
uniform vec4 color_tint_color;

void main(){
    float gradDistance;
    vec4 texColor, gradColor;
    texColor = texture2D(s_texture, v_texCoord);
    if (color_tint){
        gradColor = color_tint_color;
        gradColor.a = texColor.a;
        texColor = gradColor * color_tint_amount + texColor * (1.0 - color_tint_amount);
    }
    gl_FragColor = texColor;
}

The code works fine, but it is interesting that even all color_tint I passed in is false, the above code still cause serious drag in performance. When comparing to:

void main(){
    float gradDistance;
    vec4 texColor, gradColor;
    texColor = texture2D(s_texture, v_texCoord);
    if (false){
        gradColor = color_tint_color;
        gradColor.a = texColor.a;
        texColor = gradColor * color_tint_amount + texColor * (1.0 - color_tint_amount);
    }
    gl_FragColor = texColor;
}

Which the later one can achieve 40+ fps while the first one is about 18 fps. I double checked and all color_tint passed in the first one are false so the block should never executed.

BTW, I am programming the above in Android 2.2 using GLES20.

Could any expert know what's wrong with the shader?

Was it helpful?

Solution

I am not an expert in fragment shaders, but I assume the second one would be faster because the entire if statement could be removed at compile time because it is never true. In the first one it can't tell that color_tint is always false until runtime so will need to check that and branch every time. Branches can be expensive, especially on graphics hardware that is often designed for predictable serial programming.

I suggest you try rewriting it to be branchless - Darren's answer has some good suggestions in that direction.

OTHER TIPS

Branches are very slow on fragment shaders avoid them if possible. Use color_tint_amount of 0 for no tint. Premultiply the color_tint_color and save a multiply per pixel. Make color_tint_amount = 1.0 - color_tint_amount. (so now 1.0 means no gradColor) These shaders and run millions upon millions of times a second, you have to save every cycle you can.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top