Why are 2d filters so slow?

I really want to use 2d filters, but the really make my GPU run hot (Nvidia GTX 465). I use the DLAA filter regularly, and that one is ok. But retinex and bloom would also nice, but they actually heat up my GPU so that the framerate drop little by little.

Is this an issue with that 2d filters need downscaling? Or could we improve the filters somehow. I found two different bloom filters on the forum, which one would be faster? The second one seems a bit faster by testing.

Martinsh’s bloom filter

uniform sampler2D bgl_RenderedTexture;

const float BRIGHT_PASS_THRESHOLD = 0.5;
const float BRIGHT_PASS_OFFSET = 1.5;

#define blurclamp 0.002 
#define bias 0.01

#define KERNEL_SIZE  3.0

vec2 texcoord = vec2(gl_TexCoord[0]).st;

vec4 bright(vec2 coo)
{
    vec4 color = texture2D(bgl_RenderedTexture, coo);
    
    color = max(color - BRIGHT_PASS_THRESHOLD, 0.0);
    
    return color / (color + BRIGHT_PASS_OFFSET);    
}


void main(void)
{
    vec2 blur = vec2(clamp( bias, -blurclamp, blurclamp ));
    
    vec4 col = vec4( 0, 0, 0, 0 );
    for ( float x = -KERNEL_SIZE + 1.0; x < KERNEL_SIZE; x += 1.0 )
    {
    for ( float y = -KERNEL_SIZE + 1.0; y < KERNEL_SIZE; y += 1.0 )
    {
         col += bright( texcoord + vec2( blur.x * x, blur.y * y ) );
    }
    }
    col /= ((KERNEL_SIZE+KERNEL_SIZE)-1.0)*((KERNEL_SIZE+KERNEL_SIZE)-1.0);
    gl_FragColor = col + texture2D(bgl_RenderedTexture, texcoord);
}

Gh123man’s bloom filter

uniform sampler2D bgl_RenderedTexture;
uniform sampler2D bgl_LuminanceTexture;

void main()
{
   vec4 sum = vec4(0);
   vec4 bum = vec4(0);
   vec2 texcoord = vec2(gl_TexCoord[0]);
   int j;
   int i;

   for( i= -2 ;i < 2; i++)
   {
        for (j = -1; j < 1; j++)
        {
            sum += texture2D(bgl_RenderedTexture, texcoord + vec2(-i, j)*0.002) * 0.40;
            bum += texture2D(bgl_RenderedTexture, texcoord + vec2(j, i)*0.003) * 0.40;            
        }
   }
       if (texture2D(bgl_RenderedTexture, texcoord).r < 2)
    {
       gl_FragColor = sum*sum*sum*0.0080+bum*bum*bum*0.0080+ texture2D(bgl_RenderedTexture, texcoord);
    }
}

As far as I know the filters are raw,they don’t use down sampling,this could be .
Right now we need down sampling , and this is a part of harmony branch,fortunately.

Indeed, as stated by BlendingBGE, the same number of pixels as the resolution of the screen must be sampled. It means that the processing time increases linearly with the resolution.

And there is no way to downsample in the python code? (while waiting for harmony)

Simple .No.

Actually, it might be possible. You can grab the screen’s texture with the bge.texture module, and you can adjust the capture size via Python. So, it should be possible to grab that texture at a smaller resolution, apply it to a plane, and use a GLSL shader on it the same way you would use a 2D screen filter on it. You could also alter the texture using Python. That would be slow, but you could do a huge amount with it, as you have the full Python library available to you.

I tried this for an hour or two. My down-sampled version used a 3D GLSL shader over a sized-down texture. I had a slight gain in FPS, but nothing major. If there was some way to use the shader on a down-sampled size of the fragment, it would have worked. I guess that’s the core problem, though, hah. I also tried a Pythonic GLSL shader, but it stressed logic too much, even on a slight effect.

However, I think I might have figured something out. It basically involves doing a full shader on pixels that are on a grid, and disregarding the shader on pixels that aren’t on a grid. The disregarded pixels would instead copy the closest grid pixel’s color. It might not look so great, but it might be worth the effort. I’ll try it out and post my results later.

@SolarLune
A long time ago, someone posted a very fast bloom shader that worked kind of like what you’re describing. It used RenderToTexture to render out a low-resolution capture of the scene, and displayed it on a plane in an overlay scene with a glsl shader that blurred the image. This plane was set to the add blend mode, and since it was on an overlay scene, it displayed on top of the main scene and added it’s bloom contribution to the final image. It was quite a brilliant concept, and worked really fast. However, it involved rendering the scene two times over, which could be a bottleneck for games with too many render batches.

Now, if you really wanted to avoid rendering the scene twice, you could render the scene into a texture, then display the texture on a plane and call renderToTexture again to render a low-res image of this plane. Then you’d have both high-res and low-res versions of the framebuffer/texture. You could then display the final composite on a plane with a glsl shader, accessing both the high and low-res renders from before to create the final image. This way you’d only need to render the scene once, and you could to all the filter effects with glsl. The downside is that RenderToTexture is a bit slow.

Sorry if you’ve already tried the above concept- I couldn’t quite tell.

I did get the bloom effect working pretty easily. The problem comes in with the down-sampled GLSL shader. Even though it’s down-sampled, I think the shader’s still doing a screen size’s amount of work on the texture, which breaks the point of using a down-sampled texture in the first place. Here’s the example. The Python shader worked at one point, but I broke it trying to get it to run faster. :1

DownsampleTest.blend (1 MB)

The python script 2dfilter runs at 10 fps fullscreen on my Nvidia GTX 465… :confused:

https://dl.dropbox.com/u/11542084/flare_playground_3.blend

What about an approach as in Martinsh’s flare filter? Which uses a plane infront of the camera with an GLSL shader that spawns the glare and flares according to highlights in the scene, without actually capturing the frame as a texture. Couldn’t we do the same but with a shader that highlights bright pixels, a bloom or glow effect?

In the example above, if you try setting the threshold very low, and the gain very low, you get a diffuse glare, or some kind of bloom effect. :rolleyes:

Odd. Are you getting any errors? The file was made in Blender 2.65, by the way. Also, the Python shader doesn’t work - it’s just a white texture. I did get it to work, but it was pretty much only useful on really small textures (i.e. 16x16). If I could find a faster way to iterate through the colors, though…

As for martinsh’s shader, lowering the gain and threshold hurts the FPS for me (not sure why; I guess because the threshold is lower, it’s going more work?).

No errors.

Yes I guess it has to do more work. But I experience that it is faster than a bloom filter :confused:

For loops slow down 2D screen filters somewhat. Hand-typing the if statements in can make it run a bit faster.