Geometry Shader using OpenGL Wrapper (bgl) [fix for Intel]

BluePrintRandom · March 19, 2015, 12:33am

the geometry shader is using the real existing objects as a reference, then adding the graphics mesh

(so the physics objects all exist)

the shader is just adding it’s instanced mesh where-ever the physics items already are.

http://http.developer.nvidia.com/GPUGems2/gpugems2_chapter03.html

I guess each object would still need to update, so you could not batch the render calls?

HG1 · May 5, 2015, 12:34pm

I hopefully fixed the Intel bug. I have tested it with an Intel HD Graphics 4000. But sometime it will crash. I seems that Intel attach the shades randomly.

It is very important for Intel and ATI/AMD that are the latests drivers are installed. Legacy drivers will not work (no geometry extension).
[ATTACH]377056[/ATTACH]

Maujoe · May 9, 2015, 12:25pm

Thank you!

It would be nice if someone (who has the time and the skills and the mutivation) could make a little demo file how to create particel. Maybe I will try it myself in the next days but I don’t know much about Glsl.
But I think in theory it’s not so difficult to give the created geometrie random motions by setting/changing the Position or something like that. I will look what I can get.

Yarbrough08 · May 15, 2015, 11:03am

Using your wrapper I confirmed other shader stages available for use with it:

 
GL_VERTEX_SHADER  -  0x8B31

GL_TESS_EVALUATION_SHADER  -  0x00008e87

GL_TESS_CONTROL_SHADER  -  0x00008e88

GL_GEOMETRY_SHADER  -  0x8DD9

GL_FRAGMENT_SHADER  -  0x8B30

GL_COMPUTE_SHADER  -  0x91B9

Yarbrough08 · May 16, 2015, 11:24am

Ah, my apologies. I have not looked at the code in that blend yet.

I used the same method to written a basic tessellation shader but it don’t works. Actually I don’t know why.

Do we need to specify these:


glPatchParameteri(GL_PATCH_VERTICES, GLint<i> value</i>);
glDrawArrays(GL_PATCHES, GLint <i>f[I]irst</i>,[/I] GLsizei count);

I don’t tested if compute shader will work. It need at least GLSL 4.30

I just tested a simple compute shader. It throws two different errors.

where ‘x = 8’ could be whatever you have specified - 2d or 3d (i.e. ‘local_size_y = 8’ or ‘local_size_z = 1’ ):
error C3008: unknown layout specifier ‘local_size_x = 8’

and:
error C7594: OpenGL requires declaring a layout qualifier for work group size before using the builtin constant gl_WorkGroupSize

The last one makes sense if it is not taking the layout qualifiers.

EDIT:
Actually I got it to work with this simple shader code: (although some uniforms fail…)

#version 430
writeonly uniform image2D Texture1;
uniform float Time;
layout (local_size_x = 16, local_size_y = 16) in;
void main() {
    ivec2 storePos = ivec2(gl_GlobalInvocationID.xy);
    float localCoef = length(vec2(ivec2(gl_LocalInvocationID.xy)-ivec2(8))/8.0);
    float globalCoef = sin(float(gl_WorkGroupID.x+gl_WorkGroupID.y)*0.1 + Time)*0.5;
    imageStore(Texture1, storePos, vec4(1.0-globalCoef*localCoef, 0.0, 0.0, 0.0));
}

Nerai · May 25, 2015, 11:31am

I think you could use bgl to create a geometry batch …you might need to access the mesh data using
http://www.tutorialsforblender3d.com/GameModule/ClassKX_MeshProxy_2.html the KX_MeshProxy … then read the primitives into a python version of the mesh data, and use the bgl to create a vbo

edit: I’m sure there is a way to instance purely using shaders, so I think it’s possible … I don’t know how much you can do with bgl yet

Nerai · May 25, 2015, 12:17pm

Yarbrough08:

Ah, my apologies. I have not looked at the code in that blend yet.

Do we need to specify these:
glPatchParameteri(GL_PATCH_VERTICES, GLint<i> value</i>);
glDrawArrays(GL_PATCHES, GLint <i>f[I]irst</i>,[/I] GLsizei count);
I just tested a simple compute shader. It throws two different errors.

where ‘x = 8’ could be whatever you have specified - 2d or 3d (i.e. ‘local_size_y = 8’ or ‘local_size_z = 1’ ):
error C3008: unknown layout specifier ‘local_size_x = 8’

and:
error C7594: OpenGL requires declaring a layout qualifier for work group size before using the builtin constant gl_WorkGroupSize

The last one makes sense if it is not taking the layout qualifiers.

EDIT:
Actually I got it to work with this simple shader code: (although some uniforms fail…)
#version 430
writeonly uniform image2D Texture1;
uniform float Time;
layout (local_size_x = 16, local_size_y = 16) in;
void main() {
    ivec2 storePos = ivec2(gl_GlobalInvocationID.xy);
    float localCoef = length(vec2(ivec2(gl_LocalInvocationID.xy)-ivec2(8))/8.0);
    float globalCoef = sin(float(gl_WorkGroupID.x+gl_WorkGroupID.y)*0.1 + Time)*0.5;
    imageStore(Texture1, storePos, vec4(1.0-globalCoef*localCoef, 0.0, 0.0, 0.0));
}

The compute shader works on my computer - I don’t understand where the Texture1 resource is allocated and how to read that into a pixel shader, is there some change to the python ? Here is what i have (just shows a grey triangle) based on source already on here, the program “geometry1” actually contains a compute shader and loads and runs the shader, the console prints “ok” that the compute shader runs, but I can’t see anything, can anyone help with how to set this up?
ComputeShaderExample V0.2 (5).blend (116 KB)

Yarbrough08 · May 25, 2015, 12:47pm

[quote=“Nerai,post:46,topic:634103"”]

I think you could use bgl to create a geometry batch …you might need to access the mesh data using
http://www.tutorialsforblender3d.com/GameModule/ClassKX_MeshProxy_2.html the KX_MeshProxy … then read the primitives into a python version of the mesh data, and use the bgl to create a vbo

edit: I’m sure there is a way to instance purely using shaders, so I think it’s possible … I don’t know how much you can do with bgl yet[/QUOTE]

Understanding the graphics pipeline would help out a lot in this situation:

The CPU sends information to the GPU for processing (vertex position and uniform variables)
The GPU processes data and sends it to next shader stage (tess control, tess eval, geometry, fragment)
The GPU outputs the processed fragment information to the framebuffer (normally the back buffer)

When the back buffer is filled, it is swapped to the front buffer which is displayed to the screen.
This means that the CPU has no idea that the GPU has created instances (much less the position of the instances).

That being said, I know of two ways to creating instanced physics on the GPU, both of which use compute shaders. Compute shaders, unlike the other shaders, take information from the CPU; processes the info and outputs it back to the CPU.

If the instances are created with a compute shader, it could output the instanced information back to the CPU. The CPU could then create physics for it; however, you would not gain much performance by doing this.

The second would be to calculate the physics on the GPU. If compute shaders were used to calculate physics, then it could instance the object (same as before) AND calculate the physics for it. This way would improve performance, but only marginally.

Performance, of course, is relative. It would depend on the speed of the targeted hardware.

Unfortunately, neither of these options are available to developers right now. The BGE uses the Bullet physics engine calculated on the CPU, and compute shaders are not fully working right now (vbo’s are also not fully supported).

Nerai · May 25, 2015, 1:02pm

Yarbrough08:

Understanding the graphics pipeline would help out a lot in this situation:

The CPU sends information to the GPU for processing (vertex position and uniform variables)

The GPU processes data and sends it to next shader stage (tess control, tess eval, geometry, fragment)

The GPU outputs the processed fragment information to the framebuffer (normally the back buffer)

When the back buffer is filled, it is swapped to the front buffer which is displayed to the screen.
This means that the CPU has no idea that the GPU has created instances (much less the position of the instances).

That being said, I know of two ways to creating instanced physics on the GPU, both of which use compute shaders. Compute shaders, unlike the other shaders, take information from the CPU; processes the info and outputs it back to the CPU.

If the instances are created with a compute shader, it could output the instanced information back to the CPU. The CPU could then create physics for it; however, you would not gain much performance by doing this.

The second would be to calculate the physics on the GPU. If compute shaders were used to calculate physics, then it could instance the object (same as before) AND calculate the physics for it. This way would improve performance, but only marginally.

Performance, of course, is relative. It would depend on the speed of the targeted hardware.

Unfortunately, neither of these options are available to developers right now. The BGE uses the Bullet physics engine calculated on the CPU, and compute shaders are not fully working right now (vbo’s are also not fully supported).

I’m more interested in getting the compute shader to work on something useful with meaningful data … in the example above the Texture1 resource could be read back to the CPU perhaps then used in a second shader pass for particles or something

edit: … perhaps bgl will update with some method of loading extensions … i think compute shaders could be improved with pynodes, although it is a bit far fetched for me because I don’t fully understand blender’s architecture

edit 2: I also don’t fully understand the compute shaders to be honest

BluePrintRandom · May 25, 2015, 1:39pm

http://rastergrid.com/blog/2010/10/opengl-4-0-mountains-demo-released/

is there any way to get all of the pipeline for static meshes to live on the gpu?
in the article they say they do LOD, occlusion, and instancing all on the gpu,
with bullet openCL can position and rotation data live as VBO on the gpu?

Nerai · May 25, 2015, 1:51pm

I don’t know about the physics on GPU, NVidia physx and newer versions of bullet use the GPU for a lot of physics processing, so upgrading the bullet version in Blender might be the best plan. How I’d approach this is find the inputs / outputs of the bullet API in the blender source and compare to the new openCL version, then create a wrap file that allows you to either switch versions or simply patch over inconsistencies by keeping the blender side of the API intact … still I’d expect some issues with replicating behavior between two builds if anything confusing had to be added to the patch - I would think they would have upgraded already if it was a simple matter

Yarbrough08 · May 25, 2015, 4:15pm

VBOs are not currently available to the BGE as far as I know. The advantage of using VBOs stems from the fact that they reside in GPU VRAM for faster access.

Although I have not looking into occlusion culling with shaders, I can confirm that LOD and instancing can be done with shaders (mesh-wise). I use both LOD and instancing on the GPU with my procedural grass shader. The problem is that there is no way, currently, to get the transformed (or generated) data from the GPU since the output is the display.

Nerai · May 25, 2015, 5:41pm

Theres a file called RAS_StorageVBO.cpp in the code (its in the ge_oglrasterizer folder), it appears to be a VBO. I don’t know how to access it from python … I think it might be possible to add some python bindings to a test build

edit: also I am not sure how the BGE pipeline actually works atm, will need to read / trace through the logic unless someone can explain - it collects objects into buffers and crunches polygons somehow

BluePrintRandom · May 25, 2015, 11:37pm

if you can get it working… with a level assembly kit like this…

Wrectified can be huge and detailed…

also any game really, but I always intended Wrectified to be the tech demo that made people take the bge seriously.

using this level of texture, and normal, diffuse and reflection,

I got 225,000 faces before I dipped from 60 fps to 59 fps.

Alienware Alpha

Yarbrough08 · May 26, 2015, 1:44pm

We have wandered a bit off the topic of geometry shaders in search of VBOs. Perhaps we should create a new thread for this and not hijack HG1’s post anymore than we have already…

Nerai · May 26, 2015, 6:01pm

Good point … thanks for the cool shaders and nice hack …

btw on the subject of hacking the BGE open GL I looked through some older blend files and found that you can also write


import OpenGL
from OpenGL.GL import *
from OpenGL.GLU import *
import glew
from glew import *


glewInit()

BluePrintRandom · May 26, 2015, 6:07pm

I am intrigued, will this help with gpu hardware nesh instancing?
I know literally nothing about OpenGL.

want to open a new thread?(for hg1)

Nerai · May 26, 2015, 6:26pm

The inclusion of GLU and GLEW is helpful for shading, the GLU library is just a utility library for OpenGL, however GLEW is more interesting, it is an acronym for GL Extension Wrangler … that should add some powerful functionality … and so might be useful for geometry shader’s.

HG1 · May 27, 2015, 11:11am

[QUOTE=Yarbrough08;2870521]
Do we need to specify these:


glPatchParameteri(GL_PATCH_VERTICES, GLint<i> value</i>);
glDrawArrays(GL_PATCHES, GLint <i>f[I]irst</i>,[/I] GLsizei count);

Yes, you need this OpenGL functions too, to get the tessellation shader working.

I just tested a simple compute shader. It throws two different errors.

You you need also to call some extra OpenGL functions, to define the threads for the compute shader.


glDispatchCompute(512/16, 512/16, 1) # 512^2 threads in blocks of 16^2

For both, BGL don’t have this OpenGL functions you need to use an other OpenGL wrapper.
Maybe so it is possible to get compute shader working, but I don’t have much hope the tessellation shader.

HG1 · July 28, 2015, 1:47am

Added a new version in the first post.