How to optimize a Minecraft-like Voxel system?


(BluePrintRandom) #21

what about having a model for 4x4 voxel

like

one model

0000 0000 0000 0000
1000 0000 0000 0000
0000 0000 0000 0000

maybe some way to generate them automatically?

and have it all have uv mapped faces you can change

so 1 object represents 4 * 4 * 4 blocks?

so the mesh are pre-generated?

edit :

o poo 

43 252 003 274 489 856 000 combinations for 3x3x3 

(statikregimen) #22

BluePrintRandom -
Not really following you, sorry =/

~
Here is an example of the above code, sampling the noise directly, rather than caching it…Generation time is hit less than I thought:

from bge import logic, types
from opensimplex import OpenSimplex
#from mathutils import Vector

seed = 123
chunkSize = 16
chunkViewDistance = 4
noiseScale = 0.1
threshold = -0.1
blockSpacing = 1

noise = OpenSimplex(seed).noise3d

s = logic.getCurrentScene()


class Chunk:
    def __init__(self, x=0, y=0, z=0):
        self.x, self.y, self.z = x, y, z

    def neighbors(self, x, y, z):
        # Return 6 neighboring blocks
        return [
            self[x+1,y,z] if x+1 < chunkSize else 0,
            self[x,y+1,z] if y+1 < chunkSize else 0,
            self[x,y,z+1] if z+1 < chunkSize else 0,
            self[x-1,y,z] if x-1 >= 0 else 0,
            self[x,y-1,z] if y-1 >= 0 else 0,
            self[x,y,z-1] if z-1 >= 0 else 0
        ]

    def render(self):
        # Render visible blocks
        objects = []
        
        for i in self:
            i = i[0]

            if not all( self.neighbors(*i) ):
                o = s.addObject("Cube.002")
                o.worldPosition = i
                objects.append(o)

        return objects

    def __iter__(self):
        for x in range(chunkSize):
            for y in range(chunkSize):
                for z in range(chunkSize):
                    v = self[x,y,z]

                    if v:
                        yield ((x,y,z), v)
    
    def __getitem__(self, i):
        n = noise(
            (i[0]+self.x)*noiseScale,
            (i[1]+self.y)*noiseScale,
            (i[2]+self.z)*noiseScale
        )
        
        if n > threshold:
           return n

    
chunk = Chunk()

#chunk.render()
b = types.KX_BatchGroup( chunk.render() )

(wkk.py) #23

How about you cached the value?

Initialise the blocks to None, in __getitem__ you could check if a block was generated or not if it is None, and add a visible field to the voxel data that you would precompute as soon as you generate the voxel.

My random guess is that computing the neighbors for each voxel in render was heavy, but idk really.


(statikregimen) #24

Sampling the neighbor blocks is certainly very expensive. In the example before last, I do cache all the values and it is noticably faster (without taking measurements - just observing - it looks about half a second faster to generate a 16x16x16 chunk). I’m not sure if I understand your suggestion, so some pseudocode or more detail would be appreciated! I do feel like there is some math magic that is eluding me here…


(BluePrintRandom) #25

I did this in the past.

it’s a huge infinite plane.

I use faces of a mesh to skin invisible cube objects,

this was quite some time ago and may be possible to do more efficiently.

this is a huge grid of points, and the player gets points in radius with a kdtree of all grid points in the scene.

I am thinking it would be better now to have a few trees of differing density

1 kdtree that is units of space like 50x50 chunks

then inside each entry for that tree is kdtree that holds blocks

this way we can use the results of the course grain to get the smaller kd tree

KdChunks = {}
Chunks = []
index = 0
for X in range(100):
    for Y in range(100):
        for Z in range(10):
             Chunk = [ X*50 - (5000/2), Y*50 - (5000/2), Z * 50 - (500/2)  ]
             
             ChunkParts = [ ]
             KdChunks[index] = mathutils.kdtree.KDTree(50*50*50)
             index+=1
             index2 = 0
             for X1 in range(50):
                 for Y2 in range(50):
                     for Z3 in range(50):
                          micro =  [ Chunk[0] -25 + X1 , (Chunk[1] -25) + Y1, (Chunk[2] - 25) + Z3 ]
                          ChunkParts.append(micro)
                          KDChunks[index-1].insert(micro,index2)
            KDChunks[index-1].balance()
            filledLocations = []
            Chunks.append( (Chunk, ChunkParts, filledLocations) )

kdMaster = mathutils.kdtree.KDTree(len(Chunks))
index = 0
for entry in chunks:
    kdMaster.insert(entry[0], index)
    index+=1

kdMaster.balance()

own['KDMaster']= kdMaster
own['KDChunks']= kdChunks
own['Chunks']=Chunks

(BluePrintRandom) #26

Bascially we make a big grid, and the big grid is filled with smaller grids.

we use KD.findRange() to get all ‘Large grid entry’ in a radius, then sweep through those to get all small tree entry in a radius.

each kdtree is returned in order of distance.

Radius = 25 
nearChunks = own['KDMaster'].findRange( player.worldPosition, radius)

for Chunk_index in nearChunks:
    Chunk = own['Chunks'][Chunk_index[1]]
    KDSmall = own['KDChunks'][Chunk_index[1]] 
    Micro = KDSmall.findRange(player.worldPosition, radius)
    for entry in Micro:
        #this is a location one would fill or not
        if entry[0] not in Chunk[2]:
            #this is a voxel that is empty
            added = own.scene.addObject('block', own, 0)
            added.worldPosition = entry[1]
            #we can also sample perlin noise here to fill/not fill / decide what type of block we have here
    
    

(BluePrintRandom) #27

I know this can work as it’s how I edit vertex in realtime in a radius

however you will have to play w/radius
chunk distance etc / number of entry per chunk

like a grid every 10 units with 10x10x10 blocks in it

edit: working test file included
tryCraft.blend (515.2 KB)

Note - this is not tracking if a block has been destroyed or not or replaced yet.


(Lostscience) #28

have you played with block easy demo?it might give you ideas.


(statikregimen) #29

What is block easy demo?

BluePrintRandom-
I ran the demo really quick earliler but couldn’t immediately figure out how to move around so I’ll have a closer look later. It did seem to be very slow at generating blocks though.

~
I found a small modification is all it takes to try one idea on my current code, which is to use planes to only draw visible faces, then batchgroup the resulting gameobjects. This certainly reduced the number of faces but did not improve framerates much, if any. I was pleased that it did not seem to impact chunk generation time, though.

One annoying problem, however, is that my script appears to be getting run twice no matter how I trigger it, which instantly halves performance.

I didn’t realize we can attach files. Neat.

Update: Not sure what happened, but I managed to make UPBGE crash. After restarting, it was no longer executing the script twice, and now performance has definitely increased with the current code. Next, I’m going to try to throw a real-time chunk generator together and see how hard I can push it.


(Lostscience) #30

it is in the blendfile on this page.


(smilebags) #31

Definitely use as primitive data structures as you can for the generation side of things, this will make it much faster. If you know enough about Python, you can get almost the entire way there just with some basic Python classes. voxel generation and visibility culling both seem like relatively simple tasks in Python, so do as much as you can there before creating blender cubes or anything.


(jesusmora) #32

ok let me help you with the algorithm parts.
first you need to divide your script into two. the first script should create the matrix and STORE it in a dictionary (creating variables takes time, you should not do it every frame). the sencond one, “update” script has to run every frame, so it has to be simpler. it has to load the dictionary and read/update it, as well as creating the geometry.
i got experience from making a 2d map using bitmask, checking all the neightbour points to set a single “block”, the same way as a gpu shader, one by one.


the way i did it was by having parallel matrixes, one matrix has the data on what the voxel is made of: say one point has dirt, that would be 1, air would be 0 so it would be immediately ignored, etc. with blocks you don’t need borders, but you can benefit from this technology by detecting the border, if a block is surrounded by blocks everywhere, it is ignored, so in one of the parallel matrixes that same position is 0, representing an ABSOLUTELY hidden block, while the borders would always be 1, and air would also be 0.
then, with that parallel matrix, you go to the raytracing step, you skip all the blocks that are air (0), and create a new parallel matrix, with the VISIBLE blocks from the POSITION of the player in the voxel, not the frame. that should make it faster. so you only update when the player moves.
then you go to the “render” step using the raytrace matrix (or you can use the “border” matrix, since it will be already few enough blocks), you look at the parallel, if it is 1, you look at the first voxel (what it’s made of) and use it to create a block. if it’s 0, you skip.
after that, you create the batch. with big enough blocks (minecraft) you should get good framerate, with a not very powerful computer (i used to get good python speed in an athlon 64 x2, any modern computer is much more powerful than that).

thinking again, you should do the border detection in the same script where you define variables, so that the render and raytrace becomes faster. a third script can be used to recalculate “material voxel” and “border” voxel, when a block is changed or a new chunk is loaded. you should make big chunks to get better performance. use the 16x16 for testing, but think to increase as the script gets more complex.


(statikregimen) #33

smilebags-
I have just 1 class, inheriting from list. It has a separate method for rendering. This essentially boils down to 1- Sample/cache the noise in a 3D list on init, 2- Call .render() method to find visible faces & add geometry to screen.

jesusmora-
I think I’m already roughly doing most of what you describe.

Right now, chunk generation times seem to be maybe fast enough - possibly approaching Minecraft speeds in some tests, although granted not generating block types or trees/treasure yet (which I don’t think will add too much overhead anyway). However, I’m still very low on the framerates. Even at just 64x64x64 blocks, my framrates parody Minecraft when it’s rendering 16x16x16 chunks.

I did find that turning physics off made an easy 10-100 fold improvement, so I will likely have to use a very simple ray tracing approach or whatever.

Anyway, here’s my progress so far:

bge_voxel_test.blend (537.9 KB)

It will start generating the first chunk (64x64x64 blocks in this example) as soon as you run it, so anticipate a few seconds delay (I haven’t begun making this real-time yet).

Oh you may want to adjust the keys too - I use f,a,v,s rather than w,a,s,d (change via logic bricks on the camera).


(statikregimen) #34

A bit more testing reveals that OpenSimplex is causing the most impedance to chunk generation, so I’m looking at alternatives. For some reason, the “noise” library wont work on my system, which I’m also looking into.

Nevertheless, framerates will still be too low. Batchgroups do seem to help a huge amount, after disabling collisions, but not enough to get where I need to be.


(wkk.py) #35

I have run some tests with OpenSimplex, doesn’t seem to be the biggest bottleneck:

For a 16x16x256 chunk, it takes less than a second on my machine:

$ time python benchmark.py 
size: 65536

real    0m0.626s
user    0m0.613s
sys     0m0.012s

With the following script:

from opensimplex import OpenSimplex

generator = OpenSimplex(1234)

size = 16 * 16 * 256

print('size:', size)
noise = [generator.noise3d(i, i, i) for i in range(size)]

Yes, it might kill the framerate if you try to run the whole calculation in one frame, because you are only allowed ~16ms to keep up 60fps.

On the other hand, you could spread the computation over several frames, use threading, or use subprocessing (that I recommend).

So what is the real bottleneck?


(statikregimen) #36

The plan is to spread the work over frames.

My conclusion was based on the difference between initializing my class with a blank 3D array vs. filling it with samples from OpenSimplex. Sampling noise is always going to cost a lot. I’m currently not concerned about chunk generation time - it was just an observation. I’m currently most concerned about boosting framerates.


(jesusmora) #37

let me see your code

ox, oy, oz = self.x, self.y, self.z

what is this for? you don’t need coordinates in a voxel, a voxel is a matrix, the position of a block is it’s position in the matrix. you only need to do a triple for loop and add the blocks or not with a simple if in the render part. it seems as if you are running “neighboor” inside render. as i told you, you need to create the render list before hand, and only update it when there’s a change. the change for visibility as i told you too, is the player moving from its absolute position (int(worldPosition[0]), etc). it doesn’t have to run every frame. you render the visible surroundings, the camera always moves in a sphere, the only way to change “oclusion” is to move from the position.

def render(i):
     for abc in range(x):
          for adc in range(y):
               for aec in range(z):
                    if i[abc][adc][aec] == 1:
                         ob1 = scene.addObject(blah blah blah add dirt block)
                         ob1.worldPosition = [abc, adc, aec] #position of current block in the matrix
                         list0.append(ob1)
                    elif i[abc][adc][aec] == 0:
                         pass

this render adds the blocks from a list, it has to run every frame. the “init” has to run once, and the visibility one only has to run when player moves.

i could not initialize opensimplex. in my voxels i used random and randint and made a perlin-like code to create the shapes, going block by clock like in a shader, then saved the matrix to a dictionary. when you open your game you want to load this matrix rather than generate the world again from scratch, as you want to save changes made to the world. that is why it’s useful to have 3 parallel matrixes, one has the material data, one has “oclusion” data, and one is used for inmediate visibility. you reuse the “layers” depending on which step you are.
edit: i think the “render” doesn’t have to run every frame, just when the player moves. render a sphere around the player and keep it until the player moves. or you can ignore having a “visibility” matrix, and just render the “border” blocks, the entirety of the voxel as a hollow mesh.


(statikregimen) #38

That line is for positioning the chunk in the scene. It may not have been implemented in whatever iteration you’re looking at. Your code would put all chunks at the same position.

The world is just a matrix of chunks, which are a matrix of “blocks”/voxels. But you still have to tell the engine where to put each object in the scene.

There will be a “chunk manager” class or function that will be responsible for generating the chunk matrix, adding/removing chunks as the player moves, and positioning them in the scene. It will also be responsible for generating only as much as it can in the alotted time (say 0.01ms) and pick up where it left off each frame.

Ideally, I only want to save CHANGED blocks - not entire chunks or entire worlds. Most of the world, in fact, should always be generated on the fly, or save game files would rapidly grow out of control. Plus, I’m not even sure streaming data from disk would be faster than re-generating it.

Also, I’m calling neighbors() the way I am because it has to know that informatio for adding faces - not for knowing where blocks should be. I should/could generate data for faces earlier than I do but for now, it’s getting the job done and it is fast enough.

My big problem remains framerates. I can’t get much further until I solve that. I have some ideas but they will take time to implement. As it is, I cannot get nearly enough blocks on screen to be useful before framerates are in the toilet.


(BluePrintRandom) #39

mathutils.noise works quite well


(wkk.py) #40

Not the issue here.