Cluster Rendering

Hi there. Long time since I’ve been here, nice to have some time to spend in Blender again.

I’m near completion of the Raspberry Pi Cluster v2, when it’s all wired up I will be using it for rendering, it may not be the most efficient use of the Raspberries, but if they’re not doing anything else they may as well be churning away at Blender files.

I have posted a topic on this a long time ago, I’ve lost it, so I thought I’d start this one.

This cluster is 128 Raspberry Pi 3B+ each with 4 cores, with some rudimentary testing some time ago the output rate was actually pretty good.

Another reason for building this cluster is to venture into MPI coding with Python, this got me thinking about this subject again. With mpi4py I see no practical reason why the render code couldn’t scatter tiles of a frame out to all available processors and gather them again. I’m making some pretty large assumptions here and there may be technical reasons why this would be extremely difficult, I really don’t know. But wouldn’t it be kind of fun to have a branch of the render code that could do this?

Setting Blender running on each node of the cluster to build an animation is not difficult at all, but I wonder if it’s the most efficient use of the resources. An MPI version of the render routines would also be able to churn away at a still frame, allowing the cluster to build a test render and not tie up my development/design machine.

I really do think there are legs to this, many, many people are building small clusters with pis and other small single board pc’s. There’s a practical use for this, there’s an academic opportunity for this, and again let’s consider the fun value!

I would love to hear back from a dev or two, let me know what you think.


A quick update for anyone who may have read this post and have any interest in it at all…

I now have a fully functional 128 node pi blender rendering cluster. This is 512 cpu cores available for bashing it’s way through animation frames.

A few tests have given (I reckon) fairly impressive results, 12,000 frames, 1920/1080, cycles, 512 samples of several thousand verts are easily pushed to my storage in under 12 hours. Basically 1,000 frames an hour.

I have somewhat overbuilt this cluster as it’s going to be doing many different things and may be going on tour, so it’s been more expensive than I first planned. However, if you were happy with wifi, didn’t want ethernet and switches, bolted it all to MDF and ran a simple desk fan for cooling, this can be built for less than a half decent desktop PC. My flagship intel with 2x 1070ti cards cannot out compete the cluster.

This of course is completely out of the box for most people, more common is a simple 6/8 node cluster, but all the considerations are the same. And people learn stuff from this. It is time I tell you, for some farming tools, rendering hooks and more headaches for you developers!! ;o)

1 Like

" this can be built for less than a half decent desktop PC. My flagship intel with 2x 1070ti cards cannot out compete the cluster."

Until you run out of ram? 1gb for rendering is a very low limit… 3-4gb should be sufficient for most peoples needs.

Can you run the standard benchmarks on one node so that we can see how fast the RPI is and how it would compare rendering an animation vs different machines?

I’d love to see how you’ve set that up. While the ram limit may not allow you to render hollywood grade scenes, I could see the benefits for motiongraphics and other simple renders.