Cluster Knoppix Renderfarm

I’m wanting to have a go at teaching blender this term, or at least introducing it to my class. The main problem I’m predicting is in slow render times. Renderfarming the school computers makes sense.

I’ve downloaded ClusterKnoppix Live CD. After a little bit of tinkering, I figured out how to boot a master machine up, then run the OpenMosix terminal server so the other machines can be “networkbooted” into the renderfarm.

Problem: while I believe the renderfarm is working (the master computer I am typing at right now is very fast indeed), blender does not seem to render fast at all. It’s terribly slow.

What do I need to do in order to make Blender 2.34 render using the other computers? :confused: I hear that it can, but I can’t find any settings for it.

I’ve searched for similar topics, but they are all about python scripts ansd stuff, and that loses me. I’d like some simple “for dummies” steps please.

Progress so far:

Did I mention I’m trying to do this on Linux? “…downloaded ClusterKnoppix Live CD…” oh yeah, I think I did.

Well, I realised that ClusterKnoppix still seems to work from the one machine, but reading through forums seemed to suggest that running it in the background through a shell might in fact cause the other machines to get on the bandwagon.

So, firing up an xterm, I decided to render the free chair model from http://blenderartists.org/forum/showthread.php?t=79006 like so…
blender -b chair.blend -f 1

I did have to change the options of chair.blend into internal renderer instead of yafray though (the yafray on the ClusterKnoppix seems bugged).

I thought that by working “in background” as a command, maybe the task would end up being spread around the notes of the renderfarm. However, while the load of each of the 9 computers went up to between 7-10% the render still seemed unbearably slow. I left it running and came back home. It looks like blender is determined to render as a single instance and not split its rendering tasks.

Or am I thinking about this the wrong way? Surely it’s okay to do it that way as long as the boxes are volunteering their RAM and CPU power?

Am I doing things the wrong way?

You might want to try dynebolic linux: http://dynebolic.org/manual-1/dynebolic-x171.en.html
Openmosix and Blender included, just boot it on all cluster machines.

Thanks - I’ll have a look.

Noooo…

I spent ages downloading dyne:bolic (I’m on dialup here at home, although they do have broadband at school) only to find that the promised /usr/mosix directory of tools does not exist.

Then I went to http://dynebolic.org/index.php?show=features and could see why: OpenMosix clustering is in dyne:bolic version 1 only – not the 2.2 version I’ve just downloaded.

I can still download dyne:bolic 1.4 at least, and there is a nice tutorial (I haven’t got that far yet) at http://spot.river-styx.com/viewarticle.php?id=12, so hopefully, I’ll be able to see results soon (fingers crossed).

I am slightly concerned that OpenMosix is exactly what was already running on ClusterKnoppix where blender did not work with it, although I am encouraged by the tutorial specifically showing the use of blender (albeit whichever older version ships with earlier versions of dyne:bolic).

Because we are a school using already setup PC’s, the idea of a “live” bootable distro is much more preferable over trying to install another possibly dependency-nightmare kernel patch directly. Sure hope this works out.

Waitaminute! The tutorial mentions the “Marc O. Gloor’s render script”!!! (see: http://pubwww.fhzh.ch/~mgloor/mosix-blender.html ) Looks like I might even be able to get this going on the original ClusterKnoppix? That would be cool because ClusterKnoppix only needs one CD to “live” boot the whole network.

I’m actually trying to set up the exact same thing, Though I’m doing it as a part of our computer club, and we’re trying to show the coolness of a cluster, not just blender, but is there any way to fork the single blender process, because say you have a single frame, kinda like how the threads button works?

I think we’re in the same place.

I’m at school now (weekend) and have been playing with this unsuccessfully for some time.

To make blender work over a renderfarm, it’s going to need the -b which makes it render an image or animation in the background. You would also add either -f 1 to render a single frame or -a -s (startframe) -e (endframe) to render an animation (or at least a series of pictures if your blender file is set to save as jpeg.

The problem with blender is that it refuses to thread through to nodes on the render farm because it insists on working as one large single process. The solution is to, apparantly, use the “excellent rendering script” from http://pubwww.fhzh.ch/~mgloor/mosix-blender.html which goes something like this…

#!/bin/sh

render.sh - a Mosix/Blender load balancing workload manager

written 2000 by Marc O. Gloor <[email protected]>

$Id: render.sh,v 1.1 2005/08/05 21:27:11 gloor Exp $

This program is free software; you can redistribute it and/or modify

it under the terms of the GNU General Public License as published by

the Free Software Foundation; either version 2 of the License, or

(at your option) any later version.

This program is distributed in the hope that it will be useful,

but WITHOUT ANY WARRANTY; without even the implied warranty of

MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the

GNU General Public License for more details.

You should have received a copy of the GNU General Public License

along with this program; if not, write to the Free Software

Foundation Inc., 59 Temple Place - Suite 330, Boston, MA 02111-1307, USA

example below will render 550 images from test.blend on 10 nodes:

“./render.sh test.blend 50 600 10”

if [ $# -lt 1 ]; then
echo “syntax: render.sh blenderscene startimage endimage nodes”
exit 1
fi

BIN=/usr/bin/blender # blender path
SCN=$1 # name of blender scene
DIF=expr \( $3 - $2 \) # no. of images to render
RPN=expr \( $DIF / $4 \) # no. of images per node
LOP=expr \( $4 - 1 \) # loop counter

for i in seq 0 $LOP ; do
BEG=expr \( $i \* $RPN \) \+ \( $2 + 1 \)
END=expr \( $i \+ 1 \) \* $RPN \+ $2
$BIN -b $SCN -s $BEG -e $END -a > /dev/null 2> /dev/null &
done

echo " "
echo “Rendering” $DIF “images from” $1 “on” $4 “nodes.”
echo “Tasks forked, network rendering in progress.”
echo -n "Job started at: " ; date ‘+%d-%m-%y %H:%M:%S’
echo “Please wait while rendering…”
wait
echo “Rendering successfully finished.”
echo -n "Job ended at: " ; date ‘+%d-%m-%y %H:%M:%S’
echo " "

#EOF

Looking at the script carefully, it does not look like it is intended to split a single frame into separate nodes at all, but rather separates individual frames to each note to render, so if you have a 1000 frame animation and 10 nodes, each node would render fully its own allocated set of around 100 frames. :frowning: So therefore I don’t think this script would be good for speeding up a long single frame yafray render.

Also, I am not sure whether it matters if each node has to have its own version of blender (e.g: should the latest blender be on all nodes?) or whether they “borrow” the executable from the central master and just lend the CPU power.

So far, after running live CD ClusterKnoppix, downloading and unzipping the laterse 2.42a blender onto a new /home/knoppix/Desktop/blender directory and then internally changing the /tmp dirs etc in the blender files and also the render.sh script to output tings to the Desktop directory, I still haven’t had success as getting separate nodes to take on the workload for blender, even though openmosix is working, and I can view workload graphs of all nodes from the master machine. Then again, I’ve mainly been trying bigf single frame renders and should perhaps use simple multiframe renders to test. Sadly, when I tried…

./render.sh basic-jpeg.blend 1 20 9

…to make a 20 frame animation of just the basic cube (basic-jpeg.blend was just that), all I got was a delay while the master node had workload of 100% :frowning: There may be an issue running off the live CD as I have seen openmosix complain (terminal somewheres) about not having write permissions to something in /tmp/

Would love to get this thing sorted. Feel I am so close, but not getting there may as well be miles off.

yeah, the script only is for animations. But I know there has to be a way to split the image up, And i know yafray can handle mutliple threads.

Another problem I want to tackle is the network boot. It seems like netbooting off of the CD is super slow. I already have an ltsp(linux terminal server project) running on FC3 on one of the servers, but I dont have openmosix/dont know how to configure it in. The solution at the moment is to try installing Clusterknoppix to each node. Which i dont know if i can do yet, but I’m gonna try. Its a work in progress but our computer club will have about 20 nodes in our cluster, mostly between 500Mhz, and 1Ghz boxes, but still. It will be sweet when we hook it all up. It will also be a ‘borg’ cluster, in that they will act as regular workstations as well as parts of the greater collective.

Oh - okay. You may not have to make a separate CD for all machines. I do the whole farm from one CD.

First I boot one machine in ClusterKnoppix. All other machines are off.

In the single machine that is running ClusterKnoppix, I go to the penguin icon (third or fourth icon along the bottom taskbar from memory), and launch the service for an OpenMosix server (or something like that). Basically, this initiates setting the machine up and ready to recieve nodes. It asks me a few questions but basically it’s a matter of clicking “next” a few times. It must be done before booting the other machines, as it generates a signal for the other clients to pick up on.

Once this has run, I then one at a time boot the other machines. At the very beginning, before even the BIOS “hit del to enter setup” stage, there is an option on my computers for “F12 = Network Boot”. It’s rather brief, though if I hit F12 at this time, that machine boots up in ClusterKnoppix even though there is no CD in it. Also it does not go to a desktop, but stays in console mode showing reports about its access. If your clients computers are capable, you may need to enable Network Booting in their BIOS / CMOS settings.

Once they are running, go to OpenMosix viewer on the main master computer (the one with the CD) This is a white square icon in the bottom taskbar, with a simple line drawing of a penguin on it. It should bring up a control panel showing all connected nodes, allowing you to reconfigure each ones access settings.

A side note: ClusterKnoppix is rather old. The default directory to save temporary files in it is bad (has a backslash in the wrong place). I get around this by porting across the laters 2.42a blender. It’s great that blender comes as a simple tar.gz instead of a full install needing to put stuff into /usr/lib, /usr/bin etc etc. because this would be unlikely possible on a write-only CD. So anyway, I can get latest blender at least working on latest ClusterKnoppix (though not threadding) but I suspect I’d be out of luck trying to install yafray over a live CD setup?

Noooo…

I spent ages downloading dyne:bolic (I’m on dialup here at home,
Uhm, sorry mate!

Next time you could ask someone with a broadband connection in the forum to try it out for you (or maybe even find a shipping arrangement). An entire CD on dial-up? Hope it did not block the phone line.:rolleyes:

That’s okay. Your suggestion was helpful because in the very least it pointed to the tutorial which is in the right direction. Just a pity I haven’t managed to get it quite working for blender.

Get someone to send it to me? I doubt anyone would do that. For as long as I’ve been in open source there seems to be an RTFM or “hard work is good for you” ethic which generally means that if someone out there does know how to do something beyond trivial, there are times when they choose to keep it to themselves. Does no one on this forum know the steps needed to render over a Mosix / LAN setup?

Like I said, I can also drive in to school where there is broadband and start large downloads there.

Having said that, an interesting play with the gear (spent a few hours trying to get it to work yesterday), though I’m out of time for planning my classes once more.

I wasnt planning on making separate CD’s I am planning at having it netboot like it does with clusterknoppix, just with a focus on installation from the network if at all possible, for two reasons 1. Speed, Booting off of the CD is slow, booting off the CD from the network is slower. 2. resources, installing will mean repartitioning all of the drives, which will mean dedicated temp, and swap file space.

And I have a server already that netboots, I just dont know how to configure it to netboot an install image. Also I’m having doubts about Mosix/OpenMosix as a kernel extension. I havent been able to really do anything with our cluster that we’ve set up so far. All the nodes seem to be working, but for some reason, none of the workload is being shared. Which sux.

What i’m really wanting is like a layer above the openmosix stuff that will split individual threads to go out to the network. So say you have a demanding process that for some reason was coded as a single thread, a middleware program that would chop up the thread and then propagate it on the network. That way the distinction between a single node and the entire cluster would be nearly indistinguishable…

Now, the real question… Can that happen? has someone come up with a solution to that yet?

I’ve been trying to do some things with openmosix and clusterknoppix as well but I did never get that far (lack of time :D)

I also thought of “chopping up the thread” but this is impossible… at least, as far as I understand programing languages, this is not easy. You would have to hack blender. There is a threads button in the rendering tab but it only starts two threads as far as I know. However, this option could help you to test the cluster: when the image takes a really long time to render, the master node should propagate the second thread and everything should be about twice as fast

I would suggest asking a coder if it would be possible to create a small “number-box” (or whatever they call it), where you can enter the number of threads that the rendering should start. This should not be impossible because if it is possible to run two threads simultaneously, why not 20? The limit of threads would always be the number of parts you use (and it seems you could use up to 32768).

Anyone here who knows a coder and could ask him?
Hope I got it all and maybe it even helps :slight_smile:

I did some research, and although i havent tested to make sure it would work, openmosix cant propogate threads, only processes, which is a little odd. If it was written in java, i could totally hack out a multithreaded version of blender, unfortunately, blender isnt written in java(yeah, i’m a java guy… yeah, i know other languages, but advanced stuff like threading I’ve only done in java), and and I’m not the most proficient hacker ever. Anyway. If its true that threads cant be propogated on a mosix cluster, doing the thread hack wouldnt help either. I’m going to install clusterknoppix today ona bunch of machines for our club. And Maybe by the end of the week, I’ll run some tests on threads, and I’ll see for sure. If not, I might even get some of our more interprising clubbers and hack together a middleware solution(but dont hold your breath… there are only two of us who know how to code.

Open Mosix is not really suitable for distributed rendering in my opinion. Mosix is not capable of splitting up a job into multiple threads. It is not enought to have a SMP system and simply start a linear task, you need to parallelise it first (thread it).

I use DrQueue on a HPC cluster which can render single frames if you split them into smaller jobs first, but many raytracing systems won’t allow this the whole image is necessary for the caluclations. However distributing animations is simple and linearly scalable wrt time.

Hanni


http://ainkaboot.co.uk High Performance Computing Cluster for distributed rendering and more…