Renderbot: Scalable and Affordable Blender Rendering in the AWS Cloud. No middleman.

Lumpengnom · August 15, 2015, 7:00pm

Hmm… It is possible that I had spot requests running which were assigned to US-East-1A (the cheapest at the time) which then dropped out due to increased price. Once the spot requests are in, for example US-East-1A, they will not change the sub zone even if a different sub zone in the mean time gets cheaper.

Renderbot · August 15, 2015, 7:56pm

Right, so I’ll remove the “-F PNG” on the next Client update (that would be 1.0.3). I tested out the “subzone” issue, and 6 or 7 of my requests were all automatically assigned to the cheapest subzone. So I think that your explanation of pre-existing spot requests is probably the correct one. I tried to replicate the bug where the home screen was not appearing, but was not able to get it to occur. Could you describe a sequence of actions I can perform to trigger the bug?

Lumpengnom · August 16, 2015, 2:55am

Cool, thank you very much.

As for the menu bug: I´m sorry I don´t know what triggers it. It was gone yesterday.
On the day, when I wrote about the bug it would allways come up right after connecting with putty. I tried reconnecting, I tried setting up a new client instance, I restarted my computer but nothing made it go away. Then the next time the bug didn´t occur anymore.
If it happens again I´ll post.

Renderbot · August 18, 2015, 1:03pm

Hello all,

I just updated the Renderbot Client to 1.0.3, with the following changes:

-The output format is now defined by your Blender file, not forced to be PNG (Thanks Lumpengnom)

-The Running Instance Status Report is working again (Thanks Ibu Profen)

You’ll need to launch a new Client instance to replace your 1.0.2 instance, so I recommend launching an instance of 1.0.3 before you terminate your 1.0.2 so you can copy and paste your settings easily.

Keep the questions/comments/bugs coming.

thudjie · August 19, 2015, 9:43am

Thanks for setting this up. I’ve successfully tested a couple of renders.
A couple of points people may find useful:

The documentation is very good but would benefit to a minor update to make it a bit clearer about starting up the client instance first (as mentioned elsewhere in this thread). It was the only point where I stumbled during the set up and it would save other users from having to look here for that bit of info. Having re-read the document, it does make sense but just a tweak to make it clear that the ‘node’ instance doesn’t need to be started.

The aws s3 web interface is poor and doesn’t allow the ability to download multiple files or even order the files by any of the columns.
The documentation does suggest other alternatives and I did find the ‘s3 Browser’ software which is freeware very easy and it does allow ordering and multiple downloads - so all good there.

Yesterday I had trouble uploading an 80mb file to s3. I was on a 1-2mb upload speed at the time. It failed 3 times and each time restarted from scratch. This was using the s3 browser interface so I don’t know if the ‘s3 browser’ software provides better resilience for this. I’m back home now with a 20mb upload speed so no problems today.

For those curious about performance:
I did 2 tests, 1 using cpu and 1 using gpu and this compares the relative performance between my laptop which is of the following spec:
i7-4710MQ
16gb mem
gtx 980M 8gb
The first run (cpu rendering on laptop and aws) using 1 instance of c4.2xlarge
laptop: 22s per frame
aws: 50s per frame
Second test - different blend file (gpu rendering on laptop and aws) using 2 instances of g2.2xlarge
laptop: 380s per frame
aws: 450s per frame per instance - ie. 2 frames produced per 450s across both instances

I set up both tests using a spot instance slightly above the current rate so that they would start instantly and got $.09 for the c4.2xlarge and $.20 per instance for the g2.2xlarge
not sure if VAT is extra or not (UK user)

Finally a question:
When I first tried yesterday it didn’t do anything to start (I had used a zip file to upload the blend file).
I then realised that I had set the render to use GPU when I was using a CPU instance so I suspect that was the issue but it made me wonder - is there a way to find out what it going on when there is a problem?

Thanks again

Lumpengnom · August 19, 2015, 1:42pm

Hey, Renderbot, thanks for updating the client. I´m finishing up a larger file to render and will test the client tomorrow.

@thudjie: Thanks for your test results. Regarding your final question: You´re probably right. The GPUs in the C4 instances probably aren´t good. Tha Amazon instance types guide doesn´t specify their GPU but I assume since they are made for CPU power they don´t have a GPU besides some onboard chip at all.
Regarding monitoring your instances in case of some sort of screw up: I don´t think that is possible with Renderbot at the moment. Brenda has a monitoring command which is

$ brenda-tool ssh tail log

but I don´t think it is integrated into the Renderbot user interface.

Regarding cost and speed: I´ve rendered a fair share of projects with AWS now and in my experience GPU is not useful in the cloud because of cost. The C4.8xlarge instances with their 36 CPU cores are a bit faster than the g2.8xlarge with their 4 GPUs. And the G2.8xlarge are usually more expensive than the C4.8xlarge. I also can never get more than 2 or 3 g2.8xlarge running at the same time. I allways get a message that the areas limit has been reached whereas getting a couple of c4.8xlarge running usually works.
Otherwise I´ve experienced that projects which take long to render per frame (half an hour/frame or more) it is more cost effective to use high powered instances such as the c4.8xlarge whereas on projects where it takes less time per frame to render it is more cost effective to get a whole bunch of c4.4xlarge and even c4.2xlarge machines.
I´m not sure about the c3 machines as they are most of the time more expensive than their c4 equivalents for some reason.

I´ve been metering the KWh usage of my computer since the beginning of 2015. All in all I think cost wise rendering on AWS is cheaper or about as expensive as rendering at home. At least if you live in Germany where power is about 0.25€/KWh.

Renderbot · August 19, 2015, 6:34pm

thudjie:

Thanks for setting this up. I’ve successfully tested a couple of renders.
A couple of points people may find useful:

The documentation is very good but would benefit to a minor update to make it a bit clearer about starting up the client instance first (as mentioned elsewhere in this thread). It was the only point where I stumbled during the set up and it would save other users from having to look here for that bit of info. Having re-read the document, it does make sense but just a tweak to make it clear that the ‘node’ instance doesn’t need to be started.

The aws s3 web interface is poor and doesn’t allow the ability to download multiple files or even order the files by any of the columns.
The documentation does suggest other alternatives and I did find the ‘s3 Browser’ software which is freeware very easy and it does allow ordering and multiple downloads - so all good there.

Yesterday I had trouble uploading an 80mb file to s3. I was on a 1-2mb upload speed at the time. It failed 3 times and each time restarted from scratch. This was using the s3 browser interface so I don’t know if the ‘s3 browser’ software provides better resilience for this. I’m back home now with a 20mb upload speed so no problems today.

For those curious about performance:
I did 2 tests, 1 using cpu and 1 using gpu and this compares the relative performance between my laptop which is of the following spec:
i7-4710MQ
16gb mem
gtx 980M 8gb
The first run (cpu rendering on laptop and aws) using 1 instance of c4.2xlarge
laptop: 22s per frame
aws: 50s per frame
Second test - different blend file (gpu rendering on laptop and aws) using 2 instances of g2.2xlarge
laptop: 380s per frame
aws: 450s per frame per instance - ie. 2 frames produced per 450s across both instances

I set up both tests using a spot instance slightly above the current rate so that they would start instantly and got $.09 for the c4.2xlarge and $.20 per instance for the g2.2xlarge
not sure if VAT is extra or not (UK user)

Finally a question:
When I first tried yesterday it didn’t do anything to start (I had used a zip file to upload the blend file).
I then realised that I had set the render to use GPU when I was using a CPU instance so I suspect that was the issue but it made me wonder - is there a way to find out what it going on when there is a problem?

Thanks again

That’s true that the documentation was not too clear on that, so I just updated it. The AWS S3 online interface is really bad in terms of file management/downloads, so yeah, S3 Browser or Cyberduck makes things a lot easier.

In terms of uploads, the advantage of S3 Browser/Cyberduck is that they do multithreaded uploads (upload multiple parts of the project at the same time). If you are having trouble, you could also try uploading directly through the AWS S3 interface, it’s actually quite usable for uploads, with the only disadvantage being no multithreading.

Thanks a lot for the test data.

Absolutely true that the error reporting is not up to speed for Renderbot right now. There is a log file available on each Node instance, so as soon as possible I’ll put out a new Client that allows you to take a look at those via SSH. Those logs are pretty descriptive, and I’ll put up some instructions on how to access them.

Thanks for trying out the system and for your feedback.

kenanaybati · August 20, 2015, 9:34am

Only Cycles supported ?

Renderbot · August 20, 2015, 12:36pm

At the moment, Blender Internal and Cycles are supported. I’m looking into adding some other render engines, but the priority for now is ironing out any bugs/issues people are having with those two render engines.

Lumpengnom · August 20, 2015, 3:22pm

I´m rendering on the new client right now and it is working perfectly.

Is it actually possible to render two projects at the same time?

Renderbot · August 20, 2015, 8:54pm

So the S3 path to your project is “sent” to the instances at startup. There are two ways you could render two projects at the same time:

Option 1: Put the path to Project 1 and an sqs task queue name for Project 1 tasks in the Renderbot settings. Launch some instances. Put the path to Project 2 and another sqs task queue name for Project 2 in the settings. Launch other instances. Under this scenario, the instances launched while the S3 project path was set to Project 1 will render that project, and those launched when the settings (S3 and queue) were for Project 2 will render Project 2. You can check the progress of either queue by changing the queue name in the settings back and forth and checking the task queue on the Renderbot Client.

Option 2: If you don’t want to have 2 sets of instances, I do have an option within the settings window called “push settings to instances.” So under this scenario, you can render some frames of one project, then change the S3 setting to your second project (without changing the task queue setting), and push the new settings to the running instances. Then, add the frame tasks for your second project, and the instances will render the second project. This method has the advantage of using the same instances for both projects. Keep in mind that the shutdown behavior of the instances (in the settings) must be set to “poll” and not “shutdown,” otherwise when the project 1 tasks are complete the instances will terminate.

I hope this answers your question.

Renderbot · August 22, 2015, 9:22am

Hello everyone. I have released a 1.0.4 update to the Client. It is for some reason not showing up directly on the AWS Marketplace listing, but in the launch dialog (the one you see when you press “Continue” on the AWS Marketplace) it can be selected under the “Version” heading. Changes are the following:

-Added the ability to exit the Renderbot interface into the terminal

-Wrote a script allowing download of logs from running Node instances. These logs allow you to see exactly what is going on when jobs are running on instances.

Quick tutorial:
-To use the log download script, press 5 on the homescreen to exit the Renderbot interface

-Then, run the command “fetchlogs”

-The utility will prompt you to provide the Public IP Address of the instance from which you want to download the log file. This is the same IP you would use to connect to an instance with Putty, available in the EC2 manager when you click a running instance.

-Also, the utility asks for a filename under which to save the logs. The log file will be saved in the home folder of your client instance. (~/)

-Keep in mind that if you use the same filename twice, the old log file will be overwritten.

-You will have to type “yes” to allow connection to the Node instance.

-Use basic Linux commands to navigate to the home folder (cd ~/), list available files (ls), and view the log file you created (nano [logfilename]).

-You can restart the Renderbot interface at any time by running the command “renderbot”.

I hope this helps you diagnose any issues with render jobs not starting, etc… Let me know if you have any questions.

thudjie · August 27, 2015, 11:36am

Thanks for the info on the update. I was about to do another render but have a question - does this support cache files for things like (in my case) fluid simulations?
If so I presume I can just up load a zip file? Its going to be about 1gb so I didn’t want to use up bandwidth if there is an issue with it
Thanks

Renderbot · August 29, 2015, 9:45am

Hey, and sorry for the late response. I don’t have too much experience with fluid sim. I am pretty sure that anything that is stored within the .blend file should be fully supported. So if the bake operation for the fluid sim stores its results in the .blend file or if you can pack them in somehow (perhaps with the “pack external files” option), you should be good to go. I also think that if you are planning on generating the cache during the render, and you were planning on doing this render in the cloud, the Nodes will be able to generate them on EC2.

Now, if you meant uploading a separate cache directory, I am less sure. From what I could gather from some basic research, there is an option to store the bake files in a folder right next to the .blend file (in the same directory). In that case, I think that you should be able to render the project if you pack the bake folder and the .blend into the same .zip file and upload that.

In terms of saving bandwidth, perhaps you could create a “test” file with a very simple fluid simulation to see if it works. I hope this answers your question. Let me know how it goes.

Lumpengnom · August 29, 2015, 12:51pm

If Renderbot works the same as vanilla Brenda then caches do work. I´ve rendered files with particle caches with vanilla Brenda by simply uploading a .zip file in which I had my .blend file and subfolders with the cache. The sub folder structure of course has to be the same as on your hard drive.
I´m currently preparing a file with a particle cache and will try rendering it tomorow with Renderbot.

mitchstababs · August 29, 2015, 2:34pm

Check out this kickstarter, hopefully it will bring free render farms to the masses!

Lumpengnom · August 31, 2015, 7:53am

I´ve been trying to render a particle simulation with very mixed results. Particle caches can not be packed into the .blend file as far as I am aware, so I packed the .blend file and 3 subfolders with 3 caches into a zip file and uplaoded it to AWS.

It renders fine on c3.8xlarge instance but not on c4.4xlarge and c4.8xlarge instances. I´ve donwnloaded the .log files of a c3.8xlarge and a c4.8xlarge and the latter shows a bunch of errors. For example:

Here are the two complete log files with all things I assume to be personal info removed:
c4.8xlarge


RM task_count
RM task_last
MKDIR /mnt/brenda/tmp
Spot request ID: DELETED REQUEST ID
RMTREE /mnt/brenda/brenda-project.tmp.pre.tmp
MKDIR /mnt/brenda/brenda-project.tmp.pre.tmp
GET http://DELETED PROJECT FOLDER AND SIGNATURE
RETRY[8] 2/4 status=56 (Failure when receiving data from the peer)
RETRY[1] 2/4 status=56 (Failure when receiving data from the peer)
RETRY[9] 2/4 status=56 (Failure when receiving data from the peer)
RETRY[7] 2/4 status=56 (Failure when receiving data from the peer)
RETRY[10] 2/4 status=56 (Failure when receiving data from the peer)
RETRY[11] 2/4 status=56 (Failure when receiving data from the peer)
RETRY[5] 2/4 status=56 (Failure when receiving data from the peer)
RETRY[15] 2/4 status=56 (Failure when receiving data from the peer)
RETRY[13] 2/4 status=56 (Failure when receiving data from the peer)
RETRY[3] 2/4 status=56 (Failure when receiving data from the peer)
RETRY[14] 2/4 status=56 (Failure when receiving data from the peer)
RETRY[4] 2/4 status=56 (Failure when receiving data from the peer)
RETRY[6] 2/4 status=56 (Failure when receiving data from the peer)
RETRY[12] 2/4 status=56 (Failure when receiving data from the peer)
RETRY[0] 2/4 status=56 (Failure when receiving data from the peer)
RETRY[2] 2/4 status=56 (Failure when receiving data from the peer)
WRITE[0] 0-191996073
WRITE[1] 191996074-383992147
WRITE[12] 2303952888-2495948961
WRITE[11] 2111956814-2303952887
WRITE[5] 959980370-1151976443
WRITE[8] 1535968592-1727964665
WRITE[10] 1919960740-2111956813
WRITE[13] 2495948962-2687945035
WRITE[2] 383992148-575988221
WRITE[6] 1151976444-1343972517
WRITE[3] 575988222-767984295
WRITE[4] 767984296-959980369
WRITE[7] 1343972518-1535968591
WRITE[14] 2687945036-2879941109
WRITE[9] 1727964666-1919960739
WRITE[15] 2879941110-3071937187
*** ['unzip', 'Smoke03.zip']
Archive:  Smoke03.zip
   creating: blendcache_Smoke03_ParticleTrail1_2/
  inflating: blendcache_Smoke03_ParticleTrail1_2/ParticleTrails_000000_00.bphys
  inflating: blendcache_Smoke03_ParticleTrail1_2/ParticleTrails_000201_00.bphys
  inflating: blendcache_Smoke03_ParticleTrail1_2/ParticleTrails_000202_00.bphys
  inflating: blendcache_Smoke03_ParticleTrail1_2/ParticleTrails_000203_00.bphys
  inflating: blendcache_Smoke03_ParticleTrail1_2/ParticleTrails_000204_00.bphys
  inflating: blendcache_Smoke03_ParticleTrail1_2/ParticleTrails_000205_00.bphys
  inflating: blendcache_Smoke03_ParticleTrail1_2/ParticleTrails_000206_00.bphys
  inflating: blendcache_Smoke03_ParticleTrail1_2/ParticleTrails_000207_00.bphys
  inflating: blendcache_Smoke03_ParticleTrail1_2/ParticleTrails_000208_00.bphys
  inflating: blendcache_Smoke03_ParticleTrail1_2/ParticleTrails_000209_00.bphys
  inflating: blendcache_Smoke03_ParticleTrail1_2/ParticleTrails_000210_00.bphys
  inflating: blendcache_Smoke03_ParticleTrail1_2/ParticleTrails_000211_00.bphys
  inflating: blendcache_Smoke03_ParticleTrail1_2/ParticleTrails_000212_00.bphys
  inflating: blendcache_Smoke03_ParticleTrail1_2/ParticleTrails_000213_00.bphys
  inflating: blendcache_Smoke03_ParticleTrail1_2/ParticleTrails_000214_00.bphys
  inflating: blendcache_Smoke03_ParticleTrail1_2/ParticleTrails_000215_00.bphys
  inflating: blendcache_Smoke03_ParticleTrail1_2/ParticleTrails_000216_00.bphys
  inflating: blendcache_Smoke03_ParticleTrail1_2/ParticleTrails_000217_00.bphys

And the working c3.8xlarge instance:


RM task_count
RM task_last
MKDIR /mnt/brenda/tmp
Spot request ID: DELETED ID
RMTREE /mnt/brenda/brenda-project.tmp.pre.tmp
MKDIR /mnt/brenda/brenda-project.tmp.pre.tmp
GET DELETED PROJECT FOLDER AND SIGNATURE
WRITE[14] 2687945036-2879941109
WRITE[1] 191996074-383992147
WRITE[10] 1919960740-2111956813
WRITE[6] 1151976444-1343972517
WRITE[4] 767984296-959980369
WRITE[11] 2111956814-2303952887
WRITE[7] 1343972518-1535968591
WRITE[15] 2879941110-3071937187
WRITE[12] 2303952888-2495948961
WRITE[2] 383992148-575988221
WRITE[5] 959980370-1151976443
WRITE[13] 2495948962-2687945035
WRITE[8] 1535968592-1727964665
WRITE[3] 575988222-767984295
WRITE[9] 1727964666-1919960739
WRITE[0] 0-191996073
*** ['unzip', 'Smoke03.zip']
Archive:  Smoke03.zip
   creating: blendcache_Smoke03_ParticleTrail1_2/
  inflating: blendcache_Smoke03_ParticleTrail1_2/ParticleTrails_000000_00.bphys
  inflating: blendcache_Smoke03_ParticleTrail1_2/ParticleTrails_000201_00.bphys
  inflating: blendcache_Smoke03_ParticleTrail1_2/ParticleTrails_000202_00.bphys
  inflating: blendcache_Smoke03_ParticleTrail1_2/ParticleTrails_000203_00.bphys
  inflating: blendcache_Smoke03_ParticleTrail1_2/ParticleTrails_000204_00.bphys
  inflating: blendcache_Smoke03_ParticleTrail1_2/ParticleTrails_000205_00.bphys
  inflating: blendcache_Smoke03_ParticleTrail1_2/ParticleTrails_000206_00.bphys
  inflating: blendcache_Smoke03_ParticleTrail1_2/ParticleTrails_000207_00.bphys
  inflating: blendcache_Smoke03_ParticleTrail1_2/ParticleTrails_000208_00.bphys
  inflating: blendcache_Smoke03_ParticleTrail1_2/ParticleTrails_000209_00.bphys
  inflating: blendcache_Smoke03_ParticleTrail1_2/ParticleTrails_000210_00.bphys
  inflating: blendcache_Smoke03_ParticleTrail1_2/ParticleTrails_000211_00.bphys
  inflating: blendcache_Smoke03_ParticleTrail1_2/ParticleTrails_000212_00.bphys
  inflating: blendcache_Smoke03_ParticleTrail1_2/ParticleTrails_000213_00.bphys
  inflating: blendcache_Smoke03_ParticleTrail1_2/ParticleTrails_000214_00.bphys
  inflating: blendcache_Smoke03_ParticleTrail1_2/ParticleTrails_000215_00.bphys
  inflating: blendcache_Smoke03_ParticleTrail1_2/ParticleTrails_000216_00.bphys
  inflating: blendcache_Smoke03_ParticleTrail1_2/ParticleTrails_000217_00.bphys
  inflating: blendcache_Smoke03_ParticleTrail1_2/ParticleTrails_000218_00.bphys
  inflating: blendcache_Smoke03_ParticleTrail1_2/ParticleTrails_000219_00.bphys
  inflating: blendcache_Smoke03_ParticleTrail1_2/ParticleTrails_000220_00.bphys
  inflating: blendcache_Smoke03_ParticleTrail1_2/ParticleTrails_000221_00.bphys
  inflating: blendcache_Smoke03_ParticleTrail1_2/ParticleTrails_000222_00.bphys
  inflating: blendcache_Smoke03_ParticleTrail1_2/ParticleTrails_000223_00.bphys
  inflating: blendcache_Smoke03_ParticleTrail1_2/ParticleTrails_000224_00.bphys
  inflating: blendcache_Smoke03_ParticleTrail1_2/ParticleTrails_000225_00.bphys
  inflating: blendcache_Smoke03_ParticleTrail1_2/ParticleTrails_000226_00.bphys
  inflating: blendcache_Smoke03_ParticleTrail1_2/ParticleTrails_000227_00.bphys
  inflating: blendcache_Smoke03_ParticleTrail1_2/ParticleTrails_000228_00.bphys
  inflating: blendcache_Smoke03_ParticleTrail1_2/ParticleTrails_000229_00.bphys
  inflating: blendcache_Smoke03_ParticleTrail1_2/ParticleTrails_000230_00.bphys
  inflating: blendcache_Smoke03_ParticleTrail1_2/ParticleTrails_000231_00.bphys
  inflating: blendcache_Smoke03_ParticleTrail1_2/ParticleTrails_000232_00.bphys
  inflating: blendcache_Smoke03_ParticleTrail1_2/ParticleTrails_000233_00.bphys

Lumpengnom · August 31, 2015, 11:11am

It appears that I´ve found a workaround. It´s a bit annoying but it works.

Upload your project with all Cache subfolders to your S3 as a packed .zip file.
Create a new EC2 instance, a free T2.micro is sufficient.
SSH into this instance with Putty.
Install s3cmd and use it to copy the .zip file from the S3.
Install Unzip and use it to unpack your .zip file
use s3cmd to copy the unpacked files back to your S3.

It is weird. In vanilla Brenda I had no problems with c4 instances and .zip files. With Renderbot it oesn´t appear to work.

However, thank you Renderbot for updating the client. Being able to access the log file is very useful.

thudjie · September 4, 2015, 8:53am

Thanks for the updates Guys, I’ll hopefully get around to trying the cache files in the next few days.
I’m actually using a dynamic paint cache in my animation and there isn’t a lot of compression with a zip file (.95gb compared to 1.05gb)
So I’m assuming I could just directly load these up to the s3 un zipped and save having to un compress on the cloud.
I’ll try that and report back.
Thanks again.

Lumpengnom · September 4, 2015, 9:58am

You can try but in my experience single large files upload a lot faster than lots of small files.
My cache files were about 3GB uncompressed and something like 2.7GB compressed. The uncompressed files would have taken 12 hours to upload whereas the compressed file took a bit more than half an hour.