Rendering using AWS Batch

Hi

I mentioned this way down another thread, but I thought I’d post specifically about this now it’s in a working state. I put together an automated process for running renders on EC2 instances in the cloud using AWS Batch.

This features:

  • A one line deployment into your AWS environment
  • A python command line utility for enqueuing and processing jobs. Animation jobs can be split into many separate jobs each rendering 1 or more frames.
  • You can modify the compute environment to specify a maximum ‘spot price’ you’re willing to pay for instances, and the jobs will run when the current spot price drops below this level
  • Auto scale up and down (to zero) of instances by AWS Batch

I’m just posting this in case anyone is interested… no warranty as to correctness is provided. See the README.md on GitHub for more info and instructions.

Hope this helps someone.
Pete

2 Likes

This looks really cool. I’ve been looking at switching over to a docker based solution for my ec2 rendering.

How prone to interruption and cost effective do you find gpu spot instances?

1 Like

Hmm, interruption all depends on how you set your spot prices. I’ve mainly tested with the spot price set to 100% - indeed sometimes the spot price is 100% on the GPU instances, in some regions/AZs.

Cost effective: for a frame which took 10 minutes to render on my (RTX2080SUPER + GTX970) I very roughly estimate it to cost £0.14 on the largest p3 instance (at 100% spot price). It didn’t work out much cheaper on CPU instances as they are so much slower than the GPU ones.

Have you tried the spot prices on the C5.24xlarge or the c5.metal? I was getting about 0.9 USD/hour.

It was just an experiment for me, but using sheepit-renderfarm, my 2 GTX 970’s would get about 60k points a day earnings, while this instance could get it in 2 hours.

Or using sheepits % system, I got ~400% for 1 GTX 907, vs ~3000% compared to the reference machine.

I suggest this as when models get complex, or shaders do CPU’s tend to slow down less than GPU’s, and AWS’s spot prices seem to vary/cost more for GPU instances.

No I hadn’t tried those instances. That’s interesting, sounds cheaper than the GPU instances, and yes, the GPU spot prices are high! Thanks for the info.

Is it possible to render projects that have external files such as mesh caches and fluid sims?

Can you use the File -> External Data -> Pack All Into Blend option?

If you really need external files then no, it’s not currently supported, but support could be added.

Ah, too bad. Mesh caches, particle caches, Fluid and Smoke caches can not be packed into the Blend file.

Are you using this to render in the cloud? Cool if so. It also needs updating to Blender 2.81… I haven’t used it myself in a while.

I tried it because I have to render a scene with a 10 GB Mesh Cache and a couple of even larger Smoke caches and it takes for ever to upload to S3. I am rendering with the Brenda scripts.

Since I have several scenes that use the Mesh caches I have to upload them several times because that is unfortunately a limitation with Brenda.

I have a 50 Mbit upload rate which surprisingly runs around 30 to 50 pretty often. But it appears that Amazon doesn’t want me to upload faster than 5Mbit or so to S3.

Now, you system is a lot more modern and easier to handle, so in the future if I don’t need caches I will probably use that.

I have several questions, though. How do I set the max spot price? I saw that you can set it percentage wise but what if the spot price spikes at something absurdly high? I read that there are absurd spikes when the spot price goes a lot higher than the normal price. If the price spikes to 100 € this could be very expensive.

Also, is it possible to control which kind of instances are launched? With my testfile AWS started a fleet of seemingly random instances like, an r4.large, c4.4clarge, I think I saw a p in there, too.

The max price is set as a percentage in the compute environment (BidPercentage). It is a percentage of the on-demand price, so if you set it to 100% it will never go higher than the listed value.

You can change which instances are launched by changing InstanceTypes from optimal to a list of instance types or classes. When set to optimal it will launch whichever instances can satisfy your job definitions (including any GPU requirements you have on your job definitions).

Yeah I hadn’t thought about related cache files, maybe someone could implement this :slight_smile:

1 Like

Ah, so a percentage of the on demand price. Very good.

How do you set the job definitions? Or, how does AWS know how much CPU/GPU power is required?

It’s in the job definition in the cloudformation file (in the AWS directory) so set before you deploy the stack.

1 Like

Hi guys, which version/fork of Brenda are you using and is recent for AWS and Blender 2.8x? The original one seems to be outdated (last update 6yrs ago).

Hi, can you describe the directory structure (i.e. show me the tree of files and directories) in a render like this with the caches? I’m planning on doing a bit on this at the weekend, I need to support a single additional file alongside the .blend, but a directory structure shouldn’t be much extra work.

And I’m going to fix it so it will only upload the data once, if it hasn’t changed.

That is awesome.
My directory structures for Rendering usually look like the following:

  • There is a folder with the blender file. This file is usually called RENDER.blend

  • In this folder I have a “Caches” folder

  • In the “Caches” folder are more folders containing the individual caches (i.e. blendcache_smoke1, blendcache_smoke2, blendcache_water1 …)

  • These folders contain the cache files

  • Often there will also be subfolders for other files such as textures, or .blend files linked into the the RENDER.blend

  • for example there might be a folder called “houses” containing a bunch of .blend files which the RENDER.blend file uses to link the houses of a city scene.

Hi, so I had a look at this - it turns out your use case is slightly different to what I was doing, as I prefer to upload all required files with each job. But I can now with my latest changes (not yet pushed) include additional files along with the main blend.

One simple option which might satisfy your requirement would be to cater for a ‘pre_render_hook.sh’. If you include this with your blend (using the above option) it would be executed in the docker container before the job runs.

You could then have a script with content like:

mkdir ./Assets
aws s3 cp s3://myassetsbucket/houses.blend ./Assets/
aws s3 cp s3://myassetsbucket/streets.blend ./Assets/

mkdir ./Textures
aws s3 cp s3://mytexturesbucket/house1.jpg ./Assets/
etc

Then as long as you’re using relative paths in the blend, it should render. And then you don’t have to submit the large files more than once, assuming your main job.blend isn’t too large.

Let me know what you think, if it will work for you then I’ll add the hook…

1 Like

Sounds great! Thank you for all the work. It is really appreciated. :smiley:

Hi, this is now available at https://github.com/petetaxi-test/AwsBatchBlender

Have a look in the examples folder to see how to use pre/post render hooks. Also see the README.md file for setup.

Let me know how you get on…