Cycles Development Updates

rawalanche · August 2, 2018, 8:09am

Alright:

1, Working on VFX in large studios does not automatically equal competency. In fact, it’s often quite the contrary. Large studios have brute force pipelines set up, which usually trade in performance in return for being error-proof and not requiring any technical skill from the artists. That’s how Arnold based pipelines mostly work. I have been surprised how many people working in large VFX studios with shiny titles such as Senior VFX TD have complete lack of understanding of even basic sampling optimization strategies. If those people are supposed to “Technically Direct” anything, then I am scared to even know what kind of technical skillset non-TD employees posses.

2, I never stated if I was comparing Cycles to Vray and Corona using secondary cached GI or using full path tracing. I actually compared Cycles to Corona using pure PT+PT in Corona, and Corona still performed about 3-4* faster in an identical scene on the CPU. On GPU, Cycles on GTX1080Ti was slightly slower than PT+PT Corona on i7 5930k, a 4 years old 6 core CPU.

3, The times when approximate methods in either Corona or Vray caused any artifacts, be it on static or animated scenes are long gone, that’s why these setups are also default these days. I haven’t seen any animated scene, regardless of complexity, which could not be handled by V-Ray’s light cache or Corona’s UHD cache in past 3-4 years, at all… not a single one. These days, retracing techniques for these caches have been refined to perfection.

4, The reason Cycles doesn’t have cached secondary GI has nothing to do with it not being able to produce artifact-free results. The main reason is that cached GI on it’s own is hard to implement, and very, very hard to implement on GPU. It’s possible, as Redshift has proven, but challenging non the less. Even large rendering companies have hard time pulling it off, so it’s understandable that for Cycles dev team with such limited resources, it’s not on a roadmap.

5, What this means in practice is that the only reason why you would not use cached solution for secondary GI - possible artifacts - is no longer valid, and there is never any excuse not to use them. This, in turn, gives renderers with such caching methods even bigger advantage over the others. So while Corona and V-Ray are “just” faster than Cycles in pure PT+PT mode, they are much, much faster in PT+Cached mode. In this thread, however, I was initially referring just to the difference between Cycles and PT+PT mode of Corona and Vray, as I generally tend to compare apples to apples too.

6, Renderman, Arnold and Manuka are made to handle complexity at the expense of performance while being error-proof, as I pointed out in my first post. It has absolutely nothing to do with the instability of the approximate solutions. I mean for example ILM generalist department uses V-Ray heavily and they are able to easily get their shit done

7, The argument about archvis but not animation, I don’t even know what to say about that… I’ve delivered so many animated shots over the past years using approximate methods as secondary GI, and never had issues. I think you are taking about V-Ray 6 or more years ago.

Now, my main point:
Blender’s main demographic, at this point, are smaller studios and individual freelances, exactly the kind of people who can’t afford those giant pipeline oriented rendering solutions which trade performance for reliability. Sure we need our renderer to be reliable too, but first and foremost, we need to be able to even afford to render. Rendertimes are still an issue, especially for animations. Trust me, I know it… this is one of my last personal projects:

Rendered in Corona with cached secondary GI. 3400 frames at FullHD of a scene mostly covered in very dense, reflective and translucent foliage. It was completely stable. I know there is some small flickering here and there, but that’s not from GI, that’s from the antialiasing. Wanna know why? Because I had to render it at my own expenses. Believe or not, at fullHD, each frame of this scene took just 15 minutes on my 4 year old 6 core CPU. The scene took about 56GB of memory, so it would never fit on any GPU.

Despite using such fast renderer as Corona, and getting incredible rendertime of just 15 minutes for such complex frames, including in-render motion blur and DoF, I still had to pay over 1500EUR, or $1750 out of my own pocket to get this scene rendered. If I ever tried to do this in Cycles, I’d fail, because I would not be able to spend around let’s say $7000 just on a rendering of a personal project. So even these days, it’s still borderline impossible for a freelancer to render a short movie of a complex scene on his own. And for this to change, it means we need to take advantage of any optimization available, and get rid of that retarded mentality of “it has to be unbiased” or “caching is cheating”.

skw · August 2, 2018, 8:14am

Do you have any guess if that’s due to Corona delivering more samples/s or Corona delivering fewer noise at the same number of samples? That’d be interesting to know.

rawalanche · August 2, 2018, 8:15am

I don’t, but I can make the tests, right now in fact. I know where to get the stats of rays/s in Corona, I just need to know where can I get them in Cycles One of my bets, however, is that random sampler makes a big difference too.

EDIT: Maybe this should be split off into another “Cycles Performance” thread

rawalanche · August 2, 2018, 9:39am

Alright, so I did the tests and I am actually quite surprised with the results. I compared Cycles in latest August 02 Master to Corona V2. Both pure Path tracing, with ray depth limit of 12, and max ray intensity of 10 (ray clamping). Identical scenario, identical camera, identical material. And to my surprise, Cycles on CPU only was actually equally as fast as Corona.

I am not sure what attributed to such a difference in my previous tests. Granted, they were done over a year ago, and I may have done a mistake, but I suspect that the better tile size handling may have something to do with it, because I was not aware about tile size impact on performance back when I was doing those tests, and in latest master, it’s handled automatically.

Non the less:
Cycles, 5m 32s

Corona, 5m:

Very comparable results, very comparable times.

Now I did one more test, with Corona and secondary cached GI:

This one is default and used for every scene pretty much, even animated ones. It’s about twice as clear as those pure path traced renders in one 5th of the time, while also being more accurate since the secondary GI bounces are not limited to 12, which attributes to some light energy loss. So in practical scenarios, especially interior ones, cached methods are superior.

Non the less, I still stand corrected. If we compare pure path tracing performance, it’s apparent that Cycles, at least the one in latest master, is equally as performant as Corona (or V-Ray) for that matter. So really great job on that. I guess I underestimated Cycles quite a bit.

Renderstamp from Corona shows rays/s. In Cycles, I still don’t know how to obtain that value.

skw · August 2, 2018, 12:48pm

Detailed render stats are high on my wish list too.

JuhaW · August 2, 2018, 2:37pm

Where do I get that scene ? I can test it with Blender, V-ray Next,clay render bf/bf and bf/light cache, cpu and gpu.

anaho · August 2, 2018, 3:19pm

I want to mention that Cycles is still like 30% faster in CPU mode when using Linux instead of Windows.

lacilaci86 · August 2, 2018, 3:23pm

Excuse me, what?!

YAFU · August 2, 2018, 3:24pm

Yes, but not only Cycles. CPU is generally faster in Linux. So for a fair test, all tests should be performed on the same OS.

oaschwab · August 2, 2018, 3:27pm

I was running manjaro and it regularly rendered about 15% faster than windows 10.

anaho · August 2, 2018, 3:31pm

Maybe they have managed to reduce the diffrence over time. But i find that difference still quite large tbh.

rbx775 · August 2, 2018, 3:33pm

I can confirm that windows 10 is alot slower at rendering aswell as in common tasks. copying folders, starting programs, etc.

Win7 and Linux are closer together, though. ~ or even about the same.

anaho · August 2, 2018, 3:35pm

It is the compiler they use for the windows builds, not windows itself.

oaschwab · August 2, 2018, 3:42pm

This was about a year ago for me. I no longer have a dual boot machine with Manjaro on it to test.

anaho · August 2, 2018, 3:43pm

Me neither XD

s12a · August 2, 2018, 3:44pm

This is easily demonstrated by using the Windows Subsystem for Linux (e.g. “Ubuntu on Windows”). Several months ago I tried command-line rendering with a Linux build of Blender on Windows under the WSL and I could obtain appreciably faster rendering times using Cycles. It doesn’t have to do with OS overhead like some have tried to imply.

anaho · August 2, 2018, 3:47pm

Hey thanks thats a nice tip.

YAFU · August 2, 2018, 4:09pm

Maybe this is because WDDM is being avoided when WSL is used?

Have you compared In your tests render times from command-line in a Linux installation too (not from WSL)?

Just asking. Anyway there are reports of users who also find other programs like Houdini faster in Linux than in Windows (CPU).

Anyway, this is turning off topic.

s12a · August 2, 2018, 5:12pm

I haven’t tried from a Linux installation with the same build and file.

Anyway, I just tried the latest Linux and Windows builds under Windows and this time I’m currently getting about the same rendering times. 2.79b is on the other hand much slower.

To make this in-topic, I’ve also tried the latest 2.80 build from Buildbot. It’s fast as 2.79 master (understandably so since all changes are getting ported into 2.80). I guess that previous compiler-related issues might have been solved. I haven’t tried it yet natively on Linux, however.

YAFU · August 2, 2018, 5:17pm

Here I have tried BMW27 scene in Kubuntu (CPU). Command line is only 3 seconds faster than from GUI, so it does not really influence much (at least in this scene).

Edit:
*BMW27: original scene from Blenderartists with 20 square samples.