Cycles Development Updates


(Indy_logic) #383

So, I’m not disagreeing with the general sentiment that Vanilla Cycles is slower than most other renderers. However, I work in VFX in LA and I can tell you that 4 hour render times are not ridiculous. Also, it’s not accurate to compare Cycles to Vray because you’re comparing brute force path tracing to the approximate methods in Vray. We use Vray here at work and we NEVER use Approximate methods because they usually cause artifacts. When comparing Apples to Apples (like Vray brute force VS Arnold) it actually comes out more similar. I can’t comment on Corona though.

The thing is, for animated feature film work, all of the majors studios use a brute force path tracer. RenderMan, Arnold, Hyperion, Manuka and Glimpse, are all path tracers. They are used because of how unreliable approximate methods can be. Sure they might work for Arch/vis but not for animation.

Another aspect of this is, when rendering for film, it’s cheaper to spend money on render time than artist time. This is also one of the main underlying the reasons for doing as much as possible in camera. It takes a lot of artist time to break up a scene into individual passes and exclusions. It may take longer to render the whole scene with moblur and dof but at least you’re not paying someone 60$ an hour to spend tons of time breaking everything out.


(Stefan Werner) #384

I don’t know how Jeff measured render time, but when measuring from “press render” to “final file on disk”, then a non-significant amount of time in production is just file transfers. When you have a hundred render nodes simultaneously asking for tens or hundreds of GBs worth of textures, the file server can become quite busy (and slow).

To quote upcoming TOG article about Arnold:

[…] simply fetching all this data from a local or networked disk, even in compressed form, could easily take an hour, or much more if the file servers or network are saturated with texture requests from the render farm.

I know, four hours may sound long to many. Keep in mind though that those are scenes much larger than what the average Blender user has. One of the first Cycles patches that came out of the production of “Next Gen” was removing the limit of 1024 textures, because that became a real issue.


(rawalanche) #385

Alright:

1, Working on VFX in large studios does not automatically equal competency. In fact, it’s often quite the contrary. Large studios have brute force pipelines set up, which usually trade in performance in return for being error-proof and not requiring any technical skill from the artists. That’s how Arnold based pipelines mostly work. I have been surprised how many people working in large VFX studios with shiny titles such as Senior VFX TD have complete lack of understanding of even basic sampling optimization strategies. If those people are supposed to “Technically Direct” anything, then I am scared to even know what kind of technical skillset non-TD employees posses.

2, I never stated if I was comparing Cycles to Vray and Corona using secondary cached GI or using full path tracing. I actually compared Cycles to Corona using pure PT+PT in Corona, and Corona still performed about 3-4* faster in an identical scene on the CPU. On GPU, Cycles on GTX1080Ti was slightly slower than PT+PT Corona on i7 5930k, a 4 years old 6 core CPU.

3, The times when approximate methods in either Corona or Vray caused any artifacts, be it on static or animated scenes are long gone, that’s why these setups are also default these days. I haven’t seen any animated scene, regardless of complexity, which could not be handled by V-Ray’s light cache or Corona’s UHD cache in past 3-4 years, at all… not a single one. These days, retracing techniques for these caches have been refined to perfection.

4, The reason Cycles doesn’t have cached secondary GI has nothing to do with it not being able to produce artifact-free results. The main reason is that cached GI on it’s own is hard to implement, and very, very hard to implement on GPU. It’s possible, as Redshift has proven, but challenging non the less. Even large rendering companies have hard time pulling it off, so it’s understandable that for Cycles dev team with such limited resources, it’s not on a roadmap.

5, What this means in practice is that the only reason why you would not use cached solution for secondary GI - possible artifacts - is no longer valid, and there is never any excuse not to use them. This, in turn, gives renderers with such caching methods even bigger advantage over the others. So while Corona and V-Ray are “just” faster than Cycles in pure PT+PT mode, they are much, much faster in PT+Cached mode. In this thread, however, I was initially referring just to the difference between Cycles and PT+PT mode of Corona and Vray, as I generally tend to compare apples to apples too.

6, Renderman, Arnold and Manuka are made to handle complexity at the expense of performance while being error-proof, as I pointed out in my first post. It has absolutely nothing to do with the instability of the approximate solutions. I mean for example ILM generalist department uses V-Ray heavily and they are able to easily get their shit done :slight_smile:

7, The argument about archvis but not animation, I don’t even know what to say about that… I’ve delivered so many animated shots over the past years using approximate methods as secondary GI, and never had issues. I think you are taking about V-Ray 6 or more years ago.

Now, my main point:
Blender’s main demographic, at this point, are smaller studios and individual freelances, exactly the kind of people who can’t afford those giant pipeline oriented rendering solutions which trade performance for reliability. Sure we need our renderer to be reliable too, but first and foremost, we need to be able to even afford to render. Rendertimes are still an issue, especially for animations. Trust me, I know it… this is one of my last personal projects:


Rendered in Corona with cached secondary GI. 3400 frames at FullHD of a scene mostly covered in very dense, reflective and translucent foliage. It was completely stable. I know there is some small flickering here and there, but that’s not from GI, that’s from the antialiasing. Wanna know why? Because I had to render it at my own expenses. Believe or not, at fullHD, each frame of this scene took just 15 minutes on my 4 year old 6 core CPU. The scene took about 56GB of memory, so it would never fit on any GPU.

Despite using such fast renderer as Corona, and getting incredible rendertime of just 15 minutes for such complex frames, including in-render motion blur and DoF, I still had to pay over 1500EUR, or $1750 out of my own pocket to get this scene rendered. If I ever tried to do this in Cycles, I’d fail, because I would not be able to spend around let’s say $7000 just on a rendering of a personal project. So even these days, it’s still borderline impossible for a freelancer to render a short movie of a complex scene on his own. And for this to change, it means we need to take advantage of any optimization available, and get rid of that retarded mentality of “it has to be unbiased” or “caching is cheating”.


(Stefan Werner) #386

Do you have any guess if that’s due to Corona delivering more samples/s or Corona delivering fewer noise at the same number of samples? That’d be interesting to know.


(rawalanche) #387

I don’t, but I can make the tests, right now in fact. I know where to get the stats of rays/s in Corona, I just need to know where can I get them in Cycles :slight_smile: One of my bets, however, is that random sampler makes a big difference too.

EDIT: Maybe this should be split off into another “Cycles Performance” thread :slight_smile:


(rawalanche) #388

Alright, so I did the tests and I am actually quite surprised with the results. I compared Cycles in latest August 02 Master to Corona V2. Both pure Path tracing, with ray depth limit of 12, and max ray intensity of 10 (ray clamping). Identical scenario, identical camera, identical material. And to my surprise, Cycles on CPU only was actually equally as fast as Corona.

I am not sure what attributed to such a difference in my previous tests. Granted, they were done over a year ago, and I may have done a mistake, but I suspect that the better tile size handling may have something to do with it, because I was not aware about tile size impact on performance back when I was doing those tests, and in latest master, it’s handled automatically.

Non the less:
Cycles, 5m 32s

Corona, 5m:


Very comparable results, very comparable times.

Now I did one more test, with Corona and secondary cached GI:


This one is default and used for every scene pretty much, even animated ones. It’s about twice as clear as those pure path traced renders in one 5th of the time, while also being more accurate since the secondary GI bounces are not limited to 12, which attributes to some light energy loss. So in practical scenarios, especially interior ones, cached methods are superior.

Non the less, I still stand corrected. If we compare pure path tracing performance, it’s apparent that Cycles, at least the one in latest master, is equally as performant as Corona (or V-Ray) for that matter. So really great job on that. I guess I underestimated Cycles quite a bit.

Renderstamp from Corona shows rays/s. In Cycles, I still don’t know how to obtain that value.


Cycles Performance
(Stefan Werner) #390

Detailed render stats are high on my wish list too.


(JuhaW) #391

Where do I get that scene ? I can test it with Blender, V-ray Next,clay render bf/bf and bf/light cache, cpu and gpu.


(anaho) #392

I want to mention that Cycles is still like 30% faster in CPU mode when using Linux instead of Windows.


(lacilaci86) #393

Excuse me, what?!


(English is not my native language) #394

Yes, but not only Cycles. CPU is generally faster in Linux. So for a fair test, all tests should be performed on the same OS.


(oaschwab) #395

I was running manjaro and it regularly rendered about 15% faster than windows 10.


(anaho) #396

Maybe they have managed to reduce the diffrence over time. But i find that difference still quite large tbh.


(rbx775) #397

I can confirm that windows 10 is alot slower at rendering aswell as in common tasks. copying folders, starting programs, etc.

Win7 and Linux are closer together, though. ~ or even about the same.


(anaho) #398

It is the compiler they use for the windows builds, not windows itself.


(oaschwab) #399

This was about a year ago for me. I no longer have a dual boot machine with Manjaro on it to test.


(anaho) #400

Me neither XD


(s12a) #401

This is easily demonstrated by using the Windows Subsystem for Linux (e.g. “Ubuntu on Windows”). Several months ago I tried command-line rendering with a Linux build of Blender on Windows under the WSL and I could obtain appreciably faster rendering times using Cycles. It doesn’t have to do with OS overhead like some have tried to imply.


(anaho) #402

Hey thanks thats a nice tip.


(English is not my native language) #403

Maybe this is because WDDM is being avoided when WSL is used?

Have you compared In your tests render times from command-line in a Linux installation too (not from WSL)?

Just asking. Anyway there are reports of users who also find other programs like Houdini faster in Linux than in Windows (CPU).

Anyway, this is turning off topic.