NOX, Cycles Mitsuba, yafaray (now lux) comparision

condar · July 23, 2011, 7:17pm

… +1

Alain · July 24, 2011, 12:59am

Wrong thinking: Luxrender is not made for toonrenders (why you wanna “downscale” a fotorealistic render to a toonrender ?). Better you use BI for toonrender (if it can do that, don’t know).

Alain · July 24, 2011, 1:01am

But GPU is the renderpower of the future. Does it make sense to compare old CPU-rendering methodes ?
Yes post the blend-file, I will test it.

JoseConseco · July 24, 2011, 3:43am

mkay, blend with yafa setup:
http://www.pasteall.org/blend/7796

Just try to make similar material if you gona use octane guys - so very reflective at glazing angles, and hi gloss reflections (but no 100% sharp - more like 80-80% sharp)

Also my comp is old, so most of you guys will have better renders, but it was about relative comparision, no absolute render times. And relative results should stay similar for all.

organic · July 24, 2011, 10:31am

Hi JoseConseco, thanks for sharing.

There is nothing wrong with your render settings. Material emit is an unusual lighting method for Yafaray, I think that was producing the need for high samples and the long render times.

At the bottom of the ‘Object’ panel you can enable the planes as Meshlights, though for more complex scenes area lights are probably more efficient.

Using Area lights with 16 Path Samples and 2 additional passes, it renders on my machine in a little over 2 mins. (About 3.5 mins on a tri-core?)

With higher samples it will render cleaner but, given that it is a CPU render I think it still compares favourably with the featured alternatives. Render times with meshlights are about the same, but they can be slightly noisier if the only light source used.

I also turned down Depth. Yafaray does most of the work in the first couple of bounces, so you wouldn’t really need 8 bounces for a scene like this where the lights are relatively easy for rays to find.

Hope this helps.

Render - http://www.pasteall.org/pic/15432

.blend - mesh lights are in scene 1, area lights in scene 2:-
http://www.pasteall.org/blend/7804 (17 MB)

Thanks for taking the time, and for letting me interfere

condar · July 24, 2011, 11:01am

same file as organic render but with gamma 2.2 and lower threshold to remove more noise. core i7 => 2min 15 sec.

Render -> http://www.pasteall.org/pic/show.php?id=15438

JoseConseco · July 24, 2011, 11:48am

Cool organic. I have now better render time and result seems more correct. I updated 1 post with summary.

organic · July 24, 2011, 12:41pm

Yes. Some noise could be caused by lighting samples, but broadly speaking, Path samples removes most noise and setting the ‘Threshold’ determines how many pixels are resampled in subsequent passes. Getting a clean fast render involves balancing them. Using more Path Samples on a single pass will give a clean render, but taking advantage of the adaptive sampling allows using more samples only where needed for a shorter overall render time.

I’m still slightly confused about the render time though. Condar’s time is broadly in agreement with mine.
Why does 2 mins on a 2.8 quad translate to 7 mins on a 2.6 tri? Is your Yafaray up to date?

I’m not too knowledgeable about processors, perhaps the Phenom is just older and therefore slower? If so, it is pleasantly surprising that render times can be reduced by two thirds by a newer processor with similar spec.

Just for fun, here is a quick render made with Yafaray’s Direct Lighting with Ambient Occlusion:- http://www.pasteall.org/pic/15443
All other settings are the same.
Render time - 26s

Ace_Dragon · July 24, 2011, 1:50pm

GPU’s can provide a much higher render speed, no doubt about that, the big issue they have compared to CPU’s is operating temperature.

With the newest cards, you’d be hard-pressed to keep them from overheating unless you have a rather pricey cooling system to remove large volumes of heat from the interior of the PC tower, even then, a lot of homes in my area at least do not have adequate air-conditioning systems to prevent a room from getting hot when those GPU’s are at full load unless you opened the window in the middle of Winter.

If I was to do rendering with a late-generation GPU at full load, I’d have to build an office or a special room for the PC to be in, our early 1990’s era air-conditioning system would be inadequate for such a heat-generating machine.

Alain · July 25, 2011, 3:30am

Ace Dragon:

GPU’s can provide a much higher render speed, no doubt about that, the big issue they have compared to CPU’s is operating temperature.

With the newest cards, you’d be hard-pressed to keep them from overheating unless you have a rather pricey cooling system to remove large volumes of heat from the interior of the PC tower, even then, a lot of homes in my area at least do not have adequate air-conditioning systems to prevent a room from getting hot when those GPU’s are at full load unless you opened the window in the middle of Winter.

If I was to do rendering with a late-generation GPU at full load, I’d have to build an office or a special room for the PC to be in, our early 1990’s era air-conditioning system would be inadequate for such a heat-generating machine.

In the future I guess you will have separated GPU-Boxes with a few GPU-prozessors for only the renderpower. No more PC’s or rags filled with graphiccards. The cooling is a “problem” that will be solved in the future as well.

But all this costs alot of energy ! And energy will not be cheaper in the future
We also should think to care about our environment.
So in this way of thinking I agree with you that CPU-rendertests CAN make sence

Kind regards
Alain

arexma · July 25, 2011, 4:10am

Hmmm.
A modern CPU has a TDP of around 100-140W, a modern GPU of around 225-350W
Let´s take Octanes benchmark as it was already brought up.
A Ci7-920 has a TDP of 130W and is factored around 100,
a GTX480 has a TDP of 320W and is factored around 4000,
for one of the sample scenes.

So if a GTX480 would render for an hour on a frame it uses 320Wh.
The Ci7 would need 40 hours and consume 5200Wh.

1 KWh in Austria costs around 18 Eurocent.
That´s 5 cents for rendering with the GPU, and 93 cents for rendering with the CPU.

Also to produce 1KWh electricity, costs around 1000g CO2 emission.
Thats 320g for the GPU, 5.2kg for the CPU.
(However if you use your homes 8kW air conditioning system because you can´t take the heat things change.)

I got 3 GPUs rendering in a small room, and no air conditioning and we had 30°C+ the last weeks. Nothing opening 2 windows creating a draft couldn´t fix. But as (at least here) average room air conditioning systems have around 2kW, I could cool my system with it and still run more energy and time efficient than with the CPU.

So who´s expelling more heat now?
Personally I gladly exchange less rendertime and less power consumption for less heat expelled over all.
It just feels more because it is expelled over a shorter time.

Alain · July 26, 2011, 2:10am

Interesting calculation.

One thing I ask myself: is Octane really 40x faster ?
I also think we should compare scenes from “real business”, an interiorscene (with instances, glossy materials, displacement and all the critical things) for example which someone did for a customer. Those scientific testscenes mostly are useless or don’t say mutch for practical business.

Kind regards
Alain

tyrant_monkey · July 26, 2011, 2:29am

it doesn’t have to be 40x times faster if it is even twice or three times as fast you are saving electricity using the GPU. From using cycles and small Lux the speed is definitely there maybe not 40 times as fast but it is quicker.

arexma · July 26, 2011, 4:52am

Firstoff, Octane doesn´t have instancing would be nice though to save VRAM. Also it´s not yet fully fit for interiour scenes as it hasn´t had any form of lighttransport. They didn´t implement MLT but a new algorithm they developed, based on MLT and some other stuff tailored for the GPU and it delivers very good results.

To be completely fair, you would also have to take into account that you have to import the scene into Octane, which can take quite some time and power and that the scene is voxelized for which the same accounts. However, as you usually tweak the materials and lights lateron and save all the testrenders you´d do with the CPU because you get more or less an instant feedback, I simply neglected it. I say they cancel each other out in the “equation”

However, overall octane will keep its speed advantage. The only thing that slows it down are calculations not parallelized or parallelizeable. And you don´t even have to benchmark the speed, with O-notations you can calculate the speed advantage, as GPUPU simply put is highly scientific, but ideally Octane would get even faster.

Let´s shoot a ray, hit a glossy transparent material with a normalmap, you get a new ray bouncing off and one being refracted:
A traditional (no SMT) quadcore CPU would have 4 threads running, one of each shooting one ray. Let´s look at one thread:
The ray intersects with an object. The shading and lighting for that point is calculated and offset by the normalmap value, shooting rays (one after another) to the point from the lightsources. Once done, a new ray is sent out for the mirror and bounces around till we kill it or it has no energy anymore. The next ray is sent through the object, refracted, exits the object refracted, shading and lighting is calculated (again for all lights) and the ray continues until we kill it, or it has no energy anymore.

Now in the GPU let´s randomly say (for maths sake) we have 80 fragement processors. You shoot 40 rays at once leaving us 40 spare.
The 40 rays are already 10 times faster than the 4 rays our quadcore can handle, not to mention that floating point operations and vector operations are a multiple faster on a GPU than a CPU.
Now if we intersect an object with one of our rays, we don´t have to do is sequential like the cpu. We can shoot 10 rays parallel from our 10 lights back to the point for shading and lighting, the cpu has to do it sequential. We can also already shoot the refracted ray through the object meanwhile. It is the parallelization that makes it so fast. But the raytracer has to be coded extremely well and I can´t really tell you what the code looks inside of Octane.

Still, like tyrant said, even if it is only 3 times faster you save energy.
And not to forget, if you reach a certain point, dunnow, the 1200 sample per pixel, there isn´t much chaning anmyore.
So in the same time the CPU renders one average image, you can render a lot of GPU renders of similar quality and composite them to remove artifacts and fireflies. By the time you are done with the GPU you start to render the second frame with the CPU.

One downside is, like with octane as pure GPU renderer, there´s no distributed rendering. If you got 8 graphiccards in your computer that´s as fast as it gets or you externally connect cards to your PCIe bus limited to half an meter distance, while with CPUs you can cluster one fantastillion of them together all over the world if you like… talking about energy efficient

The thing though is, raytracing is known since the early 1960s. Since then all was developed sequential, with the manycore cpus mulithreading became a topic to research, which is not too long ago. And only since the 1999 when the unified shaders where introduced GPGPU and parallelization became a new field of research. CUDA was introduced with the GF8 also 1999 and no one had an idea what to do with it. SETI@home and folding@home were the first tools to have magic GPGPU powers around 2006-2008 IIRC? So 5 years of GPGPU?
This technology is somewhere between the discovery of fire and the invention of the wheel algorithmicians heads bursting open from high pressure trying to parallelize all the algorithms of the last 50 years of computing.
So I guess we haven´t seen speed yet as the GPGPUs are evolving along with the utilized software and algorithms and the introduction of manycore CPUs just underlines the fact that parallelization holds the key to power for future computing. That and later quantum computers

arexma · July 27, 2011, 2:35am

Speaking of it… the worlds first raytracing chip:

Handles 14-24m rays/second and core.

I am already curious about the pricetag.

tyrant_monkey · July 27, 2011, 3:18am

it kind of looks like its aimed for tablets and smart phones, so they will probably sell to manufacturers not the general public but even for the mobile platforms its kind has the feel of a bunch of hot air. the stuff they show looks extremely striped down to be able to run. there is nothing on the level of Epic Citadel going on there

arexma · July 27, 2011, 4:44am

Idd, it seems to be targeting mobile devices, but I think that is actually a smart move to introduce it.

I guess for a ridiculous amount of money you can produce a desktop raytracing chip no one would buy and you could make raytraced games, which would require to do an engine from scratch, which no one would produce because there would be no market.
I remember buying my Voodoo1 back in the days with good faith as there was only 1 game available for it ^^

The Raycore chip can handle 960x640. No one with a 1080p+ display will write home about it.

If you take a slate or mobile phone now and offer realtime raytracing for games, it´s something else, there you are running the maximum resolution and can offer a visual quality that hasn´t been there yet with the OpenGL ES.

And if you look back, 3dfx’s Voodoo chips wheren´t that great either, yet they have made Nvidia and rasterization what it is today.

So yeh. Raycore is a puny first step. But each step moves you forward

Dani · July 28, 2011, 9:30am

Are you sure? I’ve seen mentions of raytracing hardware long ago. Hum, ah, wikipedia has some of them listed (well, what is not in wikipedia nowadays…): http://en.wikipedia.org/wiki/Ray_tracing_hardware

Anyway, off topic
Daniel

arexma · July 28, 2011, 10:56am

From your very link:
“Siliconarts developed a dedicated ray tracing hardware (2010) to achieve at interactive rate rendering. RayCore (2011), which is the world’s first ray tracing semiconductor IP, was announced.”

Everything else were research prototypes at universities and one card to accelerate GI, or premanufactured complete systems.

elbrujodelatribu · July 28, 2011, 1:57pm

What about luxrender?