Volumteric bidirectional spectral patch need your help

storm_st · March 24, 2013, 6:55am

Yes, you!

I think that even in current half arsed state, it can compete with trunk on selected scenes, due to better sampling of places that hard to hit from camera as in current path tracer. On average typical scenes, on my CPU it 4x slower then trunk Cycles.

So, i claim that bidirectional sampling is definitely useful, add to it fact that some very special scenes have better noise cleaning speed, and as it have light tracer as any other bidirectional by definition, it can make caustics from point/spot light.

The problem is it mostly one man project, code is bad and full of bugs, and as someone already noticed that bidirectional algorithm have a bit steep learning curve, so i did not expect many people start hacking it immediately.

But i know myself as guy who losing interest quickly as soon as main hard principal task solved (and it solved, as last 2 weeks i have only positive local tests), and that mean we must have more ppl that get familiar with algorithm to make it robust supported part of Blender.

What is bidirectional tracer, and why it better?

It is almost same as Cycles, in fact it is Cycles, with small addition that trace paths not only from camera (eye) but from light sources like point light or mesh with emission material. It work better when scene have complex geometry as some areas explored faster, sometime orders of magnitude faster then classic path tracer.

What is current state in general?

It can render few scenes in preview window, No GPU, only CPU, step right step left and it segault/crash or wipie your hard disc and do other very bad things. (Ouch!)

What it need to make first glaphicall build for more wide testing?

Main stopper is tiling render feature, that i cannot workaround, need help with it. In final render, Cycles split image by tiles, and allocate them on demand to save GPU memory. But some of contribution pixels must be written in other tiles, that maybe not exist, and we get segfault / program crash.

Are there any similar projects, maybe better do not make another unfinished crap but unite effort around big single project ?

Hard question.
Luxrender of course is very mature, long time proven and supported that have similar algorithm, and already have more features. The only reason i play with Cycles instead, that i never understand SLG/main Luxrender relation, it only was explained last year, and unfortunately have very bad luck to make it compile and work here, but Cycles start working immediately after make install since first published.

What help you need beside that obvious bug that make final render unusable?

Perspective camera DoF suppport, connect_camera() must respect lens aperture circle.
Better understanding of non symmetric BSDF due interpolated normal, for now i use hack, not sure it work for anything but diffusion BSDF
Code refactoring, it is still nightmare, maybe split unfinished mlt related from kernel_mlt.h to other file
many places like kernel_light.h and kernel_random.h have questionable parts, they complex to explain, related to how QMC work and how sample initial direction from light. Really need help from other ppl who understand that matter.
too many fireflies in some scenes, i suspect ray pdf go negative or very close to zero when it does not, maybe precision related.
MIS with background as light is broken, i just have no time to look at it as more important parts pull my attention.
-broken MIS when “shadow” feature used, it double hilight some parts, need more work to fix, so no nice “milk in glass” pictures posted by me.
-Russian roulette feature not supported at all, you MUST set min samples=max samples, i am too lazy to fix it, as it not very hard and interesting ^^

Bidirectional is cool, but what happened with Volumetric, MLT, spectral sampling?

Volumetic part work long ago, if you can call “work” that it can produce light beams and few other cool pictures. After some weeks of render per frame , ofc. I barely change it, only reuse as debug tool to debug bidirectional part. Very nice feature when combined with bidirectional, for single point/light source in fog atmosphere is produce rays, with acceptable speed if low bounces (0 or 1) used.

MLT, or Mitropolis Light Transport, or MCMC (Markov Chain Monte Carlo). Another hard question. In short, sometime it so nice i cannot believe. But in 99,99% it require terrific bounce number to produce something that looks like object, not abstract point cloud. I am still very believe with that algorithm, maybe need to selectively use it on parts like caustics from light. Too big question to write detail here.

Spectral sampling, it just joke, not real physically correct, it was quick hack to sample wave length , assume black body related light source. It can make diamonds on floor and colored volume Gaussian cones behind glass globes in fog. Not very useful.

Update:

MCMC/MLT start to show it awesomeness, few interior scenes with only diffusion flat meshes converging almost as quick as Maxwell (at least I have that feeling after many noisy previous attempts :P)

qutorial · March 24, 2013, 7:03am

If I were smarter and better at coding, I would help. It’s fantastic reading about what you’ve done so far

I hope you can get the needed extra hands on this as soon as possible!

m9105826 · March 24, 2013, 7:17am

I’m willing to jump in and take a look at your code. How far does it deviate from the Veach thesis? Do you have a git repository available, or a branch somewhere that we can build and play with? Your progress has been impressive, but it sounds like it may take just as much time as you’ve spent on it so far just to clean up the code and make sure it is in sync with the rest of the Blender code base.

storm_st · March 24, 2013, 7:47am

Thanks.

It completely Veach based, i must admit i am biased a bit about that dissertation, long time i read such detailed desctiption of compex things. some trouble was that Veach “promote” pdf per projected solid angle, but Cycles use solid angle, and it confuse me a lot at some step. i have local git repo (~2 months), i doubt it make sense as many unrelated things and my english skills to making comments it just joke.

I think we must make quick patch refactoring to state it can be placed at blender svn or maybe trunk, but trunk is very questionable as too many parts it intercept Cycles, it hard to make simple #ifdefs to get it as module. Most problematic is kernel_random.h and obviously kernel_path.h, i make too many cistomisation in QMC, it allocate 4X more samples, and use longer Cranly-Patterson rotation vector, as w/o it volume sampling too ordered. Maybe just catch Brecht and others and get directions what must be done to get code a bit more blender style.

Btw, is someone will setup git or other system, i will rebase to that no doubt. Other question, maybe it not that useful and better stop pushing and do something other? Luxrender will eventionally get node system as recently trunk get it, and SLG->main transition, do we really need another competitive “fork” as my patch?

storm_st · April 1, 2013, 7:42pm

Rebased after SSS addition, no other changes. Trying to get correct asymmetic BSDF, partially work for diffusion material and light trace contribution, need spread that for other cases.

Cool UFO like scene, 40 min, i think it need 2hr+ to clear noise.

Ace_Dragon · April 1, 2013, 8:05pm

The fact that we are seeing a larger variety of scenes is alone a good sign that there is a higher rate of progress now than before.

Keep up the good work, perhaps you will be able to present your volumetric patch to Brecht in a state where he can use it as a base for when he starts working on volumetrics.

Bernardo · April 2, 2013, 2:08am

sorry I am no coder so I am afraid I am not of much help here… just can say that I would really like to see this finished!

Unreal3DFX · April 2, 2013, 3:34am

but there is dilemma, what if Brecht creates own version of volume for 2.69?

Maybe everyone who has time should spread this in social media too and maybe some high skilled programmer sees this and wants to help.

Storm, good luck. I really hope best because volume is top needed Cycles feature.

Shadowrom · April 2, 2013, 5:31am

Hmm… in fact it would be cool if someone talket with Brecht about this, maybe you guys can join forces

storm_st · April 2, 2013, 5:45am

Thanks. The base problem it require much more samples, as it measure energy in volime, not only along surfaces. And current hardware a bit low for everyday usage. I found it can be usable in case short bounce like 0 or 1, but it make bias and i love physically correct resault, it looks more natural w/o any artist manual tricks. That is reason i tend to post only 8+ bounce multi hour scenes. For example, this

is same scene, but 1 bounce instead of 8, much less noise, as more samples in same 40 min render on same hardware. still noisy even for animation, lol 40 min per frame in i7 3770K.

About relation with Brecht version, there are too many obstacles, GPU support is most problematic. I am as Nvidia hater do not want to to get hardware and tweak for CUDA only, in turn my lovely AMD continue so show most buggy OpenCL compiler in industry and cannot render anyting. Clover project still have no support for complex kernels. All or that force me to go CPU only parts, and trying to keep patch syntax close to common parts of OpenCP/CUDA in hope it just work when compiler will be fixed.

Brecht is in much worse position, he MUST support GPU, Nvidia or AMD dont matter, as GPU is main part of Cycles success (FFS, 10X speedup matter!), despite that GPU RAM limit, OSL and other, his task to make is work and work fast. I think that best i can do it polish some parts to make it more usable in CPU, in hope it will can be easily reused by Brecht or maybe as reference to compare with other algorithms, it is always good to have at least something working that you can campare in speed/quality.

Zalamander · April 2, 2013, 8:24am

FFS, 10X speedup matter!

Almost nobody gets a sustained 10x speedup with GPGPU. If they do, that often means they where porting code that was inefficient in the first place and porting it into a data-parallel format would’ve improved CPU performance as well (with CUDA this isn’t made obvious because it doesn’t run on the CPU, unlike OpenCL)
Cycles shows a 2-4x speedup compared to a good CPU on non-trivial scenes and that figure is only bound to get worse when increasing kernel complexity.
Also, I don’t think Brecht MUST support the GPU. He’s made it clear that he won’t have feature progress held back by GPU inadequacies. It’s better to have a working CPU solution first, be it for Hair, SSS or your Volumetrics.

ng-material · April 2, 2013, 8:32am

would gpu fail in most case of volume anyway.

like moderate smoke/fire renders?

BroadStu · April 2, 2013, 9:56am

@storm_st, I would be happy to help out if possible. I still have plenty to do on the hair system, but some of this should be finished soon… hopefully. I have actually already looked through the volumetric part of the patch. I would be happy attempting to refactor this element… and trying to understand the rest of it.

DingTo · April 2, 2013, 10:25am

BroadStu: Wohooo!

I have to say, the Bi-Dir render in post #5 really looks good, great work storm!

storm_st · April 2, 2013, 10:56am

@BroadStu: Great! But you definitely better keep out of patch source (as any other) although it very necessary, as it become very cryptic, i am barely can understand many parts myself, if not look at them a week. Main goal still not patch, but speedup noise cleaning in general.

Also, need some proof scene test to be sure that formulas used are correct, for example i think i just found another very long standing epic fail bug in my volume pdf code (thanks recently SSS addition, as it kick me to try repeat great SSS dragon image posted in other thread), as i seems wrongly assume i must clamp pdf of homogeneous media when sampled length pass if there is known obstacle (known triangle intersection point) to value of intersect point. Long time I was suspect current images a bit more bright in distance then it must be, looking at other renderers, and was think it monitor calibration or some unrelated issue, and , if i use fixed pdf images become much more similar to other renderers. Multiple scattering is very complex thing for me, hard to imagine how is image must look.

monsterdog · April 2, 2013, 8:44pm

I am certainly not a coder and only have a vague idea of how all this stuff works, but I have been thinking about the volume thing a lot and wonder what your approach is like?

It would seem to me a lot of noise will naturally be generated and the scene will converge slowly if every visible point in space has to be hit, lit up, and absorb light in a physically correct manner.

Is bidir necessary to good volume generation? I can see how it would speed things up.

Do you use any kind of interpolation, such as cheating by adding the same value to any participating media in a configurable radius around your ray, maybe with some falloff in order to make the image converge faster? I don’t know if that approach would make sense, but it would seem to me that light would naturally scatter a bit anyway.

Of course that would present a bunch of different problems, such as not having any bleeding around the edges of a sharply defined volume.

I’m afraid I cannot really give you any tangible help, but I’m willing to ask endless dumb questions

storm_st · April 3, 2013, 5:38am

Is bidir necessary to good volume generation? I can see how it would speed things up.

Bidir greatly increase noise cleaning in some places, for example near light sources (i think it is only “wow” factor of that patch if you ask me, classic PT will converge that noise very long, 10+ more time, for average user point of view it almost instant in case 0 bounce settings ). I am a bit busy with more important bugs to post good detailed comparison of noise cleaning differences between patched and trunk Cycles in equal setup (same settings, same hardware).

Do you use any kind of interpolation, such as cheating by adding the same value to any participating media in a configurable radius around your ray, maybe with some falloff in order to make the image converge faster? I don’t know if that approach would make sense, but it would seem to me that light would naturally scatter a bit anyway.

I only start to implement Mitsuba render trick, that use controlled (50% bu default) chance to hit media or obstacle behind media. HAve trouble in understanding details, but it is somewhat as you say, it “scale” sampling distance a bit to make noise between background and media more predictable, in worst case 50%, not extreme as now (sometime 1000:1). That code is commented out, need more work. BTW, that method is not fake, it absolutely unbiased and correct and will produce same image.

Any help is welcomed, for example better source comments (i obviously have bug problem with english), variables names must be exact and descriptive, hints how to better modularise it to group common parts in separate source file, just anything. Too bad it have crash/segfault issue with final render, and i cannot call for real wide testing for building bugs and other, but getting close to that.

endless dumb questions

Any question welcomed, if you can tolerate my english skills.

monsterdog · April 3, 2013, 2:07pm

I don’t know if we are talking about exactly the same thing or not when it comes to “cheating” in volumes.

If I understand correctly, the way a path tracer works is that you evaluate the color of a pixel based on what kind of surface it is and the light hitting it from surrounding light sources and objects.

Without a clear understanding, I imagine that with volumes you will have to evaluate how much light is absorbed, reflected, or refracted at each point along the entire length of a light ray terminating on the surface of something and add/mix it all together. Is this correct?

Therefore a lot of noise will happen since rays are scattered much more between each other than the termination points on surfaces in the scene.

If my (completely uninformed) assertion of how this works is true, would it make sense to only sample points every X units along the ray and interpolate between them? And also interpolate between neighboring rays in participating media. Essentially lowering the resolution of participating media and smooth between known points.

All this is completely based on my own non-understanding of the process If nothing else you can re-ignite your motivation for this by realizing you know more than most of us about this

storm_st · April 3, 2013, 4:12pm

I imagine that with volumes you will have to evaluate how much light is absorbed, reflected, or refracted at each point along the entire length of a light ray terminating on the surface of something and add/mix it all together. Is this correct?

Yes.

Therefore a lot of noise will happen since rays are scattered much more between each other than the termination points on surfaces in the scene.

It depend. Path tracer “measure” not only light that can be interpolated, but things like shadow, and in some cases that information cannot be easily interpolated. And current algorithm sample 1 path in a turn, of course in theory it can split to wide ray tree, but it will take huge RAM usage when number of bounces go high, as it power of that number. If you want to “cheat” for example using coarse data after first hit, it can be done as well known “irradiance cache” algorithm, it is very fast and produce nice images, see http://www.mitsuba-renderer.org/devblog/?p=163, scroll to Beam Radiance Estimate, it biased of course, but drastically decrease volume sampling. The idea it make photon tracer-like BVH storage and preprocess scene, filing it by indirect energy for example at every pixel, with interpolation. At final phase it work like 1 bounce path tracer, very fast. I just not very interested in such biased algorithms, must finish first “reference” robust solution, that can be used later for measure how good is biased picture quality side by side.

storm_st · April 10, 2013, 7:37am

Trunk Cycles get 2 very visible fixes, BVH fix of some corner case scene geometry setup, and BSDF fix for vectors that go close to extreme angles. First one literally halves render time on some scenes, and not very related to my patch, but BSDF fix improve picture a lot, clear 99% fireflies that i see on some scenes with glossy.

Some example, 12 min render on my i3770K, 8 threads, true 8 bounces (min=max), no cheat like clamp, No caustic, glossy filter, nothing, just brute force render.

Bidirectional integrator:

Path tracer:

Some difference in brightness i think because PT can not trace at all caustics from light shiny lag, but not that sure, maybe it is bug in my implementation.

Just for fun, i leave Path tracer to see how long it take to clear noise more, here is 55 min render, it getting close to first image but not yet.

Of course, not every scene get that speedup, this one is “cheating”, only one single light source is point light hidden inside lamp. But you get idea what it is all about bidirectional algorithm.