A good news for AMD/ATI Graphic cards owners

Haha excellent s12a !

Can you post the steps you took to that result ?

In the meantime I am trying to understand what went wrong here…

Another test:

Rendering started at 3:20
3:34 minutes for 728 passes on a AMD Radeon HD7770!

Note GPU usage, preprocessing memory consumption (on task manager).

Hold on a second, you’ll see it is an easy process, really.

Aah, cool… So if you get it going in 2.63, tru the procedural rust scene again. That’s the most hardcore scene I found as far, my computer basically stalls to a standstill rendering that, no other scene I found do that…

You just transfer the compiled kernel from 2.62 folder to 2.63 ?

On my side: I give up. I can’t recreate the steps i took…
I am out of ideas…

Still there is something i don’t grasp: how on earth is your compiling time so low… ? I mean our CPUs are of the same generation (almost). So compiling on 1 core only should give me about 5 minutes (2.4GHz), not 10.

INSTRUCTIONS

  • Install 12.5 beta Catalyst drivers properly (as explained in the previous page)
  • Prepare Blender 2.62 Win64 Official for Cycles GPU renderings (as explained in previous pages). Render the BMW Mike-Pan test scene, then exit Blender
  • Open the \2.63\scripts\addons\cycles\kernelBlender\kernel_types.h file in your 2.63 Win64 Official installation
  • Find the code block beginning with #ifdef KERNEL_SHADING and make sure it looks like this, if it doesn’t:
#ifdef __KERNEL_SHADING__
//#define __SVM__
//#define __EMISSION__
//#define __TEXTURES__
//#define __HOLDOUT__
#endif
  • Open Blender 2.63 and render the Mike-Pan test scene. It will be a clay render. It’s ok. Exit Blender.
  • Go in the “C:\Users\xxxx\AppData\Roaming\Blender Foundation\Blender\2.62\cache” (replace xxxx with your windows user name) directory and copy the cycles_kernel file in a safe place. Mine was called cycles_kernel_1CB8CB83963C013B0A46762C0ABBA7FD_0006CA20D63394B440D64864EF1300A7.clbin
  • Go in the “C:\Users\xxx\AppData\Roaming\Blender Foundation\Blender\2.63\cache” directory and copy the cycles_kernel file from 2.62 here.
  • Rename the cycles_kernel file from 2.62 to that of the 2.63 kernel. You need, in other words, to replace the 2.63 kernel with the 2.62 one without making Blender aware of the change.
  • Next time you’ll start a Render on Blender 2.63, it will use the working 2.62 kernel. On my system it’s for some reason much faster than on 2.62! Speed is consistent with my graphic card’s true potential.

I doubt it’s going to work. Remember, you’re using Blender 2.62 Kernel on 2.63. The kernel contains “Cycles instructions” for the GPU. Cycles on 2.62 didn’t have all the features needed. But I’ll try anyway, and report the results.

EDIT: no good. I get worse results than in 2.62:

However, it rendered those 100 passes very quickly (render started at approx. minute 3:20)

You have to replace an existing working 2.63 kernel. Just transfering the 2.62 compiled kernel won’t work, at least as far as I have seen.

Still there is something i don’t grasp: how on earth is your compiling time so low… ? I mean our CPUs are of the same generation (almost). So compiling on 1 core only should give me about 5 minutes (2.4GHz), not 10.
Unfortunately, I have no idea. Maybe it’s got something to do with how I installed 12.5beta drivers (see previous page).

Waw thanks s12a !
Great job you did there.
Next step would be to try that with a recent SVN build, one that has extra quick BVH building routines.

I think it would only save 1 or 2 seconds on this scene at most, if even at all. 2.63 already includes some drastic BVH improvements. Anyway, I’m already extremely happy with these results. They prove that current AMD cards can be at least as fast as NVidia Fermi (5xx) ones with Cycles. My Radeon HD7770 is as fast as a Geforce GTX570!

Somebody should inform developers about this (DingTo? Brecht? Campbell?), to investigate on the issue which prevents normal rendering speeds with AMD cards in OpenCL GPU rendering without moving kernels around. And of course, the silent recompilation of the OpenCL kernel before each render.

I wonder if the memory usage could be fixed too, but I doubt that, at least with current Catalyst drivers.

Well, I think you and the other “leaders” of that thread are the ones to contact the bosses frankly (thinking of Sivas too) !
I am trying your stuff right now.
Something interesting: i got that black car, and the compiling time was way shorter. The kernel about 4/5 MB instead of 18 too… I’d be interested to see if that 6850 is as bad as Zalamander thinks…

If you comment out #define SVM as well it will be even smaller… because even less Cycles features will be compiled into the opencl kernel. The working 2.62 one is about 15 MB big, anyway.

Mine is 18 MB.

Here is the render:
http://glp.lescigales.org/it/blender/testscene%20HD6850%20on%202.63%20screen.jpg

So, if you take away my 12 min 22 compiling time, my HD6850 is down to 1min 40. !
That is much better than the 4 min 30 we had on 2.62.

Now, I have to find out how i managed to have the kernel be built just once… that drives me nuts !
And it’s getting late on this side of the Atlantic !

The fact that your kernel is slightly bigger and takes longer to compiler might mean that your AMD SDK is different than mine… but it could as well be due to graphic card differences (maybe the opencl kernel just compiles faster for HD7xxx cards?). We need another user with a Radeon HD7xxx to verify this.

The HD6850 should compare like this with other cards:

http://media.bestofmicro.com/T/D/326353/original/luxmark.png

The HD7770 is 36.5% faster than the HD6850 in LuxMark.
This is consistent with the performance difference between our cards in Blender Cycles.

By the way, looking at this chart makes me wonder if with an HD7970 render times on the same scene would be around 30 seconds:

http://media.bestofmicro.com/Q/8/328832/original/Luxmark.png

Yes, and it’s more or less the kind of result i was hoping for : < 2 minutes.
Now, about the price :slight_smile: ?

If the compile time comes from SDK difference, I’ll uninstall and try again tomorrow, but in theory, as i use the links found in this thread, it should be the same.

I can’t wait for the updates from the developpers about all this !

Well, maybe it’s not as bad as I expect, after all.
The difference in compile-time/kernelsize is likely due to hardware differences. Every architecture has different binary formats.

Didn’t you install the OpenCL 1.2 beta SDK? I don’t think you should have bothered with that. It was an old version (dated 2011).
Next time, remove everything AMD-related cleanly as I previously suggested and install only the 12.5 beta Catalyst drivers (which include the OpenCL SDK themselves).

@ Zalamander: well ok for architecture etc, but 12 minutes against 3/4 ? Q6600/E8500 ? They are close relative cousins.
But what do i know…
@s12a: OK, I’ll try that !

EDIT: no change. I uninstalled everything but the driver itself (was the 12.5b anyway) and re installed the whole 12.5b pack, no change. Still 10/12 minutes and 4 1/2 minutes rendering times.

Well, it MUST be my Q6600 getting tired then… sniff…

4:30 actual rendering time? Weren’t you getting 1:40 after the latest changes?
Anyway, 10-12 minutes for OpenCL kernel compilation wouldn’t be too bad if the process occurred only once as it’s supposed to do.

Yes, 1:40 using your trick (2.62’s kernel on 2.63).
But 4.30 in 2.62 as before.

I have tried over and over to get rid of this ever-recompiling stuff. No success.

This 1.2 OpenCL SDK doesn’t install anything new anyway: the versions it proposes are older than what’s in the 12.5b pack.

As somebody who didn’t follow it from the beginning. Why is RAM usage so big ? Is it something that can be optimized ?

It’s the AMD compiler for the OpenCL Kernel. For some reason it takes much more RAM than the NVidia one (for CUDA or OpenCL) when compiling the kernel (up to 6.5 GB in my case). This process alone is most probably solely dependent on AMD drivers and little can be done about it. On theory, this should happen only once per Blender installation (which would be fine), but for some reason it happens everytime a Cycles render is started. This is what I believe Blender developer intervention could fix.

EDIT to add that I just tried the “kernel swapping trick” on the latest buildbot Blender Win64 build (r46535), and it doesn’t work there, unfortunately :frowning:
This only works between 2.62 and 2.63 official versions for now, so we can say it is something I almost purely discovered by chance. The good thing is that developers exactly know where they can look.

Hi s12a. Had a short night !

Could you do something for me ?
Can you tell me the exact version of your various AMD/ATI drivers ?

I mean, this :
http://glp.lescigales.org/it/blender/ATI%20versions.jpg

If you launch some install, you can choose then “unistall”, choose the manual mode.

Do you have the exact same versions ?
There is also the CCC with version 2012.0418.2133.36668 (But CCC doesn’t work)

I have the feeling my drivers don’t get properly uninstalled… Can you trust ATIMAN ? The website is in greek, i don’t get a single word of that…