OpenCL and System Memory?

With latest drivers, AMD cards using OpenCL can use system memory to render scenes that are bigger than the GPU memory.

Can anyone confirm that this works as promised? If so, this could be a pretty substantial advantage to OpenCL from a production standpoint.

Also, does this translate across all applications out of box, or does it need to be specifically implemented? Will I have this advantage in Houdini and RF? Is this an OpenCL 2.0 feature, or will it work on 1.2?

Yes. It works.

it’s a driver feature, so it works for amd, whatever version of OpenCL and whatever program you use.

If this is the case, that’s pretty massive feature… Why haven’t we heard much about this?

Because so many people were wrapped up in CUDA and the smaller kernel rather than the potential of OpenCL which has been around for quite some time. Faster rendering through CUDA came about and then people wanted to start throwing more and more into faster rendering and then started running out of core memory, but OpenCL was still there, just not used. LuxRender stepped into OpenCL early, now developers are looking into tackling the more difficult, but slightly more capable OpenCL as a solution to all of us banging our heads while not being able to render large scenes.

it’s a driver feature, so it works for amd, whatever version of OpenCL and whatever program you use.

Are other AMD cards like R9 390 or RX 480/580 benefit from that? Or is it only for gpus with HBM2 type of vram?

This is a cool feature, but I think there is a huge drop down in speed if it switches from Vram to system ram,right?
So that´s why people stick to CUDA.

Is this comment based on Cycles benchmarks of different scenes on different hardware? Do you have links to review site or user posts?
There is a drop in performance indeed, but it’s more in the 20% range in my case on the Victor scene with dual DDR4 @ 2400Mhz on windows 7 using a RX480 and a Vega64. Of course, it will depend on the speed of your PCI-E bus and your RAM and how often this data has to be accessed (so it also depend on the scene). It seems Windows 10 also has an option to make such system memory access faster, but as far as I know, nobody reported benches to compare windows 7 to 10 in this case. But 20% is more than acceptable to be able to use my 64GB of system memory.

@tmz it works also at least on the RX480 and from what I heard, on the 290x on Linux. So at least with GCN 1.2+. The best is to test yourself if you have one.

@bliblubli thanks for your reply.

To be clear, this is a feature that AMD chooses to implement in their OpenCL drivers, not an OpenCL feature. OpenCL doesn’t specify it and NVIDIA doesn’t support it. OpenCL on AMD simply hasn’t been working well up until recently, most people bought NVIDIA hardware as a result.

The reason that the performance hit isn’t that bad is that, for most scenes, not all the memory is being read all the time, e.g. most rays don’t hit most textures. There’s a simple solution here to just store some textures on system memory explicitly, a patch for that has been sitting on the tracker for a while. That would work for CUDA and OpenCL on all hardware.

I don’t think this is “why people stick to CUDA” the same kind of feature is available on redshift, and that’s a pretty big attraction for some people, regardless of the performance hit when rendering off the system memory.

But this isn’t a feature most of us will frequently encounter. Modern cards now in the 8-16GB range, and I think it’s probably safe to say that vram has outpaced most artists’ demands, at least on a day-to-day basis.