Request for AMD users

Hi guys.

I have an idea for Blender development, I am not a coder and I honestly don’t know if my following request would be possible. I am planning on getting a AMD FM2 desktop mainly for 3D and motion graphics as well as game development. Now as I understand the CPU has an integrated GPU which can be crossfired with discrete AMD GPU’s, however current generation FM2 CPU’s will drop a DDR5 Card down to DDR3 as the current memory controller only supports it.

Each card contains streaming processors which is used with OpenCL, so I am wondering if it would be possible to give an option in Blender for Cycles, Internal as well as the game engine to harness those processors across the linked GPU’s. For example have tick boxes in the render panel to choose which or how many processors are spread to certain areas of a render, such as one card focuses on physics, another on lighting, one on the passes etcetera. Same idea for the game engine one focuses on shaders, another on lighting, one on geometry, one on physics, while leaving room for the CPU to calculate player actions.

Would this cause bottlenecks? Maybe another feature which could be nice is to port the games across other platforms which share a similar setup, Windows, OSx, Linux, Xbox One, PS4.

Looking forward to hear everyones thoughts on this.:spin:

The point of the GPU is that you offload as much of the geometry and lighting calculations from the CPU to the parallel GPU. You cannot (and would not want to) tell the GPU how to allocate the compute cores. GPUs are fast specifically because they solve problems that can take in geometry, apply a vertex shader, apply a fragment shader, and the image is computed. Messing with that formula will slow everything down.

The Sony “Cell” processor in the PS3 was supposed to work the way you describe but it didn’t really work out for the PS3 that way (from what I hear it was too complex) and the technology has not caught on anywhere else.

The point of technology like Crossfire is to try to make multiple GPU cards behave as a single, bigger GPU than either of the component cards alone. If you want the most out of crossfire then you just enable it and pretend you have a bigger GPU.

Thank you Kastoria for your input and clarifying how it works. I remember reading somewhere about a network for PS3 which connects PS3’s over the internet to calculate physics for astronomy, due to the Cell architecture, but it is understandable that it will be hard for developers to build software to support it. It seems I may have to scrap the FM2 and go for AM3+ FX-8350 as this should at least perform better when rendering motion graphics and just get a high end nVidia card for CUDA, unless I can use cycles with OpenCL on the setup I mentioned in the previous post. I read recently that Adobe has found a 25% increase in performance with OpenCL as it is maturing and plan to add it in the next release of their products, so I am not sure if it would be in the case for Cycles in the future.

Thank you Kastoria for enlightening me, this also has helped me to decide which desktop would be a better choice to buy. I gues FX-8350 and a nVidia card would be the better choice for me.

If you have a look around the forums, there are many threads about opencl / cycles / amd… pretty much opencl was dropped from blender due to lack of support from amd, however, it looks promising over the next few months / years (cycles now compiles with the latest amd drivers, just takes ages & is slow)

  1. AMD hardware usually very good, software is not. In short, AMD OpenCL compiler is not the best quality product. It somewhat work, it shine in bitcoin and Phososhop, and… and that is almost all. As soon as your program get more complex, it become very unpredictable, upto anecdotic 24 gb RAM usage in hours, compiling Cycles kernel. Very recent compiler (it is part of Catalyst sute) get some improvement, but fact it cannot be usable still.

  2. Crossfire is very old attempt to scale game rendering to 2 and more cards (evolution of AFR, alternative frame rendering), basic idea is render even frame in one card and odd on other. Nothing related to true utilising resources of hardware units, and that things mostly in software, hardware suport just for better sync between frames and pass render buffer to one card that connected to monitor. Unrelated to cycles or any other Opencl/DirectCompute things.

  3. APU with embedded graphics was very promising, in fact, next, full features HSA device (naybe generation after steamroller?) will be “must have” for Cycles. Idea is exactly as you tell, true exploit of closer on die units, high speed internal bus instead of PCIE, programming in C/C++ (front end will compile to some intermediate language, that will passed to binary AMD driver, almost like OpenMP 4.0), not some fancy things like limited OpenCL/CUDA. Full true memory access across all devices, even virtual memory. The problem we need wait for years to see real device. Current APU’s just nt use HSA, and very limited.

I think that you need multi-CPU top MB, get 2+ OPterons, and something like Titan. Reason is, CPU will render anything, even 100gb+ textures (neec some code tweaks for that, but not a lot, just imcrease current limits as it common dnominator of CUDA and OpenCL). Titan is optional for cool interactive viewport with Cycles, not for final render.