The closest example for comparison will be the next-gen consoles. They will demonstrate how relatively modest hardware, on paper, can out perform much more powerful hardware. The APU and shared memory will reduce so many inefficiencies due to negotiating the PCIe bus that the consoles will do far more with much less.
Apple will do the exact same thing and make sure all their APIs hit the metal as hard as efficiently as possible. I can see video editing on Apple Silicon being absolutely next level as for the first time the benefits of heterogeneous computing comes to fruition. All those inefficiencies of moving data from main memory to GPU memory and back are gone and the best type of compute can be chosen for any task either GPU or CPU or fixed function hardware thatâs also in the SoC like CODECs and Machine Learning hardware.
With Apple Silicon will deliver on the promise of OpenCL which, now transformed into Metal, will have the benefit of actual heterogeneous hardware to run on. Unlike the consoles which are updated every few years Apple will likely be updating their SoCs yearly like the iPads. They donât have to wait for Intel or AMD to produce new hardware and are masters of their own destiny.
The biggest problem Apple will probably have to overcome is that the hardware will not look, on paper, as impressive as a PC but the real world performance will be superior and of course it wonât be a space heater.
Thatâs wrt the Integrated GPGPU for Apple Silicon and itâs relative to that of iGPU by Intel. Discrete GPUs are and have been Tile Based Deferred Rendering by AMD and Nvidia for several years now.
Apple will expand their use on the Mac Pro lines, not deprecate their support. They didnât invest four years of heavy investing on the Mac Pro line to shit can it for a low power consuming Apple Silicon SoC.
The Apple Mac Pro with Silicon is a good three years out.
I think Optix = being locked into a closed-system proprietary technology that Nvidia fully owns and controls. Right now Nvidia is the nice guy, but who knows about the future?
Iâm particularly looking forward to the Big Navi release and hope AMD will give me an option for Cycles rendering. Can you say more on how youâre going to up the performance and possibly surpass GPUs from other vendors? What are the optimisations youâre hoping to bring or what do you propose to do differently in laymenâs term?
Thanks for taking this task on, it looks incredibly well timed.
Itâs too early to talk about performance. Thereâs nothing to execute on GPU.
There are two major things which cycles does: ray intersection, and shader evaluation. The performance depends on simultaneous execution and memory access. Simultaneous execution can go very bad with current compiler, specially for SSS and volumes, and for complex shaders. My compiler will first ensure the most visited code (ray intersection) is executed together, via memory trade-off. Second would be grouping similar shaders together, this worked well on the âclassroomâ benchmark scene.
The second limitation is memory access and specially random memory access. Code has to wait for data to become available. This is somewhat mitigated by cache. Also code has to wait for the next code to become available.
I expect my compiler to outperform other vendors gpus on complex scenes. Also volume rendering would be blazing fast compared to now (GPU comparison only).
Hi Nirved! Is your project generation specific (Navi) or should we expect general uplift in performance across all (reasonably recent) generation of cards?