Mac: M3 - *Hardware accelerated RT (Part 1)

I have to say this is incredible. Three months ago we didn’t know if Metal Cycles would ever come out. Two months after Apple joined the Blender Institute, we have the first pre-alpha builds. This is even more mind blowing since there is no stable version of Octane X after more than two years of work. I have a clear feeling that Apple should have invested earlier in Blender instead of Octane.
Anyway it is great to see what is going on around Metal Cycles now.
As far as I know there is no sign of AMD graphics card drivers for Apple Silicon. I’m very curious how Apple will approach the multi-GPU in Mac Pro.

7 Likes

Yeah, they did say they are working to get it for the 3.1 release. Which means it needs to be in master by December 29th.

same results as me and I also have that same issue with the selecting objects in viewport.

I will say GPU rendering is obviously much faster than CPU on the mac…but its still like 1/5th the speed of my 3070 using optix. But it depends on the scene.

1 Like

I agree that the current Blender macOS / Metal development pace is wonderful. I wonder if Apple also actively contributed to OTOY or not. If OTOY had to figure (nearly) everything out themselves that could explain the delay.

2 Likes

I’m hoping for the Metal API (which the supported Macs are supposed to handle all the same) to be a straight plug and play at least on macOS. For all we know, this metal backend would make this specific rendering portion of Blender even compatible on iPads and iPhones (if Blender itself were too that is).

2 Likes

This, and I’m hoping for future RTcore support too. The advantage of combining Optix, RTcores, and Cycles X is pretty significant. In Rasterization task as shown across the web in gaming benches and such, the new SOCs are holding up to Apple’s claims.
With that said even with these early Blender test this thread is doing, I’m more than content with the early numbers for what I do in Blender. Couldn’t be more excited!
And Like @anon80315389 said, a couple of month ago we didn’t even know this would ever happen and now we on the cusp of having a native ARM version of Metal Blender!!!
Bravo Apple and Bravo everyone who got the compiles to work, and for sharing your results. :grinning:

3 Likes

I wonder if that’s because your compile is from several days ago, like you mentioned? The latest addition from Mike Jones (which I see you linked to) has to do with Host Side changes, which from what I understand deal with the CPU interactions with Cycles. Those who built more recently are seeing those gains for maybe this reason?

A nice advantage to running a new Mac…
Just realized I only have maybe one completed scene under 8GB for VRAM and defiantly none under 6GB. Most are pushing well over 20GB of VRAM. So having a viewport renders running smoothly without the need for swap memory, and with a buffer of 64GB is going to be super nice.

Don’t know how guys are working with 6 to 8GB of VRAM in this day of 3D. :grin:

1 Like

I think we can close the thread now. Could not resist, :smiley:

My understanding is that Apple has been actively involved with both Otoy and Redshift in the past to get them going. It’s unfortunate that they waiting this long to get involved with Blender, we could have had full Metal support in 3.0.

3 Likes

At least for Otoy this is true. Otoy has on a couple of occasions mentioned that they are/were working closely with Apple developers for their Metal build.

Saying that, It maybe be a blessing that it went in this order. There may only be one team from Apple designated for this, and they were able to cut their teeth and work through the ins-and-outs of 3D DCCs with Otoy and Redshift before working on Blender.
If I remember correctly Metal Octane took close to 2 years after Otoy’s announcement of collaboration.
Blender Metal is on track for less than two months after the announcement.
Like others have said, it’s crazy how fast this has happened after the Apple devs stepped in.

1 Like

I don’t believe so as December 1st was the initial commit of that entire Host Side changes. They’ve been re-organising things and moving class headers mostly since. No major changes to how it works. It’s probably because mine is Rosetta 2 for CPU and hybrid is suffering from that.

1 Like

This has perhaps been discussed earlier in the thread, and if so I apologize, but does anyone know if the GPU can access the full amount of the unified memory? Could I theoretically do a 50GB+ GPU render? This seems to be the case, but the details of the unified memory architecture are hard to parse.

I don’t think any of us know the absolute maximum percentage the GPU can utilize, but it can grab well over half. If you buy a 64GB machine, it can reserve over 32GB for itself.

1 Like

I don’t know about the legality of that, but I can absolutely give clear instructions!

  1. Make sure you have homebrew installed
  2. Follow the instructions here, ignoring all the Xcode stuff (but you do need Xcode installed)
  3. Download the latest diff currently here, right clicking on “Download Raw Diff” and saving
  4. Follow the post here [One note - you’re just editing text files, so you don’t need Xcode or VScode or anything][Second note - step 5, the correct path is blender/intern/cycles/CMakeList.txt]
  5. Rebuild in terminal
    cd ~/blender-git/blender
    make
  6. Open Blender in /Users/[user]/blender-git/build_darwin/bin/Blender.app

Note that frequently in order to have Cycles work in viewport, you have to disable the CPU part in settings (it’s the one without GPU next to it)

Of course you need to select “Experimental” - GPU under the Rendering Properties

However, you can enable the combined CPU + GPU or GPU alone for rendering in most cases.

Hope more people give it a try, it’s fun to see!

1 Like

I completely agree. These results don’t compete with Desktop RTX + Optix [your result was 3.6 times faster than mine on a 24Core M1Max - 2.1 times faster with CUDA).

With that said, this is still early days for the implementation. But expectations should be kept in check I suppose.

In viewport I saw 4-6x better performance (getting to 10 samples) with Metal vs CPU only (both 3.1 alpha) in the barbershop scene. That, personally, is enough for me. And it never got hot or spun up the fans.

3 Likes

I don’t know about anyone else, but with these pre-alpha build numbers with still more task to finish with the development work are pretty impressive. And even if those numbers stay as is, Apple’s iGPU is more than beating my expectation. I’m kind of blown away by these early numbers. My Intel iGPU on my other machine can’t come close to this. lol

2 Likes

According to Apple’s claims M1 MAX should be close to Vega 56 / RTX 2080 CUDA raytracing performance (about 10 Tflops). The results in Luxmark seem to confirm this. Theoretically we should get similar performance in optimised AS Cycles X version.

3 Likes

I don’t use Luxmark, glad to see there are numbers out for it alreay! Does Lux have viewport rendering for scene setup?
And is the Lux version a fork?

Well the 16" 32c M1Max has yet to be tested, so there’s still a chance we’re closer than we know. But yeah, I’d assume performance tuning / general work still needs to be done ( the fact that you can’t do CPU+GPU for viewport is telling).

Either way, even if this is it, I’m able to work well with this level of mobile performance. Not that I expect it to stay like this :wink:

1 Like