Mac: M3 - *Hardware accelerated RT (Part 1)

Wow, there’s a noticeable difference between Staar’s M1M and your M1P
His is also 24 Core GPU and not the higher 32. I’m assuming yours is the 24 core too; or is it the16 core?

M1Pro taps out at 16 cores.

1 Like

That’s still really good!!! So I can probably expect half the performance of my 8 core mine.

Is it running through Rosetta?

Staars got his M1 mini hitting 2:04 on bmw.

Yah currently unclear wether AS or Intel (Rosetta 2)…

1 Like

I see that now.

2:04 for the Mini… I think he said under X86
And 00:48 for his 24 Core

2 Likes

working on compiling it myself with the new instructions…we will see how it goes…

1 Like

How does the viewport rendering feel?

There’s some clear instructions now, so I think I’m gonna give it a go myself and see what I can report for y’all. I have a 14" M1Max with 32GB RAM and 24 GPU cores. So same as Staars.

4 Likes

Is the memory usage normal for the scene also when compared with other system configurations? Don’t have any comparison points but going from ~100MB to ~1400MB for enabling GPU sounds a bit off?

I hope that memory usage isn’t some left overs from needing to “duplicate over stuff here or there” for the GPU to use since that isn’t needed anymore, the GPU ideally always has access to the meshes, textures, etc etc ready in memory now… except if the CPU is using a complete different representation of it, i.e hypothetical dumb example, the textures are ARGB and the GPU needs them RGBA

You have the exact configuration I would go for

1 Like

So I got it working after some initial confusion. I’m now going through the process of running all the standard demo files (if there’s a specific one you want please let me know!).

Definitely some crashing going on. In Barcelona for example, you can only use GPU cycles for the viewport - if you try CPU + GPU it crashes. However, when rendering you can do both GPU and CPU + GPU. Similar opinions are above about fans kicking in with CPU+GPU. Certainly not anything unpleasant though. And everything else has been silent.

I’m having a blast with it honestly. It’s weird to see this on a laptop.

I’ll post results with screenshots tomorrow once I’ve finished!

Oh I forgot to say that it’s Apple Silicon.

2 Likes

I got it working as well…got 43 seconds on bmw gpu on my 14in m1 max 32core…seems to be the same as the 24 core. wonder if they will be able to improve this with performance optimizations.

3 Likes

Nice! Best I could get was 47s.

After doing more tests, I have to say this is a super unstable build. I can’t even open monster under the bed (even though Staars benchmarked it…)

could you share the build?

also going into Geometry Nodes - and man so much in the past few month most tutorials like for grass are not working anymore

also the M1 macMini GPU is really week that with a simple grass particle scene that even the Finder slows down !

I’m kind of thinking, even the 16 inch 32 core is only going to give a handful of seconds (5-ish maybe?) difference even though it’s clocked higher for the GPU.

Other test are showing slight margins… that’s why everyone is going with the 14” 32 core, I guess.

ARM, Viewport rendering crashes however, thanks vut could not have done it without the mods from devtalk.

Oh yeah and it is the base 16” model, so 10, 16, 16 if I want to put it that way.

1 Like

Yeah I found that a bit strange too will play a bit more with it tonight or perhaps there are some bugs as it is like pre alpha :wink:

1 Like

Yes it is did not try more as I had no time only opened those 2 scenes and hit render.

I know tile size had a lot of an effect pre 3.0 but not sure if it still does.

Also interesting I saw someone post a test with cycles in blender 3 and the AMD 6900 Xt and it took 34 sec.

So GPU and CPU with on the M1Pro is only about 5x slower with this early version at like about 6x less power draw or more. The M1 Max only 2,5 times slower then?

Package power draw was 53 Watts when rendering the classroom, if I remember right will check later :wink:

These are my results using Rosetta 2 on a M1 Max 10c/32c GPU and 64GB RAM:

barbershop_interior 800 samples M1 Max GPU 9m19 (559s)
classroom - M1 Max GPU - 0m55 (55s) 150 samples - B31-alpha
classroom - M1 Max CPU - 4m18 (258) 150 samples - B31-alpha
classroom - M1 Max GPU - 1m49 (109s) 300 samples - B31-alpha
monster_under_the_bed 3m08 (188s) M1 Max 3.1a
BMW - M1 Max GPU 32 core - 43 seconds
BMW - M1 Max CPU - 202s - 3m22

GPU is about 4.7x faster than CPU

I tried Hybrid and it performed slower than GPU only and turns the fans on so GPU or CPU only is best for silence.

Only problem with the 3.1 build (same on the daily builds - X86 or ARM) is that I can’t select objects in the viewport. If I try 3.0 or 2.93 selecting things works fine - anyone else have that issue?

5 Likes

Interesting that for you hybrid is lower for me it was faster but I need to test more.

As for the bug not sure have not tried to do anything really in 3.1 but I will check.