M1 Max Performance?

I’ve got an M1 Max MacBook Pro. If I go beyond a single thread, audio output becomes quite stuttery, as though VCV can’t fill the buffer fast enough. Performance using just one thread is great, but then I’m limited in how many modules I can run before getting audio crackles and dropouts (especially while recording).

Anyone else using this processor? What experience are you having?

I don’t have the M1 Max, but use it with a 2021 M1 MacBook Pro. Occasionally I will notice audio clicks with only a single thread, but usually this is when I’ve also noticed that the Memory Pressure in Activity Monitor is high, so closing other apps has resolved this.

I’ve not tried it lately, but didn’t have much luck with multiple threads in Rack 2. Every new thread added another “100%” to the CPU meter. I don’t know if this is the intended behaviour, of if there is some thread synchronisation/waiting that’s not working for macOS. The only version that I’ve had that has been completely solid for the audio output with multiple threads is an old version (of Rack 1) that @diimdeep added high priority audio threads to.

Yes, I used that Rack 1.x build with high-priority audio as well. With Rack 2 Pro’s new threading architecture this was supposedly not needed, but performance is much worse for me. I don’t have the chops to do a custom Apple Silicon build myself. I’m curious if that might improve the threaded performance though?

1 Like

I would doubt that making it native for Apple’s M1 chips would solve the issue. I’ve not looked into it further because most of my patches are quite small these days, so a single thread has been enough.

Haven’t had any issues on an M1 Air. Curious of the type of patch where you’re running into issues.

Any patches really, doesn’t matter too much. As soon as I enable more than one thread or approach 90% single-threaded it gets unhappy. Removing the @Eurikon MixMaster mixers tends to improve M1 Max performance, strangely enough, but it’s hard to go without them since they’re so useful.

what are these? Is it a typo and you mean MindMeld MixMasters?

Not a huge typo in fact… @Eurikon being an honorary member of MindMeld and an important consultant when we created the mixer, so it’s all good as far as I’m concerned! :stuck_out_tongue:

3 Likes

I did not know. I looked in the library and did not see.

Always quit any running browser, and anything else using background cycles that doesn’t need to apart from the music stuff. The multi thread issue may be due to first, the way the M1 handles threading, and second, muti-threaded programming, especially if essentially one thread needs to run near real time, is hard, to do right and efficient on any platform. You must make some pretty large stuff that you need more than one thread, I have a big mixmaster/auxmaster/eqmaster/bassmaster and compII cabled together for output mixing, never had any perf issues, even with all 16 minxer channels being used… This is on a plain old M1 Mini.

I’m using an m1 Mac mini and i can easily create patches in excess of 30 modules, and in some of those patches I’m using two MindMeld MixMasters, an 8 channel one for my drum bus and a 16 channel one for my main mix with the Aux and Eq expanders on each. I am set to 48kHz sample rate, a buffer size of 64 and set to use 1 thread. Our single core performance is pretty similar 1714 on my M1 Mac min and 1754 on your M1 Max MacBook Pro. How many and what type of modules are you using to get to 90% of a single thread I have created some pretty big patches and yet to go over about 60% cpu usage. Also what is your buffer size in the audio output module set to.

That is quite a low buffer size. I generally run at 256 for example.

But I did see your other thread where you said as you reduced buffer size, CPU usage actually went down rather than up which sounds a bit odd…

It would be interesting to see if you get any better ‘real world’ performance (rather than just looking at the CPU meters) by changing to say 256.

I don’t know what my real world performance would be , I could try adding as many modules as I can with a buffer size of 256 and see if the same thing happened when I then lower the buffer size, but I cannot see the point in wasting time doing that . I run out of ideas in a patch before I reach any problem with the number of modules I can use with the buffer set at 64. Their is also another reason I use a low buffer size, I incorporate my Elektron analog keys into most of my patches so keeping the latency down is important to me.

This is when running clean, without other processes running. I tend to favour some of the heavier modules, like @modlfo Vult filters and distortion. I’m working in quad as well, which means quite a few channels of polyphony and many utility modules as soon as patches grow beyond a few sources. I’ve gotten quite good at optimizing patches for low CPU usage, but needing more than one thread is a situation I’m commonly running into.

Seems a shame to be limited from using all the other cores available in these chips.

While it’s running under Rosetta2 there is little anyone can do, and an M1 native build is not likely until all libs Rack depends on are native as well… Plus that will add another 3rd of support burden to go from 3 to 4 supported platforms. More likely to happen by the time Apple no longer supports anything with an Intel chip inside, as then it is not another platform to support. So you sadly fall between a rock and hard place. I would put a vote for an M1 built into an email to support at vcv. I guess if enough of us add our vote it may add some urgency to get a native build rather sooner than later…

Just a stupid idea that probably won’t work but if you have the pro version of VCV rack it might be possible to have another instance of VCV rack running as a plug in host thus, as far I understand it, running on another thread. I haven’t tried this as my patches are not that big but it might be worth investigating.

1 Like

That is an intriguing idea. Unless there is some M1 architecture constraint that gets in the way, I think you should be able to set the block size quite low to get extremely low latency.