Hi guys, got a non expert question that may seems stupid, I guess I just didn’t understood how it works.
VCV Racks is handeling multi CPU… cool my fan will say thank you!
but when I load like 4 modules, the task manager says racks takes 7% (a little bit less than 0.6)
and when I set Rack to 4CPU, the task manager says rack takes 50% of the cpu, for the excat same modules… so what is the point?
I hope my question is not too stupid but I really don’t understand what is the benefit of this…
Not sure how this confusion is so common, but multithreading takes more CPU, not less. This means more electricity, more heat, and less resources for other programs.
Imagine you have 1000 hours of paperwork to do. If you hire 3 other people, the total man-hours is not decreased but increased to maybe 1500 hours, as there is overhead of collaborating with others.
Yes, see https://github.com/VCVRack/Rack/blob/v1/src/engine/Engine.cpp#L104. All threads spin after they are finished until the last thread finishes. If modules like Core AUDIO call Engine::yieldWorkers(), this spinlock turns into a mutex until the last thread finishes.
All multithreading overhead is due to both serial code at the end of each timestep and thread orchestration such as spinning.
Depends on the chipset really Having more but weaker cores (or a chip with an elastic power profile) might be good enough for certain workloads but require less effort spent on cooling. I don’t think desktops fit this model though.
The best idea is to only use extra cores when you need to, there’s quite a lot of overhead. Strangely there’s less overhead when you’re actually using the extra cores. when you’re using cores you don’t need they spend a lot of time spinning the CPU.
I’m sure we’ll work out a better way to do this, but for now you can get some extra modules this way.
Sleep releases the core back to idle, _mm_pause is just hinting now is a good time to do something else (but the core stays allocated.)
For game loops you can calculate a time budget for one frame, do some logic and then sleep for the remainder. You can have 30fps or 60fps appear to use almost no CPU power this way. (I’m assuming Rack doesn’t do this because timing precision to sleep 44,800 times per second is… less accessible.)
I have extensively researched the problem of decreasing CPU usage while not sacrificing module count or the 1-sample latency rule. I am interested in well-tested solutions for improvements, but I doubt this is possible at this point, so most speculated solutions will increase CPU usage or sacrifice module count.