Question about V1 CPU handing

pyer · May 31, 2019, 8:15pm

Hi guys, got a non expert question that may seems stupid, I guess I just didn’t understood how it works.

VCV Racks is handeling multi CPU… cool my fan will say thank you!

but when I load like 4 modules, the task manager says racks takes 7% (a little bit less than 0.6) and when I set Rack to 4CPU, the task manager says rack takes 50% of the cpu, for the excat same modules… so what is the point?

I hope my question is not too stupid but I really don’t understand what is the benefit of this…

Skrylar · May 31, 2019, 9:01pm

It’s only multithreaded within a single sample. So you are not going to see the same difference as ex. giving three u-he Diva’s their own threads.

Turning up the CPU count is not going to be worth it unless your small number of modules particularly hungry (Vult?) or you have quite a lot of modules mounted.

@Vortico Are you using hot spin locks to synchronize the threads? That might explain some of it if true

Vortico · May 31, 2019, 9:28pm

Not sure how this confusion is so common, but multithreading takes more CPU, not less. This means more electricity, more heat, and less resources for other programs.
Imagine you have 1000 hours of paperwork to do. If you hire 3 other people, the total man-hours is not decreased but increased to maybe 1500 hours, as there is overhead of collaborating with others.

Vortico · May 31, 2019, 10:00pm

Yes, see Rack/src/engine/Engine.cpp at v1 · VCVRack/Rack · GitHub. All threads spin after they are finished until the last thread finishes. If modules like Core AUDIO call Engine::yieldWorkers(), this spinlock turns into a mutex until the last thread finishes. All multithreading overhead is due to both serial code at the end of each timestep and thread orchestration such as spinning.

Skrylar · May 31, 2019, 11:38pm

Depends on the chipset really Having more but weaker cores (or a chip with an elastic power profile) might be good enough for certain workloads but require less effort spent on cooling. I don’t think desktops fit this model though.

Nik · June 1, 2019, 9:36am

You will only benefit from adding another thread once the first one is close to maxed out.

JimT · June 1, 2019, 12:03pm

The best idea is to only use extra cores when you need to, there’s quite a lot of overhead. Strangely there’s less overhead when you’re actually using the extra cores. when you’re using cores you don’t need they spend a lot of time spinning the CPU.

I’m sure we’ll work out a better way to do this, but for now you can get some extra modules this way.

Skrylar · June 1, 2019, 5:00pm

Yielding for as much time as scheduling precision allows and then hot looping for the remainder, but would take some changes and tuning to get right.

JimT · June 1, 2019, 5:02pm

that’s what Andrew does

Skrylar · June 1, 2019, 5:11pm

There is a difference between nano/usleep and _mm_pause:

Sleep releases the core back to idle, _mm_pause is just hinting now is a good time to do something else (but the core stays allocated.)

For game loops you can calculate a time budget for one frame, do some logic and then sleep for the remainder. You can have 30fps or 60fps appear to use almost no CPU power this way. (I’m assuming Rack doesn’t do this because timing precision to sleep 44,800 times per second is… less accessible.)

JimT · June 1, 2019, 5:12pm

not to mention it’s impossible for the engine to accurately estimate how long a module will take to process anything

Vortico · June 1, 2019, 8:12pm

I have extensively researched the problem of decreasing CPU usage while not sacrificing module count or the 1-sample latency rule. I am interested in well-tested solutions for improvements, but I doubt this is possible at this point, so most speculated solutions will increase CPU usage or sacrifice module count.

pyer · June 2, 2019, 10:52am

Thank you for clearing up this misunderstanding of mine.