Rack v1 development blog

Added HybridBarrier which might be interesting to anyone who writes threaded code. This is a spin barrier that turns into a mutex barrier when APP->engine->yieldWorkers() is called during a Module::step() method.

Here’s how engine workers functioned before.

  • Every sample, all workers race to grab modules in the rack and call their step() method until no modules are left.
  • When a worker is finished with that task, it spins in a while (!allWorkersAreFinished) {} loop.
  • The main engine thread (which is also worker #0) runs some serial code, like copying voltages from outputs to inputs along cables, and param smoothing.
  • Repeat.

Suppose you have 100 modules taking roughly 1 μs each to step one sample. Then the time between when the first worker finishes and the last worker finishes is roughly 1 μs. (Imagine you have a stack of 100 rocks of various sizes and you naively try to divide them evenly into 3 groups. The weight difference between the heaviest and lightest group will be around the average weight of 1 rock.) This isn’t a lot of time to be spinning, so not much CPU is wasted between each sample.

Now suppose that Core AUDIO now needs to block until the audio driver thread is ready to receive a new buffer. This might be 128 samples * 44100 Hz = 2900 μs of wait time, so meanwhile other worker threads will be spinning for a very long time, waiting on Core AUDIO to return. This pegs the CPU core at 100%. Here’s a solution.

  • Before Core AUDIO begins waiting, tell the engine to tell all workers to stop spinning and lock on a mutex instead.
  • This (usually) causes all workers to yield to the OS and let other application threads do their thing or allow the core to idle.

It takes 10-100 ns for a spinlock to “wake up” and 1,000-10,000 ns for a mutex to wake up, so calling yieldWorkers() has a massive cost, so it should only be used when you know your DSP kernel needs to block for much longer than 1,000 ns.

4 Likes