I’m wondering if anyone has any tips on managing frame-based processing in VCV Rack. I’ve got some code pulled together for a real-time pitch shifter using a short time fourier transform but I’m not sure how to manage the frame processing within the VCV rack framework.
Summary of my thoughts/where I’m at:
Short time fourier transform typically requires >512 samples to be collected before the processing can be started
Once you’ve gathered a block of 512 samples, you have to do the STFT and all your processing within a single sample (because VCV rack calls Process() every sample)… which is a lot to do in very little time… herein lies my dilemma.
Ideally, all the processing would occur in a lower priority worker thread… but I come from the land of embedded devices and I don’t typically use such features. I’ve seen: How do you buffer data for block processing - #10 by Xenakios where @Squinky and @marc_boule discuss the use of “worker threads” which can operate outside of the main Process() block. Looking over the “MindMeld EQMaster” source code, I see them being used, but there’s a lot to absorb there and I’d like a softer introduction. Is this just a typical feature in C++11? I understand threads and mutex conceptually but would like some more guidance on how to use them properly within VCV.
Even though Rack calls the process() method each sample, that isn’t a hard limit on how long your process method is allowed to run, since everything is ultimately running at the hardware buffer size anyway. Rack just needs to get everything done during the driver callback. So say, if the sample rate is 44100 Hz and the buffer size is 512 samples, all the modules need to get their things done in under about 11 milliseconds. You could start overthinking all this and involve worker threads and such but it likely isn’t going to matter much in the end. If you do the threading incorrectly, it might easily lead to worse performance, not better. I am sure people will disagree with this, though.
Hmm… but how to manage context switching when Rack calls your process method again?
For example, let’s say I finally get a block of 512 samples and I start doing my FFT or processing my Reverb or whatever, but then halfway through my FFT, rack interrupts my processing with another Process() call?
I wouldn’t advise running a big fft on the audio thread. You will block everyone else from running. I think most of us run the FFT on a worker thread. @Xenakios : what plugins do you have out that do block fft processing on the audio thread?
Hmmm… well, I was recently working on a reverb module where I tried to do block based processing and I was getting tons of clicks and pops when I made the block size greater than 128 samples. I assumed it was because it was too much for one Process() loop. Here’s a simplified version of the code where I use double buffering. Maybe my issue is that I grab the inputs, do the processing (if a block is ready), and then write the output samples?
void process(const ProcessArgs& args) override {
// grab input and store it in double buffer -> called every sample
{
// scale to +/- 1 and store in buffer
doubleBuffer[writePtr] = inputs[INL_INPUT].getVoltage() * 0.2f;
writePtr++;
if(writePtr == halfBufferSize) {
halfComplete = true;
} else if(writePtr == bufferSize) {
fullComplete = true;
}
writePtr = writePtr % bufferSize;
}
if(halfComplete || fullComplete) { //Process samples in double buffer if buffer is half full
//Clear condition, set offset
size_t offset = 0;
if(halfComplete) {
offset = 0;
halfComplete = false;
} else if (fullComplete) {
offset = halfBufferSize;
fullComplete = false;
}
//Process the whole block here using something like -> Process(float* buffer, size_t bufferSize)
reverb->Process(&doubleBuffer[offset], halfBufferSize);
}
// set output from double buffer -> called every sample
{
outputs[OUTL_OUTPUT].setVoltage(doubleBuffer[readPtr] * 5.0f);
readPtr++;
readPtr = readPtr % bufferSize;
}
}
all pass filters, feedback delays, delay taps, lots of lfos, lots of interpolation… etc… it works no problem processing things 1 sample at a time (1-3% cpu on my computers) but in theory it should run a little bit smoother when using a block based approach. But like I said, it starts dropping samples when I increase the size greater than 128 samples.
I assume the processing for an STFT will be on par or worse, since I plan on processing blocks of 512 or 1024.
I basically have a buffer and when it is full, I do the FFT, since the buffer is usually at least 512 samples, that means I only do an FFT every 512 process calls, and that doesn’t seem to stress the CPU too bad
I don’t understand. I usually set the buffer size of my interface to 128 samples, but sometimes 64. Some people even use 32. Which buffer is usually 512 samples?
yes, I know that use use a separate thread for your FFT stuff. I do the same in Colors, where I use inverse FFT to make the colored noise. I use worker thread also in SFZ player, because I don’t want to do file I/O on the audio thread, although I know there are people who do that.