Tapped delay line "skipping buffer" issue


I am developing a “feedback loop utility” module. It consists of a single stereo input and 3 channels, where each channel has a stereo output (“departure”), stereo input (“arrival”), a delay knob and a gain knob. At all times, the departure output contains the original input signal and the arrival signal, delayed by the given amount (from 0s to 3s) and multiplied by the given gain (from 0% to 100%). The idea is the user can create their own effects chain in between the departure and arrival and have the signal going through it feedback (controllably, with the gain knob), making for some quite interesting effects. However, with nothing in between, just two (for a stereo pair) cables going from the departure to the arrival, it should function similarly to a delay effect.

To implement the tapped delay line functionality I am using a std::vector<std::pair<float, float>> memory (as in, a buffer of stereo samples) of size 3 (maximum delay time)* samplerate. Each channel has a vector like this and a size_t write, which points to the current sample in the buffer being written to. I believe my process code below is similar to using a ring buffer(?) I could be wrong though:

/// ...
///  inside of process:
#define DELAY_TIME clamp(params[DELAY1_PARAM + i].getValue() + inputs[DELAY1_MOD_INPUT + i].getVoltage(), 0.f, DELAY_MEMORY_SIZE*args.sampleRate)
#define GAIN       clamp(params[GAIN1_PARAM + i].getValue() + inputs[GAIN1_MOD_INPUT + i].getVoltage(), 0.f, 1.f)

for (int i = 0; i < 3; i++) {
            Channel& chan = channels[i];
            size_t size = chan.memory.size();
            chan.memory[chan.write] = { inputs[ARRIVAL1_L_INPUT + i*2].getVoltage(), inputs[ARRIVAL1_R_INPUT + i*2].getVoltage() };
            // how far back in the ring buffer we have to go to reach the appropriate delayed sample
            size_t setback = (size_t)roundf(args.sampleRate * DELAY_TIME);

            size_t delay_location = ((chan.write - setback) % size + size) % size;
            std::pair<float, float> delay = chan.memory[delay_location];

            outputs[DEPARTURE1_L_OUTPUT + i*2].setVoltage(inputs[INPUT_L_INPUT].getVoltage() + delay.first*GAIN);
            outputs[DEPARTURE1_R_OUTPUT + i*2].setVoltage(inputs[INPUT_R_INPUT].getVoltage() + delay.second*GAIN);

            chan.write %= size;
#undef GAIN
/// ...

This works to some extent, but introduces weird “skipping” artifacts every so often. Sometimes an old echo will get amplified, sometimes a relatively new echo will die out immediately. This is a simple example recorded with a sine oscillator being played as the input. The expected behavior is a neatly echoing signal, where each echo takes exactly as long to die out and doesn’t suddenly get cut off in the middle. I checked practically every variable whose value could give a clue as to what’s wrong and nothing seems suspicious, which is why I’m resorting to writing here.

Moreover, this solution in general is quite memory-inefficient, 15MB taken up by a single module? We should do better! So I’m also curious about suggestions of completely rethinking the delay line.

Just to reiterate, I’m not working on a full-fledged delay effect, I just want to be able to mix in a delayed version of a signal to itself.

I can provide more samples from my code / answer any questions.

Mod with negative numbers does unexpected stuff so the push up by size you do in the double mod is probably better as pushing up inside and just doing one mod.

size_t delay_location = (chan.write - setback +size) % size;

Which should force the mod to always be of a positive number


Oh as to your other question there is no way to do a perfect delay for n samples without at least n * sizeof(flost) memory used so either you give up on perfect delay or allocate the memory

There’s a bunch of ways to dynamically grow your buffer if the max is the problem but they all end up with the max anyway when you need it.

1 Like

Wow, I think that’s an immediate fix, thank you!

1 Like

I was curious about that because the VCV Fundamental delay effect implementation uses more than one dsp::DoubleRingBuffer and I wasn’t quite sure how they worked in that context, so I ended up just writing my own ring buffer mechanism whose conceptual correctness I was more or less sure of.

I was actually curious about one more issue (I hope it’s fine to ask in the same topic); each channel also features a “kill” button, whose function is to set the departure output voltage to 0 and clear the buffer, in case feedback gets too crazy. I’m using a dsp::BooleanTrigger to process my own button widget, which inherits from SvgSwitch and has momentary set to true. It introduces some nasty distorted artifacts when kept pressed, but it seems to only do that “internally”, i.e. it does not show up on the VCV Audio module VU and does not get recorded with the Recorder module which is why I can’t provide an example. This is the code, located right after the previous snippet:

if (kill_trigger.process(params[KILL1_PARAM + i].getValue())) {
    outputs[DEPARTURE1_L_OUTPUT + i*2].setVoltage(0.f);
    outputs[DEPARTURE1_R_OUTPUT + i*2].setVoltage(0.f);
    std::fill(chan.memory.begin(), chan.memory.end(), std::make_pair(0.f, 0.f));
    chan.write = 0;

I can’t think of a simpler way to do this. I think the issue could be caused by this (somewhat costly) code being run more than one frame, which is not the expected behavior; I thought BooleanTrigger::process only returns true if the previous frame the parameter was false. Maybe it’s an entirely different reason?

That fill could be a bit expensive indeed especially if you do it twice. That could cause a crackle

But also that slam to zero will create a click. A more common approach would be to initiate a short fade with a state counter and then spend n frames zeroing then undo the fade - something like that

From a data structure perspective an vector of pairs would probably be less efficient than a pair of vectors of floats especially if you start adding delay lookup other than the zero order hold and with a pair of vectors you can just memset both to zero which in some cases will be faster

Agree though make sure it’s just done once :slight_smile:

1 Like

Thank you!