Are delays in VCV or HOST “blocky” or “smooth”?

TL;DR this is just a question, not a claim about anything in VCV. The question is posed at the very end of this. The rest is just setting context so that I can try and ask the question.

In particular I’m wondering about the delay from “gesture” to audio. Where gesture is typically clicking on a button with a mouse of playing a note on a MIDI keyboard. By “smooth” I mean that the delay is perfectly constant, and by “blocky” I mean that the delay is not constant, but rather quantized to an even buffer boundary. Or something in between.

Why does it matter? If the delay is constant, like the delay due to the speed of sound when playing an electric guitar, then you can tolerate quite a bit of it before it becomes objectionable. But if the delay is not constant, but instead is either quantized to buffer boundaries or just random, then it is much more noticeable and objectionable.

I (and many others) have done a lot of testing around this, since back in the “bad old days” this came up a) when the CPU in a hardware synth was over-burdened and introduced delay between pressing a key and sound coming out, and similarly b) the dreaded “MIDI Timing Jitter” that occurred due to the delay between sending a midi event and the synth producing sound, or a MIDI sequencer whose recording or playing algorithms introduced random timing errors.

Those who have suffered through my posts on aliasing will not be surprised that I used to get into plenty of flame wars over this issue. But it was a huge controversy even without my help.

Anyway, what does this have to do with VCV? I’m not sure, but here’s how it applies to VSTs.

In the model of VST 2, a VST plugin is given 1) a buffer of data, 2) a buffer of MIDI, 3) the instantaneous values of some continuous control parameters. Because this whole system (DAW) tends to run on block sizes, running a “stream” thought a VST doesn’t add any overall delay (factor #1 doesn’t matter). If the DAW supports time stamped MIDI, then the buffer of MIDI delivered to the VST will have accurate time stamps, and the VST would be able render audio with no extra delay due to quantization of MIDI times.

I have no idea how “most” DAWS work. I have myself worked on a popular DAW that did not support time-stamped MIDI and hence quantized all MIDI gestures to the buffer boundaries.

VST 3 adds (among other things) the ability to deliver time stamped control information to the VST. This would allow, in theory, for a VST 3 VI to have a purely constant delay (no delay, as seen by the host system).

So the VST processing model is that audio has a constant (zero added) delay, and that gestures may or may not be quantized to buffer boundaries. All this relative to the rest of the DAW which is running at an overall delay of one or two buffers.

Aside: a 128 sample buffer is about 3ms at a 44k sample rate. Which is a short constant delay time, but 3ms of timing jitter is moderately noticeable and/or annoying (depending on application and personal preference).

So, what does VCV do? When the sound driver (ASIO?) asks for a buffer of sounds, does VCV generate a buffer “as fast as possible” without timestamps, thus introducing some jitter in the time from gesture to audio? Or is gestural input time stamped as in VST 3, or in some other way presented to the engine without jitter?

Similar for VCV Host – if the VST is capable, can it see the 16 inputs from VCV with time stamps, or are they sampled at the start of a host buffer?

Aside – VCV Host has its own buffers, which may be set considerably smaller than the overall sound-card buffer settings.

1 Like

I can answer that for VCV Host. MIDI events are timestamped, for both VST2 and VST3 plugins. Parameter changes are not timestamped; they are sent in a batch, once per block.

It’s worth emphasising that you can set the VCV Host block size to a small value such that you’d never be able to perceive any latency or jitter. (I’ve never come across a VST that doesn’t work at small block sizes, but of course such things probably exist.)

1 Like

So the data is delivered to the VST with “real” timestamps by host? That’s pretty cool.

Yes. MIDI events are still buffered per-block, but they each have a timestamp within the buffer.

1 Like