Because in Ableton he is using an aggregated output device of his built in audio (outputs 1+2 for monitoring) plus blackhole (outputs 3+4 for clock and transport). Outputs 3&4 are the first 2 Blackhole outputs.
In VCV he is just using the blackhole input device. So outputs 3+4 in Ableton are inputs 1+2 in VCV.
My money is on his issue being Rack microphone access.