C++/DSP Optimisation Advice Please

Hi there!

I’ve coded a module that is essentially a clone of a hardware buffered unity mixer I made. It sums 6 inputs to output 2 at unity gain, unless output 1 is connected - then it sums inputs 1-3 to output 1, and sums inputs 4-6 to output 2.

The good news is, the code is working exactly as I want it to! However, I’m very aware that the code is not very efficient as I’ve repeated myself a few times.

I was wondering if anyone would mind having a look through the code below to advise how (and hopefully why) I can make the code more efficient.

I have a pretty basic grasp of C++, and obviously have a lot to learn if as I want to develop more interesting/complex modules in the future!

#include "plugin.hpp"

using simd::float_4;

struct BuffMix : Module {
	enum ParamId {
	enum InputId {
		//clever way of setting input numbers - used in for loop later
	enum OutputId {
		//less clever way of setting output numbers
	enum LightId {

	BuffMix() {

		//for loop that counts through number of inputs and appends the number
		for (int i = 0; i < 6; i++)
			configInput(IN_INPUTS + i, string::f("Input %d", i + 1));
		configOutput(OUT1_OUTPUT, "Output 1");
		configOutput(OUT2_OUTPUT, "Output 2");

	void process(const ProcessArgs& args) override {
		// number of channels and connected inputs
		int channels = 1;
		int connected = 0;
		for (int i = 0; i < 6; i++){
			channels = std::max(channels, inputs[IN_INPUTS + i].getChannels());
			if (inputs[IN_INPUTS + i].isConnected())
		// set the gain to automatically be divided by the number of connected inputs - unity gain by default
		float gain = 1.f;
		gain /= std::max(1, connected);

		for (int ch = 0; ch < channels; ch += 4){
			float_4 out = 0.f;
			float_4 out2 = 0.f;
			//mix the inputs
			if (outputs[OUT1_OUTPUT].isConnected()){ // if output 1 is connected route inputs 1-3 to output 1 and 4-6 to output 2
				for (int i = 0; i < 3; i++){
					out += inputs[IN_INPUTS + i].getVoltageSimd<float_4>(ch);
					out2 += inputs[IN_INPUTS + (i + 3)].getVoltageSimd<float_4>(ch);
				// applying the gain
				out *= gain;
				out2 *= gain; 		
				// set outputs
				outputs[OUT1_OUTPUT].setVoltageSimd(out, ch);
				outputs[OUT2_OUTPUT].setVoltageSimd(out2, ch);
			else{ // route all 6 inputs to output 2	
				for (int i = 0; i < 6; i++){
					out += inputs[IN_INPUTS + i].getVoltageSimd<float_4>(ch);
				// applying the gain
				out *= gain; 
				// set outputs
				outputs[OUT2_OUTPUT].setVoltageSimd(out, ch);
		//set the channels of each output
struct BuffMixWidget : ModuleWidget {
	BuffMixWidget(BuffMix* module) {
		setPanel(createPanel(asset::plugin(pluginInstance, "res/BuffMix.svg")));

		addChild(createWidget<ScrewSilver>(Vec(RACK_GRID_WIDTH, 0)));
		addChild(createWidget<ScrewSilver>(Vec(box.size.x - 2 * RACK_GRID_WIDTH, 0)));
		addChild(createWidget<ScrewSilver>(Vec(RACK_GRID_WIDTH, RACK_GRID_HEIGHT - RACK_GRID_WIDTH)));
		addChild(createWidget<ScrewSilver>(Vec(box.size.x - 2 * RACK_GRID_WIDTH, RACK_GRID_HEIGHT - RACK_GRID_WIDTH)));

		addInput(createInputCentered<PJ301MPort>(mm2px(Vec(15.24, 18.74)), module, BuffMix::IN_INPUTS + 0));
		addInput(createInputCentered<PJ301MPort>(mm2px(Vec(15.24, 31.465)), module, BuffMix::IN_INPUTS + 1));
		addInput(createInputCentered<PJ301MPort>(mm2px(Vec(15.24, 44.19)), module, BuffMix::IN_INPUTS + 2));
		addInput(createInputCentered<PJ301MPort>(mm2px(Vec(15.24, 69.639)), module, BuffMix::IN_INPUTS + 3));
		addInput(createInputCentered<PJ301MPort>(mm2px(Vec(15.24, 82.364)), module, BuffMix::IN_INPUTS + 4));
		addInput(createInputCentered<PJ301MPort>(mm2px(Vec(15.24, 95.089)), module, BuffMix::IN_INPUTS + 5));

		addOutput(createOutputCentered<PJ301MPort>(mm2px(Vec(15.24, 56.914)), module, BuffMix::OUT1_OUTPUT));
		addOutput(createOutputCentered<PJ301MPort>(mm2px(Vec(15.24, 107.813)), module, BuffMix::OUT2_OUTPUT));

Model* modelBuffMix = createModel<BuffMix, BuffMixWidget>("BuffMix");


I don’t see anything that looks like it is wasteful here. Have you measured its efficiency using the CPU monitoring in VCV Rack? What percentage of CPU time does your module actually consume?

1 Like

I think you can do some very minor things. For one you can move this test outside the for loop. For another, initialize out and out2 to the first inputs and loop from 1…3. I think this would make for a miniscule actual performance improvement, but since you asked :grin:

Also if no inputs are connected, set outputs to zero and return!

      if (outputs[OUT1_OUTPUT].isConnected()) { // if output 1 is connected route inputs 1-3 to output 1 and 4-6 to output 2
         for (int ch = 0; ch < channels; ch += 4){
            float_4 out = inputs[IN_INPUTS].getVoltageSimd<float_4>(ch);
            float_4 out2 = inputs[IN_INPUTS + 3].getVoltageSimd<float_4>(ch);
            for (int i = 1; i < 3; i++){
               out += inputs[IN_INPUTS + i].getVoltageSimd<float_4>(ch);
               out2 += inputs[IN_INPUTS + (i + 3)].getVoltageSimd<float_4>(ch);
            // applying the gain
            out *= gain;
            out2 *= gain;
            // set outputs
            outputs[OUT1_OUTPUT].setVoltageSimd(out, ch);
            outputs[OUT2_OUTPUT].setVoltageSimd(out2, ch);
      } else {
         for (int ch = 0; ch < channels; ch += 4){
            float_4 out = inputs[IN_INPUTS + i].getVoltageSimd<float_4>(ch);
            for (int i = 1; i < 6; i++){
               out += inputs[IN_INPUTS + i].getVoltageSimd<float_4>(ch);
            // applying the gain
            out *= gain;
            // set outputs
            outputs[OUT2_OUTPUT].setVoltageSimd(out, ch);
1 Like

Yes, there are minor things you can do. But all in all this is pretty darned efficient. I’d be really be surprised if this looked bad on the CPU meters.

1 Like

Hi all, thanks for getting back to me and putting my mind at ease! CPU is absolutely fine.

@chaircrusher I’ve implemented those changes for the next update. You’re right the performance difference isn’t much, but it makes sense as good practice.

Cheers again!