Best way to incorporate wav audio into a module?

CircadianSound · March 22, 2022, 3:14pm

Thank you Bruce for this! I couldn’t agree more, the simplest solution is usually the best solution.

So you’re saying a global variable could serve the same purpose as a singleton, for my usages?

Your modules are notorious for being light on resources. Which is also something I’d like to look out for when working on my modules.

Squinky · March 22, 2022, 3:31pm

Thanks. Oh, I don’t think I’m recommending anything in particular, and I don’t remember exactly what your use case is. I guess first I’d consider the very simplest thing - don’t share the wave, and make it owned by your module.

If you do want to share that wave, and it’s for sure always going to be the same file, then you could consider a simple global variable. But if you want to be finicky you will need a way to know when your plugin isn’t being used so you can free the memory associated with the wave. After all, it would be a little unfriendly if a user removed our module and you didn’t give the memory back. Probably ok for VCV, definitely not ok at all where it matters (like my day job where that would be considered a bad bug).

So be careful, and maybe don’t worry about sharing the data. Esp if it isn’t huge. If you find your plugin is a wild success and ppl are using multiple instances a lot, then worry about getting fancy?

If you take a look at my ObjectCache thing you can see how I used weak pointers and shared pointers to share immutable data between plugins, but I would not really recommend doing something this fancy. I just felt like doing it at the time, so I did.

kwurqx · March 22, 2022, 4:21pm

Occam’s Razor

KISS Principle

Ahornberg · March 22, 2022, 7:02pm

<off-topic> Just to satisfy my curiosity: What kind of sound-synthesis are you thinking of? By using existing sampler modules and whatsoever modules provided by the 2000+ piece VCV library, I can’t think of any synthesis-technique, that can’t be done right now.

There’s subtractive, addidtive, FM, wavetable, Karplus-Strong, granular, phase-distortion, vocoder and any or even all that combined together that actually can be built with a modular system like VCV.

What exactly are you missing that you can’t build with a combination of existing modules, and drives you into coding your own module? Just for curiosity, please let me know. </off-topic>

clone45 · March 23, 2022, 1:37am

Hiya @CircadianSound! I’ve put together a demo for you that embeds a 44,100 Hz sample as code into a VCV Rack module!

Here’s the code:

#include "kick.hpp"

#define KICK_SAMPLE_LENGTH 22820

struct Circadian : Module
{
  dsp::SchmittTrigger playback_trigger;
  float left_audio = 0;
  float right_audio = 0;
  bool playback = false;
  double playback_position = 0;
  
  enum ParamIds {
    PLAYBACK_BUTTON,
    NUM_PARAMS
  };
  enum InputIds {
    NUM_INPUTS
  };
  enum OutputIds {
    AUDIO_OUTPUT_LEFT,
    AUDIO_OUTPUT_RIGHT,
    NUM_OUTPUTS
  };
  
  Circadian()
  {
    config(NUM_PARAMS, NUM_INPUTS, NUM_OUTPUTS);
  }
  
  void process(const ProcessArgs &args) override
  {
    if(playback_trigger.process(params[PLAYBACK_BUTTON].getValue()))
    {
      // reset sample playback position
      playback_position = 0;
      
      // Set playback flag.  The sample will playback while this is true
      playback = true;
    }
    
    if(playback)
    {
      // 44100 is the sample rate of the recorded sample
      float step_amount = 44100 / args.sampleRate;
      
      // Step the playback position forward.
      playback_position = playback_position + step_amount;
      
      // convert float to integer
      unsigned int sample_position = playback_position;
      
      // If the playback position is past the playback length, end sample playback
      if(sample_position >= KICK_SAMPLE_LENGTH)
      {
        playback_position = 0;
        playback = false;
      }
      else
      {
        left_audio = kick_drum[sample_position][0];
        right_audio = kick_drum[sample_position][1];
        
        outputs[AUDIO_OUTPUT_LEFT].setVoltage(left_audio);
        outputs[AUDIO_OUTPUT_RIGHT].setVoltage(right_audio);
      }
    }
  }
};

struct CircadianWidget : ModuleWidget
{
  CircadianWidget(Circadian* module)
  {
    setModule(module);
    setPanel(APP->window->loadSvg(asset::plugin(pluginInstance, "res/looper_front_panel.svg")));
    
    // Add output jacks
    addOutput(createOutputCentered<PJ301MPort>(mm2px(Vec(7.560, 35.0)), module, Circadian::AUDIO_OUTPUT_LEFT));
    addOutput(createOutputCentered<PJ301MPort>(mm2px(Vec(7.560, 40.0)), module, Circadian::AUDIO_OUTPUT_RIGHT));
    
    // Add playback input
    addParam(createParamCentered<LEDButton>(mm2px(Vec(7.560, 5)), module, Circadian::PLAYBACK_BUTTON));
  }
};

You’ll notice that the only include is kick.hpp. It’s a bit long to paste here, but here’s a sample of what it contains…

  float kick_drum[][2] = {
    { -0.000030,-0.000061 },
    { -0.000030,-0.000030 },
    { 0.000000,-0.000091 },
    { 0.000000,-0.000091 },
    { -0.000031,-0.000061 },  // etc...

Here’s how I generated the sample data. First, I used Wavosaur to export a kick drum as text.

This left me with a pretty awful file to work with that looked like:

-0.000030	-0.000061	
-0.000030	-0.000030	
0.000000	-0.000091	
0.000000	-0.000091	
-0.000031	-0.000061

I wrote a quick PHP program to get this closer to what I needed:

<?PHP
  /*
    For prepping samples for vcvrack embed
  */

  $input_data = file_get_contents("circadian.txt");
  $lines = explode("\r\n",$input_data);
  $output_text = "{";
  $pair_count = 0;
  foreach($lines as $line)
  {
    $pair_count ++;
    list($left_audio, $right_audio) = explode("	", $line);
    $output_text .= "{ $left_audio,$right_audio },";
  }

  $output_text .= "}";

  file_put_contents("output.text", $output_text);
  print("pair count: $pair_count");
?>

The resulting output is essentially what I used in kick.hpp, but I had to remove a trailing comma that I was too lazy to fix in the PHP code.

Here’s the code on GitHub:

Here’s a video of it in action. I borrowed a module’s front panel and didn’t spend any time on customizing it:

Ahornberg · March 23, 2022, 7:10am

As I said above, I wouldn’t embed the sample data as code, but if I had to, I would convert the data to the desired format before putting it into the code, exactly as you did in your example.

CircadianSound · March 23, 2022, 9:35am

This is definitely something to consider, so thank you for your input here. I am very grateful for all the advice and help in this thread, as it encourages me to keep pushing through as I balance what I’m learning on my own and how to apply it when referencing the Rack API.

A couple questions, if you don’t mind.

When you refer to “share the wave(s) or make it owned”, you’re referring to the memory resources used when the module is loaded in Rack to make sure multiple instances of the module don’t load the samples multiple times, resulting in overloading the users memory, or loading it through a directory(public) vs. hard coding(private) it?

If I’m leaning towards hard coding the samples, does this eliminate the need to use a singleton or global variable?

Thank you again Bruce for your input

CircadianSound · March 23, 2022, 10:47am

I’m at a loss for words! I can’t thank you enough for taking it upon yourself to put this all together for me. So, I’m forever grateful for the time and energy you’ve put in, to not only answer my questions, offer code snippet examples, but actually put together a working demo example as a jumping off point for me! You are so very kind! This is going to get me started off perfectly. So again, thank you so much Bret

I’ve cloned and built this repo already, and begun to study your code, as I start the “learning by doing” practice.

If it’s alright, I’d like to ask you a few questions regarding what you’ve provided so far.

clone45:

Here’s the code:

//Truncated to keep post shorter

#define KICK_SAMPLE_LENGTH 22820

      // reset sample playback position
      playback_position = 0;
      
      // Set playback flag.  The sample will playback while this is true
      playback = true;
    }
    
    if(playback)
    {
      // 44100 is the sample rate of the recorded sample
      float step_amount = 44100 / args.sampleRate;
      
      // Step the playback position forward.
      playback_position = playback_position + step_amount;
      
      // convert float to integer
      unsigned int sample_position = playback_position;
      
      // If the playback position is past the playback length, end sample playback
      if(sample_position >= KICK_SAMPLE_LENGTH)
      {
        playback_position = 0;
        playback = false;
      }
      else
      {
        left_audio = kick_drum[sample_position][0];
        right_audio = kick_drum[sample_position][1];

This part here:

        left_audio = kick_drum[sample_position][0];
        right_audio = kick_drum[sample_position][1];

(Answered below)I’m curious why the right channel is set to 1, and not 0 like the left channel? 0 resets the sample position to the beginning of the wav file, yes? The wav file length is 22820 samples long, so is this creating a 1 sample offset once the wav file is reset to the beginning? (Edit: this was answered by @Ahornberg below, as he pointed out the the 0 and 1 I’m referring to, represent the stereo channel, left=0 and right=1)

I also wanted to ask about the defines.h file. I can’t see it included anywhere, so if you don’t mind me asking, what’s the intension/purpose for having it in the circadian folder with the rest of the header files? (Edit: Okay, I figured out that it’s left over from your Looper Module. If I were to guess, it has to do with Looper’s GUI display, from the looks of it?)

clone45:

Here’s how I generated the sample data. First, I used Wavosaur to export a kick drum as text.

I wrote a quick PHP program to get this closer to what I needed:
<?PHP
  /*
    For prepping samples for vcvrack embed
  */

  $input_data = file_get_contents("circadian.txt");
  $lines = explode("\r\n",$input_data);
  $output_text = "{";
  $pair_count = 0;
  foreach($lines as $line)
  {
    $pair_count ++;
    list($left_audio, $right_audio) = explode("	", $line);
    $output_text .= "{ $left_audio,$right_audio },";
  }

  $output_text .= "}";

  file_put_contents("output.text", $output_text);
  print("pair count: $pair_count");
?>
The resulting output is essentially what I used in kick.hpp, but I had to remove a trailing comma that I was too lazy to fix in the PHP code.

Again, this is amazing! I googled Wavosaur, and I saw that it’s available for MacOS as well! So I’m going to download that and start the conversion of my samples so I can start building my project. Although, if I remember correctly my samples are mono, but meant to still flow through a stereo output. I think that may be easy enough to work out though.

The trailing comma you mentioned?

{
    $pair_count ++;
    list($left_audio, $right_audio) = explode("	", $line);
    $output_text .= "{ $left_audio,$right_audio },";
  }

Is it the comma between the last curly brace and end quotation mark? For reference: …audio },"; Or has it since been removed from the .php file you attached to your post?

I did a quick Google search on how to run a .php file, and came up with this:

You just follow the steps to run PHP program using command line.

Open terminal or command line window.
Goto the specified folder or directory where php files are present.
Then we can run php code code using the following command: php file_name.php.

Seems simple enough, but I am aware that sometimes when using terminal we need to set input and output arguments after the run file command. Does your .php file require that, or is it as simple as replacing the circadian.txt section, with the name of the .txt file I want to format?

Lastly, regarding using PHP. Do I have to download and install PHP or some sort of library first before I can run this file for my own usages?

Bret, you’ve been a tremendous help! After all, this is exactly what I asked about in my original post. I hope you don’t mind if I follow up with other questions once I try to get this all working on my end?

Again, thank you so much!

Ahornberg · March 23, 2022, 11:12am

kick_drum[][] is a 2-dimensional array, the first dimension is the sample-position, represented by the variable sample_position, the second dimension is the stereo-channel, representend by 0 for the left channel and 1 for the right channel.

Yes, you have to install PHP to run PHP-scripts.

CircadianSound · March 23, 2022, 11:15am

Ah! Now the 0 and 1 makes more sense. Thank you.

Ahornberg · March 23, 2022, 11:31am

Be aware that the example code only works fine when the user runs his VCV Rack at 44.1 kHz. If the user runs the Rack on a different sampling frequency like 48 kHz or whatsoever, this code will produce aliasing artefacts that may be audible. The same goes for playing the sample at a different speed.

That’s the point where DSP-coding becomes challenging and that’s the reason why I recommend learning C++ well before you start coding VCV modules. Maybe start with a module that only deals with CV-signals.

I coded 7 modules that only do CV processing before I started with audio processing. And just a “simple” fader module isn’t that simple when it comes to avoid unwanted clicks and pops in the audio-signal-chain. The same goes for playing back a sample, especially at a different speed or at an external modulated speed.

Squinky · March 23, 2022, 3:03pm

Yes, yes, and yes. On might want to share loaded waves between modules to reduce the wasted memory of each module having its own copy. And, yes, baking in the wave data will ensure that all the instance use the same copy/memory.

clone45 · March 23, 2022, 3:44pm

Thanks for digging in an answering some of your own questions! I’ll do my best…

Yes, that’s correct! If I wanted to get rid of the trailing comma, I would do something like:

  foreach($lines as $line)
  {
    $pair_count ++;
    list($left_audio, $right_audio) = explode("	", $line);
    $output_text .= "{ $left_audio,$right_audio }";
    if($pair_count <= count($lines)) $output_text .= ",";   <<========== added this
  }

  $output_text .= "}";

In retrospect, I probably just should have done that.

As for the defines.h, you’re right, it was left over from copy/pasting the looper module.

If your samples are mono, I wouldn’t convert them to stereo. Instead, I’d rewrite my code to use a one-dimensional array instead of a two-dimensional array. This will cut down the memory consumption by half.

When it comes to PHP – if you have trouble installing and running PHP, you will need to install it to run my script, but there aren’t any special libraries that you’ll need.

Again, I was lazy and “hard coded” the input and output filenames, so you’ll need to replace circadian.txt with your content. You won’t need to pass in a filename like php convertwav.php [your filename] Only php converwav.php.

Be aware that my PHP script needs some work. There’s some gobbledygook at the end which I cleaned up by hand by removing ,{ , },

Let me know if you hit any major hurdles! Happy to help! If you get really stuck on PHP, let me know and maybe I can throw together a C++ version.

CircadianSound · March 24, 2022, 2:41pm

Ah yes! Thank you for broaching that. It is definitely something to consider. If my memory serves me correctly, antialiasing filters can be implement through the use of a low pass filter set to roll off at the Nyquist frequency, with a steep enough attenuation sitting around 2000Hz above this frequency. So for my use case a LPF set to around 20-22kHz (roughly).

Now, I am unsure if this has to happen on the front end of sample playback, or if aliasing can be prevented just by filtering the output.

CircadianSound · March 24, 2022, 2:47pm

Thanks again Bruce! This seems to be another justification for hardcoding the samples, before I am able to start implementing singletons or global variables. For my first go at a module, it should serve my purposes perfectly

Squinky · March 24, 2022, 3:04pm

In some cases, but not in this case. Once you have digital nasties in your signal it is generally impossible to get rid of them. They are going to be in-between frequencies that are supposed to be there, and there is no practical way to filter them out.

The issue here is that if the user sets a sample rate that isn’t the one the samples were recorded at, they will play at the wrong speed. So usually you want to do something to prevent that. And to do that you need to make up in-between samples if you are going up, or throw some away if you are going down.

So just getting the speed-pitch right takes some trickiness.

And you “can’t” really just repeat samples or drop some. It can sound awful. It’s not really aliasing, I don’t think. It’s usually called “interpolation noise”, I think.

Wavetable VCOs have to deal with this, since they are supposed to play at the right pitch and sound ok, and have to be transposed all over the place. That said, afaik they all play back at the correct pitch, but some do actually just drop or repeat samples.

But a good one, like the VCV WT VCO will have some fancy interpolation so that it sounds good when transposing. You might look at that that does. But beware, this is a pretty complex topic, and pretty quicky ppl start to talk about polyphase filters, sinc resamplers, etc…

I would say approachable solutions are to either let it sound bad, or find an implementation you can borrow (like the VCV one)

Ahornberg · March 24, 2022, 3:22pm

Correctly you have to apply cubic interpolation between samples. In the example code, the float-value step_amount is truncated to the int-value sample_position. You have to prevent this loss of data and calculate the sample-data in between the 2 samples accordingly to the position after the decimal point of the float-value. The Cattmul-Rom algorithm should give you good results.

If you’re downsampling, you should apply a low-pass-filter operating under the Nyquist-frequency of your sample before you do the interpolation.

But before all that: Make it run!

Squinky · March 24, 2022, 3:32pm

That is far from true. a) you don’t “have” to do anything, and b) cubic interpolation is not a perfect solution to this problem, nor is it the only one.

That said, I have always used cubic interpolation because it’s relatively easy, and it sounds “good enough” to me. Others are more finicky, and will so something fancier. As I mentioned before, some do something worse.

NYSTHI Seven seas is an interesting example. I think it has three different uses selectable qualities for this. Presumably so you can trade off CPU usage for sound quality.

Ahornberg · March 24, 2022, 3:39pm

Yes, you’re right. No one “has” to …

Yes, it’s a trade off between CPU usage for sound quality.

Thank you for your wise point of view.

Squinky · March 24, 2022, 3:44pm

maybe more “nick picky” that wise but thanks. Your advice was of course quite good.