Trying to build a Math-Heavy module, am I SOL or do I just not know enough?

the maths here are specifically iterative maths, and in a sense doubly so. The output is derived from a sinusoidal iterative equation, but I dont want the output to step through each iteration, I want it to follow one specific iteration as that changes over time. And I want this to be done on a minimum of four independent outputs, and its in 2D. As far as I can tell, this simply requires a for loop, and no way to get around it. I have to build the iterations in a loop, output a specific one (e.g. #8), then step time forward and run the loop again, outputting the new #8, meaning I have to run this loop every sample, with 8 or more computation-heavy equations run each time (for 4 pairs of outputs). I have tried something more like this in order to build a buffer and therefore do so less often:

void process(const ProcessArgs& args) override { if (loopCounter % sample_build == 0) { build_Samples(args);
} read_to_Output(args); loopCounter++; loopCounter %= sample_build; } void build_Samples(const ProcessArgs& args) { // this should -theoretically i guess- simply create a ā€˜buffer’ of ā€œsample_buildā€ samples, already tuned to Racks sampleRate. for(int k = 0; k < sample_build; ++k) { //set starting conditions for timestep for(int i = 0; i < max_iterates; ++i) { //do iterative maths for each of the outputs if (i == desired_iterate) { // do this for each //emplace_back into x and y vectors for this output //or with arrays make arrayX[k] = x, etc. } } increment_Time(); // this is where tuning happens, just like building a phase for a sine wave } }

void read_to_Output(const ProcessArgs& args) { outputs[each_one].setVoltage(Each_Ones_Vector[sample_read], 0); ++sample_read; sample_read %= sample_build;

}

but for reasons I’m sure someone who knows anything about the internal workings of Rack would know, this doesn’t increase performance all that much, and causes what looks like a general reduction in bit depth, related to the number of samples I buffer.

this plugin in its current technically working state can run 4 2D outputs simultaneously, but it consumes around 30% cpu. I’m certain there are a good number of structural optimizations that could be made, again I am quite new to this, but I am concerned that this may simply be the wrong kind of math for this environment, being chaotic dynamical systems and all.

If you enclose your code in triple backquotes it will be readable

But the general question here’s how I think of it

You have at most 1/48000th of a second to return from process. Rack internally is per sample but sound cards provide some buffer but doing n work every n samples won’t help that much.

So in your case you just need things to go faster.

Couple of ways you can do that.

If you know anything about the smoothness or functional form of the answer do the heavy computation less. For instance if you can compute the values and their first derivatives you could do a four sample Taylor series.

You could profile your code to see if it is slow. Can you replace sin with an approximation? Things like that

Or in your case since you are computing 8 fields you could use simd parallelizing and recast as a vector operation. This allows the cpu to do 4-wide float ops rather than single but requires some coding expertise which isn’t that common

Hope that helps?

in my seaching I have come across the use of the simd float4, but mostly in the context of polyphony. if I’m going to be assigning independant values to each of the four slots, would that not still need a loop, and end up essentially the same as the vectors I’m already building time-wise? I have indeed not been able to fully understand their usage. I like the idea of using a faster sin approximation, and will do that at least for sure (its chaos, inexactitude is what gives rise to a lot of the interest), but I dont think im quite high enough in the calculus ladder to determine if a taylor series approach would even work here, as the points I’m tracking are dependent both on initial conditions (where time physically moves a starting location) And on which iteration on those conditions, which I can only know the value of by knowing the one before it, etc.

Is there a Rack module function that runs faster than the sample rate that I could use to build samples in the ā€˜background’ or would that inevitably intrude on other module’s time?

you could spin up a thread and in the background make things ahead but thats tricky and a bit antisocial. and often way less performant if you don’t know all the tricks

yeah, that tracks. time to hunt down a good sin and learn more about taylor it looks like. Thank you!

The pade approximates of sine are good The Bhaskara’s formula form is super cheap depending on how much accuracy you want to give up

1 Like

Sin is insanely slow on some platforms. don’t use it! The approximation in the VCV SDK is not bad, btw.

You can for sure make math heavy modules, but it requires some programming skill to do it well. for a silly example, my old ā€œLFNā€ runs white noise through a five band graphic EQ and uses almost no CPU.

Here is an article I wrote years ago on this that some ppl like. here

If you want to seem some examples (and source code) for some things, my ā€œBasic VCOā€ has two sines - one fast using sine approximation, one very pure using a lookup table.

My ā€œOrgan 3ā€ sums up a sine at a different pitch for each drawbar on a Hammond, so several sine waves per note that need to be looped over every sample. It also doesn’t use much CPU. It’s here

2 Likes

Your demo modules are in fact how I got started with all this! Right before I finished trying things last night I threw in the pade approx that you used in the vco2, but something went real wierd so I’m gonna have to troubleshoot. Is it a problem for these approximations to be taking the sin of large numbers?

yes, there sure is. most ones I know use something like a taylor series, and they are only accurate for small values, like plus or minus pi.

Oh, I see the one in VCV isn’t quite as restrictive:

" The code is the exact rewriting of the cephes sinf function. Precision is excellent as long as x < 8192 (I did not bother to take into account the special handling they have for greater values – it does not return garbage for arguments over 8192, though, but the extra precision is missing)."

Indeed. I’ve spent much of today looking at the taylor series in more depth, and I am uncertain how, when coded, that many calls to pow, and that many factorial loops(if I want accuracy from -2pi - > 2pi) , could be faster than a single call to sin if I am to write out the series manually to an appropriate degree. Am I right in guessing that the approximations I’ve heard of here are essentially algorithmic approximations of this kind of function?

That is a wonderful amount of precision for the vcv version though, will have to drop that in there tonight, but in the meantime I’ll try finding cephes function and putting it into my little portable testing program.

taylor series is all polynomial, there is no pow. look, there are a ton of modules that can compute a ton of sine waves. just copy one, ok?

Umm… Polynomials use exponents? Which if I don’t wanna write x * x * x * x etc, or do a recursive loop, is a call to pow isn’t it? Like I know I’m new to coding but… What?? Also yes we’re talkin about sin functions but I am not producing sin waves (as what i’m taking the sin of is another polynomial dependent on previous positions) and besides using the approximations I am in fact largely imitating You structurally for waveform generation, at least from the demo-modules structure. Still, I can smell a dismissal when I read it, so, peace.

x * x * x is not an exponential function. At least how the term is commonly used. yes, it is x ** 3. But usually exponential means k ** x. often k == e, and ā€œexponentialā€ mean e ** x or ā€œe to the xā€. x ** e is not exponential in x. e ** x is.

Of course you may define the phrase ā€œexponential functionā€ any way you want - that’s fine. You can google ā€œexponential vs polynomialā€ if you want to learn more about how these terms are commonly used.

Oh, and usually y = x * x * x is much faster than y = pow(3, x).

I think your question was about how to do math quickly in c++. It is eminently doable, but I’m sorry if this doesn’t help. I, too, wish you luck in your endeavors.

1 Like

that many calls to pow

There are many ways to calculate a polynomial on a computer without involving pow. VCV includes a couple of them.

The simplest way is to simply have an ā€œaccumulatorā€ for the value for x ^ n, like this function from VCV Rack:

template <typename T, size_t N>
T polyDirect(const T (&a)[N], T x) {
	T y = 0;
	T xn = 1;
	for (size_t n = 0; n < N; n++) {
		y += a[n] * xn;
		xn *= x;
	}
	return y;
}

Instead of using pow, the previous x ^ n is kept, and then multiplied by x. Multiplications are fairly fast, so this simple version is efficient enough for most applications…
Not so much for audio, though, which is why VCV Rack also includes a function using Horner’s Method (polyHorner) and one using Estrin’s Method (polyEstrin)

You can see an example of those functions being used in VCV Rack’s exp2_taylor5 function, it’s an approximation of 2 ^ x using a polynomial calculated using polyHorner.

and that many factorial loops(if I want accuracy from -2pi - > 2pi)

You pre-calculate the coefficients of the Taylor series polynomial and insert them into the code, that way, you don’t have to calculate anything for the coefficients at runtime.

2 Likes

For my own enlightenment:

Are the simd::sin() and simd::cos() functions optimized, I mean beyond the handling of 4 floats at a time? Do they use any kind of fast approximation of sine and cosine?

Precalculating coefficients: duh, yeah I should totally do that, silly me. This is where I start getting lost with code though, I’m seeing very specific things that imply (to me) specific use cases, like 2^X has a lot of good uses, but that function can’t give me X^any-real-number, or do fractional powers on a coordinate system, so I have no idea if it or an alteration of it functions in my particular very niche case without trying and likely failing repeatedly and tediously cuz I don’t fully understand the functionality yet. I guess generally(haha ironic) I am having trouble taking specific functions, generalizing them, then specifying down into my very different usages. I also have absolutely no ā€˜official’ learning to speak of, which I’m sure makes me a nuisance to talk to.

I showed exp2_taylor5 only as an example of how to use those functions for evaluating polynomials included in VCV Rack. The idea is that you want to generate a Taylor polynomial approximation of sin with N coefficients, and use a function like polyHorner to evaluate it.

Note that there’s also many other ways to approximate sin. It all depends on your precision requirements. The VCV VCO does it like this, for example.

2 Likes

It uses this function. I presume it’s somewhat optimized, but I don’t know if that’s fast enough to be using it heavily in process.
For types other than float_4, it just aliases std::sin.

1 Like

Is the formula you are trying to implement something we can look at on GitHub or some such?

If you are doing repeated sin of a polynomial in a chaotic system or some such then yeah things like horner for the poly, caching coefficients, never calling pow when you can multiply, and finding a stable approx for sin will matter

But also profile the code and find out where the time is going probably by running it outside of rack ina test harness

1 Like

std::pow(x, n) is optimal for small integer exponents. The optimized assembly basically the same, and might actually be faster than my ā€œmanualā€ version.

1 Like