Would Highway be any good for implementing SIMD stuff in a VCV plugin?

dan.tilley · January 12, 2023, 12:14pm

I have absolutely zero experience and low knowledge of using SIMD in C++

But, today I saw this and wondered if it might be useful / an easier way to get the benefits of SIMD in a VCV plugin. Anyone with SIMD experience / knowledge have thoughts on this?

Ahornberg · January 12, 2023, 1:11pm

In VCV Rack there’s a namespace simd you can use. You can also inspect the source to see how it works.

https://vcvrack.com/docs-v2/namespacerack_1_1simd

Curlymorphic · January 12, 2023, 1:11pm

You say you have zero experience in programming with simd. VCV does come with simd support and it’s own extensions that make developing with it much easier than direct simd programming. It is used in some of the modules in the Fundamental collection and is well-tested for use in plugins as it is what most developers here use.

A quick look shows Mixer.cpp uses simd, and it simple code to learn from. Anywhere you see a float_4 type is what you are looking for.

dan.tilley · January 12, 2023, 1:43pm

Yes I have seen this and investigated the Fundamental code before…

However, the API docs are very basic, and I found trying to extrapolate general usage from the specific implementations in Fundamental to be challenging.

It looks like Highway provides a different (larger?) set of operations to VCV, but I suppose that many might not actually be applicable in module use cases.

I guess what interested me about Highway is that it has more detailed docs, including a quick start and some examples.

When I said “an easier way to get the benefits of SIMD”, what I really meant was: an easier way to understand how SIMD works and how to use it in module code.

For example, and this might be obvious, but it is not actually explained by VCV documentation that I am aware of; in Highway you specify the number of lanes for the vectors, which I gather is the amount of data that is to be processed. For the VCV SIMD implementation everything uses float_4, so I assume that it always processes 4 data items?

I think if I wanted to use SIMD to process a full polyphonic 16 channel signal, I’d need to split up those signals into 4 separate float_4 vectors, for example: Fundamental/src/MidSide.cpp at 91847b37237a5ac4b7c79f247fbd3dd76e0c9c32 · VCVRack/Fundamental · GitHub

What happens if the number of channels is not a factor of 4? Does that last float_4 just have null or default values in it, do they get processed or ignored?

Maybe I am thinking SIMD is more complex that it really is, should I just think of it as processing values by groups of 4 at a time?

Curlymorphic · January 12, 2023, 2:05pm

You do already seem to have grasped the basics, while it can be a complex subject, it is simple enough to get started. The way I started was to write a simple polyphonic vca, as a test module, then expand on that. I could see it being a bit harder to try and integrate into an existing plugin without this learning step.

The most common use in rack plugins is to process polyphonic channels simultaneously, so a good starting place.

yes, 4 No. 32 bit floats are represented as a single element.

yes, you will see code like this in lots of modules

They are always 4 values, the maths will always be performed on all 4 values,

yes, as I said at the start of this post, you already understand the basics

You can use them where ever the maths can be parallelised, but this is the next step. In a couple of hour’s time Im sure looking at simd code in rack modules will be a lot clearer, and the rack simd namespace will be a lot clearer.

baconpaul · January 12, 2023, 2:50pm

I haven’t looked at highway. I know many folks who want abstractions on top of simd use xsimd and have been successful; the rack abstractions are also great.

but you get the right idea. SIMD basically makes it so processing 4 floats at a time is as cheap as processing 1 float at a time. Can you design your program to take advantage of that? voice polyphony is an obvious way to do it and the one rack uses a lot, but its by no means the only one. (For instance, the surge interpolated delay line uses a 12-wide FIR interpolation so uses 3 simd instructions).

hope that helps!

Squinky · January 12, 2023, 3:16pm

The SIMD library in the sdk is really good, imao. You can basically take any scalar code and turn it into SIMD by changing variables from float to float_4. That’s why I was able to make that module 10x faster in a day (last week).