DIY marbles-based physical matrix mixer controller project

You’re all a knowledgeable bunch, so I thought I’d run by you a half-baked idea I’m interested in building (and, if it works out, releasing as a set of DIY instructions + custom VCV module) this summer. I’m sure you will have useful input/prior art/etc. to share.

I’m interested in making more visually interesting live virtual synth performances, and I thought that using a physical matrix mixer would be more engaging than patching virtual cables on a screen.

If you’re not sure what I mean by “matrix mixer”, refer to this picture of the EMS VCS3 routing matrix, using pins to route inputs to outputs:

Obviously, such a 16x16 mod matrix could be replaced with a midi grid of lit MPC-style pads. I could make a grid of 4 boring 8x8 Novation Launchpads every single artist already uses for a mere €500. Or build my own out of cheap electronics for a mere €100.

But I was thinking of something more unique: marbles + computer vision.

Think Chinese Checkers:

I bought about €20 worth of 16mm marbles. I picked high quality stuff! Various colors and textures. All of them opaque. I want then to look as cool as possible in videos.

And I bought €6 worth of webcams. Plural. I bought two of them, 3 euros each, shipping included. I bought the most worthless junk possible. The cheapest garbage Aliexpress was willing to sell me. I 100% expect to get exactly the level of quality I paid for. If I get the project to work with those cams, it means I can trust it will work with any cam.

So, I have only minimal experience with this stuff, but here’s my thinking how I’ll go about this. This is where you come in, by talking me out of bad ideas I might have so I don’t waste my time pursuing them.

  • Build a cardboard prototype, with only a handful of holes, place a good webcam inside, rather than above the box, a light above the box, and try to write opencv code that detects those holes in realtime. Probably gonna use the python bindings it has.
  • Once it works, experiment with filming the box from above instead.
  • Decide whether to go for a setup where the control surface is filmed from inside or from above the box (more reliable vs. more compact buld)
  • Design an elegant but easy to build controller with something like Blender (which I am familiar with, easier than learning a legit CAD thingie).
  • Craft it using a material that’s easy to work with such as MDF, use a router to make the pits, and at the center of the pit, drill a hole for light to seep through. Add larger pits for the spare marbles, like on the image above. Note that I don’t have access to CNC gear, and I want it to be a DIY project other people can reproduce on a <€100 budget anyway.
  • Add a camera inside/above the box, and maybe a USB light above it too.
  • Adapt the prototype code to that build, and make it output MIDI CC on two MIDI channels. Tweak things until it works satisfactorily with the cheapo low quality webcams.
  • Rig it to Bogaudio SWITCH1616.
  • Try to get it to work on OSX and Linux.
  • Make my own VCV module that mashes up MIDI-CC and a mixer matrix.
  • Distribute build instructions, OpenCV <=> MIDI bridge (loopback driver required), custom VCV module.
  • Record a tutorial video explaining how to make your own.

Feedback more than welcome! And hopefully, progress updates from me every so often.

12 Likes

This sounds like an amazing project!
If you are interested in connecting cables directly by MIDI messages without a matrix mixer you can take a look at this thread. I did a working prototype which is currently not available publicly, but it will be at some point. It is not very user-friendly but it works quite well. Let me know if you are interested.

Also, @mudjakub is working on something like that, with standard patch cables.

1 Like

If my project works out, I’d like for it to be easy for others to reproduce on a budget without specialized tools, so I think I’ll pass on using less than official parts of the API, haha.

1 Like

Do I understand it correctly? You want to ‘convert’ the gray/colour values of certain area’s of the camera/image into midi values. Or into connections as in the matrix.

Yeah, both, I assume it would go as follows:

  1. From a webcam live feed, detect a grid of 16*16 holes through which light seeps through (possibly using a pair of additional holes at the edges of the grid to help with calibration)
  2. Assign each of the 256 holes a CC on two MIDI channels.
  3. When lit, CC is 0, when unlit it’s 127, nothing in between. Add a bit of delay when going from lit to unlit to help with noise & hands hovering over the control surface.
  4. Tweak the values until it works reliably under the worst scenario possible (cheap webcam out of alignment, poor DIY build with badly aligned holes, night club settings with extreme light changes)
  5. Rig it to VCV. MIDI might not be the simplest way to achieve this, but it’s universal enough to be adapted to non-VCV environments. That way people can adapt the project to other software.

Push the cam signal trough a filter for maximum contrast, pure b/w.

Have you thought about what happens if you move a marble, your hand between the camera and the lightsources? Temporarily switch of detection. Your hands entering the box can be detected with a light switch.

Interesting project.

Very interesting idea. You mention in your original post, playing live in front of an audience. A couple of points that spring to my mind that may have an effect. If you are thinking of a club environment, these usually have colored flashing lighting effects to consider, also depending on the setup of the sound system vibration could also be a factor with the camera focusing. However, if you are thinking more of a social bar type scenario these would probably be less of an issue.

I think you have a really cool idea, and if you get this working I would be interested in making one of these devices.

Personally, my interest in performance is limited to videos, maybe the odd house party, but i’d like for the build to be able to to work even in bad conditions.

I didn’t consider the vibration problem - it might be worth it to add something to be used as an alignment mark.

My thinking is that, with a camera inside the box, and lights above, I should probably be able to obtain images of at least this level of quality (simulated image), which seems sufficient from what I saw of openCV:

image

This sounds very cool! Reminds me of the largest sequencer.

Have you considered using infrared proximity sensors on each hole instead of the camera setup?

It would offload the processing from your pc and be more reliable and self-contained. Definitely more soldering though, and the added cost of some hardware like leds, resistors, multiplexers and a teensy or an arduino board to act as a midi controller. (might still be under $100)

edit: I realized you want to do 16x16… that might be a bit too much soldering :smile:

1 Like

What about a literal physical patch matrix (+ midi controller board) where you complete the circuit with conductive balls, like some neodymium magnets for added stability?

1 Like

Neato! I knew there’s some prior art but couldn’t find it easily.

Adding a lot of electronics to the project like that is currently beyond my skills (not that I can’t figure it out). Adding a single row of LEDs to make it work as a sequencer is really tempting, but it’d really increase the scope of the project. The thing about using a webcam is how cheap and reliable it would be. Plus, use as a sequencer is more of an installation art thing, while mixer matrix would (hopefully) be a practical performance tool.
I think I really want to keep this project a DIY weekend project people can make on a €50 budget.

I might want to see if I manage to use OpenCV directly within a VCV module - it’d be cool to have a live video feed displayed on the module. And easier to tweak the values for the current light condition (but ideally I’d rather have a build you don’t have to tweak). That code could be forked later to make a generic MIDI version.

I meant the leds only as in infrared leds for the proximity checking.

Haven’t looked into OpenCV myself but I’m curious about the integration with Rack! (closest I did to comp vision is hooking up the Posenet model to WebMidi to send CC messages, see poser-midi)

Wish you good luck and keep us updated!

Yeah regarding LEDs I was reacting to the video you posted, with its timeline. A sequencer is definitely more immediate than a patch matrix if you’re going to invite people to play with your instrument, but less interesting as a performer.

Thinking about how once you have the matrix recognizer, it will be nice to just rotate the physical marble plate by 90 degrees to get a different rooting instantly :smiley:

What about using fluorescent marbles with uv light to have sharp contrast on the video?

2 Likes

It is very easy to remove the infrared filter from a cheap webcam. The infrared light once detected by the webcam clearly enhances the contrast of the image. I think it could be a lead to explore to illuminate the grid with infrared light.

I think it would also reduce calibration problems due to external lighting such as stage lights or any other visible light.

1 Like

I certainly see the fun of using the camera for signal generation, but marbles are heavy enough to press a small switch/knob. A small switch can easily be installed in the dimples of the board. The it would be a matter of connecting umpteen switches to a arduino and have midi output. Taking the camera concept to the extreme you probably land at something like i2sm (not your goal, i know)

1 Like

Oh, definitely! But soldering 256 of 'em as a beginner isn’t my idea of a good time. Plus the webcam is way cheaper than an arduino haha - repurposed mass produced junk vs specialized hobbyist equipment innit.

Getting OpenCV to work reliably in VCV would definitely open some doors! There’s already a Mac OS X only module for Rack 0.x using OpenCV, but it’s no longer active: DSW Vision 2 'Flow alpha' (currently macOS only) - #15 by Aria_Salvatrice

Once someone gets a reliable cross-platform build script and the boilerplate to read a webcam in VCV done, everyone can focus on the interesting part of the problem.

1 Like

So… I couldn’t get this thing out of my head and came up with a fairly simple solution to detect a grid.

If is just a rectangular grid, there’s no need for any advanced blob detection. All that’s needed is a 3D renderer that can render a textured mesh based on UV coordinates and vertices and then read pixels from the result.

Also a fragment shader to filter/enhance “active” pixels from the render. I wrote one that makes everything that is only saturated below a threshold full black and anything that is saturated enough gets maxed out in saturation.

After the UV coordinates gets selected on the webcam’s video plane, I render a mesh from them, scaled so that every cell in the grid gets exactly 1 pixel ( the UV coordinates can be skewed but the rendered mesh’s vertices are forming a nice rectangle ) Then it is possible to simply read pixels from this rendered image to determine if the cell is active or not.

The 1 pixel render was flickering a bit, when the objects weren’t in the exact center, so I’ve added an secondary “oversampled” render as well that has 3x3 pixels for each cell.

That being said, I didn’t have marbles at hand, so I used some building toys - they are quite saturated and with different hues (hues could mean different values too!) - with a grid drawn by hand on a piece of paper.

Maybe marbles need to get enhanced in terms of contrast only…

check out salvagrid! :wink:


^ here, in this corner are the actual 1 pixel sized renders, they might not be visible :slight_smile:

(above them the regular, 3x3 sampled and the 1x1 sampled renders scaled up)

Even though this is running in the browser, it seems to work fine in terms of performance (I’ve only tested with small grids though and have a good videocard). Might be useful as a starting point for you if you go the open-cv/native route.

the sources on gitlab
(written in coffeescript, using webmidi.js, dat.gui.js and THREE.js for rendering)

One of the problems that came up (since I use a top-down camera) is how to separate pixels coming from the hand vs. the cells. My current setup tweaked with the right values works so that wherever my hands are, the cells beneath them get deactivated (or activated based on the parameters of the shader), maybe it would be better to somehow detect the hand (with chroma-keying from example), and only output pixels that are not “hand colored”. Maybe it would be beneficial to write a custom minFilter for the resampling too.

Also for some reason the webmidi.js library doesn’t send cc values over controller addresses larger than 119, so it can be a bit tricky to setup in 16x16 mode. I made it so it overflows the grid onto the next midi channel if it is larger than 119, but still (maybe it will be the cleanest with four 8x8 grids once multiple grids are supported)

Had an idea while making this that I might explore now… It is about creating circle shapes out of paper, colored in a rainbow cycle (like a hue ring in color pickers) then you pin these circles onto a table or something and by only reading 1 pixel (for example from around the 12 o’clock position on the ring) you can convert the picked hue value into cc easily, so you get these funky paper knobs that you can turn like miniature vinyl decks! :smiley:

1 Like

Wowzers nice! I’ll have to rig up a camera and try it out ASAP and report.

I have not really made significant progress on this project myself, besides becoming the proud owner of three bags of marbles, but I’ll definitely analyze in detail your implementation! Calibrating the grid then using simple color detection definitely seems like a great way to go about it.

Edit: I gave it the “Minimal viable prototype” test lol

1 Like

those are the nicest kind of marbles imo, but they might need a darkness test instead of the saturation to be reliable.

you can also drag the ABCD corners around to match your grid!

the code isn’t the cleanest but the /shaders/saturator.frag does the color filtering and the pixel reading detection happens in the update_grid function inside /board.coffee

1 Like