noob Threading affirmations

Update

Updating this ask to try and get more specific:

the easiest and most general form of the question is, can someone point me to good, freely available resources for multithreading in a DSP environment? a la lock-free etc. or help me figure out wtf is going on in Sanguine’s thread-server(there’s still a mutex hiding in there).

to go further in case someone is willing to try to explain it themselves, very basically i need help understanding how to share 2 arrays of data between 2 threads when both read and write but one cant ever be locked out.

this is the format: thread1 writes to array1 and then reads from array2 one at a time, while thread2 does block processing, reading from array1, creating new values, then writing to array2, all in one go but of course intermittently.

one caveat that potentially eases things is that i don’t think thread1 when it reads from array2 particularly cares if its a fresh value or not(while thread2 processes, thread1 has to keep pulling what are now technically old values). of course thread1 can’t ever be locked, so mutexes seem not to be the way.

However, the code below in its full form does seem to work just fine. just ran 14 instances of it with no audio issues at all and pretty good cpu usage.

original post

Hello again! this chunk of semi-pseudo code is pulled from what I have gotten to actually work for my specific case. My question is am I doing threading here any kind of ‘correctly’ or ‘safely’, like can I use this struct in multiple modules without issue, will it play well in the environment long term? so far it’s fine but i really don’t know what i’m doing here yet and this is getting into the territory where I want to be very safe.

template<typename T = float, int S = 2048>
struct FFTThreaded {
  std::thread Worker;
  std::mutex workTex;
  std::condition_variable workCV;
  std::atomic<bool> FFTReady{false};
  std::atomic<bool> outReady{false};
  std::atomic<bool> Quitting{false};
  myRingBuffer inputBuffer<T, S>;
  myRingBuffer outputBuffer<T, S>;
  T FFTanalyzed[S* 2];
  T FFTsynthesized[S * 2];
  
  FFTThreaded() {
  	//memory setting and whatnot, then
  	Worker = std::thread(&FFTThreaded::processFFTThread, this);
  }
  
  ~FFTThreaded() {
   //I think all this is to make sure the thread closes when the program does.
     {
        std::lock_guard<std::mutex> lock(workTex);
        Quitting.store(true);
     } //lock, tell quit, unlock
  
     workCV.notify_one();
     if (Worker.joinable()) Worker.join();
   }
  
  //run on main thread every sample(must)
  void push(T in) {
     {
        std::lock_guard<std::mutex> lock(workTex);
        //put sample into inputBuffer, do the appropriate ring Buffer stuff
     } //lock guard in its own scope so it unlocks on its own
      
    //tell the worker thread when a block is ready to process
    if (inputBuffer.bufferIsFull()) {
    this->FFTReady.store(true);
    workCV.notify_one();
    }
  }  
  
  //run on worker thread when needed
  void processFFTThread() {
  //why exactly must the thread be locked in an infinite while loop?
        while (true) {
            std::unique_lock<std::mutex> lock(workTex);
            workCV.wait(lock, [this] {
                return this->FFTReady.load();
                });
  	  if (this->Quitting) return;
  		//Do FFT Stuff if it says its ready, on members of this struct    
  		//take inputBuffer into FFTanalyzed, 
  		//alter them into FFTsynthesized
  		//inverse that into outputBuffer    
  		//once its done, remind it so, and tell the output a new block is ready
                
        this->outReady.store(true);
        this->FFTReady.store(false);
          
        lock.unlock();
        }
  }
  
  //run on main thread every sample(must)
  T pull() {

  	if (this->outReady) {
  	    //when the output block is ready, start from the beginning of it.
  	}
  	//pull sample from the outputBuffer and return it
  }
};

since i’m using this platform as my entire motivation to learn to code, it has become time to figure out multithreading and do a nice FFT. this, when expanded slightly, should become the base by which I can do any FFT effect on said worker thread. All the stuff I can find about threads is either so simplified it doesnt even complete the purpose of a thread[e.g. it forces the main thread to wait for the other to join] or beyond my ability to read as code. Summing up, again, I just need to know if i’ve done this alright.

Hard to say, because some essential parts like myRingBuffer are unclear. I assume that you are going to call this from the dsp thread? Then you should drop the idea of using a mutex, because this might lock the dsp thread.

I am calling this from the dsp thread, but all my attempts without this back and forth from the mutex/condition_variable combo resulted in just nothing, like the thread either never started or would never update the buffers, so I’m not sure how else to deal with all this shared data synchronously. myRingBuffer is part of the pseudo in the code, just un-complicating the overlap-add process i’m actually doing. they are just arrays with an atomic writehead(size_t) that wraps around [thats what decides when its full as well], and in reality theres a few of each for dealing with hops.

If you use dsp::RingBuffer for input and output, you should be fine without any locking. I’m guessing your issues were caused by the myRingBuffer data structure (a writehead is not enough, you need at least a readhead).

I mean, it is working right now without that.. the in-out system is structured thus: samples are written into the inputBuffers via their writeHeads, which tick along with them. when one of the hop buffers is full, that whole block is processed from start to end by the FFT, which puts its results through the analysis and synthesis bins, then to an entire output block from start to end. then the output is read per sample from the output blocks, each of which have their own independant read heads. so no read head on the input, and no write head on the output, because they always fill whole blocks. besides, I am certain the buffer system is not the issue there, as I also have tried a nubile no mutex lifetime worker thread on a fractal math iteration loop with the same nothing result. its definitely in how im constructing, joining, or sharing data with the thread from within the struct that is the issue, as all the ‘learning threads’ stuff i’ve looked at has been essentially just run a global function from an int Main with another thread, join it, and ur done, which is fully not doing what threads are for imo, and they dont seem to get much further than that.

I don’t know if this helps, but in Sapphire Empath I do an FFT to update my spectrum display graphs. But I don’t do the FFT from the audio thread (the one that calls your Module::process method). It’s too much work to do all in one audio sample.

Instead, my audio thread appends each time-domain sample to an FFT input buffer. My widget’s step method, which runs in the UI thread, checks the buffer on each display frame update, which runs at your video frame rate (60 frames per second) rather than the audio rate (48000 samples per second). That’s where I do the FFT.

You can get a lot more number crunching done in your UI/step function rather than the DSP/process function. I do the FFT, draw the graphics, empty the buffer, and quit, waiting for the next time the buffer is full.

This works in my case because the FFT is only needed for a graphical display. You run into different problems if your FFT output is needed by your audio path.

1 Like

sadly this is to become(already is really) at the very least a pitch shifter, so definitely needs the live audio abilities, but I will definitely look over your code anyway, cuz I need the reading and understanding practice.

That is an interesting approach. How do you handle that the plugin versions of Rack don’t call step when no plugin window is open. Ok - you already answered that. I’m also wondering if this might cause dropping frame rates if the load gets to heavy for one screen frame.

speaking of this, related question for when i figure this out: is creating a worker thread for the UI allowed? even there calculating and drawing a whole fractal sucks so currently im only doing it every 8 frames, but thats not really saving me, just allowing everyone else some time to draw.

I’m not sure if it is forbidden and how this would work - it might need even more thread synchronization. I suggest to look into FramebufferWidget, which helps you rendering only when actually needed (see setDirty)

lol my setDirty attempts have also failed, and I can’t figure why cuz im pretty sure i was doing it exactly how i saw it in other’s code. but thats an issue for another time. the widget is a child of a framebuffer already so when i do itll be simple to implement.