Does a computerized speech module already exist?

clone45 · April 12, 2020, 10:13pm

Hello,

I searched the library but couldn’t find a module that makes retro-style synthesized vocal sounds - similar to a Speak & Spell. I’m considering building one but I don’t know if it’s been done before. Anyone seen anything like this yet?

Cheers
Bret

Xenakios · April 12, 2020, 10:23pm

Nysthi has the SAM module. It seems to have the downside that the parameter controls don’t work while it is speaking the text and changes only take effect when it restarts speaking.

There’s a vocal synthesis mode in the Audible Instruments Macro Oscillator 2, but you can’t use your own text as input, there’s just a selection of predefined words it will do.

clone45 · April 12, 2020, 11:16pm

Thank @Xenakios. Well… I probably wouldn’t bring much new to the table, so I suppose I’ll move on to other crazy ideas.

persy · April 13, 2020, 12:04am

what about a retro style 8bit sample player? i always liked the way midines’ sample bank sounded.

or maybe a module based on lsdj’s 59 phonemes or even the wave channel?

dag2099 · April 13, 2020, 1:25am

One downside of SAM is that it only says one line at a time, sort of hard to use if you want it to say a whole songs worth of lyrics. I use it with Stoermelder 8face2 but that only gives me 16 lines and makes things a little squrrelly to set up.

Just being able to load a list of lines of dialogue would be really useful but being able to change the pronounciation so you could get things to scan would be best.

clone45 · April 13, 2020, 1:37am

OK, maybe I’ll look deeper into it. My first idea was to create a simple syllable generator of sorts. I had good luck before with this type of thing (https://github.com/clone45/EquationComposer/blob/master/experimental/ModuleVocalizer.cpp), but it’s been so long that I don’t have links to the original code which it’s based on. @persy I’ll take a look at those suggestions. I also have a big project in the works that I may focus on for a while. I’ve got to keep my mind occupied until I can go out fishing again. Ha ha ha.

VCVRackIdeas · April 13, 2020, 6:20am

I’m dreaming about some vocoding Kraftwerk like voice in VCV Rack) If you’re gonna do some module in this vein (or not) there is a suggestion from me: each phrase has it’s own trigger, or like RUN (hold as gate or just on as trigger) and PAUSE triggers, which would be a great and smaller solution to make a whole song without so much mystery things)

synthi · April 13, 2020, 1:18pm

you can sequence sentences in SAM just edit the “sam.dat” file in RES

clone45 · April 14, 2020, 4:52am

@VCVRackIdeas What if it worked like refrigerator magnets? But instead of placing them on your refrigerator (which has no concept of rhythm), instead you place the virtual magnets on a timeline, like this?

I had a little whiskey, so that might be very difficult to understand. It would play the top row, then the second row, then the third, etc. And each row would contain 1-bar split into 16 sections.

So, the above would sound like…

Out__of_____time. Out_of_______time. Out____of_time.

There could be 4 rows, and in addition 16 “snapshots” of rows that you could control via CV. (Essentially, the snapshots would act like a built-in Stoermelder 8FACE .

As for word selection, we could use existing research (such as Analysis of Billboard’s Top 100 Songs and Lyrics (1964-2015)) to choose 128 words or so to make available.

I could synthesize the words - Or maybe we could let people create “packs” of recorded words? A pack would contain a folder of samples and a file that lists what words are available and which sample is associated with which word, and how long the magnet should be for each word. Very interesting!

I created another mock up to show that there would be a “pool” of words to choose from:

rsmus7 · April 14, 2020, 5:46am

Hi, Bret,

that looks very interesting and to me quite understandable.

cheers

VCVRackIdeas · April 14, 2020, 6:41am

that’s definitely sounds cool! Wanna test it already) it’s something new in this area)

logorrhoia · April 14, 2020, 8:57am

I love this idea, with visuals like that I could introduce it into maths and english lessons and we’d have way more fun.

I know this is not easy to work out, but grammar features leading to some weird sounds would be nice - but I could use that with some distortion module and ask them how deep we need to destroy the sentence coming out depending on the amount of mistakes they spot (or if they love distortion reward them with higher distortion freedom by level of grammatical correctness).

Lucky me we only have to build very small sentences as the vocabulary capacities of my non native english speaking learning impaired children are low.

veryfungi · September 22, 2024, 11:29am

Did this ever go anywhere? I am in the process of making robots sing… SAM is cool, but I really need it to follow pitch while speaking.

auretvh · September 22, 2024, 12:33pm

What about running it through a vocoder?

bipomuz · September 22, 2024, 2:00pm

hi! I have a patch for the dreamworld challenge using a SAM voice getting pitch from external sequencer cv. don’t know if this can help somehow.

veryfungi · September 22, 2024, 2:38pm

I never even thought about that. Yeah that could be fun. Thanks

veryfungi · September 22, 2024, 2:43pm

That sounds crazy. Nicely done. Here is my recent use of SAM.

Squinky · September 22, 2024, 8:10pm

Myself I don’t think I’d want to make a grid sequencer and a speech synthesizer combined. If the words were “just” CV inputs, then it would be much simpler to make, and you could use it with any sequencer (there are an awful lot of them out there!).

clone45 · September 22, 2024, 9:53pm

I never went anywhere with this idea. I probably got sidetracked. Ha ha ha. In fact, I totally forgot about it. If I get a chance, I’ll revisit it, but if anyone else wants to run with the idea, I’ll step aside and focus on other things.