Voice recognition: Where to start?

divine87 · 2009-07-15T14:58:04+00:00

hi.. my idea of my final project is a voice recognition. my problem is i dont know where to start. can you help me.. thanks..=)

Voice recognition: Where to start?

divine87

Member

Updated: Oct 26, 2024

Views: 1.1K

hi..

my idea of my final project is a voice recognition. my problem is i dont know where to start. can you help me..

thanks..=)

0

Replies

Howdy guest!

Dear guest, you must be logged-in to participate on CrazyEngineers. We would love to have you as a member of our community. Consider creating an account or login.

Replies

silenthorde

Member • Jul 15, 2009

You can probably take a FUzzy approach, Fuzzy takes care of uncertainities in recognition. It will accomodate the fact that the sampled voice signal will be differnt every time. And this makes the system all the more difficult to design. So you can take desicions based on the closeness of match. If you know fuzzy logic, you will realize that the concept is quite close to the basic principle of fuzzy logic. Basically it allows an elemens to have partial membership in a set.

Say for instance, you go for Fast Fourier Transform (FFT) of the sampled voice signal. IT gives you the various spectral components of the voice signal. You can define input fuzzy sets like Ultra high frequency, Very High frequency, High frequency, Medium frequency, Slightyly low frequency, Low frequency ( Please note that voice signals will be restricted to 20 khz, so you will not need the UHF, HF, VHF sets, its just for the sale of demonstration 😁 ) to convert your i/p signals to fuzzy values.

Now The sampled signal will be assigned a membership grade in each of these sets, say 0.2 for HF, 0.6 for UHF etc. Now frame the fuzzy rules, they will decide how the fuzzy inference engine will decide wheteher the voice is a match or not. These rules willbe of If and Then form. The inputs will decide the firing strength of these rules. Say you have 8 rules, each of them fire, but with differing intensities. Use a defuzzification strategy like Center of Gravity, Mean of maxima method etc to get he final defuzzified o/p.

This is just a schematic of a generalised Fuzzy system (Mamdani type), there are other models like Sugeno, Tsukamoto. I have just tried to share my idea of system implementation. THis may be far from perfect.:smile:

Are you sure? This action cannot be undone.
Cancel
Harshad Italiya

Member • Jul 15, 2009

Very informative answer mate. Can you please some more knowledge with us it will be very usefull to many ceans.

We have to use MATLAB for the processing. Am i right?

Are you sure? This action cannot be undone.
Cancel
divine87

Member • Jul 16, 2009

thanks for the information..😛

Are you sure? This action cannot be undone.
Cancel
silenthorde

Member • Jul 19, 2009

godfather
Very informative answer mate. Can you please some more knowledge with us it will be very usefull to many ceans.

We have to use MATLAB for the processing. Am i right?
Ya sure GODFATHER, I'm thinking of starting a thread on FUZZY logic. Ther isnt't any on CE. Ya MATLAB's fuzzy logic toolbox is the standard developement and test platform, i'd say, though I haven't worked with it.

Are you sure? This action cannot be undone.
Cancel
wassup

Member • Jul 20, 2009

divine, speech recognition will take a lot of time if you start from the scratch. Before you start you need to decide lot of things. Most importantly do you really want to get into the basics with all the signal processing, Neural networks,fuzzy, HMM model. Or do you just want the end application written using a speech library. A full fledged speech recognition system implementation is not a 4 month project.
If you are not comfortable with all the math, you better look for the end user application, use a backend speech library. This is

Are you sure? This action cannot be undone.
Cancel
Predictor

Member • Jul 26, 2009

divine87
my idea of my final project is a voice recognition. my problem is i dont know where to start. can you help me..
You could save one (or more) examples of each known speaker's voice, and compare the new case to all of those, looking for the closest match. Which method of determining the similarity of two sampled voices is most accurate can only be determined by experimentation.

One obvious thing to try is the average absolute difference between the FFT magnitudes.

You might try using something simple. A fuzzy logic system was once built, to recognize spoken digits, which used only sample counts from the digitized waveform for each of 24 zones (4 amplitude segments by 6 time segments).

-Will Dwinnell
<a href="https://matlabdatamining.blogspot.com/" target="_blank" rel="nofollow noopener noreferrer">Data Mining in MATLAB</a>

Are you sure? This action cannot be undone.
Cancel