Contact Project Developer Ashish D. Tiwari [astiwz@gmail.com]
Download Synopsis Abstract
Desktop Applications Java BE-Engineering(CO/IT) ME-Engineering(CO/IT) BCS MCS BCA MCA MCM BSC Computer/IT MSC Computer/IT Diploma (CO/IT) IEEE-2016

Intelligent Hands Free Speech based SMS System on Android

Text to speech and speech to text conversion
Abstract-Synopsis-Documentation

Intelligent Hands Free Speech based SMS System on Android

Abstract:

 

 

Over the years speech recognition has taken the market. The speech input can be used in varying domains such as automatic reader and for inputting data to the system. Speech recognition can minimize the use of text and other types of input, at the same time minimizing the calculation needed for the process. Decade back speech recognition was difficult to use in any system, but with elevation in technology leading to new algorithms, techniques and advanced tools. Now it is possible to generate the desired speech recognition output. One such method is the hidden markov models which is used in this paper. Voice or signaled input is inserted through any speech device such as microphone, then speech can be processed and convert it to text hence able to send SMS, also Phone number can be entering either by voice or you may select it from contact list. Voice has opened up data input for a variety of user’s such as illiterate, Handicapped, as if the person cannot write then the speech input is a boon and other’s too which Can lead to better usage of the application. This application also included that user can only input numeric character for contact information, i.e. the security validation for number is done. SR will listen to input and convert numeric to text and will be displayed on contact information to verify. If any user try to insert any other character into the information an error would be displayed e.g. if user speaks his name for contact, it will be displayed as invalid contact. The message box can accept any character. To use the speech recognition user has to be loud and clear so that command is properly executed by the system.

Module Description:

1. Speech to text conversion:

          The system shown here will use SR with Google server which uses HMM method. The brief description of how speech is recognized is as follows. Firstly the speech is inputted, sound can be fluctuating set of signals which are recorded. These signals depends on speaker how is his/her voice quality and hold on the language. The input data is divided into words and phrases, i.e. command is divided into several parts. Lastly comes the processing phase where accordingly system understands command and executes it.

Speech Recognition stands majorly on five pillars that are, feature extraction, acoustic models database which is built based on the training data, dictionary, language model and the speech recognition algorithm. The inputs data i.e. voice are first converted to digital signal and are sampled on time and amplitude axis. This digitalized signal is then processed. For processing the signal is divided into small intervals, which depends on the algorithm used.

This division is based on the features of data as those features are compared with database element. Database element contains information of feature of the word found and according the command is created. The basic element can be a phoneme for continuous speech or word for isolated words recognition. The dictionary or Database is used to connect the frequency model i.e. the spoken word with actual vocabulary word. The signal namely speech has its constraints as said speech should match the meaning of textual language brain created. The HMM uses word for modeling. The output is a hidden probability function of the state which cannot be deterministically specified. States sequence is never a command that is SR system generally assumes that the signal is realization of message which is encoded as a sequence of symbols. Here symbols are the words sampled. To effect the reverse operation of recognizing the underlying symbol sequence given a spoken utterance, the continuous speech waveform is first converted to a sequence of equally spaced discrete parameter vectors. Vectors of speech characteristics consist mostly of MFCC (Mel Frequency Cepstral Coefficients), standardized by the European Telecommunications Standards Institute for speech recognition. The

MFC can be easily created. The Fourier analysis is performed on sampled i.e. divided data then variable bandwidth triangular filters are placed along with the Mel frequency scale and energies are calculated by spectrum.

2. Sending SMS:

          The user will be having 2 ways to send SMS in this project. He can send directly by telling the mobile number and the message. Here the user will tap on the mic and will tell the number it will be displayed in the edit text. Only numbers are allowed to be spelled in the number edit text. He needs to do the same for the message edit text also.

3. View and Add contacts:

         Here instead of sending message directly the user was also allowed to add and view contacts. He will add contacts by spelling and all the contacts will be displayed in the list view. By clicking on the list the text to speech conversion takes place and it will be spelled to the user. 

Comment is Only Available for registered users! Create Account or Login Now!