Google Video Looks At ‘Science Of Talking With Computers’
Language. Easy for humans to understand (most of the time), but not so easy for computers. This is a short film about speech recognition, language understanding, neural nets, and using our voices to c...
Psychoacoustics
Psychoacoustics is the scientific study of sound perception. More specifically, it is the branch of science studying the psychological and physiological responses associated with sound (including spee...
Speaker recognition
Speaker recognition is the identification of the person who is speaking by characteristics of their voices (voice biometrics), also called voice recognition.There is a difference between speaker recog...
Speaker recognition - Wikipedia
Speech processing
Speech processing is the study of speech signals and the processing methods of these signals. The signals are usually processed in a digital representation, so speech processing can be regarded as a s...
Speech recognition
In computer science and electrical engineering, speech recognition (SR) is the translation of spoken words into text. It is also known as "automatic speech recognition" (ASR), "computer speech recogni...
Speech synthesis
Speech synthesis is the artificial production of human speech. A computer system used for this purpose is called a speech synthesizer, and can be implemented in software or hardware products. A text-t...
Speech synthesis - Wikipedia
Voice Navigator
The Voice Navigator was the first voice recognition device for command and control of a graphical user interface (Patent no. 5377303). The system was originally designed for the Apple Macintosh Plus a...
Voice risk analysis
The Voice Risk Analysis or VRA, not to be confused with Voice Stress Analysis (VSA), is marketed in the UK as a lie detection technology developed by Digilog. It is said to work by detecting the chang...
Equal-loudness contour
An equal-loudness contour is a measure of sound pressure (dB SPL), over the frequency spectrum, for which a listener perceives a constant loudness when presented with pure steady tones. The unit of me...
Equal-loudness contour - Wikipedia
Vlingo
Vlingo is an intelligent software assistant and knowledge navigator functioning as a personal assistant application for Symbian, Android, iPhone, BlackBerry, and other smartphones. The application us...
VoxForge
VoxForge is a free speech corpus and acoustic model repository for open source speech recognition engines.VoxForge was set up to collect transcribed speech to create a free GPL speech corpus for use w...
PerSay
PerSay was an Israeli start-up company specializing in Voice Biometrics technology. Founded in 2000, its voice biometrics systems are used in the banking, insurance, governments, and telecommunication...
MBROLA
MBROLA is an algorithm for speech synthesis, and software which is distributed at no financial cost but in binary form only, and a worldwide collaborative project. The MBROLA project web page provide...
Voice stress analysis
Voice Stress Analysis (VSA) technology is said to record psychophysiological stress responses that are present in the human voice when a person suffers psychological stress in response to a stimulus (...
Mel scale
The mel scale, named by Stevens, Volkmann, and Newman in 1937, is a perceptual scale of pitches judged by listeners to be equal in distance from one another. The reference point between this scale and...
Source–filter model of speech production
The source–filter model of speech production models speech as a combination of a sound source, such as the vocal cords, and a linear acoustic filter, the vocal tract (and radiation characteristic). A...
Selectable Mode Vocoder
Selectable Mode Vocoder (SMV) is variable bitrate speech coding standard used in CDMA2000 networks. SMV provides multiple modes of operation that are selected based on input speech characteristics.The...
Vector sum excited linear prediction
Vector sum excited linear prediction (VSELP) is a speech coding method used in several cellular standards. The VSELP algorithm is an analysis-by-synthesis coding technique and belongs to the class of ...
Computer-assisted telephone interviewing
Computer-assisted telephone interviewing (CATI) is a telephone surveying technique in which the interviewer follows a script provided by a software application. It is a structured system of microdata ...
Text to Speech in Digital Television
Text-to-Speech in Digital television refers to digital television products that use speech synthesis (computer generated speech providing a product that “talks” to the end user) to enable access by b...
Vocoder
A vocoder (/ˈvoʊkoʊdər/, short for voice encoder) is an analysis and synthesis system, used to reproduce human speech. The vocoder was originally developed as a speech coder for telecommunications app...
Vocoder - Wikipedia
Auditory processing disorder
Auditory processing disorder (APD), also known as central auditory processing disorder (CAPD), is an umbrella term for a variety of disorders that affect the way the brain processes auditory informati...
Microsoft Speech API
The Speech Application Programming Interface or SAPI is an API developed by Microsoft to allow the use of speech recognition and speech synthesis within Windows applications. To date, a number of vers...
RTAudio
RTAudio is a Microsoft produced adaptive wide-band speech codec. It is used by Microsoft Office Communications Server (OCS) and the related OCS clients (Microsoft Office Communicator, and Microsoft Li...
Windows Speech Recognition
Windows Speech Recognition is a speech recognition application included in Windows Vista, Windows 7 and Windows 8.
Windows Speech Recognition allows the user to control the computer by giving spec...