Home > Applications and Industries > DSP > Sound and Speech Processing

Sound and Speech Processing

Discrete Dynamic Programming Toolbox  
This is a zip file that contains a set of functions for solving and analyzing dynamic programming problems with discrete state and action spaces. It contains a Postscript file explaining the problem and the code.
Submitted: Nov 11, 1999
SINUS Measurement Toolbox (SMT) for MATLAB®  
Two families of data acquisition hardware with a common dynamic MATLAB® interface for the field of acoustics & vibration.
Submitted: Jul 15, 2004
Clear, Efficient Audio Signal Processing in ANSI C  
This paper presents recommendations for efficiently implementing signal processing algorithms in C without sacrificing clarity or portability.
Submitted: Dec 24, 2003
Sound Processing Kit  
Sound Processing Kit is an object-oriented class library for audio signal processing. The site contains the Sound Processing Kit online documentation and the up-to-date software distribution.
Submitted: Dec 24, 2003
Time and Pitch Scaling of Audio Signals  
A document on Pitch Scaling - a way to change the pitch of a signal without changing its length.
Submitted: Dec 24, 2003
Speech Coding  
These web pages describe the principles involved in speech coding, and details of commonly used coders. Included also are links to other related pages, and the source code of some common speech codecs.
Submitted: Dec 24, 2003
A Real Time 3D Signal Analysis/Synthesis Tool Based on the Short Time Fourier Transform  
This paper describes a system for audio analysis, modification, and synthesis, based on the Short Time Fourier Transform (STFT). The system is intended both as a tool for sound manipulation, and as a means to reinforce people's intuitions regarding the relationships between timbre and the harmonic structure of music and other audio signals, as conveyed via their spectrograms.
Submitted: Dec 24, 2003
Speech  
The list of net resources on speech processing researches.
Submitted: Dec 24, 2003
A U D I O E F F E C T S - Frequently Asked Questions  
This document is intended to help anyone who has questions concerning audio effects, and was written after seeing the same questions about audio effects being asked time and time again on usenet newsgroups.
Submitted: Dec 24, 2003
A comparison of Internet audio compression formats  
This page compares various music compression formats that are used over the internet: PCM, MPEG, GSM, ADPCM, VSC112, TrueSpeech, RealAudio and others.
Submitted: Dec 24, 2003
Amazon.com: Digital Audio Processing Book  
This book explains digital techniques for processing sound in accessible language that any experienced programmer can understand, and provides C++ software tools for Windows 32-bit sound programming. The author explains the principles of digital signal processing (DSP) and the basic mechanisms of human sound perception, and provides practical DSP programming tricks.
Submitted: Dec 24, 2003
Digital Audio Filtering Software  
EchoFilter is a software program that turns a pc with a soundcard into a sophisticated programmable digital filter and frequency counter.
Submitted: Dec 24, 2003
Intelligent Speech Analyser™ Home Page  
The main scopes of application include: Phonetics, Phoniatrics, Logopedics, Audiology, Speech Analysis, Sound Analysis, Singing Analysis, Music Analysis, Music Instrument Analysis, Research on Children's Crying, Research on Lung Sounds and Heart Sounds, Good Radio Voice Analysis, Sound Editing.
Submitted: Dec 24, 2003
Active Noise Control FAQ  
This FAQ discusses active noise control, a novel way of using basic physics to control noise and/or vibration. The FAQ has four purposes: Provide concise, accurate answers to common questions about active noise control. Dispel popular misconceptions about what active noise control can and cannot do. Refer interested readers to web links, magazine articles, technical references, and other sources of information. Stimulate public interest in acoustics.
Submitted: Dec 24, 2003
MN Library  
MNLib is a set of C++ libraries and programs for audio processing and music. It is designed with portable C++ code for use on Linux, Microsoft platforms and others.
Submitted: Dec 24, 2003
PAiA: How Vocoders Work  
This document explains the workings of vocoders (voice coders).
Submitted: Dec 24, 2003
Sig++: Musical Signal Processing in C++  
Sig++ is a set of C++ classes primarily intended for use in creating sound synthesis/filtering programs - primarily for the uninterpreted elegant environment of a Unix command line. The site contains its code, explanation, documentation, and more.
Submitted: Dec 24, 2003
The GSM 06.10 lossy speech compression  
Information, algorithm, source code of GSM 06.10 lossy speech compression library and its applications.
Submitted: Dec 24, 2003
codecs: Dave's Encoding Guide  
One day, procrastinating from doing schoolwork and about two days after Microsoft had released their new standard for compressing sound, called MS Audio 4, I decided to see just how good (or bad) the codec was and ran these tests, pitting it against MP3 and RealAudio, both of which it was supposed to crush. While I certainly don't think the quality is earthshattering as it does not scale well to provide CD-quality audio and has annoying high-frequency artifacts, it may give RealAudio a run for its money in the low-bitrate market. As it turned out, the report became pretty popular. Over 30,000 people are estimated to have viewed this report. A second report will be forthcoming, covering MP2, MP3, AAC, AC-3, QDesign, EPAC, RealAudio, MS Audio 4, and VQF (I decided to reserve CodecReview.com from Internic.)
Submitted: Dec 24, 2003
Sampled Sound Processors  
This article describes sampled sound processors, the method of storing using digital and analog delays, filters, etc.
Submitted: Dec 24, 2003
Aglaophone  
Aglaophone is a system of interconnectable modules for the recording, processing, and playback of real-time audio. Aglaophone has modules for recording/playback, spectrogram and oscilloscope display, read/write unformatted sample data, gateway to an MP3 compressor/decompressor, filtering and much more. If the modules included with the Agalophone distribution don't do what you need, you can write your own.
Submitted: Dec 24, 2003
GEO - the Guitar Effects Oriented Web Page  
Home of the Guitar Effects FAQ, the Tube Amp FAQ, the Tube Amp Debugging Page, and "The Technology of" effects series.
Submitted: Dec 24, 2003
Harmony Central®: Effects Resources  
This site contains lots of resources on sound effects: information and comments, effects explanation and examples, links to effects manufacturers, effects retailers, classified ads, effects related software, schematics and construction tips.
Submitted: Dec 24, 2003
Digital Recording  
CONVERTING SOUND INTO NUMBERS - This article is an introduction on digital recording. It describes the effects of word size, sample rate on effective processing, discusses error correction in digital recording and the benefits of digital sound.
Submitted: Dec 24, 2003
Speech Technology And Research (STAR) Laboratory  
The Speech Technology And Research (STAR) Laboratory at SRI is a world leader in speech technology, and is active in both technology creation and technology transfer. Nuance Communications is exploiting technology developed in the STAR Laboratory in over the telephone applications. SRI's DECIPHER technology is capable of recognizing natural, continuous speech without requiring the user to train the system in advance (i.e., speaker-independent recognition).
Submitted: Dec 24, 2003
Speech Technology Products for Military Applications  
NATO IST-03 is a research study group which focuses on the application of speech technology in the military environment. The purpose of these pages is to provide information to potential uses of speech technology on the usability of commercially available systems. Results of systems tested in applicable environments are available for a variety of technologies.
Submitted: Dec 24, 2003
CSLU - center for spoken language understanding  
Mission: To make spoken language systems work. CSLU is an OGI Research Center that focuses on Spoken Language Technologies. Spoken Language Technologies enable people to interact with computers using speech, so that people can talk to computers (Speech Recognition), computers can talk to people (Speech Synthesis), computers know who is talking to them (Speaker Identification or Verification) or computers and people can talk to each other (Dialogue). The site includes a speech toolkit and language resources and explanations about spoken language technology and about CSLU.
Submitted: Dec 24, 2003
Implementation of a High-Quality Dolby* Digital Decoder Using MMX™ Technology  
This paper describes the research performed and the resultant techniques Intel used in creating its Dolby Digital decoder using MMX technology.
Submitted: Dec 24, 2003
Speech at CMU  
Carnegie Mellon University is dedicated to speech technology research, development, and deployment. CMU has a historic position in computational speech research, and continues to test the limits of the art.
Submitted: Dec 24, 2003
The Centre for Speech Technology Research (CSTR)  
CSTR is a multidisciplinary research group that undertakes application-oriented speech research. Its main work is in the areas of speech synthesis and speech recognition. CSTR has developed several large scale software systems, including the Festival speech synthesis system and the Edinburgh speech tools.
Submitted: Dec 24, 2003
Amazon.com: Digital Processing of Speech Signals Book  
The purpose of this text is to show how digital signal processing techniques can be applied to problems related to speech communication. The book gives an extensive description of the physical basis for speech coding including fourier analysis, digital representation and digital and time domain models of the wave form. It goes on to discuss homomorphic speech processing, linear predictive coding and digital processing for machine communication by voice.
Submitted: Dec 24, 2003
Canonical WAVE File Format  
The WAVE file format is a subset of Microsoft's RIFF spec, which can include lots of different kinds of data. It was originally intended for multimedia files, but the spec is open enough to allow pretty much anything to be placed in such a file, and ignored by programs that read the format correctly. This description is not meant to be exhaustive, but to suggest simple ways of doing common tasks with waveform audio, and give some pointers to other sources of information.
Submitted: Dec 24, 2003
Signalogo by Vadim Schetinkin  
Signalgo is a DSP-supporting library written in Java. It provides means to manage signals (particularly, audio ones) in object-oriented fashion. It's quite simple, but functional.
Submitted: Dec 24, 2003
CCRMA  
The Stanford University Center for Computer Research in Music and Acoustics (CCRMA) is a multi-disciplinary facility where composers and researchers work together using computer-based technology both as an artistic medium and as a research tool. Areas of ongoing interest at CCRMA include: Composition, Synthesis Techniques and Algorithms, Physical Modeling, Signal Processing, Digital Recording and Editing, Psychoacoustics and Musical Acoustics, Real-Time Applications and Controllers, Collaborative Works with other Art Disciplines and Music Manuscripting by Computer.
Submitted: Dec 24, 2003
Ultra Power Effects Max II  
This program reads from the soundcard, transforms the sound, and writes it back to the soundcard. It is free and available for download.
Submitted: Dec 24, 2003
Julius Smith's Home Page  
Selected acoustical modeling publications and software by Julius Orion Smith - Associate Professor at the Center for Computer Research in Music and Acoustics (CCRMA), Department of Music, Stanford University.
Submitted: Dec 24, 2003
Sonic Flow  
This page contains an overview of a software project in signal processing. The project was originally carried out at Tampere University of Technology for the course 80961 Signal Processing Project in fall 1998. In the project we designed and implemented a program for designing and simulating signal processing networks. The design and the API are in the C++ programming language. The program allows the design and simulation of audio dataflow networks on ordinary modern computer workstations... The program is released under the GNU General Public License.
Submitted: Dec 24, 2003
Survey of the State of the Art in Human Language Technology  
This book surveys the state of the art of human language technology. The goal of the survey is to provide an interested reader with an overview of the field- the main areas of work, the capabilities and limitations of current technology, and the technical challenges that must be overcome to realize the vision of graceful human computer interaction using natural communication skills.
Submitted: Dec 24, 2003
Machine Listening Group  
The Machine Listening Group is working towards bridging the gap between the current generation of audio technologies and those that will be needed for future interactive media applications. Research includes new description-based representations for audio that enable controllable, compact and computationally-efficient sound and music rendering and presentation.
Submitted: Dec 24, 2003



  Privacy - Trademarks - Feedback - Terms of Use Copyright The MathWorks, Inc.