Voice Activity Detection and Noise Estimation for Teleconference Phones
Independent thesis Advanced level (professional degree), 20 credits / 30 HE creditsStudent thesis
If communicating via a teleconference phone the desired transmitted signal (speech) needs to be crystal clear so that all participants experience a good communication ability. However, there are many environmental conditions that contaminates the signal with background noise, i.e sounds not of interest for communication purposes, which impedes the ability to communicate due to interfering sounds. Noise can be removed from the signal if it is known and so this work has evaluated different ways of estimating the characteristics of the background noise. Focus was put on using speech detection to define the noise, i.e. the non-speech part of the signal, but other methods not solely reliant on speech detection but rather on characteristics of the noisy speech signal were included. The implemented techniques were compared and evaluated to the current solution utilized by the teleconference phone in two ways, firstly for their speech detection ability and secondly for their ability to correctly estimate the noise characteristics. The evaluation process was based on simulations of the methods' performance in various noise conditions, ranging from harsh to mild environments. It was shown that the proposed method showed improvement over the existing solution, as implemented in this study, in terms of speech detection ability and for the noise estimate it showed improvement in certain conditions. It was also concluded that using the proposed method would enable two sources of noise estimation compared to the current single estimation source and it was suggested to investigate how utilizing two noise estimators could affect the performance.
Place, publisher, year, edition, pages
2015. , 61 p.
Voice Activity Detection (VAD), noise estimation, continuous noise estimation (CNE), statistical model-based VAD, improved minima-controlled recursive average (IMCRA), Rangachari noise estimation (RNE or MCRA-2), likelihood ratio approach, signal-to-noise ratio dependent recursive average, teleconferencing
IdentifiersURN: urn:nbn:se:umu:diva-108395OAI: oai:DiVA.org:umu-108395DiVA: diva2:852787
Master of Science in Engineering and Management
Yu, Jun, Professor
Rydén, Patrik, Universitetslektor