Abstract
In the present paper we report some recent experiments we have made in an isolated, speaker-independent (SI), recognition task under car-noise conditions. This work is an extension of the previous results we obtained in a speaker-dependent (SD) context [2][3][12][13]. The main topic of interest concerns the use of root-homomorphic deconvolution schemes [11], since it has been shown that such method gives an optimal solution to solve the deconvolution problem of voiced speech with the constraint to minimise the effects of the background noise [3]. In an SI context, the first complication comes from the inter-speaker variabilities. To minimise this effect, a low-dimensional MEL-scaled filter bank [5] is generally used. It is of evidence that such a processing destroys the convolutional structure of speech. Consequently, the root-homomorphic deconvolution is no longer justified. The goal of this paper is to give some experimental results of root-homomorphic schemes combined with a MEL filter bank representation. In an attempt to reduce the effects of the noise on the parameters, it was recently suggested the use of high-pass filtering techniques [9] [10], assuming that the stationary component of noise can be separated from speech. We show here that this filtering technique can be interpreted as a spectral subtraction technique, for which the noise model is estimated continuously; no voice activity detection (VAD) is required (VAD-free spectral subtraction). With such a consideration, we show that this technique is less efficient than the spectral subtraction technique, for which the noise model is estimated during non-speech activity (VAD-dependent spectral subtraction). The experiments are carried out using an HMM-based isolated digit recogniser.
| Original language | English |
|---|---|
| Pages | 1255-1258 |
| Number of pages | 4 |
| Publication status | Published - 1 Jan 1993 |
| Externally published | Yes |
| Event | 3rd European Conference on Speech Communication and Technology, EUROSPEECH 1993 - Berlin, Germany Duration: 22 Sept 1993 → 25 Sept 1993 |
Conference
| Conference | 3rd European Conference on Speech Communication and Technology, EUROSPEECH 1993 |
|---|---|
| Country/Territory | Germany |
| City | Berlin |
| Period | 22/09/93 → 25/09/93 |