完整後設資料紀錄
DC 欄位語言
dc.contributor.authorChin, Siew Wen Jr
dc.contributor.authorSeng, Kah Phooi Jr
dc.contributor.authorAng, Li-Minn Jr
dc.contributor.authorLim, King Hann Jr
dc.date.accessioned2011-01-21T01:05:26Z
dc.date.accessioned2020-08-06T07:15:43Z-
dc.date.available2011-01-21T01:05:26Z
dc.date.available2020-08-06T07:15:43Z-
dc.date.issued2011-01-21T01:05:26Z
dc.date.submitted2010-12-16
dc.identifier.urihttp://dspace.fcu.edu.tw/handle/2377/29927-
dc.description.abstractAn improved voice activity detection (VAD) based on the radial basis function neural network (RBF NN) and continuous wavelet transform (CWT) for speech recognition system is presented in the paper. The input speech signal is analyzed in the form of fixed size window by using Melfrequency cepstral coefficients (MFCC). Within the windowed signal, the proposed RBF-CWT VAD algorithm detects the speech/ non-speech signal using the RBF NN. Once the interchange of speech to non-speech or vice versa occurred, the energy changes of the CWT coefficients are calculated to localize the final coordination of the starting/ending speech points. Instead of classifying the speech signal using the MFCC at the frame-level which easily capture lots of undesired noise encountered by the conventional VAD with the binary classifier, the proposed RBF NN with the aid of CWT analyzes the transformation of the MFCC at the window-level that offers a better compensation to the noisy signal. The simulation results shows an improvement on the precision of the speech detection and the overall ASR rate particularly under the noisy circumstances compared to the conventional VAD with the zero-crossing rate, short-term signal energy and binary classifier.
dc.description.sponsorshipNational Cheng Kung University,Tainan
dc.format.extent6p.
dc.relation.ispartofseries2010 ICS會議
dc.subjectvoice activity detection
dc.subjectcontinuous wavelet transform
dc.subjectmel frequency cepstral coefficient
dc.subjectradial basis function
dc.subject.otherImage Processing, Computer Graphics, and Multimedia Technologies
dc.titleImproved Voice Activity Detection for Speech Recognition System
分類:1995年 NCS 全國計算機會議

文件中的檔案:
檔案 描述 大小格式 
518_ICS2010.pdf643.07 kBAdobe PDF檢視/開啟


在 DSpace 系統中的文件,除了特別指名其著作權條款之外,均受到著作權保護,並且保留所有的權利。