題名: Survey of the Smoothing Issues on Mandarin Language Models
作者: Huang, Feng-Long
Yu, Ming-Shing
Chiang, Yang-Kua
關鍵字: Language models
smoothing methods
statistical behavior
entropy
期刊名/會議名稱: 中華民國92年全國計算機會議
摘要: We survey several frequent smoothing methods used by language models for Mandarin. Due to the problem of data sparseness, smoothing techniques are employed to re-estimate the probability for all events while calculating the probability of occurrence. Among well-known smoothing methods, Good-Turing is employed widely. We have proposed a set of properties to analyze the behaviors of Good-Turing in this paper. Two novel smoothing methods are proposed. Finally, we implement three n-gram for Mandarin and then analyze the entropy and related problems of the Good-Turing; such as cut-off value and types of events .
日期: 2006-06-14T03:26:50Z
分類:2003年 NCS 全國計算機會議

文件中的檔案:
檔案 描述 大小格式 
OT_1342003236.pdf71.74 kBAdobe PDF檢視/開啟


在 DSpace 系統中的文件,除了特別指名其著作權條款之外,均受到著作權保護,並且保留所有的權利。