題名: | Survey of the Smoothing Issues on Mandarin Language Models |
作者: | Huang, Feng-Long Yu, Ming-Shing Chiang, Yang-Kua |
關鍵字: | Language models smoothing methods statistical behavior entropy |
期刊名/會議名稱: | 中華民國92年全國計算機會議 |
摘要: | We survey several frequent smoothing methods used by language models for Mandarin. Due to the problem of data sparseness, smoothing techniques are employed to re-estimate the probability for all events while calculating the probability of occurrence. Among well-known smoothing methods, Good-Turing is employed widely. We have proposed a set of properties to analyze the behaviors of Good-Turing in this paper. Two novel smoothing methods are proposed. Finally, we implement three n-gram for Mandarin and then analyze the entropy and related problems of the Good-Turing; such as cut-off value and types of events . |
日期: | 2006-06-14T03:26:50Z |
分類: | 2003年 NCS 全國計算機會議 |
文件中的檔案:
檔案 | 描述 | 大小 | 格式 | |
---|---|---|---|---|
OT_1342003236.pdf | 71.74 kB | Adobe PDF | 檢視/開啟 |
在 DSpace 系統中的文件,除了特別指名其著作權條款之外,均受到著作權保護,並且保留所有的權利。