題名: | A Novel Concept to Improve the Redistribution Process for Language Models |
作者: | Huang, Feng-Long |
關鍵字: | Language model Smoothing method Good-Turing Cross entropy Non-uniform Redistribution |
期刊名/會議名稱: | 2005 NCS會議 |
摘要: | In the paper, a new concept, based on the nonuniform redistribution probability for novel events, to improve the smoothing method in language models is proposed. Basically, there are two processes in the smoothing methods: 1)discounting and 2)redistributing. Instead of uniform probability assignment to each unseen events used by most smoothing methods, we propose new technique to improve the redistribution process. Referring to the probabilistic behavior of all seen events, the redistribution process for novel events in our method is non-uniform. The proposed technique is exploited on well-known and frequently-used Good-Turing smoothing method. The empirical results are demonstrated and analyzed for two n-gram models. The improvement is obvious and effective for smoothing methods, especially on higher unseen event rate. |
日期: | 2006-10-13T08:15:19Z |
分類: | 2005年 NCS 全國計算機會議 |
文件中的檔案:
檔案 | 描述 | 大小 | 格式 | |
---|---|---|---|---|
ce07ncs002006000231.pdf | 96.91 kB | Adobe PDF | 檢視/開啟 |
在 DSpace 系統中的文件,除了特別指名其著作權條款之外,均受到著作權保護,並且保留所有的權利。