題名: Text Categorization Using Latent Topics as Additional Features
作者: Mizugai, Hiroshi
Paik, Incheon
Kanemoto, Shigeru
關鍵字: Machine Learning
Text Categorization
Latent Topics
AdaBoost
期刊名/會議名稱: 2008 ICS會議
摘要: In feature selection of text categorization, there are methods which handle word sense disambiguation by extracting synonymy and polysemy among words in documents. One of the methods utilizes latent topics underlying documents by using a topic model. PLSA and LDA have been proposed as representative models. In this paper, two features which include both TF-IDF and the latent topic values which extracted automatically from topic models were utilized for text categorization using AdaBoost. Then, the performances were compared with the ones of only TF-IDF features. As a result, this study evaluates effectiveness and weakness of the augmented features.
日期: 2009-02-12T02:15:47Z
分類:2008年 ICS 國際計算機會議

文件中的檔案:
檔案 描述 大小格式 
ce07ics002008000154.pdf181.08 kBAdobe PDF檢視/開啟


在 DSpace 系統中的文件,除了特別指名其著作權條款之外,均受到著作權保護,並且保留所有的權利。