題名: Automatic Clustering of Web News Based on Topics-Discovery
作者: Chang, Hsi-Cheng
Lin, Jeen-Fong
關鍵字: Document clustering, feature selection
keyword clustering
topic identification
期刊名/會議名稱: 2007 NCS會議
摘要: This paper proposes a new method for the unsupervised clustering of large and high-dimensional sets of textual data. The system begins with the topics-discovery process, which determines the k groups of document with maximal intra-group similarity and well scattered throughout the similarity space of the text collection. These k document groups are regarded as the central topics of the entire document collection. Then an intelligent feature selection algorithm is applied to deriving the features, called as topic keywords, that are the most suitable representation of the topics. Finally, all documents in the collection are clustered into k clusters according to the topic keywords. This method provides advantages of a very efficient clustering operation and involves no humanly predefined thresholds, which mean that no expert intervention is required. The experimental results indicate that this approach generated higher quality of cluster than many well-known document clustering algorithms.
日期: 2008-07-22T07:04:00Z
分類:2007年 NCS 全國計算機會議

文件中的檔案:
檔案 描述 大小格式 
CE07NCS002007000009.pdf216.09 kBAdobe PDF檢視/開啟


在 DSpace 系統中的文件,除了特別指名其著作權條款之外,均受到著作權保護,並且保留所有的權利。