題名: | Mining Maximal Frequent Itemsets in Data Streams |
作者: | Li, Hua-Fu Lee, Suh-Yin Shan, Man-Kwan |
關鍵字: | Data mining data streams maximal frequent itemsets online algorithm single-pass mining |
期刊名/會議名稱: | 2004 ICS會議 |
摘要: | Mining streaming data brings not only unique opportunities but also new difficult challenges of online algorithm design, such as one streaming data scan, bounded memory requirement, fast processing time, and short response time. In this paper, we propose a single-pass algorithm, called DSM-MFI (Data Stream Mining for Maximal Frequent Itemsets), to mine the set of all maximal frequent itemsets (MFI) in a continuous stream of transactions. In single one scan of incoming streaming data, an in-memory summary data structure, called IPM-Forest (Item- Prefix Maximal-itemset Forest), is developed to store all the frequent information about the maximal frequent itemsets of the data streams. In DSM-MFI, two efficient mechanisms, namely Transaction Item-prefix Projection (TIP) and Top-Down Maximal frequent itemset Finding (TDMF), is used to improve the performance of mining MFI in data streams. More specifically, TIP makes the space requirement of DSM-MFI predicable and reconstructs the smallest parts of IPM-Forest. In addition, TDMF finds all maximal frequent itemsets by a “MaxTo3” approach from the IPM-Forest generated so far. Based on our knowledge, DSMMFI is the first algorithm for online mining maximal frequent patterns in continuous data streams. |
日期: | 2006-10-16T05:43:21Z |
分類: | 2004年 ICS 國際計算機會議 |
文件中的檔案:
檔案 | 描述 | 大小 | 格式 | |
---|---|---|---|---|
ce07ics002004000095.pdf | 423.67 kB | Adobe PDF | 檢視/開啟 |
在 DSpace 系統中的文件,除了特別指名其著作權條款之外,均受到著作權保護,並且保留所有的權利。