題名: Browser-Oriented Data Extraction
作者: Wu, I-Chen
Su, Jui-Yuan
Chen, Loon-Been
關鍵字: data extraction
Internet
BODED
期刊名/會議名稱: 2004 ICS會議
摘要: Traditionally, most researchers used the URL-oriented data extraction model for data extraction. In this model, the systems extract URLs from pages and then use the extracted URLs to access next pages. However, more and more pages currently use script functions to access next pages. Since it is hard to extract URLs from script programs, it is inappropriate to use this model for such pages. For solving this problem, this paper proposed a new data extraction model, named the browseroriented data extraction model. In this model, the system built on top of browsers accesses pages by simulating users’ operations on browsers, which can also trigger script functions. Besides, this paper defines a scripting language, named the BODED (Browser-Oriented Data Extraction Description) Language, which instructs the system to do data extraction.
日期: 2006-10-16T05:35:58Z
分類:2004年 ICS 國際計算機會議

文件中的檔案:
檔案 描述 大小格式 
ce07ics002004000100.pdf781.36 kBAdobe PDF檢視/開啟


在 DSpace 系統中的文件,除了特別指名其著作權條款之外,均受到著作權保護,並且保留所有的權利。