題名: | DESDL: A Data Extraction Service Description Language. |
作者: | Wu, I-Chen Su, Jui-Yuan Chen, Loon-Been Chien, Kuang-Ting |
關鍵字: | Data extraction XML HTML XML-QL XPath WIDL DESDL |
期刊名/會議名稱: | 2002 ICS會議 |
摘要: | In this paper, we design an XML-based description language, named Data Extraction Service Description Language (DESDL), for data extraction services. In DESDL, the users can describe a set of services each of which extracts data from the designated web pages and then saves these data into local databases or navigates into next services. The features of this language include: (a) Query expressions that specify the rules to extract data from designated web pages. (b) Multi-way navigation that the users can use to traverse web pages. (c) Plug-in code, named DESDLet, that users can use to define the process of extracted data, e.g., save them into databases or use them to navigate into more pages. In this paper, we also implement the system for DESDL and demonstrate it by using it to implement a price-comparison site where we extract product information from over 50 electronic-commerce sites in Taiwan. In our experience, one engineer only needs about one working day to write a DESDL script to extract product information from one E-commerce web site. This greatly reduces the cost of maintaining such a web site. |
日期: | 2006-10-24 |
分類: | 2002年 ICS 國際計算機會議 |
文件中的檔案:
檔案 | 描述 | 大小 | 格式 | |
---|---|---|---|---|
ce07ics002002000249.PDF | 432.67 kB | Adobe PDF | 檢視/開啟 |
在 DSpace 系統中的文件,除了特別指名其著作權條款之外,均受到著作權保護,並且保留所有的權利。