題名: An AdaBoost Approach to Detecting and Extracting Texts from Natural Scene Images
作者: Fahn, Chin-Shyurng
Liu, Chia-Wei
關鍵字: AdaBoost algorithm
connected component labeling
text detection
text extraction
natural scene image
期刊名/會議名稱: 2008 ICS會議
摘要: In this paper, we present a new connected-component-based text detection and extraction method for natural scene images using AdaBoost techniques. First, we utilize the Canny operator and devise a two-phase two-scan labeling connected components algorithm to precisely find out candidate character blocks. Subsequently, several fundamental filtering rules are derived from the characteristics of texts to screen non-text blocks. Reducing the number of candidate character blocks can speed up the efficiency of the text classifier and improve the accuracy. Then we distinguish text blocks from the remaining candidate blocks using the strong classifier that is trained by an AdaBoost algorithm. In the sequel, we group the detected characters into words. Compared with other machine learning algorithms, the algorithm has an advantage of facilitating the speed of convergence during the training. Thus, we can update training samples to deal with comprehensive circumstances but do not spend much computational cost. Finally, we adopt a binarization method with an adaptive threshold to extract text regions. Even in an unbalanced illuminant environment, we can still extract texts successfully. Experimental results reveal that the text recall and precision rates are both more than 95% and the system efficiency of execution is also satisfactory.
日期: 2009-02-12T06:25:46Z
分類:2008年 ICS 國際計算機會議

文件中的檔案:
檔案 描述 大小格式 
ce07ics002008000137.pdf233.55 kBAdobe PDF檢視/開啟


在 DSpace 系統中的文件,除了特別指名其著作權條款之外,均受到著作權保護,並且保留所有的權利。