A Multimodal Approach for Audivisual Data Segmentation and Annotation

Zhang, Tong; Shih, Hsuan-Huei; Kuo, C.C.

完整後設資料紀錄

DC 欄位	值	語言
dc.contributor.author	Zhang, Tong
dc.contributor.author	Shih, Hsuan-Huei
dc.contributor.author	Kuo, C.C.
dc.date.accessioned	2009-06-02T07:22:40Z
dc.date.accessioned	2020-05-29T06:17:13Z	-
dc.date.available	2009-06-02T07:22:40Z
dc.date.available	2020-05-29T06:17:13Z	-
dc.date.issued	2006-10-27T07:33:54Z
dc.date.submitted	1999-12-20
dc.identifier.uri	http://dspace.fcu.edu.tw/handle/2377/2761	-
dc.description.abstract	While most approaches for video segmentation and indexing are focused on the pictorial part, there are significant clues contained in the accompanying audio flow. Only by combining the audio and visual information together, a fully functional system for video content parsing can be achieved. Based on the investigation of data structures for different video types, we present in this paper a scheme for the segmentation and annotation of audiovisual sequence which includes tools for both audio and visual content analysis. In the proposed system, the video data is segmented into audio scenes and visual shots by detecting abrupt changes in the audio and visual features, respectively. Then, the audio scene is indexed as one of the basic audio types such as speech, music, song, environmental sound, speech with music background, etc. while the visual shot is represented by keyframes and associated image features. An index table is generated automatically for each video clip through the integration of audio and visual analysis outputs. Experimental results show that the proposed research accomplishes more meaningful and robust indexing of video content compared to previous work.
dc.description.sponsorship	淡江大學, 台北縣
dc.format.extent	8p.
dc.format.extent	1863600 bytes
dc.format.mimetype	application/pdf
dc.language.iso	zh_TW
dc.relation.ispartofseries	1999 NCS會議
dc.subject	audiouisual data segmentation and indexing
dc.subject	audio content analysis
dc.subject	visual content analysis
dc.subject	video database management
dc.subject	information filtering and retrieval
dc.subject.other	Image/Multimedia Database
dc.title	A Multimodal Approach for Audivisual Data Segmentation and Annotation
分類:	1999年 NCS 全國計算機會議

文件中的檔案：

檔案	描述	大小	格式
ce07ncs001999000043.pdf		1.82 MB	Adobe PDF	檢視/開啟

顯示文件簡單紀錄

在 DSpace 系統中的文件，除了特別指名其著作權條款之外，均受到著作權保護，並且保留所有的權利。

逢甲大學校園典藏知識庫