完整後設資料紀錄
DC 欄位語言
dc.contributor.author李韋宏zh_TW
dc.contributor.author吳令堯zh_TW
dc.date113學年度第一學期zh_TW
dc.date.accessioned2025-01-21T04:02:49Z-
dc.date.available2025-01-21T04:02:49Z-
dc.date.submitted2025-01-21-
dc.identifier.otherD1155099、D1153539zh_TW
dc.identifier.urihttp://dspace.fcu.edu.tw/handle/2376/4978-
dc.description.abstract中文摘要 隨著現代生活對智能化服務需求的提升,如何讓機器人具備更加自然且直觀的交互能力成為重要課題。本專題結合生成式人工智慧與機器人技術,希望可以構建一套高度智能化的交互系統,應用於教育輔助、醫療護理等高互動性場景,提升人機互動的自然性與效率。 在研究過程中,本系統以 Pepper 機器人作為交互平台,結合 OpenAI 的 Whisper 模型進行語音識別,並使用 GPT-4o 模型實現自然語言生成。影像數據由機器人內建的攝影機捕捉,經過數據編碼後與語音轉錄文本共同傳輸至伺服器進行處理。同時,透過 Prompt Engineering 技術,確保語言生成內容在邏輯性和格式上符合預期,並生成適應特定場景的回應。 本系統在語音識別準確率、語言生成的流暢性與邏輯性、以及機器人動作執行的合理性方面表現良好。特別是語音轉錄文本與影像數據的多模態結合,使交互過程更加自然流暢。本系統具備在日常服務、教育輔助以及醫療護理等場景中的應用潛力,為智能機器人技術的發展提供了新的可能性。zh_TW
dc.description.abstractAbstract With the increasing demand for intelligent services in modern life, enhancing the naturalness and intuitiveness of human-robot interaction has become a critical challenge. This study integrates generative artificial intelligence with robotic technology to develop a highly intelligent interaction system, aimed at applications in highly interactive scenarios such as educational assistance and medical care, improving the naturalness and efficiency of human-robot interaction. In this research, the Pepper robot serves as the interaction platform, combining OpenAI's Whisper model for speech recognition and the GPT-4o model for natural language generation. Visual data is captured by the robot's built-in camera, encoded, and processed together with transcribed speech text on the server. Additionally, through Prompt Engineering, the system ensures that the generated language outputs are logical, well-formatted, and tailored to specific scenarios. The system demonstrates strong performance in speech recognition accuracy, the fluency and coherence of language generation, and the appropriateness of robot motion execution. Notably, the multimodal integration of transcribed speech text and visual data makes the interaction process more natural and seamless. This system has significant potential for applications in daily services, educational assistance, and medical care, offering new possibilities for the advancement of intelligent robotic technologies.zh_TW
dc.description.tableofcontents中文摘要 ........................................................................................................................ 1 Abstract .......................................................................................................................... 2 目錄 ................................................................................................................................ 3 圖目錄 ............................................................................................................................ 4 第一章 緒論 ................................................................................................................ 5 1.1前言 .................................................................................................................. 5 1.2研究動機 .......................................................................................................... 5 1.3研究目的 .......................................................................................................... 6 第二章 技術與文獻探討 ............................................................................................ 7 2.1 Generative Pre-trained Transformer ................................................................. 7 2.2 Large Multimodal Models ............................................................................... 7 2.3 Prompt Engineering ......................................................................................... 9 2.4 Pepper robot ................................................................................................... 10 2.5 Whisper .......................................................................................................... 11 第三章 研究架構與方法 .......................................................................................... 14 3.1 系統架構 ....................................................................................................... 14 3.2 實驗設計與測試方法 ................................................................................... 15 3.2.1 實驗設計 ............................................................................................ 15 3.2.2 評估指標 ............................................................................................ 17 第四章 研究結果 ...................................................................................................... 18 第五章 結論與未來展望 .......................................................................................... 21 5.1 結論 ............................................................................................................... 21 5.2 未來展望 ....................................................................................................... 21 參考文獻 ...................................................................................................................... 22zh_TW
dc.format.extent22p.zh_TW
dc.language.isozhzh_TW
dc.rightsopenbrowsezh_TW
dc.subject人形機器人zh_TW
dc.subject大型多模態模型zh_TW
dc.subjectGenerative Pre-trained Transformerzh_TW
dc.subjectWhisperzh_TW
dc.subjectHumanoid Robotzh_TW
dc.subjectLarge Multimodal Modelszh_TW
dc.title大型多模態模型實現人形機器人智能交互系統開發zh_TW
dc.title.alternativeDevelopment of an Intelligent Interaction System for Humanoid Robots Based on Large Multimodal Modelszh_TW
dc.typeUndergraReportzh_TW
dc.description.course機器人學zh_TW
dc.contributor.department自動控制工程學系, 資訊電機學院zh_TW
dc.description.instructor黃, 清輝-
dc.description.programme自動控制工程學系, 資訊電機學院zh_TW
分類:資電113學年度

文件中的檔案:
檔案 描述 大小格式 
1131-13.pdf1.03 MBAdobe PDF檢視/開啟


在 DSpace 系統中的文件,除了特別指名其著作權條款之外,均受到著作權保護,並且保留所有的權利。