- Tài khoản và mật khẩu chỉ cung cấp cho sinh viên, giảng viên, cán bộ của TRƯỜNG ĐẠI HỌC FPT
- Hướng dẫn sử dụng: Xem Video .
- Danh mục tài liệu mới: Tại đây .
- Đăng nhập : Tại đây .
SỐ LƯỢT TRUY CẬP


accurate visitors web counter
Visits Counter
FPT University|e-Resources > Đồ án tốt nghiệp (Dissertations) > Khoa học máy tính - Trí tuệ nhân tạo >
Please use this identifier to cite or link to this item: http://ds.libol.fpt.edu.vn/handle/123456789/3676

Title: English to Vietnamese subtitle generation system
Other Titles: Hệ thống tạo phụ đề tiếng Việt
Authors: Phan, Duy Hùng
Lê, Hoàng Phúc
Ngô, Anh Kiệt
Kiều, Minh Duy
Keywords: Artificial Intelligence
Computer Science
Natural Language Processing
Subtitle Generation
Issue Date: 2023
Publisher: FPTU Hà Nội
Abstract: Recently, the applications of artificial intelligence (AI) in many life domains are becoming increasingly popular and diverse. Besides that, AI appears in personal healthcare and education, automation in logistics and production, etc. Artificial intelligence applications in education have proven their role and capabilities, especially with access to up-to-date knowledge from texts, manuscripts, textbooks, lectures, and videos in foreign languages. Until now, most of the applications and models have worked well in English, Spanish, and other popular European languages. We have questioned whether we can create an application to help Vietnamese people expand their knowledge or simply understand foreigners’ content. In this thesis, we create an application for generating Vietnamese subtitles from audio, video, and Youtube URL, and some sideline-related functions using well-perform models during the process. The application is divided into three main tasks: enhancement, recognition, and translation. After passing in, all background noise of input video or audio will be minimized, and human speech will be improved in the enhancement module before it can be processed in recognition. With recognition, our application uses an automatic speech recognition model to transcribe the spoken English language into text format. Then, the transcript text is passed into the translation module to be processed using a neural network and machine translation algorithm to create the Vietnamese subtitles. This thesis provides an effective solution for English-Vietnamese subtitle generation, and its outcomes demonstrate high accuracy and quality. In addition, a dataset is collected in order to evaluate models as well as services. Our validation dataset includes 40 hours of audio along with subtitles in Vietnamese and English. Among the chosen models tested on our dataset, Whisper-medium has shown a significant WER score of 1.0065. For the translation part, we have come up with a preprocessing method for the input text. Thanks to combining that method with the EnviT5 model, the BLEU score is slightly ỉmproved by minimum of 0.5% compared to the original and particularly shows its advantage over others such as Google Translate and Amazon Web Services
URI: http://ds.libol.fpt.edu.vn/handle/123456789/3676
Appears in Collections:Khoa học máy tính - Trí tuệ nhân tạo

Files in This Item:

File Description SizeFormat
Report_English-to-Vietnamese.pdfFree1.87 MBAdobe PDF book.png
View/Open
Slide_English-to-Vietnamese.pdfFree14.22 MBAdobe PDF book.png
View/Open

Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.

 

  Collections Copyright © FPT University

FSE Hoa Lac Library

Add : Room 107, 1st floor, Hoa Lac campus, Km28 Thang Long Avenue, Hoa Lac Hi-Tech Park

Office tel: + 844.66805912  / Email :  [email protected]

 - Feedback