Please use this identifier to cite or link to this item: https://hdl.handle.net/10321/3794
DC FieldValueLanguage
dc.contributor.advisorReddy, Seren-
dc.contributor.authorShezi Nokwandaen_US
dc.date.accessioned2022-01-19T10:56:29Z-
dc.date.available2022-01-19T10:56:29Z-
dc.date.issued2021-12-01-
dc.identifier.urihttps://hdl.handle.net/10321/3794-
dc.descriptionSubmitted in fulfillment of the degree of Master of Engineering at the Department of Electronic and Computer Engineering in the Faculty of Engineering and the Built Environment at the Durban University of Technology, 2021.en_US
dc.description.abstractA key component of artificial intelligence is human-to-machine communication. Such communication has been realised through virtual assistants such as Apple's Siri, Google's Now, Amazon's Alexa, etc. This technology is made possible through Automatic Speech Recognition (ASR). Only in recent years have the previously marginalised or developing countries started researching ASR for their indigenous languages. This research focuses on ASR in isiZulu, which is one of South Africa's most spoken indigenous language. The research involves two main fields of study i.e., digital signal processing (DSP) and machine learning (ML). DSP was applied in word boundary estimation and feature extraction. Machine learning was used to convert the work boundary estimation and feature extraction. Machine learning was used to convert the word boundary estimation problem to a classification problem as well as for word recognition. Word boundary estimation achieved an accuracy of 68.4%, which is on par with the current research. the Mel-frequency cepstrum coefficient (MFCC) was used for the feature extraction of the speech and deep neural networks were chosen for the ML component. For the detection and classification of a word in a sentence, the trained neural network was tested by considering the effect of including and excluding explicit boundaries on the overall recognition. Word recognition accuracy with manually demarcated boundaries was 78.18%. In sentence recognition accuracy achieved without demarcated boundaries was 17.74% while a 23.28% accuracy was achieved without demarcated using classification. While in-sentence recognition accuracy for the two algorithms was both low, the accurately recognised words were determined by different heuristics. Other factors, such as the complex differences between the indigenous isiZulu languages and other more commonly spoken languages, are also highlighted and further research avenues are proposed.en_US
dc.format.extent132 pen_US
dc.language.isoenen_US
dc.subject.lcshAutomatic speech recognitionen_US
dc.subject.lcshZulu language--Data processingen_US
dc.subject.lcshSignal processing--Digital techniquesen_US
dc.subject.lcshNatural language processing (Computer science)en_US
dc.subject.lcshMachine learningen_US
dc.titleAutomatic speech recognition of the isiZulu languageen_US
dc.typeThesisen_US
dc.description.levelMen_US
dc.identifier.doihttps://doi.org/10.51415/10321/3794-
local.sdgSDG17-
item.grantfulltextopen-
item.cerifentitytypePublications-
item.fulltextWith Fulltext-
item.languageiso639-1en-
item.openairecristypehttp://purl.org/coar/resource_type/c_18cf-
item.openairetypeThesis-
Appears in Collections:Theses and dissertations (Engineering and Built Environment)
Files in This Item:
File Description SizeFormat
Nokwanda Shezi.pdf12.19 MBAdobe PDFView/Open
Show simple item record

Page view(s)

362
checked on Dec 22, 2024

Download(s)

185
checked on Dec 22, 2024

Google ScholarTM

Check

Altmetric

Altmetric


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.