Recognition of speech emotion in Call Centre conversations in a multilingual environment

Zvarevashe, Kudakwashe

Please use this identifier to cite or link to this item: https://hdl.handle.net/10321/3801

DC Field	Value	Language
dc.contributor.author	Zvarevashe, Kudakwashe	en_US
dc.date.accessioned	2022-01-20T08:09:51Z	-
dc.date.available	2022-01-20T08:09:51Z	-
dc.date.issued	2021-10-15	-
dc.identifier.uri	https://hdl.handle.net/10321/3801	-
dc.description	Submitted in fulfilment of the requirements of the Degree of Doctor of Philosophy in Information Technology, Durban University of Technology, Durban, South Africa, 2021.	en_US
dc.description.abstract	The use of customer call centres has increased exponentially in the modern business world and is the heart of marketing in the customer services industry. Previous studies have shown that the quality of services that customers receive from the call centres paint a picture of how they view the company. Reliance on the use of suggestion boxes to crowdsource customer views on call centre services is not adequate and at times, may not give a correct record about the services in question. Therefore, speech emotion recognition has been applied in customer call centres as a tool for evaluating customer service perception, emotion, and sentiment. This approach presents several advantages, for instance, the performance of call centre agents can adequately be scrutinised because their emotions can be automatically classified based on machine learning methods for emotion recognition. In recent times, various techniques and methods have been used to develop robust speech emotion recognition systems for customer call centres, but the primary problem associated with these novel applications is that most of them do not perform well in multilingual environments. In addition, most of the proposed models do not properly recognise the fear archetype of emotion. The effectiveness of a speech emotion recognition system depends largely on the strength of the features used. Consequently, the purpose of this research was to discover the most efficacious features in recognising speech emotion in call centre conversations. Therefore, this thesis reports on the development of hybrid acoustic features based on spectral and prosodic descriptors. The set of hybrid features proposed in this study comprises the logarithm of energy, fundamental frequency, zero-crossing rate, spectral roll- off point, spectral flux, spectral centroid, spectral compactness, spectral variability, fast Fourier transform, Mel frequency cepstral coefficients, and linear prediction cepstral coefficients. Furthermore, this thesis reports on the development of a novel stacked ensemble machine learning algorithm based on a combination of inducers and ensemble classifiers. The discovery of effective speech emotion features and the development of an efficient machine learning algorithm are essential stages of effective speech emotion recognition in call centre conversations. The verification and validation of the proposed speech emotion recognition methods based on feature extraction and feature classification for applications in call centre conversions were done using a series of experiments. This was accomplished by testing the crafted hybrid acoustic features on five distinct speech emotion databases. The acoustic features were evaluated against deep learning auto-generated features and a hybrid of popular acoustic features. In addition, a set of four ensemble algorithms were evaluated against the newly invented stacked ensemble algorithm. The performance of the developed stacked ensemble algorithm in this study was analysed based on the widely used statistical evaluation metrics of accuracy, precision, F-score, area under the receiver operating characteristic curve and computation time. The results have indeed demonstrated that the newly developed stacked ensemble algorithm coupled with the crafted hybrid acoustic features have consistently performed better than many other state-of-the-art algorithms and speech features across various standard speech corpora.	en_US
dc.format.extent	224 p	en_US
dc.language.iso	en	en_US
dc.subject	Customer call centres	en_US
dc.subject	Speech emotion recognition	en_US
dc.subject.lcsh	Call centers--South Africa	en_US
dc.subject.lcsh	Emotion recognition	en_US
dc.subject.lcsh	Call centers--Customer services	en_US
dc.subject.lcsh	Customer services--Quality control	en_US
dc.subject.lcsh	Telephone in business	en_US
dc.subject.lcsh	Consumer satisfaction	en_US
dc.title	Recognition of speech emotion in Call Centre conversations in a multilingual environment	en_US
dc.type	Thesis	en_US
dc.description.level	D	en_US
dc.identifier.doi	https://doi.org/10.51415/10321/3801	-
item.grantfulltext	open	-
item.cerifentitytype	Publications	-
item.openairetype	Thesis	-
item.languageiso639-1	en	-
item.fulltext	With Fulltext	-
item.openairecristype	http://purl.org/coar/resource_type/c_18cf	-
Appears in Collections:	Theses and dissertations (Accounting and Informatics)

Files in This Item:

File	Description	Size	Format
ZvarevasheK_Doctorate_2021.pdf		3.97 MB	Adobe PDF	View/Open

Show simple item record

Page view(s)

448

checked on Dec 22, 2024

Download(s)

177

checked on Dec 22, 2024

Google Scholar^TM

Check

Files in This Item:

Page view(s)

Download(s)

Google Scholar^TM

Altmetric

Altmetric

Files in This Item:

Page view(s)

Download(s)

Google ScholarTM

Altmetric

Altmetric

Google Scholar^TM