O n f e r e n c e


The challenge of TTS intelligibility for CALL



Download 3,85 Mb.
Pdf ko'rish
bet96/342
Sana11.08.2022
Hajmi3,85 Mb.
#846838
1   ...   92   93   94   95   96   97   98   99   ...   342
Bog'liq
The Efficacy of Legal Videos in enhancin(1)

The challenge of TTS intelligibility for CALL 
Intelligibility must be regarded as the first and most significant criterion for the 
assessment of synthetic speech. The term ‘intelligibility’ here is not being confined to a 
recognition of individual phonemes or words in isolation as used by a number of 
investigators towards the beginning of the century such as Koul (2003). Rather, it is used 
as the listener’s ability to recognise (and orthographically transcribe) a set of sentences 
but not requiring any higher level cognitive functioning such as constructing a coherent 
mental representation of the information contained in the text and relating it to any pre-
existing knowledge which the listener may have (Kintsch & van Dijk, 1978)
.
Questions are raised as to the adequacy of tests of ‘intelligibility’ which are used for 
synthetic voices.


-106- 
2014 CALL Conference 
LINGUAPOLIS
www.antwerpcall.be 
It is argued in this paper
that intelligibility tests alone are inadequate in order to describe 
the complexity of human reaction to synthetic voices and a method for measuring ‘ease 
of intelligibility’, otherwise referred to here as ‘
clarity’
, is proposed.
Evaluating TTS Synthesis
There exists a considerable body of research into the evaluation procedures which may 
be used for synthetic speech in both commercial and educational environments. Amongst 
the many facets of the speech being inquired into are: syllable articulation; word 
intelligibility; sentence intelligibility and overall quality to include rhythm, speaking rate, 
continuity, intonation, nearness to human voice and suitability for the users’ purpose 
(Itahashi, 2000). Campbell (2007) refers to the many different ways speech synthesis 
can be evaluated. These include diagnostic or comparative evaluations, subjective or 
objective evaluations, modular or global evaluations, task-based or generic evaluations. 
An Expert Advisory Group on Language Engineering Standards (EAGLES) produced a TTS 
assessment taxonomy to explore the many dimensions of synthesised speech. They 
distinguish between what they call ‘black box’ and ‘glass box’ assessments 
(van 
Bezooijen & van Heuven, 1997, p. 485). Glass box assessments focus on testing the 
output of specific components of a speech system in a laboratory environment and may 
use human raters or automated methods. This type of investigation is considered 
objective in that it can produce objective data which can measure the quality of the 
synthetic speech output. The glass box assessment is used primarily as a diagnostic tool 
by system developers in order to improve the quality of the output of the TTS system 
being developed (Lampert, 2004). The main problem with this methodology is that there 
is not always a precise match between the objective measures it yields and more 
subjective measures of the quality of synthetic speech. Some objective measures may be 
over-sensitive compared to the human ear. Conversly synthetic speech may be perfectly 
intelligible and rated highly in glass box assessments but nonetheless regarded as 
unnatural by the listener (Cryer & Home, 2010). It is accepted that current objective 
measures are not suitable for predicting the subjective quality of synthetic speech 
(Huang, 2011). 
The black box assessment concentrates on the functionality of the overall system and its 
fitness for use in specific situations such as the telephone answering systems used in 
banking (Léwy & Hornstein, 1994). This is typically done by way of human evaluation of 
the synthesised speech (Bachan, 2008). It is mainly by way of subjective evaluations, as 
gleaned through Likert type questionnaires, that evaluations of synthetic speech in a 
CALL context are done. Such assessments try to evaluate the extent to which the 
synthetic speech can help end users perform their intended task (Lampert, 2004) or try 
to determine the fitness of a system for a purpose (Furui, 2007). Functionality rather 
than form is the focus of attention. 
It is widely claimed in recent literature reviews that state-of-the-art synthesised speech 
has developed to a point where ‘intelligibility’ is no longer a factor 
(Mayo, Clark, & King, 
2011) and research is focusing more on factors such as naturalness, likeability, prosody, 
its ability to express emotions, its persuasive abilities, etc. (Cryer & Home, 2010). 
However, the TTS systems used in specific CALL applications may differ a lot and so high 
intelligiblity cannot be taken for granted. This is particularly true for new or developing 
TTS systems, especially systems in a new language, as is the case here with the 
ABAIR 
system,
and any evaluation of an emerging TTS system must include tests of 
‘intelligibility’. 
The term ‘intelligibility’ itself needs examination since it should not be seen as an “all or 
nothing” concept. While we undoubtedly need a task/measure to assess the degree of 
intelligiblity of utterances produced by a specific system, we would argue that it is 
equally important to ascertain the ease with which listeners can carry it out. This latter 
factor is likely to be a critical indicator of the eventual acceptability of a particular 


-107- 
2014 CALL Conference 
LINGUAPOLIS
www.antwerpcall.be 
synthesis-based CALL application. For that reason the term 
‘ease of intelligibility'
is being 
introduced. 
Thi
s may be seen as being closely related to ‘
clarity’
, or the mental effort required for 
successful completion of a task, frequently referred to as ‘cognitive loading’ 
(Pillay, 
1994)
. ‘Clarity’ seems a more suitable concept when TTS synthetic speech is being
used 
for a functional purpose such as CALL games. Two types of test are therefore introduced: 
a transcription task to ascertain 
intelligibility
(i.e. Performance measures) and subjects’ 
rating of 
clarity
which indicates Opinion measures (Cryer & Home, 2010). 

Download 3,85 Mb.

Do'stlaringiz bilan baham:
1   ...   92   93   94   95   96   97   98   99   ...   342




Ma'lumotlar bazasi mualliflik huquqi bilan himoyalangan ©hozir.org 2024
ma'muriyatiga murojaat qiling

kiriting | ro'yxatdan o'tish
    Bosh sahifa
юртда тантана
Боғда битган
Бугун юртда
Эшитганлар жилманглар
Эшитмадим деманглар
битган бодомлар
Yangiariq tumani
qitish marakazi
Raqamli texnologiyalar
ilishida muhokamadan
tasdiqqa tavsiya
tavsiya etilgan
iqtisodiyot kafedrasi
steiermarkischen landesregierung
asarlaringizni yuboring
o'zingizning asarlaringizni
Iltimos faqat
faqat o'zingizning
steierm rkischen
landesregierung fachabteilung
rkischen landesregierung
hamshira loyihasi
loyihasi mavsum
faolyatining oqibatlari
asosiy adabiyotlar
fakulteti ahborot
ahborot havfsizligi
havfsizligi kafedrasi
fanidan bo’yicha
fakulteti iqtisodiyot
boshqaruv fakulteti
chiqarishda boshqaruv
ishlab chiqarishda
iqtisodiyot fakultet
multiservis tarmoqlari
fanidan asosiy
Uzbek fanidan
mavzulari potok
asosidagi multiservis
'aliyyil a'ziym
billahil 'aliyyil
illaa billahil
quvvata illaa
falah' deganida
Kompyuter savodxonligi
bo’yicha mustaqil
'alal falah'
Hayya 'alal
'alas soloh
Hayya 'alas
mavsum boyicha


yuklab olish