Conclusion
This paper has set out to evaluate the suitability of the synthetic Irish language voices for
an interactive CALL platform. It is noted that performance measures, as denoted by the
respondents’ ability
to successfully transcribe, are very high. Opinion measures, on the
other hand, are far lower. It is argued that one needs to combine both measures in order
to arrive at a more realistic measure of the intelligibility and clarity of the speech.
This is in line with the general agreement that TTS synthesis has reached a very high
level of intelligibility (Chang, 2011).
This study suggests that one needs to take a broader view when deciding the fitness-for-
purpose of TTS synthesis for CALL contexts. ‘Ease
of intelligibility’ is a fundamental issue
which needs to be taken into account in order to come to an understanding of the rather
ambivalent relationship frequently shown between end-user and synthetic speech (Cryer
& Home, 2009). The factors associated with ease of intelligibility are varied, many of
which are enumerated by Mayo, Clark and King (2011). These factors continue to be of
interest to those involved in the building and development of TTS systems. On the other
hand, the evaluation by CALL practitioners can be more global in nature and the formula
proposed in the present paper offers the practitioner a system which can evaluation the
fitness-for-CALL-purpose of any TTS system. The combination of a subjective and an
objective measure offers a validity which is greater than that of either measurement
acting in isolation.
-109-
2014 CALL Conference
LINGUAPOLIS
www.antwerpcall.be
References
Al-Wabil, A., Al-Khalifa, H., & Al-Saleh, W. (2007). Arabic Text-To-Speech Synthesis: A
Preliminary Evaluation. In C. Montgomerie & J. Seale (Eds.), Proceedings of World
Conference on Educational Multimedia, Hypermedia and Telecommunications (pp. 4423
–
4430). Chesapeake, VA: AACE.
Bachan, J. (2008). Experimental Phonetic Methods in Speech Synthesiser Evaluation. In
Workshop of Arbeitskreis Phonetik - Phonologie - Prosodie. Bielefeld.
Campbell, N. (2007). Evaluation of Speech Synthesis. In L. Dybkjær, H. Hemsen, & W.
Minker (Eds.), Evaluation of text and speech systems (pp. 29
–
64). Springer.
Chang, Y.-Y. (2011). Evaluation of TTS Systems in Intelligibility and Comprehension
Tasks. In Proceedings of the 23rd Conference on Computational Linguistics and Speech
Processing (pp. 64
–
78). Association for Computational Linguistics.
Chapelle, C. (2001). Computer applications in second language acquisition: Foundations
for teaching, testing, and research. Cambridge: Cambridge University Press.
Clark, R. A. J., Richmond, K., & King, S. (2004). Festival 2-build your own general
purpose unit selection speech synthesiser. In Fifth ISCA Workshop on Speech Synthesis
(pp. 173
–
178).
Cryer, H., & Home, S. (2009). User attitudes towards synthetic speech for Talking Books
(p. 46). RNIB Centre for Accessible Information, Birmingham: Research report #7.
Cryer, H., & Home, S. (2010). Review of methods for evaluating synthetic speech. RNIB
Centre for Accessible Information, Birmingham: Technical report #8.
Delogu, C., Conte, S., & Sementina, C. (1998). Cognitive factors in the evaluation of
synthetic speech. Speech Communication, 24(2), 153
–
168.
Furui, S. (2007). Speech and speaker recognition evaluation. In L. Dybkjær, J. Hamsen,
& W. Minker (Eds.), Evaluation of text and speech systems (pp. 1
–
27). The Netherlands:
Springer.
Gupta, P., & Schulze, M. (2012). Human Language Technologies (HLT). Module 3.5 in
Davies, G. (ed.) Information and Communications Technology for Language Teachers
(ICT4LT), Slough, Thames Valley University [Online].
Handley, Z. (2005). Evaluating Text-to-Speech (TTS) Synthesis for use in Computer-
Assisted Language Learning (CALL). (Unpublished doctoral dissertation). University of
Manchester.
Handley, Z. (2009). Is text-to-speech synthesis ready for use in computer-assisted
language learning? Speech Communication, 51(10), 906
–
919.
doi:10.1016/j.specom.2008.12.004
Higgins, J. (1988). Language, Learners, and Computers: Human Intelligence and Artificial
Unintelligence. New York: Longman.
Huang, D.-Y. (2011). Prediction of Perceived Sound Quality of Synthetic Speech. In
APSIPA ASC 2011 Xi’an
(p. 6).
Itahashi, S. (2000). Guidelines for Japanese Speech Synthesizer Evaluation. In LREC2000
(pp. 655
–
660). Athens.
-110-
2014 CALL Conference
LINGUAPOLIS
www.antwerpcall.be
Keller, E., & Zellner-Keller, B. (2000). Speech synthesis in language learning: Challenges
and opportunities. In InSTIL 2000 Conference. Dundee, Scotland.
Kintsch, W., & van Dijk, T. A. (1978). Towards a model for text comprehension and
production. Psychological Review, 85(5), 363
–
394.
Koul, R. (2003). Synthetic Speech Perception in Individuals With and Without Disabilities.
Augmentative and Alternative Communication, 19(1), 49
–
58.
Lampert, A. (2004). Evaluation of the MU-Talk Speech Synthesis System (p. 29).
Retrieved from http://sgi.nu/nlp/content/pdf/SynthesisEvaluation.pdf
Léwy, N., & Hornstein, T. (1994). Text-to-Speech Technology: A Survey of German
Speech Synthesis System (p. 52).
Mayo, C., Cla
rk, R. A. J., & King, S. (2011). Listeners’ weighting of acoustic cues to
synthetic speech naturalness: A multidimensional scaling analysis. Speech
Communication, 53(3), 311
–
326.
Peirce, N., & Wade, V. (2010). Personalised Learning for Casual Games: The
“Language
Trap” Online Language Learning. In
Game, 4th European Conference on Game Based
Learning (ECGBL) (pp. 306
–
315). Copenhagen, Denmark: Bente Meyer, Academic
Publishing.
Pellegrini, T., Costa, Â., & Trancoso, I. (2012). Less errors with TTS? A dictation
experiment with foreign language learners. In 13th Annual Conference of the
International Speech Communication Association.
Peterson, M. (2010). Computerized Games and Simulations in Computer-Assisted
Language Learning: A Meta-Analysis of Research. Simulation & Gaming, 41(1), 71
–
93.
Pillay, H. K. (1994). Cognitive load and mental rotation: structuring orthographic
projection for learning and problem solving. Instructional Science, 22(2), 91
–
113.
Sha, G. (2010). Using TTS voices to develop audio materials for listening comprehension:
A digital approach. British Journal of Educational Technology, 41(4), 632
–
641.
Stratil, M., Burkhardt, D., Jarratt, P., & Yandle, J. (1987). Computer-Aided Language
Learning with Speech Synthesis: User Reactions. Innovations in Education & Training
International, 24(4), 309
–
316.
Van Bezooijen, R., & van Heuven, V. J. (1997). Assesment of synthesis systems. In D.
Gibbon, R. Moore, & R. Winski (Eds.), Handbook of standards and resources for spoken
language systems (Volume 3., pp. 481
–
563). New York: Mouton de Gruyter.
Van Merrienboer, J. J. G., & Sweller, J. (2005). Cognitive Load Theory and Complex
Learning: Recent Developments and Future Directions. Educational Psychology Review,
17(2), 147
–
177.
-111-
2014 CALL Conference
LINGUAPOLIS
www.antwerpcall.be
Do'stlaringiz bilan baham: |