Education of the republic of uzbekistan kokand state pedagogical institute named after mukumi the faculty of foreign languages



Download 53 Kb.
bet9/17
Sana01.01.2022
Hajmi53 Kb.
#284177
1   ...   5   6   7   8   9   10   11   12   ...   17
Bog'liq
''Checking speaking skills'' 1

2.4 Raters

In order for raters to achieve a common understanding and application of a scale, rater training is an important part of assessing speaking. As the standard for speaking assessment procedures involving high-stakes decisions is an inter-rater reliability coefficient of 0.80, some variability among raters is expected and tolerated. Under optimal conditions, the sources of error that can be associated with the use of a scale are expected to be random rather than systematic. Therefore, research aims to identify and control systematic error resulting from rater performance.

One type of systematic error results from a rater’s tendency to assign either harsh or lenient scores. When a pattern is identified in comparison to other raters in a pool, a rater may be identified as negatively or positively biased. Systematic effects with respect to score assignment have been found in association with rater experience, rater native language background, and also examinee native language background. Every effort should be made to identify and remove as their presence negatively affects the accuracy, utility, interpretability, and fairness of the scores we report.

With fairness at issue, researchers have studied factors affecting ratings. It was compared differences across Japanese language teachers and professional tour guides in their assignment of scores to 51 Japanese tour guide candidates. While no differences were found in the scores assigned, the two pools of raters did apply different criteria in their score assignments: teachers tended to focus on grammar, vocabulary, and fluency while tour guides tended to focus on pronunciation. It was examined the performance of three rater groups who differed in professional background and place of residence and found a tendency for the teachers to rate grammar more harshly in assessment of speaking 5 comparisons to the nonteaching groups who emphasized communicative success. They compared native-speaking English raters from Australia, Canada, the UK, and the USA and found raters from the UK harshest while raters from the USA were the most lenient. Differences in raters’ application of a scale have been found not only across raters of different backgrounds and experiences, but also across trained raters of similar backgrounds.

Studies comparing native speaker and nonnative speakers as raters have produced mixed findings. While some studies have identified tendencies for non-native speakers to assign harsher scores, others have found the opposite to be the case. In Winke raters with first language backgrounds that matched those of the candidates were found more lenient when rating second language English oral proficiency, and the authors suggest that this effect may be due to familiarity with accent. In an attempt to ameliorate such potential effects, some scientists provided special training for Indian raters who were evaluating the English language responses of Indian examinees on the TOEFL iBT. While the performance of the Indian raters was found comparable to that of Educational Testing Service raters both before and after the training, the Indian raters showed some improvement and increased confidence after participating in the training. Far fewer studies have been conducted on differences in ratings assigned by interviewers; however, there is no reason to expect that interviewers would be less subject to interviewer effects than raters are to rater effects. Indeed, in an examination of variability across two interviewers with respect to how they structured the interview, their questioning techniques, and the feedback they provided, it was identified differences that could easily result in different score assignments as well as differences in interpretations of the interviewee’s ability.

These findings underscore the importance of rater training; however, the positive effects of training tend to be short-lived. In a study examining rater severity over time, Lumley and McNamara found that many raters tended to drift over time. The phenomenon of rater drift calls into question the practice of certifying raters once and for all after successfully completing only a single training program and highlights the importance of ongoing training in order to maintain rater consistency. A more important concern raised by studies of rater variability—one that can only be partially addressed by rater training— is whose standard, whether that of an experienced rater, of an inexperienced rater, of a teacher, of a native speaker, or of a non-native speaker is the more appropriate standard to apply.




Download 53 Kb.

Do'stlaringiz bilan baham:
1   ...   5   6   7   8   9   10   11   12   ...   17




Ma'lumotlar bazasi mualliflik huquqi bilan himoyalangan ©hozir.org 2024
ma'muriyatiga murojaat qiling

kiriting | ro'yxatdan o'tish
    Bosh sahifa
юртда тантана
Боғда битган
Бугун юртда
Эшитганлар жилманглар
Эшитмадим деманглар
битган бодомлар
Yangiariq tumani
qitish marakazi
Raqamli texnologiyalar
ilishida muhokamadan
tasdiqqa tavsiya
tavsiya etilgan
iqtisodiyot kafedrasi
steiermarkischen landesregierung
asarlaringizni yuboring
o'zingizning asarlaringizni
Iltimos faqat
faqat o'zingizning
steierm rkischen
landesregierung fachabteilung
rkischen landesregierung
hamshira loyihasi
loyihasi mavsum
faolyatining oqibatlari
asosiy adabiyotlar
fakulteti ahborot
ahborot havfsizligi
havfsizligi kafedrasi
fanidan bo’yicha
fakulteti iqtisodiyot
boshqaruv fakulteti
chiqarishda boshqaruv
ishlab chiqarishda
iqtisodiyot fakultet
multiservis tarmoqlari
fanidan asosiy
Uzbek fanidan
mavzulari potok
asosidagi multiservis
'aliyyil a'ziym
billahil 'aliyyil
illaa billahil
quvvata illaa
falah' deganida
Kompyuter savodxonligi
bo’yicha mustaqil
'alal falah'
Hayya 'alal
'alas soloh
Hayya 'alas
mavsum boyicha


yuklab olish