Three basic tenets
1. We have to use assessment tasks which focus on the kinds of texts the learners will hear in 'the real world'.
2. We need to design tasks which accurately show the learners' ability.
3. We need to have a reliable way to score the learners' performance.
These three factors are to do with ensuring reliability and validity. For more on those two concepts, see the guide to testing, assessment and evaluation. The rest of this guide assumes basic familiarity with the content of that guide. Fulfilling all three criteria adequately requires a little care.
Identifying listening text types
The first step is to find out what sorts of texts the learners will need to access and what strategies are appropriate for the purposes of listening. This is by no means an easy undertaking, especially if the course is one in General English (also known as ENAP [English for No Apparent Purpose]) when it is almost impossible to predict what sorts of texts, for what purposes the learners may one day need to access (see below for a generic check-list of skills).
On courses for very specific purposes, it is easier to identify the sorts of texts the learners will encounter and the purposes for which they will listen to them but there is no related set of subskills that we can identify with confidence that will allow them easy access to texts in particular topic areas. We can, however, look at the types of texts and identify key listening strategies to focus on. For example:
Situation
|
Skills needed
|
ANNOUNCEMENT
|
Good monitoring skills to decide on relevance (Is this my flight?) and the ability to extract vital data (gate numbers, platforms etc.)
|
LECTURES
|
Listening for signposting (sequences, itemisation, prioritisation, importance etc.)
|
RADIO AND TV
|
Gist listening to entertainment to follow a plot
Monitoring for relevance in a news broadcast
Using visual clues to understand TV programmes
|
INSTRUCTIONS AND DIRECTIONS
|
Intensive listening for detailed understanding
|
MEETINGS AND SEMINARS
|
Intensive listening to understand detail and locate relevance
On-going monitoring to identify questions and invitations to comment
|
DIALOGUES
|
Gist listening to follow a conversation
Intensive listening if the listener is a (potential) participant
|
When we know what kinds of settings in which our learners will need to operate, we can get on with designing tests which assess how well they are able to deploy the skills they will need.
Now we know what sorts of thing we want to assess, the text types we are targeting, the purposes of listening, the subskills deployed and so on, we can get on and design some assessment procedures.
There are some generic guidelines for all tasks. If you have followed the guide to testing, assessment and evaluation (see above), you will know that this is something of a balancing act because there are three main issues to contend with:
1. Reliability:
A reliable test is one which will produce the same result if it is administered again (and again). In other words, it is not affected by the learner' mood, level of tiredness, attitude etc. This is challenging area in the case of assessing listening because the skill requires high levels of concentration especially if more than gist is to be gleaned. We need to be aware that very long listening tasks will result in fatigue and that may overwhelm learners who are otherwise good listeners. Unless there is a good reason for using a long text (e.g., when preparing people for study in English), a range of short tasks focused as far as possible on micro skills is a better way forward in most circumstances. Assessment outcomes are often in written form and the listening text itself often recorded and repeatable so marking can be quite reliable.
2. Validity:
Two questions here:
a) Do the texts represent the sorts of texts the learners are likely to encounter?
For example, if we set out to test someone's ability to understand a lecture, we need to ensure that the topic area is valid for them. On the other hand, if we know that our learners will rarely, if ever, encounter the need to listen to extended monologues from native speakers but will need to understand what they are told in service and informational encounters, then we have to match the texts we use for assessment of their abilities.
2. Do we have enough tasks to target all the skills we want to assess?
For example, if we want to test the ability to use context and co-text to infer meaning, do we have a task or tasks focused explicitly and discretely on that skill? If we want to test the ability to monitor a series of announcements for crucial data, do we have a test that requires that skill?
3. Practicality:
Against the two main factors, we have to balance practicality. It may be advisable to set as many different tasks as possible to ensure reliability and to try to measure as many of the subskills as possible in the same assessment procedure to ensure validity but in the real world, time is often limited and concentration spans are not infinite. Practicality applies to both learners and assessors:
a) for learners, the issue is often one of test fatigue.
Too many tests over too short a time may result in learners losing commitment to the process. On shorter courses, in particular, testing too much can be perceived as a waste of learning time.
b) for the assessors, too many time-consuming tests which need careful assessment and concentration may put an impractical load on time and resources. Assessors may become tired and unreliable.
c) The third issue concerns technology. If we know, for example, that our learners will rarely have to understand audio-only, disembodied text, then providing context and clues through the use of video recordings should be considered. Even settings which are heavily text laden (such as lectures) are accompanied by gesture, expression and visual data that cannot be excluded from a valid test of the skills.
Do'stlaringiz bilan baham: |