Appendix:
Linguistic
corpora consulted
The British National Corpus (BNC): 100 million words of different types of
speech (10 million words) and writing (90 million words). Examples
taken from the BNC that are included in this book contain an identi-
fication number specifying the
register, particular text, and particu-
lar lines within the text from which the example was taken. More
information on the registers in the BNC can be found at: http://www.
natcorp.ox.ac.uk/XMLedition/URG/codes.html#classcodes (accessed
June 19, 2008)
The International Corpus of English (ICE): Examples from three components
of ICE are included in this book (see below). For general information
on ICE, go to: http://www.ucl.ac.uk/english-usage/ice/index.htm
(accessed June 19, 2008)
The British Component of the International Corpus of English (ICE-GB): one mil-
lion words of various kinds of speech (600,000 words) and writing
(400,000 words). Examples taken from ICE-GB that are included in this
book contain an identification number indicating whether the exam-
ple was taken from speech (S) or writing (W) and specifying the partic-
ular register, sample number, and line number from which the exam-
ple was taken. Additional information about ICE-GB can be found at:
http://www.ucl.ac.uk/english-usage/projects/ice-gb/ (accessed June 19,
2008)
The New Zealand Component of the International Corpus of English (ICE-NZ):
Same kinds of texts as ICE-GB. The same identification numbers used
to document examples from ICE-GB are included following examples
cited in this book. For details on this component, go to:
http://www.victoria.ac.nz/lals/research/corpora/index.aspx#ice
(accessed June 19, 2008)
The American Component of the International Corpus of English (ICE-USA): cur-
rently
under development; will eventually contain the same kinds of
texts in ICE-GB and ICE-NZ.
The Cambridge International Corpus (CIC): The CIC is a computer database of
contemporary spoken and written English, which currently stands at
over one billion words. It includes British English, American English
and other varieties of English. It also includes
the Cambridge Learner
Corpus, developed in collaboration with the University of Cambridge
ESOL Examinations. Cambridge University Press has built up the CIC
to provide evidence about language use that helps to produce better
language teaching materials.
Michigan Corpus of Academic Spoken English (MICASE): 1.8 million words of
various kinds of speech found in academic contexts (e.g. class lec-
tures, study groups, advising sessions).
Examples from this corpus
used in this book contain an identification number listing the partic-
ular register, sample, and line number within the sample from which
the example was taken. For additional details, go to: http://quod.lib.
umich.edu/m/micase/ (accessed June 19, 2008)
Santa Barbara Corpus of Spoken American English (SBCSAE): transcriptions
and audio recordings of hundreds of dialogues and some mono-
logues; the transcriptions will eventually become part of ICE-USA. For
more
information, go to: http://www.linguistics.ucsb.edu/research/
sbcorpus.html (accessed June 19, 2008)
The London-Lund Corpus: One million words of various kinds of spoken
English representing many different registers, from spontaneous dia-
logues to scripted monologues. Samples have been prosodically tran-
scribed with annotation marking various features of intonation (e.g.
tone unit boundaries). Examples included in this book contain identi-
fication numbers similar to those used in ICE-GB. For additional infor-
mation, go to: http://khnt.hit.uib.no/icame/manuals/londlund/
index.htm (accessed June 21, 2008)
220
APPENDIX:
LINGUISTIC CORPORA CONSULTED
acceptability:
Judgments that speakers make concerning what they feel are good vs. bad
uses of language. For instance, many people dislike double negatives or
the word
ain’t, even though these expressions are perfectly grammatical
and are regularly used by many speakers of English. Compare with
gram-
Do'stlaringiz bilan baham: