Genres, registers, text types, domains and styles: Clarifying the concepts and nevigating a path through the bnc jungle



Download 321,72 Kb.
Pdf ko'rish
bet15/27
Sana31.12.2021
Hajmi321,72 Kb.
#261524
1   ...   11   12   13   14   15   16   17   18   ...   27
Bog'liq
36987049 (1)

Domains Versus Genres: The BNC Sampler & Why We Need Genre Information

The BNC Users' Reference Guide states that only three criteria were used to "balance" the corpus:

domain, time, and medium. In choosing texts for inclusion into the BNC Sampler (the 2-million word

sub-set of the BNC), domain was probably the most important criterion used to ensure a wide-enough

coverage of a variety of texts. On the BNC 

Web page for the Sampler

, the following comment on its

representativeness is made:




David Lee

Genres, Registers, Text Types, Domains, and Styles

Language Learning & Technology

54

In selecting from the BNC, we tried to preserve the variety of text-types represented, so



the Sampler includes in its 184 texts many different genres [italics added] of writing and

modes of speech.

It should be noted that no real claim to representativeness is made, and that what they really meant was

that many different texts were chosen on the basis of domain and other criteria.

13

 The fact that the



Sampler contains many different genres is not in doubt, but the texts were not chosen on this basis, since

they had no genre classification, and hence the Sampler cannot (and, indeed, it does not) claim to be

representative in terms of "genre."

It is my belief that it is because "domain" is such a broad classification in the BNC that the Sampler

turned out to be rather unrepresentative of the BNC and of the English language. Anyone wishing to use

the Sampler should be under no illusion that it is a balanced corpus or that it represents the full range of

texts as in the full BNC. The Sampler may be broadly balanced in terms of the domains, but when broken

down by genre, a truer picture emerges of exactly how (un)representative it really is. 

Appendix A

 lists


missing or unrepresentative genres in the Sampler BNC which demonstrate this.

"Genre" is perhaps a more insightful classification criterion than "domain," as least as far as getting a

representatively balanced corpus is concerned. If the compilers of the BNC Sampler had known the genre

membership of each BNC text, they would probably have created a more balanced and representative sub-

corpus. As things stand, however, any conclusions about "spoken English" or "written English" made on

the basis of the BNC Sampler will have to be evaluated very cautiously indeed, bearing in mind the

genres missing from the data.

There is another example of how large, undifferentiated categories similar to domain can unhelpfully

lump disparate kinds of text together. Wikberg (1992) criticises the LOB text category E ("Skills, trades,

and hobbies") as being too baggy or eclectic. He demonstrates how, on the evidence of both external and

internal criteria, the texts in Category E can actually be better sub-classified into "procedural" versus

"non-procedural" discourse. He also notes that it is not just text categories that can be heterogeneous.

Sometimes texts themselves are "multitype" or mixed in terms of having different stages with different

rhetorical or discourse goals. He thus concludes with the following comment:

An important point that I have been trying to make is that in the future we need to pay

more attention to text theory when compiling corpora. For users of the Brown and the

LOB corpora, and possibly other machine-readable texts as well, it is also worth noting

the multitype character of certain text categories. (p. 260)

This is a piece of advice worth noting.

THE BNC (BIBLIOGRAPHICAL) INDEX

The BNC Index spreadsheet I am about to describe was created as one solution to the previously

mentioned problems and difficulties. It is similar to the plain text ones prepared by Adam Kilgarriff that I

have benefited from and found rather useful.

14

 However, those files do not contain all the details which



are needed for compiling your own sub-corpus (author type, author age, author sex, audience type,

audience sex, section of text sampled, [topic] keywords, etc.). 

Sebastian Hoffmann's files

 were useful too,

in a complementary way, but these do not include (a) keywords and (b) the full bibliographical details of

files. A third existing resource, the "bncfinder.dat" file that comes with the standard distribution of the

BNC (version 1) has most of the header information, but in the form of highly abbreviated numeric codes,

and also does not include any bibliographical information about the files or keywords. The BNC Index

consolidates the kinds of information available in the above three resources, but, in addition, includes (a)

BNC-supplied keywords (as entered in the file headers by the compilers); (b) 

COPAC

 keywords



15

 

for



published non-fiction texts

16

 (topic keywords entered by librarians); (c) full bibliographical details




David Lee


Download 321,72 Kb.

Do'stlaringiz bilan baham:
1   ...   11   12   13   14   15   16   17   18   ...   27




Ma'lumotlar bazasi mualliflik huquqi bilan himoyalangan ©hozir.org 2025
ma'muriyatiga murojaat qiling

kiriting | ro'yxatdan o'tish
    Bosh sahifa
юртда тантана
Боғда битган
Бугун юртда
Эшитганлар жилманглар
Эшитмадим деманглар
битган бодомлар
Yangiariq tumani
qitish marakazi
Raqamli texnologiyalar
ilishida muhokamadan
tasdiqqa tavsiya
tavsiya etilgan
iqtisodiyot kafedrasi
steiermarkischen landesregierung
asarlaringizni yuboring
o'zingizning asarlaringizni
Iltimos faqat
faqat o'zingizning
steierm rkischen
landesregierung fachabteilung
rkischen landesregierung
hamshira loyihasi
loyihasi mavsum
faolyatining oqibatlari
asosiy adabiyotlar
fakulteti ahborot
ahborot havfsizligi
havfsizligi kafedrasi
fanidan bo’yicha
fakulteti iqtisodiyot
boshqaruv fakulteti
chiqarishda boshqaruv
ishlab chiqarishda
iqtisodiyot fakultet
multiservis tarmoqlari
fanidan asosiy
Uzbek fanidan
mavzulari potok
asosidagi multiservis
'aliyyil a'ziym
billahil 'aliyyil
illaa billahil
quvvata illaa
falah' deganida
Kompyuter savodxonligi
bo’yicha mustaqil
'alal falah'
Hayya 'alal
'alas soloh
Hayya 'alas
mavsum boyicha


yuklab olish