Creating parallel and comparable corpora for work in domain specific areas of language



Download 83,48 Kb.
bet1/6
Sana28.10.2022
Hajmi83,48 Kb.
#857910
  1   2   3   4   5   6
Bog'liq
1-sentyabr ssenariy, Test 2, 1666077157, BTM 2217, Ijara shartnoma 1676633593557, Ijara shartnoma 1676867583415, AJRBF MAY 2022 FULL JOURNAL-38, AJRBF MAY 2022 FULL JOURNAL-2, классификация отклонений и расположения поверхности механизма нико, Статья Равшан ПОВЫШЕНИЕ УСТОЙЧИВОСТИ К АГРЕССИВНЫМ СРЕДАМ КОМПОЗИЦИОННЫХ МАТЕРИАЛОВ ПУТЕМ ПОКРЫТИЯ ПОЛИМЕРАМИ, ПУТЁМ ПОНИЖЕНИЯ ВНУТРЕННИХ НАПРЯЖЕНИЙ И РАЗРАБОТКА ТЕХНОЛОГИИ ИХ ПОЛУЧЕНИЯ, статья никита

Creating parallel and comparable corpora for work in domain specific areas of language

Belinda Maia

FLUP

Parallel corpora - definition

  • “A parallel corpus is a collection of texts, each of which is translated into one or more other languages than the original. The simplest case is where two languages only are involved: one of the corpora is an exact translation of the other. ....... The direction of the translation may not even be known”.

Parallel corpora - uses

  • “Parallel corpora are objects of interest at present because of the opportunity offered to align original and translation and gain insights into the nature of translation. From this work it is hoped that tools to aid translation will be devised. Probabilistic machine translation systems can moreover be trained on such corpora”.

Comparable corpora - definition

  • “A comparable corpus is one which selects similar texts in more than one language or variety. There is as yet no agreement on the nature of the similarity, because there are very few examples of comparable corpora”.

Comparable corpora - uses

  • “The possibilities of a comparable corpus are to compare different languages or varieties in similar circumstances of communication, but avoiding the inevitable distortion introduced by the translations of a parallel corpus”.

Quotations from:

  • EAGLES - Expert Advisory Group on Language Engineering Standards
  • Guidelines – 1996 – at:
  • http://www.ilc.pi.cnr.it/EAGLES96/browse.html

Parallel corpora - alignment & annotation


Download 83,48 Kb.

Do'stlaringiz bilan baham:
  1   2   3   4   5   6




Ma'lumotlar bazasi mualliflik huquqi bilan himoyalangan ©hozir.org 2023
ma'muriyatiga murojaat qiling

    Bosh sahifa
davlat universiteti
ta’lim vazirligi
axborot texnologiyalari
zbekiston respublikasi
maxsus ta’lim
guruh talabasi
nomidagi toshkent
O’zbekiston respublikasi
o’rta maxsus
toshkent axborot
texnologiyalari universiteti
xorazmiy nomidagi
davlat pedagogika
rivojlantirish vazirligi
pedagogika instituti
Ўзбекистон республикаси
tashkil etish
vazirligi muhammad
haqida tushuncha
таълим вазирлиги
toshkent davlat
respublikasi axborot
kommunikatsiyalarini rivojlantirish
O'zbekiston respublikasi
махсус таълим
vazirligi toshkent
fanidan tayyorlagan
bilan ishlash
saqlash vazirligi
Toshkent davlat
Ishdan maqsad
fanidan mustaqil
sog'liqni saqlash
uzbekistan coronavirus
respublikasi sog'liqni
coronavirus covid
covid vaccination
vazirligi koronavirus
koronavirus covid
qarshi emlanganlik
risida sertifikat
vaccination certificate
sertifikat ministry
haqida umumiy
o’rta ta’lim
matematika fakulteti
fanlar fakulteti
pedagogika universiteti
ishlab chiqarish
moliya instituti
fanining predmeti