Software Tools for Morphological and Syntactic Analysis of Natural Language Texts



Download 71,68 Kb.
Pdf ko'rish
bet2/9
Sana31.12.2021
Hajmi71,68 Kb.
#259511
1   2   3   4   5   6   7   8   9
Bog'liq
Software Tools for Morphological and Syntactic Analysis

1. Introduction 

 

The “Software Tools for Morphological and Syntactic Analysis of natural Language 

Texts” is a software system designed for natural language texts processing. The 

system is used to analyze syntactic and morphological structure of the natural 

language texts. Specific formalisms has been worked out for this purpose allow us to 

write down syntactic and morphological rules defined by particular natural language 

grammar [1]. These formalisms represent a new, complex approach that solves some 

of the problems connected with the natural language processing. A software system 

has been implemented according to these formalisms. Syntactic analysis of sentences 

and morphological analysis of word-forms can be done within this software system. 

Several special algorithms were designed for this system. Using formalisms described 

in [2-3] are very difficult for Georgian language. 

 

The system consists of two parts: syntactic analyzer and morphological 



analyzer. Purpose of the syntactic analyzer is to parse an input sentence, to build a 

parsing tree that describes relations between the individual words within the sentence, 




and to collect all important information about the input sentence that was figured out 

during the analysis process. It is necessary to provide a grammar file to the syntactic 

analyzer. There must be written syntactic rules of particular natural language 

grammar in that file. Syntactic analyzer also needs information about the grammar 

categories of the word-forms of natural language. Information about the grammar 

categories of the word-forms are used during the analysis process. However it may be 

quite difficult to include all of the word-forms from the natural language into a 

dictionary file. To avoid this problem, and to reduce size of dictionary file, 

morphological analyzer is used. Morphological analyzer uses a dictionary file of 

unchanged parts of words. Therefore this file will be considerably smaller, because 

many word-forms can be produced by single unchanged part of word. The 

morphological analyzer also needs its own grammar file. According to the specific 

formalism, morphological rules of natural language must be written in that grammar 

file. An input word is divided into the morphemes when applying these rules. And 

important information about the grammar categories of word-form can be deduced 

during the analysis.   

An input sentence is passed to syntactic analyzer. Syntactic analyzer passes 

each word from the sentence to the morphological analyzer. Morphological analyzer 

will analyze the words according to the rules from the grammar file, using a 

dictionary of words’ unchanged parts. After the successful analysis each word-form 

will obtain information about its grammar categories, and this information will be 

returned to the syntactic analyzer. At the end syntactic analyzer will try to parse the 

sentence according to the rules from the syntax file. 

 

Basic methods and algorithms, that were used to develop the system the, are: 



operations defined on the feature structures, trace back algorithm (for morphological 

analyzer), general syntactic parsing algorithm and feature constraints method. Feature 

structures are widely used on all level of analysis. As an abstract data types they are 

used to hold various information about dictionary entries. Each symbol defined in a 

morphological or syntactic rule has an associated feature structure, which is initially 

filled from the dictionary, or it is filled by the previous levels of analysis. Feature 

structures and operations defined on them are used to build up feature constraints. 

With general parsing algorithm it is possible to get a syntactic analysis of any 

sentence defined by a context free grammar and simultaneously check feature 

constraints that may be associated with grammatical rules. Feature constraints are 

logical expressions composed by the operations that are defined on the feature 

structures. Feature constraints can be attached to rules defined within a grammar file. 

If the constraint is not satisfied during the analysis, then the current rule will be 

rejected and the search process will go on. Feature constraints also can be attached to 

morphological rules. However, unlike the syntactic rules, constraints can be attached 

at any place within a morphological rule, not at the end only. This speeds up 

morphological analysis, because constraints are checked as soon as they are met in the 

rule, and incorrect word-form divisions into morphemes will be rejected in a timely 

manner. 

 

Formalisms that were developed for the syntactic and morphological 



analyzers are highly comfortable for human. They have many constructions that make 


it easier to write grammar files. Morphological analyzer has a built-in preprocessor, 

which has a capability to process parameterized macro insertions. 

 

The software system is written in C++ programming language standard. It 



uses STL standard library. Program operates in UNIX and Windows operating 

systems. Although the program could be compiled and used in any other platform as 

well, which contains modern C++ compiler. 

 

 




Download 71,68 Kb.

Do'stlaringiz bilan baham:
1   2   3   4   5   6   7   8   9




Ma'lumotlar bazasi mualliflik huquqi bilan himoyalangan ©hozir.org 2024
ma'muriyatiga murojaat qiling

kiriting | ro'yxatdan o'tish
    Bosh sahifa
юртда тантана
Боғда битган
Бугун юртда
Эшитганлар жилманглар
Эшитмадим деманглар
битган бодомлар
Yangiariq tumani
qitish marakazi
Raqamli texnologiyalar
ilishida muhokamadan
tasdiqqa tavsiya
tavsiya etilgan
iqtisodiyot kafedrasi
steiermarkischen landesregierung
asarlaringizni yuboring
o'zingizning asarlaringizni
Iltimos faqat
faqat o'zingizning
steierm rkischen
landesregierung fachabteilung
rkischen landesregierung
hamshira loyihasi
loyihasi mavsum
faolyatining oqibatlari
asosiy adabiyotlar
fakulteti ahborot
ahborot havfsizligi
havfsizligi kafedrasi
fanidan bo’yicha
fakulteti iqtisodiyot
boshqaruv fakulteti
chiqarishda boshqaruv
ishlab chiqarishda
iqtisodiyot fakultet
multiservis tarmoqlari
fanidan asosiy
Uzbek fanidan
mavzulari potok
asosidagi multiservis
'aliyyil a'ziym
billahil 'aliyyil
illaa billahil
quvvata illaa
falah' deganida
Kompyuter savodxonligi
bo’yicha mustaqil
'alal falah'
Hayya 'alal
'alas soloh
Hayya 'alas
mavsum boyicha


yuklab olish