113
and automated workplaces at domestic enterprises. In the process, the programmer had the idea of
creating a computer program that would allow finding information in large texts taking into account
the morphology of the Russian language (that is, at the request of a word in any form). Together
with Arkady Borkovsky, a specialist in computer linguistics of the USSR Academy of Sciences,
Volozh opened the Arcadia company in 1989. In 1990, Ilya Segalovich, a former classmate and
friend of Volozh, joined the CompTek programmer team. For two years, the specialists of Arcadia
created two information retrieval systems for the Central Research Institute of Patent Information
and Technical and Economic Research (TsNIIIPI): the International Classification of Inventions
and the Classifier of Goods and Services. Subsequently, the programs recorded on floppy disks
were sold to research institutes and organizations involved in patent management for three years.
By 1993, CompTek was engaged in the marketing of network technologies, and in order not to
leave search technology, Arcadia, whose activities became less popular, was made one of
CompTek's departments.
In 1993, the first working version of the application for local search (on the computer’s hard
drive)
was written, which was called Yandex.
The word stands for
yet another indexer (from the English - “another indexer”) or as
“Language Index”.
In 1993-1994, CompTek programmers began working with the Laboratory of Computer
Linguistics of the Institute for Information Transmission Problems, which was headed by
Academician Yuri Apresyan, a Russian specialist in the field of structural linguistics and theoretical
semantics. Segalovich studied morphology and improved search technology, after some time,
Mikhail Maslov, Dmitry Teiblum, Sergey Ilyinsky and Leonid Brovkin began to help him.
Segalovich as the main developer wrote a program for automatic morphological analysis, which
was used in the search. The result of the joint work of programmers was a dictionary with a search
that took into account the morphology of the Russian language, another of its advantages was that it
was completely loaded into RAM and quickly worked.
In 1994, based on the developed technologies, CompTek programmers created the “Bible
Computer Reference” - an information retrieval system that worked with the text of the Bible. To
convert to electronic form, almost half of the book had to be manually typed. Since 1995, the
company has been working on the project “Academic publication of classics on CD ROM”, which
envisaged the release of a full electronic academic publication by Alexander Griboedov and
Alexander Pushkin with a dictionary of the Griboedov language. By 1996, an algorithm for
constructing hypotheses was developed: if the word in question was not in the dictionary, then the
search was carried out using the most similar ones, and an inflection model was already built on
them.
With the launch of CompTek on the Internet in 1995, its creators decided to set up a search
program to work in new conditions - so that the user could easily navigate the
World Wide Web and
find any necessary information. To do this, programmers rewrote the search program so that it
already works with Internet sites, first a search was performed on a limited number of sites, and
then on the entire Russian
Internet segment, the so-called Runet.
On September 23, 1997, the Yandex.ru search engine was demonstrated for the first time at
Softool, and two months later a natural-language query was implemented.
Yandex was not the first search engine in Russia: Rambler appeared in 1996, and even earlier,
in December 1995, Altavista, the latter having the most productive server at that time and was
distinguished by the highest speed among competitors, processing millions of requests per day Two
114
months after yandex.ru announced the search engine "Aport" (although it was first shown back in
February 1996).
The search engine, originally called Yandex-Web, indexed sites on .su and .ru domains, as
well as foreign Russian-language pages. The program took into account the morphology of the
Russian language, the algorithms were able to find the initial form of the word, a query was
possible on the exact word form. The query language included the logical operators AND, OR,
NOT and allowed searching within the same document, paragraph, in headings and other fields, as
well as taking into account the distance between words. In the first time after the launch, the search
robot weekly crawled 4 GB of texts on 5 thousand servers. Documents found were sorted by
relevance, calculated by the position of the word, the frequency of mentioning it in the document,
the distance between the words.
Yandex tried to sell the search taking into account morphology from the team of Arkady
Borkovsky twice: in 1996 - to Rambler, and in 2003 - to Google, but neither
of them considered this
technology to be of any importance.
Do'stlaringiz bilan baham: