Microsoft Word Avtoreferat (Muhiddinov M. N. Tatu)

«Development of text extraction and

Download 0,88 Mb.

Pdf ko'rish

bet	22/29
Sana	25.02.2022
Hajmi	0,88 Mb.
	#463854

1 ... 18 19 20 21 22 23 24 25 ... 29

Bog'liq
tasvirlardagi obektlarni azhratish usullari va algoritmlarini ishlab chiqish

Figure 2. The proposed text detection and recognition methods and Uzbek language TTS synthesizer

«Development of text extraction and
recognition method and algorithm based on neural network»
, text detection
and recognition methods based on fully convolutional neural network (FCN) and
Tesseract OCR model were developed. The proposed text detection and
recognition methods divided into four main stages as shown in Figure 2: 1) pre-
processing stage global contrast enhancement using histogram equalization method
2) text detection using FCN, 3) text extraction and recognition, and 4) Uzbek
language TTS synthesizer. The end-to-end text recognition approaches consist of
two parts: text detection and text recognition. It should be noted that, existing
approaches, both conventional and deep neural network-based, principally consist
of many steps and parts, which are reasonably sub-optimal and time-consuming.

29
Accordingly, the efficiency and accuracy of such approaches are still far from
sufficient.
To overcome these drawbacks, a quick and reliable scene text detection and
localization method that has only two steps proposed. The proposed method uses
FCN model that instantly generates word or text-line level prophecies, apart from
unnecessary and heavy intermediate stages. Using convolutional neural network
(CNN) is not an effective solution to determine the precise location of texts in the
image. Therefore, the use of the FCN model will yield higher accuracy results.
Figure 2. The proposed text detection and recognition methods and Uzbek
language TTS synthesizer
Furthermore, different circumstances must be considered when creating
neural networks for text detection. Because the areas of text regions differ
remarkably, discovering the presence of long sentences would need features from
late-stage of a neural network, while predicting correct geometry surrounding a
short text regions demand low level knowledge in early stages. For that reason the
network must utilize features from various levels to satisfy these demands. The
proposed method slowly unites feature maps while preserving the up sampling
features merging small. Simultaneously the method concludes with a network that
can both use various levels of features and retain a small calculation cost. The
model can be decayed into three parts: feature extractor, feature-merging and
output layer. The feature extractor might be a convolutional network pre-trained on
ImageNet dataset, along with interleaving convolution and pooling layers. Four
levels of feature maps, represented as , are obtained from the feature extractor,
whose sizes are
1/32
,
1/16
,
1/8
and
1/4
of the input image, respectively.
Mathematically, feature-merging formulation expressed as:
=
(ℎ )
3
×
(ℎ ) = 4
(5)
ℎ =
= 1
×
(
×
(
;
)) ℎ
(6)

30
where is the merge base, and
ℎ
is the merged feature map, and the operator [;]
denotes concatenation with the channel axis.
In the next stages, once the text region is detected, the region can be cropped
and processed further to recognize the text. To do this, trained Tesseract OCR
model with Uzbek Latin and Cyrillic alphabet characters can be used. The
proposed method also includes recognized texts send to TTS synthesizer for Uzbek
language.
In this chapter of the dissertation, the result of studying and analyzing words
in the Uzbek dictionary, an electronic database of 31,5 thousand words was formed
and arranged in alphabetical order. The Uzbek language speech synthesizer is
based on the concatenation method and contains pronunciation of the words.
Therefore, the Uzbek vocabulary with 31,5 thousand words were studied and all
words were broken down into 2,5 thousand sections, i.e. syllables. For correct
pronouncing of recognized texts and update Uzbek language database, recognized
texts are compared with database, if recognized text is exist in Uzbek language
database system send it to Uzbek language TTS Synthesizer, else the word send to
language specialist to confirm new word.
In the fourth chapter of the dissertation,

Download 0,88 Mb.

Do'stlaringiz bilan baham:

1 ... 18 19 20 21 22 23 24 25 ... 29