Methods have been developed to overcome this limitation


Database Mean length



Download 65,31 Kb.
bet6/6
Sana17.05.2023
Hajmi65,31 Kb.
#939880
1   2   3   4   5   6
Bog'liq
Compared with structural information

Database

Mean length

Std

Eukaryota

Bacteria

Archaea

Mean

Std

Mean

Std

Mean

Std

CATH

150.4

90.9

147.9

90.8

154.8

91.8

137.2

81.0

SCOP

196.8

129.7

179.5

128.9

215.4

129.5

189.5

115.1

4. Discussion and conclusion
The exact identification of protein domains and their boundaries is one of the most important problems in the study of protein structure and function. Therefore, a number of domain prediction methods and databases have been developed, which can be divided into two categories: sequence-based and structure-based.
With known three-dimensional structures, accuracy is often not the problem. The problem that needs be considered is the ambiguity in a domain definition. To the best of our knowledge, Sword [53], developed recently, is the only method which has tried to address this problem by producing multiple alternative decompositions of a protein. Therefore, more innovative multipartitioning algorithms are needed to tackle this problem.
The difficulty of obtaining protein experimental structure limits the application scope of structure-based protein domain identification methods. Sequence-based methods have been developed based on the assumption that domain family members share some common sequence features. When there are close templates, such methods can achieve high prediction accuracy. However, this prediction accuracy decreases sharply when homologous templates are unavailable. Therefore, a number of approaches independent of templates have been developed, and most of them are based on machine learning. Despite extensive research, predicting domain boundaries from sequence data alone is still a challenging problem. The prediction accuracy of most of these methods is not high enough to be applied in large-scale sequence annotation. Another problem is that sequence-based methods generally do not consider discontinuous domains prediction. With the development of machine learning algorithms and the improvement of contact map prediction, there will be great progress in protein domain prediction accuracy and discontinuous domain detection.
Coupled with the development of domain identification methods, a variety of protein domain databases have been constructed to classify protein sequence and structure. A newly identified protein can be classified into a corresponding family through searching the available protein domain family databases. InterPro [78], which is an integrated domain family resource, has annotated 79.1% protein sequences in UniProt [9]. It is hoped that with the improvement of the protein domain detection methods, the domain annotation ratio of protein sequences will increase.
CRediT authorship contribution statement
Yan Wang: Conceptualization, Writing - review & editing, Formal analysis, Funding acquisition. Hang Zhang: Conceptualization, Writing - review & editing, Data curation. Haolin Zhong: Writing - review & editing. Zhidong Xue: Conceptualization, Supervision, Writing - review & editing, Funding acquisition.
Declaration of Competing Interest
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.
Download 65,31 Kb.

Do'stlaringiz bilan baham:
1   2   3   4   5   6




Ma'lumotlar bazasi mualliflik huquqi bilan himoyalangan ©hozir.org 2024
ma'muriyatiga murojaat qiling

kiriting | ro'yxatdan o'tish
    Bosh sahifa
юртда тантана
Боғда битган
Бугун юртда
Эшитганлар жилманглар
Эшитмадим деманглар
битган бодомлар
Yangiariq tumani
qitish marakazi
Raqamli texnologiyalar
ilishida muhokamadan
tasdiqqa tavsiya
tavsiya etilgan
iqtisodiyot kafedrasi
steiermarkischen landesregierung
asarlaringizni yuboring
o'zingizning asarlaringizni
Iltimos faqat
faqat o'zingizning
steierm rkischen
landesregierung fachabteilung
rkischen landesregierung
hamshira loyihasi
loyihasi mavsum
faolyatining oqibatlari
asosiy adabiyotlar
fakulteti ahborot
ahborot havfsizligi
havfsizligi kafedrasi
fanidan bo’yicha
fakulteti iqtisodiyot
boshqaruv fakulteti
chiqarishda boshqaruv
ishlab chiqarishda
iqtisodiyot fakultet
multiservis tarmoqlari
fanidan asosiy
Uzbek fanidan
mavzulari potok
asosidagi multiservis
'aliyyil a'ziym
billahil 'aliyyil
illaa billahil
quvvata illaa
falah' deganida
Kompyuter savodxonligi
bo’yicha mustaqil
'alal falah'
Hayya 'alal
'alas soloh
Hayya 'alas
mavsum boyicha


yuklab olish