6.1.BLASTvs.PSI-BLAST
BLASTstandsforBasicLocalAlignmentSearchTool.Firstitconstructsadictionaryof all the k-long words in the query sequence.Then it initiates a local alignment foreach word match between the words in the query and the words in the database. Thealignment is determined using ungapped extensions in both directions until the scoredrops below a statistical threshold.The output of BLAST is all the alignments withscoresabovethatthreshold.
PSI-BLAST is an extension of the original BLAST algorithm that implements aniterativeapproach.Thebasicstepsareasfollows:
FindallpairwisealignmentsofqueryxtosequencesindatabaseD.
Collectallmatchesofxtoywithsomeminimumsignificance.
ConstructpositionspecificmatrixM.
UsingM,searchDformorematches.
Iterateuntilconvergence.
7.TRILOGY
TRILOGYisaprogramwhichidentifiessequence-structurepatternsinproteins.Apat-tern object in TRILOGY consists of a sequence pattern that specifies the spacing andamino acid type of the three residues and a structure pattern that specifies the three-dimensionalarrangementandorientationoftheresidues.AresiduetripletthatmatchestoaparticularstructurepatternmusthaveCα-CαdistancesandCαCβvectorsthatagree
Figure8:ExamplesoflongpatternsfoundbyTRILOGY
withthoseofthestructurepatterntowithin1.5˚Aand60”respectively.TRILOGYscores eachtriplepatternsuchthatthescorereflectsthedegreeofcorrelationbetweenthese-quence and structure components. This score is basically the probability of seeing thenumber of pattern matches if the sequence and structure matches are chosen indepen-dentlyandatrandom.
The TRILOGY algorithm begins by analyzing three-residue patterns and selecting asubset of these patterns as seeds for identifying longer patterns.This extension of thepatternsisimplementedby”gluing”togethertwothree-amino-acidpatternsthatoverlap in two amino acids. TRILOGY scores these extended patterns by counting the number ofmatches to the pattern and its sequence and structure components independently. SomeexamplesofpatternsfoundbyTRILOGYareshowninFigure8.
8.References
S.Batzoglou,ProteinStructure,MotifsandtheirIdentification,Apr2004.
Wikipedia,Alphahelix,http://en.wikipedia.org/wiki/Alphahelix,Mar2004.
Wikipedia,Betasheet,http://en.wikipedia.org/wiki/Betasheet,Mar2004.
P.Bradley,P. Kim,B.Berger,TRILOGY:Discoveryofsequence-structurepatternsacrossdiverseproteins,PubMed,Jun2002.
Do'stlaringiz bilan baham: |