Catalogo Articoli (Spogli Riviste)

OPAC HELP

Titolo:
Iterative sequence/secondary structure search for protein homologs: comparison with amino acid sequence alignments and application to fold recognition in genome databases
Autore:
Wallqvist, A; Fukunishi, Y; Murphy, LR; Fadel, A; Levy, RM;
Indirizzi:
Rutgers State Univ, Dept Chem, Wright Rieman Labs, Piscataway, NJ 08854 USA Rutgers State Univ Piscataway NJ USA 08854 Labs, Piscataway, NJ 08854 USA
Titolo Testata:
BIOINFORMATICS
fascicolo: 11, volume: 16, anno: 2000,
pagine: 988 - 1002
SICI:
1367-4803(200011)16:11<988:ISSSFP>2.0.ZU;2-Q
Fonte:
ISI
Lingua:
ENG
Soggetto:
SECONDARY STRUCTURE PREDICTION; MYCOPLASMA-GENITALIUM PROTEINS; HIDDEN MARKOV-MODELS; METHANOCOCCUS-JANNASCHII; SIMILARITY SEARCHES; REMOTE HOMOLOGS; IDENTIFICATION; CLASSIFICATION; MATRICES; FEATURES;
Tipo documento:
Article
Natura:
Periodico
Settore Disciplinare:
Life Sciences
Citazioni:
77
Recensione:
Indirizzi per estratti:
Indirizzo: Wallqvist, A Rutgers State Univ, Dept Chem, Wright Rieman Labs, 610 TaylorRd, Piscataway, NJ 08854 USA Rutgers State Univ 610 Taylor Rd Piscataway NJ USA 08854 USA
Citazione:
A. Wallqvist et al., "Iterative sequence/secondary structure search for protein homologs: comparison with amino acid sequence alignments and application to fold recognition in genome databases", BIOINFORMAT, 16(11), 2000, pp. 988-1002

Abstract

Motivation: Sequence alignment techniques have been developed into extremely powerful tools for identifying the folding families and function of proteins in newly sequenced genomes. For a sufficiently low sequence identity it is necessary to incorporate additional structural information to positively detect homologous proteins. We have carried out an extensive analysis ofthe effectiveness of incorporating secondary structure information directly into the alignments for fold recognition and identification of distant protein homologs. A secondary structure similarity matrix based on a databaseof three-dimensionally aligned proteins was first constructed. An iterative application of dynamic programming was used which incorporates linens combinations of amino acid and secondary structure sequence similarity scores. Initially, only primary sequence information is used. Subsequently contributions from secondary structure are phased in and new homologous proteins are positively identified if their scores are consistent with the predetermined error rate. Results: We used the SCOP40 database, where only PDB sequences that have 40% homology or less are included, to calibrate homology detection by the combined amino acid and secondary structure sequence alignments. Combining predicted secondary structure with sequence information results in a 8-15% increase in homology detection within SCOP40 relative to the pairwise alignments using only amino acid sequence data at an error rate of 0.01 errors perquery; a 35% increase is observed when the actual secondary structure sequences are used. Incorporating predicted secondary structure information in the analysis of six small genomes yields an improvement in the homology detection of similar to 20% over SSEARCH pairwise alignments, but no improvement in the total number of homologs detected over PSI-BLAST, at an error rate of 0.01 errors per query. However because the pairwise alignments based on combinations of amino acid and secondary structure similarity are different from those produced by PSI-BLAST and the error rates can be calibrated it is possible to combine the results of both searches. An additional 25% relative improvement in the number of genes identified at an error rate of 0.01 is observed when the data is pooled in this way. Similarly for the SCOP40 dataset, PSI-BLAST detected 15% of all possible homologs, whereas the pooled results increased the total number of homologs detected to 19%. These results are compared with recent reports of homology detection using sequence profiling methods.

ASDD Area Sistemi Dipartimentali e Documentali, Università di Bologna, Catalogo delle riviste ed altri periodici
Documento generato il 24/09/20 alle ore 02:03:17