![]() Likewise, many compelling tools for predicting the folded structure of proteins are based on analyses that consider reconstructed ancestral sequences. For example, explicit consideration of ancestral states provided an understanding of the general features of correlated changes in proteins, an understanding that eluded a perplexed literature that attempted to analyze correlated change by leaf-leaf comparisons. It is not surprising, therefore, that explicitly considering reconstructed ancestral sequences is a powerful tool for interpreting sequence data. Reconstructed ancestral sequences can be viewed as an expedient representation of extant sequence data, because they include all of the sequence information in a way that represents a best guess model for the historical reality. These reconstructions are generally presented in probabilistic form, where the likelihood that each of the standard amino acids occupied a particular site at a point in the tree is represented by a vector whose coefficients sum to unity, and where each coefficient represents the probability, conditional on the reconstruction model, that each of the standard amino acids occupied that site at that point. Much of this history can be modeled by a process that formally reconstructs the sequences of ancestral proteins throughout an evolutionary tree, given a multiple sequence alignment relating individual sites in the descendent proteins. The amino acid sequences from a set of homologous proteins contain an imperfect record of the history of sequence divergence within that protein family. ![]() Magnum provides evolutionary and structural bioinformatics resources that are useful for identifying experimentally testable hypotheses about the molecular basis of protein behaviors and functions, as illustrated with the examples from the cellular retinoid binding proteins. We integrated two areas of protein science, evolution and structure, on a large scale and created a precomputed database, known as Magnum, which is the first freely available resource of its kind. Recruitment of cellular retinol binding protein to function as a lens crystallin in the diurnal gecko afforded another opportunity to showcase the predictive value of a browsable database containing branch replacement patterns integrated with protein structures. In the cellular retinoid binding protein family a site that potentially modulates ligand binding affinity was discovered. To demonstrate Magnum's capabilities, we asked for amino acid replacements requiring three nucleotide substitutions, located at internal protein structure sites, and occurring on short phylogenetic tree branches. Comprehensive evaluations revealed that the automated procedures used to construct Magnum produced accurate models of how proteins divergently evolve, or genealogies, and correctly integrated these with the structural data. The Magnum deliverables include 1) multiple sequence alignments, 2) mapping of alignment sites to crystal structure sites, 3) phylogenetic trees, 4) inferred ancestral sequences at internal tree nodes, and 5) amino acid replacements along tree branches. 1,800 full-length protein families with at least one crystal structure. The precomputed Magnum database offers a solution to this problem for ca. However, progress in this direction has been slowed by the lack of a publicly available resource suitable for general use. This represents an excellent example of how bioinformatics can be used to guide experimental research. When accurate models for the divergent evolution of protein sequences are integrated with complementary biological information, such as folded protein structures, analyses of the combined data often lead to new hypotheses about molecular physiology.
0 Comments
Leave a Reply. |
AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |