Pearson, William R.
Professor, Biochemistry and Molecular Genetics
- BS, Chemistry, University of Illinois, Urbana Champaign
- PhD, Biochemistry, California Institute of Technology
- Postdoc, Molecular Biology, Johns Hopkins School of Medicine
Bioinformatics and Genomics, Biotechnology, Computational Biology
Protein Evolution; Computational Biology
We have a long-standing interest in exploiting protein sequence information, both for understanding better how new protein sequences arise and for understanding the relationship between protein sequence and protein structure. Since the description of the FASTP program in 1985, our group has been developing more effective methods for identifying distantly related protein sequences. Over the past 10 years, state-of-the-art methods have improved to where proteins that have diverged from a common ancestor in the past billion years are likely to be detected by sequence similarity searching. We hope to push back that threshold to beyond 2 billion years (near the time when prokaryotes and eukaryotes diverged), but already it is possible to identify novel proteins that are likely to have emerged in the last 500 - 800 million years. If we can identify proteins that emerged in the last 100 - 250 million years, it may be possible to identify the mechanisms by which new proteins are formed.
We are also exploring alignment-based strategies for integrating variation, domain, and functional annotations into protein and DNA sequence alignments. Traditionally, alignment programs display a protein or DNA sequence. To find out more about the homologous sequence, an investigator must click on links and read web pages to learn about functional information. The latest version of the FASTA program integrates functional and variation information into alignment displays.