Andreas D. Baxevanis, Ph.D.
Deputy DirectorDivision of Intramural Research
Director
Computational Genomics Program
Associate Investigator
Genome Technology Branch
B.S. Cornell University, 1984
Ph.D. The Johns Hopkins University, 1991
(301) 496-8570 (301) 480-2634 andy@nhgri.nih.gov | Building 50, Room 5222B 50 South Drive, MSC 8002 Bethesda, MD 20892-8002 | |
| Selected Publications Books by Researchers at NHGRI | ArrayDB 2.0 GeneMachine Histone Sequence Database Homeodomain Resource WebBLAST | |
With the advent of the new millenium, the scientific community marked a significant milestone in the study of biology: completion of the working draft of the human genome. This work signals a new beginning for modern biology, one in which more and more biological and biomedical research will be performed in a sequencebased fashion. This new approach promises to quickly lead to advances not only in the understanding of basic biological processes, but in the prevention, diagnosis and treatment of many genetic and genomic disorders.
One of the major focuses of this group involves the testing of candidate sequences for compatibility against known three-dimensional structures. The major technique employed is called homology model building, or threading, and this technique relies on the tenet that three-dimensional structure is conserved to a greater extent than sequence, and that a number of sequences can adopt similar conformations. This technique, therefore, allows for the identification of significant structural similarities, even when traditional sequence alignments do not show an obvious relationship. More importantly, it allows for an assessment of the effect of a mutation on a protein, an assessment that can help to discern the underlying cause of phenotypes characterizing a given genetic or genomic disorder.
One case where we have recently utilized this approach involves two different proteins belonging to the homeodomain family. These proteins play a fundamental role in a diverse set of functions that include the specification of body plan, pattern formation and cell fate determination during metazoan development. In the first, we examined mutations in the homeobox gene PITX2, a gene that is responsible for a range of clinical phenotypes involving ocular and craniofacial development. Several mutations within the PITX2 homeodomain region are specifically responsible for the development of the related, autosomal-dominant disorders, Rieger syndrome and iridogoniodysgenesis. Here, the threading-based analysis revealed that point mutations responsible for the development of these genetic disorders lead to the inability of PITX2 to adopt its proper structure and bind to the regulatory sequences of its target gene(s), which, in turn, affects its metabolic role in the cell. In a related study, mutations in the winged-helix FOXC1 transcription factor, found in Axenfeld-Rieger patients, indicated a particular mutant (I87M) that reduced the stability of the FOXC1 protein; biochemical analyses on this mutant also indicated that the I87M mutation reduced FOXC1 protein stability.
Additional studies on the homeodomain proteins have centered on the evolutionary relationships between members of this protein family. All members of this family are characterized by a helix-turn-helix DNA-binding motif, and these proteins regulate various cellular processes by specifically binding to the transcriptional control region of a target gene. An evolutionary classification of 129 human homeodomain proteins, many of which are involved in inherited human disorders when mutated, indicates that these proteins segregate into six distinct classes and that this classification is consistent with the known structural and functional characteristics of these proteins. This analysis, coupled with recent observations from the initial analysis of the human genome sequence, provides some insight as to the pattern of distribution of the homeobox genes within the genome and to the array of functions that can be performed by these proteins.
In addition to basic research questions directly involving human disease genes, our group is involved in the development and application of automated methods for the analysis of sequence and expression data. For example, the WebBLAST suite of programs is intended to assist in organizing sequence data and to provide first-pass sequence analysis in an automated fashion. Data processing is fully automated, with end-users being presented with both tabular and graphical summaries of data that can be viewed using any Web browser. Following this, a logical next step to the archiving and basic interpretation of sequence data is a higher-level analysis involving the prediction of putative genes based solely on raw sequence data. To this end, a free-standing program called GeneMachine was developed, with the aim of helping researchers find potential coding regions and deduce gene structures within long stretches of what is, essentially, anonymous DNA.
Finally, with the development of techniques such as microarrays and the serial analysis of gene expression (SAGE), the means are now at hand to perform large-scale and whole-genome expression studies. In addition to advancing our current understanding of gene expression and regulation, such studies also present substantial challenges for data management and analysis. For this reason, it becomes important to focus on effective informatics methods with which to make valid biological conclusions. To address this need, we have developed ArrayDB 2.0, a software suite that provides an interactive user interface for the mining and analysis of microarray gene expression data. The program is a Sybase relational database with a Web front-end that allows for the identification of clones with high or low red/green intensity ratios based on user-defined confidence limits. The software also provides value-added information in the form of links to offsite databases such as UniGene, GeneCards, and KEGG. As of this writing, ArrayDB is the only freely available software package for the analysis of array data. Continuing work in this area involves the development of more robust clustering methods for the analysis of such data, as well as the investigation of the idea of genetic profiles, whereby one might be able to apply expression-based data to questions regarding patient diagnosis and treatment.


(301) 496-8570
(301) 480-2634