Bioinformatics / Computational Biology Probabilistic Modeling / Machine Learning
A. D. Long, H. J. Mangalam, B. Y. P. Chan, L. Tolleri, G. W. Hatfield, and P. Baldi. Improved Statistical Inference from DNA Microarray Data Using Analysis of Variance and a Bayesian Statistical Framework. Journal of Biological Chemistry, 276,23,19937-19944, (2001).
Book: P. Baldi. The Shattered Self-The End of Evolution, MIT Press, in press, (2001).
Book: P. Baldi and S. Brunak. Bioinformatics: the Machine Learning Approach, MIT Press, (1998). Second Edition (2001).
Book: Pierre Baldi and G. Wesley Hatfield. DNA Microarrays and Gene Regulation, MIT Press,(expected Fall, 2001)
G. Pollasti, P. Baldi, P. Fariselli, R. Casadio. Improved Prediction of the Number of Residue Contacts in Proteins by Recurrent Neural Networks. Bioinformatics, 17, Supplement 1, S234-S242, (2001).
P. Baldi and A. D. Long. A Bayesian Framework for the Analysis of Microarray Expression Data: Regularized t-Test and Inference of Gene Changes. Bioinformatics, 17, 6, 509-519, (2001).
P. Baisnée, P. Baldi, S. Brunak, and A. Gorm Pedersen. Flexibility of the Genetic Code with Respect to DNA Structure. Bioinformatics, 17: 237-248, (2001).
P. Baldi, and P. Baisnée. Sequence Analysis by Additive Scales: DNA Structure for Sequences and Repeats of All Lengths. Bioinformatics, Vol 16, 10, 865-889, (2000).
P. Baldi, S. Brunak, Y. Chauvin, and H. Nielsen. Assessing the Accuracy of Prediction Algorithms for Classification: An Overview. Bioinformatics, 16, 5, 412-424, (2000).
P. Baldi, S. Brunak, P. Frasconi, G. Soda, G. Pollastri Pedersen. Exploiting the Past and the Future in Protein Secondary Structure Prediction. Bioinformatics, 15, 937-946, (1999)
Layer leader, California Institute for Telecommunications and Information
Dr. Baldi's group works at the intersection of biological and computer sciences, using probabilistic/machine learning techniques to address biological problems and mine large data sets produced by massive data acquisition technologies, such as genome sequencing, high-throughput drug screening, and DNA microarrays. Current projects include the prediction of protein secondary and tertiary structure, the study of DNA structure in relation to several biological processes (protein binding, gene regulation, triplet repeat expansion diseases), and the analysis of gene expression data. Dr. Baldi's group is building a suite of genomics and proteomics programs for the prediction of protein structure and function and the analysis of microarray data.