Ongoing Projects

Overview

We envision a future in which the complex multitude of biological cellular processes can be controlled in real time and in vivo, as the trajectories of moving bodies are controlled today. In this future, cancer and disease can be stopped or reversed, damaged tissues can be engineered to regenerate, and aging can be slowed or even halted altogether. For this to become a reality, it will be necessary to understand and be able to predict the progression of biological cellular processes, just as Newton could "calculate the motion of heavenly bodies (but not the madness of the people)," and NASA can plot the trajectories of its spacecraft.

Biology and medicine today may be at a point similar to where physics was after the advent of the telescope. Recent advances in high-throughput technologies,such as DNA microarrays, enable acquisition of different types of molecular biological data, such as DNA copy number, RNA expression and proteins' DNA-binding occupancy levels, on genomic and proteomic scales. For the first time it is possible to record the complete signals that guide the progression of biological cellular processes. The rapidly growing number of large-scale datasets hold the key to fundamental understanding of life on the molecular level, as well as answers to questions regarding diagnosis, treatment and drug development, just as the astronomical tables compiled by Galileo and Brahe enabled accurate predictions of planetary motions and, later, the discovery of universal gravitation. Just as Kepler and Newton made these predictions and discoveries by using mathematical frameworks to describe trends in astronomical data, so future discovery, prediction and control in biology and medicine will come from the mathematical modeling of genomic and proteomic data.

In the Genomic Signal Processing Lab at UT Austin we are creating models from these large-scale molecular biological data, through adaptations and generalizations of mathematical frameworks that have proven successful in describing the physical world in such diverse areas as mechanics and perception. To this end, we created the first data-driven models from these data using frameworks from matrix and tensor computations.

We showed that the singular value decomposition (SVD) model, the generalized SVD (GSVD) comparative model and the pseudoinverse projection integrative model, provide mathematical descriptions of the genetic networks that generate and sense the measured data, where the mathematical variables and operations represent biological reality: The variables, patterns uncovered in the data, are associated with activities of cellular elements, such as regulators or transcription factors, that drive the measured signal, and cellular states where these elements are active. The operations, such as data reconstruction, rotation and classification in subspaces of selected patterns, simulate experimental observation of only the cellular programs that these patterns represent. Similarly, the eigenvalue decomposition (EVD) model, pseudoinverse projection integrative model and a tensor higher-order EVD (HOEVD) model provide mathematical descriptions of the pathways that compose the cellular system from genome-scale nondirectional networks of correlations among the genes of the system, which are computed from the measured data.

We illustrated these models in the analyses of RNA expression data from yeast and human during their cell cycle programs and DNA-binding data from yeast cell cycle, mating and biosynthesis transcription factors and replication initiation proteins. Two alternative pictures of RNA expression oscillations during the cell cycle that emerged from these analyses, which parallel well-known designs of physical oscillators, conveyed the capacity of the models to elucidate the design principles of cellular systems as well as guide the design of synthetic ones. In these analyses, the power of the models to predict previously unknown biological principles was demonstrated with a prediction of a novel mechanism of regulation that correlates DNA replication initiation with cell cycle-regulated RNA transcription in yeast.

Our models may become the foundation of a future in which biological systems are modeled as physical systems are today. The novel mechanism of regulation we predicted may be at the basis of a future where the cell division cycle and cancer can be controlled.