Warnow named a Fellow of ACM

Founder Professor Tandy Warnow, in the departments of Bioengineering and of Computer Science at the University of Illinois at Urbana-Champaign, has been named a 2015 Fellow of the Association for Computing Machinery (ACM). The award recognizes Warnow “for contributions to mathematical theory, algorithms, and software for large-scale molecular phylogenetics and historical linguistics.” She is one of 42 ACM members being recognized this year for their research contributions.

Warnow is an expert in the application of mathematics and computer science to develop algorithms for complex problems in the fields of phylogenomics (the intersection of evolution and genomics), metagenomics (the study of genetic material in the environment), and historical linguistics.

In both the biological and linguistics aspects of her research, Warnow studies ways to understand and estimate the evolution of a collection of genes, species or languages from their common ancestor. Her more recent research work has focused on the biological side. In this work, Warnow has been focusing on developing methods that can analyze large complex datasets and provide improved accuracy compared to existing methods, and she has been collaborating with biologists in using those methods on biological datasets.

Warnow also has contributed to mathematical theory in phylogenetics and was part of a team of researchers who developed the first absolute fast converging (AFC) methods. These methods have a theoretical guarantee of reconstructing the true evolutionary tree with high probability given polynomial length sequences.

She has focused attention on the problem of performing large multiple sequence alignments in a reasonably efficient manner. When she first started looking at the problem, few methods were available that could analyze large datasets, and those that existed were not very accurate on large datasets. In recent years, Warnow and her students developed two separate approaches to producing accurate multiple sequence alignments of large datasets: PASTA and UPP. Both approaches can work with datasets of up to 1 million sequences. While most projects may not need the capability of working with such large numbers of sequences, some of the biological projects Warnow participates in do require this kind of power.

“The Thousand Plant Transcriptome project has datasets that have more than 100,000 sequences that need multiple alignment sequences,” Warnow said. “So this was part of the motivation for us to develop these methods. So it goes hand in hand with trying to analyze actual biological data.”

One of the projects Warnow works on is the Avian Phylogenomics Project, featured in a December 2014 special issue of Science. Warnow served as the computational leader for the phylogenetic analyses of the project, which used the genomic sequences of 48 bird species to develop a new understanding of the evolutionary family tree of birds.

One difficulty that Warnow and her colleagues addressed in the avian project was finding a way to rigorously estimate the avian species tree using different genes within different species. This work led to a new method, called “Statistical Binning,” which was published separately in the special issue of Science.

Warnow also has developed models and computational methods for the study of the evolution of languages. “Languages also evolve like species, and there are a lot of the same challenges. For example, just as there is horizontal gene transfer in biology, languages have “loan words,” and so also have this kind of horizontal transfer,” Warnow said.

Warnow began working on Indo-European languages, trying to determine how that large family of languages (which includes English) evolved and what the evolutionary tree of this large family of languages may look like. “Basically we were trying to characterize the properties of how different features of languages evolve,” she said. “Using that structure that we developed, we came up with methods to estimate evolution of Indo-European and other families using this model. “

The result was influential in the historical linguistics community. The work was featured in the top journals in that field and has a lasting impact on the understanding of the development of those languages. “It stood the test of time in that linguists accept [our] estimations as opposed to those that people posited based on other types of analyses,” Warnow said.

Not only has this work been published in the top journals in the linguistics field — it has had lasting impact.

“It’s been so busy with biology — and my students have been wanting to primarily work on the biology side. But I’m still planning to get back on the linguistics side. I’m still involved in that community.”

ACM will formally recognize the 2015 fellows at the ACM Awards Banquet, scheduled for Saturday, June 11, 2016, in San Francisco. 

Among the many other recognitions Warnow has received for her research achievements are an NSF National Young Investigator Award in 1994, a David and Lucile Packard Foundation Fellowship in 1996, a Radcliffe Institute for Advance Study Fellowship in 2003, and a John Simon Guggenheim Foundation Fellowship in 2011. Warnow served as the chair of the NIH study section on Biological Data Management and Analysis (BDMA) from 2010 to 2012.

Tom Moone