Molecular sequences evolve under processes that include substitutions,
insertions, and deletions (jointly called "indels"), as well as other
mechanisms (e.g., duplications and rearrangements). The inference of
the evolutionary history of these sequences has thus been performed
in two stages: the first estimates the alignment on the sequences,
and the second estimates the tree given that alignment. While such
methods seem to work well on relatively small datasets, these two-stage
approaches can produce highly incorrect trees and alignments when
applied to large datasets, or ones that evolve with many indels.
In this talk, I will present a new method, SATe, that my lab has been
developing that uses maximum likelihood to estimate the alignment
and tree at the same time, and that can be used to analyze datasets
with up to 1000 sequences on a desktop in 24 hours. Our study, using
both real and simulated data, shows that this method produces
much more accurate trees than the current best methods.
Tandy Warnow is Professor of Computer Sciences at the University of
Texas at Austin. Her research combines mathematics, computer science,
and statistics to develop improved models and algorithms for
estimating complex and large-scale evolutionary histories in both
biology and historical linguistics. Tandy received her PhD in
Mathematics at UC Berkeley under the direction of Gene Lawler, and did
postdoctoral training with Simon Tavare and Michael Waterman at USC.
She received the National Science Foundation Young Investigator Award in
1994, and the David and Lucile Packard Foundation Award in Science and
Engineering in 1996. Tandy is a member of five graduate programs at the
University of Texas, including Computer Science; Ecology, Evolution, and
Behavior; Molecular and Cellular Biology; Mathematics; and
Computational and Applied Mathematics. She is also the director for the
multi-disciplinary CIPRES (Cyber-Infrastructure for Phylogenetic Research)
Project, currently funded by the NSF under their Information Technology
Program.
Date and Time
Wednesday February 3, 2010 4:30pm -
5:30pm
Location
Computer Science Small Auditorium (Room 105)