| Alignment | ||
|---|---|---|
|
Marc Suchard
UCLA |
"A Bayesian perspective on alignment" |
November 18, 2009 |
|
Ward Wheeler
AMNH |
"Dynamic homology and phylogenetic systematics" |
December 7, 2009 |
| Gene-tree species-tree | ||
|
Joseph Heled
U Auckland |
"The end of lineage sorting: inferring species trees using *BEAST" |
January 25, 2010 |
|
Noah Rosenberg
Stanford |
"Consistency properties of species tree inference algorithms under the multispecies coalescent." |
February 24, 2010 |
|
Jens Lagergren
KTH |
"Probabilistic Analysis of gene families with respect to gene duplication, loss, and transfer." |
March 29, 2010 |
| Infectious disease | ||
|
Trevor Bedford
U Edinburgh |
"Adaptation and migration in the human influenza virus." The influenza A virus infects approximately 500 million individuals each year. Owing to its RNA makeup, influenza mutates extremely rapidly allowing the virus population to escape the pull of the human immune system. A single individual may be infected year after year by antigenically novel strains. As result of this rate of mutation, the timescale of influenza evolution is a human timescale. We get the chance to observe the process of evolution in action. However, the rapid pace of evolution also causes an intrinsic link between evolutionary and ecological dynamics in the virus population. The availability of temporally spaced sequence data allows estimation of details of these dynamics unavailable in other systems. Through analysis of this data, I address open questions regarding patterns of adaptation and the effects of seasonality in the human influenza virus. |
April 23, 2010 |
|
Philippe Lemey
KU Leuven |
"Phylogenetic diffusion models and their applications in viral epidemiology" Emerging infectious diseases continue to appear all over the world, and importantly, they have also risen significantly over time after. Having the potential to quickly adapt to new hosts and environments, RNA viruses are prime candidates to emerge as global threats to human health. Their rapid rate of evolution, however, also turns viral genomes into valuable resources to reconstruct the spatial and temporal processes that are shaping epidemic or endemic dynamics. In this seminar, I will highlight recent developments in phylogenetic diffusion models that tie together sequence evolution and geographic history in a coherent statistical framework. Both discrete and continuous phylogeographic models have recently been implemented in a Bayesian statistical approach. I will position this approach among other popular phylogeographic methods, and then focus on applications in viral molecular epidemiology to demonstrate their use. Finally, I will hint at future extensions that may provide entirely new opportunities for phylogeographic hypothesis testing. |
September 10, 2010 |
|
Marco Salemi
U Florida |
"Phylogenetic challenges in the retroviridae branch of the tree of life." The representation of all virus families within a single phylogenetic tree may be a misleading description of their evolutionary history. First, it is unlikely that all viruses originated from a unique common ancestor. Second, viruses (retroviruses in particular) can integrate into the host genome and be transmitted vertically as well horizontally. Third, different viral genera can evolve according to dramatically different molecular clocks. Three paradigmatic examples from the retroviridae family will be considered here: the simian foamy viruses (SFVs); the primate T-lymphotropic viruses (PTLVs), which include HTLV and STLV, and the primate lentiviruses (PLVs), which include SIV, HIV-1 and HIV-2. SFV is an example of an ancient virus that has been co-evolving with its primate hosts over the last 30 million years. PTLVs emerged around 300 thousand years ago and are characterized by frequent interspecies transmissions and multiple introductions into human populations since prehistoric times. PLVs have a much more recent origin and only within the last 200 years have been able to spread successfully within the human population. The complex relationship between population dynamics and evolutionary time-scale of these retroviruses, as well as the challenge of their integration within the tree of life will be discussed. |
September 20, 2010 |
|
Sergei Kosakovsky Pond
UCSD |
"Accurate estimation of evolutionary attributes of coding sequences and evolutionary fingerprinting." Codon substitution models have facilitated the interpretation of evolutionary forces operating on genomes. Most of these models, however, assume a single rate of non-synonymous substitution irrespective of the nature of amino acids being exchanged. Recent developments have shown that models which allow for amino acid pairs to have different rates of substitution offer improved fit over single rate models. However, these approaches have been limited by the necessity for large alignments in their estimation or the adoption of a particular residue exchangeability scale. We present an alternative procedure which assigns substitution rates between amino acid pairs can be subdivided into a few rate classes, dependent on the information content of the alignment. This procedure permits us to infer generalizable models for specific genes, organisms and taxonomic clades. |
October 28, 2010 |
| Macroevolution | ||
|
Joe Felsenstein
U Washington |
"What poultry breeders and guinea pigs have to tell us about statistical nonmolecular phylogenetics." We are far from having an understanding of the determination of morphological characters at the genome level, so most evolutionary biologists working on them still need to use phenotypic approaches. I will discuss the prospects for using the tools of quantitative genetics, which has faced the same dilemma for the past century. I will use as examples three projects of my own. One, which is joint work with Fred Bookstein, adapts the tools of morphometrics, of which he is a chief developer, to modeling change of morphological forms on phylogenies. The second is a similar project that asks how to best place fossil forms into a phylogeny of present-day species when there is molecular data enabling us to get a good estimate of the phylogeny for those species. The third models discrete 0/1 characters using the Threshold Model developed by Sewall Wright for his work on guinea pigs. All of these lead to asking whether we can connect Brownian Motion models with quantitative genetics models. In all such cases we will have limits on what we can infer, and need to be aware of the need to carry that uncertainty through any subsequent inference using these results. |
January 24, 2011 |
|
Luke Harmon
U Idaho |
"New Frontiers for the Comparative Analysis of Diversification." We're building the tree of life, but what can we do with it? It seems clear that there is a wealth of information about evolution in the structure of this tree. There are some methods that can use phylogenetic trees to test macroevolutionary models, but the range of models that we can test is still severely limited. In some cases, such as the estimation of extinction rates from phylogenetic trees, current methods have proven controversial. We are now beginning to develop and implement methods that use tree-of-life scale data to answer key questions in evolution. I will review three new approaches developed in my lab for analyzing comparative datasets: MECCA, fossil-Medusa, and reversible-jump MCMC. I argue that these methods represent the next generation of comparative methods that will open the door to analyzing a much broader range of models with large datasets. |
February 25, 2011 |
|
Brian O'Meara
U Tennessee |
"Making comparative methods as easy as ABC." For decades, biologists have addressed evolutionary and ecological questions using measurements of species traits, phylogenies, and an assortment of comparative methods. Unfortunately, while there is a large assortment of these methods, they are still fairly limited and development of new methods is slow. It took seven years between the introduction of using a simple Brownian motion model for looking at trait evolution (Felsenstein, 1985) and the use of this same model for looking at rates of trait evolution (Garland, 1992), and an additional 14 years to more powerful tests using a small modification of the basic model (O'Meara et al., 2006). Still other promising methods are described and even tested but remain unavailable to empiricists because they are not put into software. As a result, the questions empiricists can ask about the world are limited by the research productivity of the few dozen scientists who develop and implement new methods in phylogenetics. We describe a new approach based on Approximate Bayesian Computation and implemented in R that will allow researchers to easily develop their own models for trait evolution without requiring them to have specialized mathematical or computational knowledge. |
March 30, 2011 |
| Evolutionary genomics | ||
|
Mike Lin
MIT |
"Locating protein-coding sequences under selection for additional, overlapping functions in 29 mammalian genomes." The degeneracy of the genetic code allows protein-coding DNA and RNA sequences to simultaneously encode additional, overlapping functional elements. A sequence in which both protein-coding and additional overlapping functions have evolved under purifying selection should show increased evolutionary conservation compared to typical protein-coding genes -- especially at synonymous sites. We developed a method to systematically locate short regions within known ORFs that show conspicuously low estimated rates of synonymous substitution, based on phylogenetic codon rate models and likelihood ratio tests. We applied this method to genome alignments of 29 placental mammals, resulting in more than 10,000 “synonymous constraint elements” (SCEs) with resolution down to nine-codon windows. These are found within more than a quarter of all human protein-coding genes and contain ~2% of their synonymous sites. We collected numerous lines of evidence that the observed synonymous constraint in these regions reflects selection on overlapping functional elements including splicing regulatory elements, dual-coding genes, RNA secondary structures, microRNA target sites, and developmental enhancers. We also ruled out certain alternative explanations such as codon usage bias and neutral rate variation. Our initial results show that overlapping functional elements are common in mammalian genes, despite the vast genomic landscape. Furthermore, anticipating the future availability of additional mammalian and vertebrate genomes, we are currently developing Bayesian codon modeling methods to measure synonymous rates at even higher resolutions, perhaps eventually allowing the detection of individual regulator binding sites embedded in protein-coding ORFs. |
April 26, 2011 |
|
Adam Siepel
Cornell |
"Bayesian inference of ancient human demography from individual genome sequences." Besides their value for biomedicine, individual genome sequences represent a rich source of information about human evolution. I will describe an effort to estimate key evolutionary parameters from the genome sequences of six individuals from diverse human populations. We have used a Bayesian approach based on coalescent theory to extract information about ancestral population sizes, divergence times, and migration rates from inferred genealogies at many neutrally evolving loci from across the genome. We introduce new methods for accounting for gene flow between populations and integrating over possible phasings of diploid genotypes. I will also describe a custom pipeline for genotype inference to mitigate possible biases from heterogeneous sequencing technologies, coverage levels, and read lengths. Our analysis indicates that the San of Southern Africa diverged from other human populations 108--157 thousand years ago (kya), that Eurasian populations diverged 38--64 kya, and that the effective population size of the ancestors of all modern humans was ~9,000. |
May 24, 2011 |
|
Jason Stajich
UC Riverside |
"Fungal phylogenomics: Getting lost in the moldy forest." Fungi occupy diverse ecological niches in roles from nutrient cycling in rainforest floors to aggressive plant and animal pathogens. Molecular phylogenetics has helped resolve many of branches on the Fungal tree of life and enabling studies of evolution across this diverse kingdom. The genome sequences from hundreds of fungi now permit the study of change in genes and gene content in this phylogenetic context and to connect molecular evolution with adaptation to ecological niches or changes in lifestyles. I will describe our work in studies contrasting pathogenic and non-pathogenic fungi and efforts to unravel the evolution of multicellularity in fungi comparing unicellular basal fungi with multicellular mushrooms and molds. The development of tools for data mining and use of fungal genomics is also driving the pace of molecular biology and genetics of fungi. I will highlight new approaches to make this easier and the ways data integration can inform and transform studies of functional biology of fungi. |
June 29, 2011 |
| Beyond IID | ||
|
Oscar Westesson
UC Berkeley |
"Accurate reconstruction of insertion-deletion histories by statistical phylogenetics" The "multiple sequence alignment" is a computational artifact. In nature there is no such thing; rather, an alignment represents a partial summary either of indel history, or of structural similarity. Here we show, via evolutionary simulation tests, that all currently-available multiple alignment tools introduce systematic biases into downstream evolutionary analysis - particularly when used to reconstruct histories of insertions and deletions. I will present our unification of Felsenstein's "pruning" algorithm and "progressive alignment" to build a fast, linearly-scaling approximate-maximum-likelihood phylogenetic alignment/reconstruction algorithm. Inference of evolutionary history in this framework displays a clear improvement in accuracy over non-statistical phylogenetic reconstructions and a massive improvement in performance over slow-running MCMC statistical reconstructions. |
September 20, 2011 |
|
Alexandre Bouchard-Côté
U British Columbia |
"The Poisson Indel Process" The key component of a probabilistic joint approach to tree and alignment inference is a Continuous Time Markov Chain (CTMC) over strings. Ideally, this CTMC should support tractable inference algorithms and should be easily extensible to support a wide range of evolutionary models. The classical string-valued CTMC, the TKF91 model (Thorne et al., 1991), is limited in both of these axes. Previous work has focussed on increasing the complexity of the TKF91 model, making the inference problem computationally more difficult (Miklos et al., 2004). In this work, we present a new stochastic process, the Poisson Indel Process (PIP), which allows simple and practical inference algorithms. Efficient computations are based on an exchangeable representation and on Poisson processes. This representation gives a natural way of extending the capacity of the model while keeping inference computationally practical. We used this process to design a joint Bayesian estimator over alignments and trees. We evaluated both consensus trees and alignments against standard baselines on synthetic and real data. These experiments demonstrate that competitive trees and alignments can be inferred using a Bayesian model equipped with a PIP prior. |
October 18, 2011 |
| Software | ||
|
Liam Revell and Klaus Schliep
UMass Boston and University of Paris |
"Introduction to phytools and phangorn: phylogenetics tools for R" phytools is a new multifunctional phylogenetics package for the R statistical computing environment. The focus of the package is on methods for phylogenetic comparative biology; however it also includes tools for simulation, phylogeny input/output, manipulation, and even inference. The phytools library is designed for maximum interoperability with other important R phylogenetics packages such as ape, geiger, and phangorn. phangorn is a package for phylogenetic reconstruction and analysis in the R language. Previously it was only possible to estimate phylogenetic trees with distance methods in R. phangorn, now offers the possibility of reconstructing phylogenies with distance based methods, maximum parsimony or maximum likelihood (ML) and performing Hadamard conjugation. Extending the general ML framework, this package provides the possibility of estimating mixture and partition models. Furthermore, phangorn offers several functions for comparing trees, phylogenetic models or splits, simulating character data and performing congruence analyses. |
December 15, 2011 |
|
Sergei Kosakovsky Pond
UCSD |
"Introduction to HyPhy: Hypothesis testing using Phylogenies" HyPhy is an open-source software package for the analysis of genetic sequences using techniques in phylogenetics, molecular evolution, and machine learning. It features a complete graphical user interface (GUI) and a rich scripting language for limitless customization of analyses. Additionally, HyPhy features support for parallel computing environments (via message passing interface) and it can be compiled as a shared library and called from other programming environments such as Python or R. |
January 25, 2012 |
|
John P. Huelsenbeck and Sebastian Höhna
UC Berkeley and Stockholm University |
"RevBayes: An R like Environment for Bayesian phylogenetic inference" RevBayes is a computer program that uses directed acyclic graphs (DAG's) to specify any type of model, to hold the model and data in memory, and to compute the likelihood of the parameters of the model. DAG's provide a framework for the construction of modular models. Models can easily be extended and/or parts of the model exchanged (e.g., the substitution process and clock model) and several models can be combined. The design of RevBayes should allow the implementation of any extension to existing models. RevBayes is mainly developed for Bayesian phylogenetic analyses, but it can be extended to any inference on probabilistic models. In this talk, I will give a brief introduction to the concept of DAG's and how they are used to construct a model. Once the model is specified, I will show how to simulate new observations under the model and how to estimate its parameters. I will demonstrate this in the RevLanguage, which is an R-like language for building DAG's for phylogenetic problems. The RevLanguage is used interactively to specify the model, as done with R. I will show how a full phylogenetic model is specified, step-by-step. I will mainly focus on various standard substitution models, relaxed clock models, and divergence times priors. Specifically, I will show a new birth-death model with speciation and extinction rates varying over time and use this in a integrative analysis. In the integrative analysis I condition only on the alignment (only the alignment is considered to be known) and estimate the tree and divergence times simultaneously as well as the speciation and extinction rates. Example files for the demonstration are available here. |
February 29, 2012 |
| Structure and molecular evolution | ||
|
David Liberles
U Wyoming |
"Protein Structural, Biophysical, and Genomic Underpinnings of Protein Sequence Evolution" Common models for amino acid substitution assume that each site evolves independently according to average properties in the absence of a genomic, protein structural or functional context. Two characterizations of amino acid substitution will be presented. One approach extends a population genetic model to inter-specific genomic data and a second approach evaluates the effects of selection for protein folding and protein-protein interaction on sequence evolution. Several take home lessons include the importance of considering linkage independent of protein structure, the importance of negative pleiotropy (or not statements in folding and binding), and the nature of the co-evolution of sites and how it links standard substitution models with covarion models when binding function is conserved and when it changes. |
March 28, 2012 |
|
Richard Goldstein
National Institute for Medical Research, London |
"Simulating evolution with in silico models of protein thermodynamics" Many of the most basic issues of protein evolution are difficult to determine from the relationship between existent protein sequences. We would ideally like to analyse the complete evolutionary record: what mutations were attempted when in what lineage, which ones were deleterious or advantageous and by how much, which ones were accepted, and how these substitutions affected further mutations and the overall evolution of protein properties. In the absence of available biological data, we can create our own - simulate protein evolution in silico, such as in our work modelling how proteins would evolve given their need to be thermodynamically stable. These simulations allow us to explore a range of phenomena and develop a conceptual framework that tells us which questions may be interesting and important to consider in real proteins. Such simulations can also illuminate which conditions are necessary and/or sufficient to explain observed protein characteristics. We consider how evolution of protein thermostability explains why proteins are generally marginally stable, why eukaryotes may have more disordered proteins than prokaryotes, and what the consequences of this are for biochemical networks. We also consider how various locations in a protein can co-evolve, and how this can inform the next generation of substitution models. |
April 30, 2012 |
|
David Pollock
University of Colorado School of Medicine |
"Adaptation, coevolution, and convergence in the context of protein thermodynamics" Interactions within and between proteins are a fundamentally important part of how they evolve and adapt. We have been considering how and why proteins adapt, coevolve, and converge, and working to understand these concepts in the context of protein thermostability and function. We will expand from the previous talk of our collaborator, Dr. Goldstein, and discuss how and why coevolution is and should be detected, and how thermostability affects reconstruction of ancestral functions. Further, we will discuss our work on adaptive redesign in mitochondrial proteins, perhaps the largest known case of an adaptive burst in multiple metabolic proteins. The convergence between ancestral snakes and ancestral acrodont lizards is also perhaps the largest known case of adaptive convergence. We will consider what these examples tell us about the theory of how proteins appear to evolve in the context of nearly neutral versus cases of adaptive change. Further, we will discuss the impact on understanding phylogenetic relationships, and we will also discuss a unified theory of nearly neutral and adaptive evolution in the context of structure and function. |
May 30, 2012 |
| Rates and Dates | ||
|
Tanja Gernhard Stadler
ETH Zurich |
"Inferring macroevolutionary processes based on phylogenetic trees" Phylogenetic trees of present-day species allow inference of the rate of speciation and extinction which led to the present-day diversity. Classically, inference methods assume a constant rate of diversification, or neglect extinction. I will discuss major limitations of this null model and will present a new framework which allows speciation and extinction rates to change through time (environmental-dependent diversification), with the number of species (density-dependent diversification), and with a trait of a species (trait-dependent diversification). For the latter model, particular focus is given to the trait being the age of a species. Issues arising in empirical data analysis, such as incomplete taxon sampling, model selection, and confidence interval estimation, will be discussed. The methods reveal interesting macroevolutionary dynamics for mammals, birds and ants, and can easily be applied to other datasets using the R packages TreePar and TreeSim available on CRAN. |
September 19, 2012 |
|
Hélène Morlon
Ecole Polytechnique |
"Understanding biodiversity patterns using the Tree of Life" Species richness results from past and current speciation, extinction and dispersal events, themselves influenced by various ecological and evolutionary processes. Estimating rates of diversification, and understanding how and why they vary over evolutionary time, geographical space, and species groups, is thus key to understanding how ecological and evolutionary processes generate biological diversity. Phylogenetic approaches are critical for making such inferences, especially in groups or regions lacking fossil data. I will illustrate how phylogenies, coupled with models of cladogenesis, can be used to test the role of ecological limits, boom-then-bust diversity dynamics, the paleoenvironment, and population dynamics on the biodiversity patterns that we observe today. |
December 5, 2012 |
| Phylogenetics and language | ||
|
Simon Greenhill
Australian National University |
"Language phylogenies and cultural evolution" Charles Darwin famously noted that there were many curious parallels between the evolution of species and languages. Since then evolutionary biology and historical linguistics have used trees to conceptualise evolution. However, whilst evolutionary biology developed the vast discipline of phylogenetic methods, linguistics dabbled with computational methods before rejecting them. The last decade or so has seen the introduction of phylogenetic methods into linguistics, often with some startling results. In this talk I will present some of these studies, and discuss how phylogenetics can help us grapple with the problems of linguistic and cultural evolution. These problems range from testing population dispersal hypotheses, to investigating the shape of cultural evolution, to inferring the rates at which languages change. |
January 16, 2013 |
|
Fiona Jordan
University of Bristol |
"Testing hypotheses about cultural evolution" Anthropologists had a name for the non-independence-of-species-problem way back in the 1880s. Solving "Galton's Problem", and the promise of comparative methods for testing hypotheses about cultural adaptation and correlated evolution was a major catalyst for the field of cultural phylogenetics. In this talk I will show how linguistic, cultural, and archaeological data is used in comparative phylogenetic analyses. The "treasure trove of anthropology" - our vast ethnographic record of cultures - is now being put to good use answering questions about cross-cultural similarities and differences in human social and cultural norms in a rigorous evolutionary framework. |
February 5, 2013 |
|
Thomas Currie
University College London |
"Bobbins, Borrowing, and Bayesian Inference: Horizontal Transfer and the application of Phylogenetic Methods in Cultural Evolution studies" Researchers have applied quantitative phylogenetic methods to study human cultural and linguistic evolution. However, a common critique of this approach is that cultural evolution and biological evolution differ in important ways that make phylogenetic analyses unsuitable for cultural data. Principally, horizontal transmission (or borrowing) of cultural and linguistic traits is argued to be so pervasive as to invalidate the approach. In this talk I will address this issue by asking how much does horizontal transfer occur?, and does it matter if it does? Contra the skeptics, I will discuss studies that demonstrate that 1) many biological systems also show non-tree-like patterns of evolution, 2) cultural systems vary in the degree to which horizontal transfer occurs, and 3) borrowing does not necessarily cause big problems. Rather than being a reason to give up on the whole project, borrowing can be productively investigated using phylogenetic techniques to yield deeper insights cultural and linguistic evolution. |
March 11, 2013 |
| In honor of Carl Woese | ||
|
Norman R. Pace
University of Colorado – Boulder |
"Following Carl Woese into the Natural Microbial World – The Beginnings of Metagenomics" Carl Woese, one of the great scientists of all time, died in December, 2012. Among other important contributions, he used primitive sequencing technology to compare small subunit (16S) ribosomal RNA sequences from different organisms and thereby establish the outlines of a universal tree of life. His results also put in place a sequence-based reference framework within which to understand and articulate biological diversity. Since this perspective is based on molecular sequences and not properties of organisms, it opened the door to begin to understand the kinds of organisms that make up the natural microbial world. Prior to Woese’s sequence-based reference framework, microbial ecologists had to culture organisms to study them, but not many environmental organisms, <<1%, are cultured using standard methods. Sequence surveys of environmental microbial genes and genomes – “metagenomics” - have now revolutionized understanding of microbial ecology, including its influence on human health. The seminar will discuss how metagenomics developed and the impact it has had on our understanding of environmental microbial diversity and the structure of the molecular tree of life. |
April 16, 2013 |
|
Ed DeLong
MIT |
"How Carl Woese transformed the field of microbial ecology" The challenges of dissecting naturally occurring microbial assemblages, with respect to their community composition, interspecies interactions, functional attributes, and activities, are numerous and daunting. For many years, these challenges impeded our understanding of the properties and dynamics of microbial communities, and thus hindered development of the field of microbial ecology. Enter Carl Woese: the theory and application of molecular phylogenetics and genomics in studies of microbial evolution and ecology can be traced directly to Woese and one of his primary collaborators, Norman Pace. This lecture will trace the logic and roots of the application of molecular phylogenetics and genomics to the study of microbial ecology, through a historical review and examination of its past and current applications. |
May 13, 2013 |
To watch a recording, simply click on the name of the speaker (and be patient while it starts...).