PhD Exit Seminar
Wednesday, September 13, 2017 at 2:10 pm, Earth Sciences Building, Room 3087
Shalabh Thakur (Guttman Lab)
“Comparative and Evolutionary Genomics of Pseudomonas syringae“
The Pseudomonas syringae species complex comprises many genetically diverse strains ubiquitously found in both agricultural and non-agricultural environments. The species complex has a very broad host range; however, distinct strains show strong host specificity and are able to cause disease on limited crops. Although more popularly known as a plant pathogenic bacteria, many P. syringae strains are reported to be non-pathogenic and found in habitats linked to water sources, soil, and snow fields. Multi-locus sequence typing studies have sub-divided the P. syringae complex into at least 13 different subgroups referred as phylogroups. Given such extensive genetic and ecological diversity within the species complex, there is an ongoing debate over the species definition of P. syringae strains. My research work investigated whether strains within the P. syringae species complex belong to a single species population or if distinct phylogroups are in fact different species. We performed a whole-genome analysis of approximately 400 P. syringae strains using various comparative and evolutionary approaches to examine the extent of genetic cohesion within the P. syringae species complex due to various ecological and evolutionary mechanisms. The comparative genomics projects often face a computational challenge due to quadratic increase in time and the resources needed for the pairwise sequence comparisons with the increase in number of sequenced genomes. To overcome this challenge and facilitate large-scale sequence comparisons between hundreds of closely related prokaryotic genomes, we designed a novel comparative genomic pipeline named DeNoGAP. The DeNoGAP pipeline provides a robust computational pipeline for performing various comparative genomics tasks, such as gene prediction, ortholog prediction, functional annotation, and analysis of a pan-genome. DeNoGAP implements an iterative homolog clustering strategy to increase speed and accuracy for large-scale ortholog prediction analysis. Because of this strategy, DeNoGAP outperforms the efficiency of other ortholog prediction tools that implement traditional pairwise comparison algorithms. Our whole-genome comparative analysis of more than 400 strains shed insight into the P. syringae pan-genome. We found that the P. syringae pan-genome is big and diverse, comprising more than 79,000 gene families. We also found substantial diversity in the distribution of virulence-associated gene families, such as type III secreted effectors and toxins, across P. syringae strains. Evolutionary analyses of the gene families in the P. syringae pan-genome showed evidences of homologous recombination and positive selection across entire genomes of P. syringae strains. We found that the P. syringae strains in different phylogroups rarely exchange genes via homologous recombination. However, despite being rare, inter-phylogroup homologous recombination occurs disproportinately among virulence-associated and positively-selected genes that are essential for ecological adaptation and evolution of strains within the P. syringae species complex. Based on these findings, we hypothesized that P. syringae maintains genetic cohesion between its divergent strains due to an exchange of ecological and evolutionarily relevant genes. Together, my work provides a robust computational pipeline for large-scale comparative genomics projects and sheds insight into species definition of the P. syringae species complex based on strong evolutionary species concepts rather than molecular methods.