Each of us contains a collection of trillions of microbes that is unique and vital to our health. The realization that the microbiome is highly individualized has led to advances in personalized nutrition and medicine. However, it also complicates our ability to compare microbiomes among different people. For example, it is likely that there is no universal ‘healthy’ human microbial community, but, rather, that each individual has their own ‘healthy’ state. To account for inter-individual variation, when studying microbiota between host phenotypes (e.g. healthy v. sick), repeated measures studies are increasingly popular. These studies involve collecting multiple samples from the same subject, either longitudinally or from different body sites. The end goal of repeated measures experimental designs is to separate individual variation over time or space from the host phenotype(s) of interest.
Our group has long been interested in developing methods and visualization tools to better understand how hosts or environments differ in their microbial communities. Pioneering methods such as UniFrac, developed by Dr. Catherine Lozupone while she was a graduate student in the lab, have helped reveal associations between the microbiome and a wide range of phenotypes. This included aspects of development in the human microbiome: for example, one of the strongest factors influencing the microbiome immediately after birth is delivery mode (i.e. vaginal delivery vs. caesarean section). Many other studies have since demonstrated the strong effect birth mode has on the microbiome during development, and other examples of temporal variability in the microbiome. As the popularity of repeated measures studies has grown, we set out to develop a method to account for inter-individual variation in the microbiome over time or space.
In our Nature Biotechnology article, we describe our new method called Compositional Tensor Factorization (CTF), which accounts for repeated measures study designs. Similar to methods like UniFrac, which was developed for cross-sectional data, CTF allows us to explore how microbial community composition differs between subjects across time. In the paper, we first demonstrate that in both real and data-driven simulations CTF significantly improves our ability to separate cases from controls as compared to current beta-diversity methods. Next, we re-analyzed two studies that tracked infant gut development over time, focusing specifically on birth mode. We found that not only did CTF better separate infants born via C-section from those delivered vaginally when compared to existing methods, but it also revealed consistent microbial taxa that drive those differences and that are consistent between studies. From this consistent set of microbial taxa, we derived a birth mode ‘signature’ with the log-ratio of vaginal to caesarean associated microbes. We found that this signature is detectable in the first four years of life using the American Gut Project, a large citizen science microbiome dataset produced by members of our group.
We believe that using CTF, researchers will discover many more shared temporal signatures in varying host phenotypes, which may ultimately lead to improvements in personalized medicine and in early-life interventions that prevent the development of chronic disease in adulthood or old age. For example, we imagine early life interventions targeting the developing microbiome that could affect the risk of asthma in childhood, cardiovascular disease in middle age, or Alzheimer’s in the elderly, since all of these conditions have been linked to the microbiome.
Our paper: Martino, C., Shenhav, L., Marotz, C. A., Armstrong, G., McDonald, D., Vázquez-Baeza, Y., Morton, J. T., Jiang, L., Dominguez-Bello, M. G., Swafford, A. D., Halperin, E. & Knight, R. Context-aware dimensionality reduction deconvolutes gut microbial community dynamics. Nat. Biotechnol. (2020).
CTF method code repository: https://github.com/biocore/gemelli