Simultaneous Repression of Multiple Bacterial Genes using Nonrepetitive Extra-Long sgRNA Arrays (ELSAs)
We developed ELSAs to stably co-express many CRISPR guide RNAs without introducing repetitive DNA. Adding many-site targeting to CRISPR will have profound biotech implications.
Our work, just published in Nature Biotechnology, started as a curious side-project conceived in a free-wheeling elective graduate course. We were driven by the ambitious goal of engineering sophisticated genetic circuits with many programmable transcriptional regulators. As a result, we developed a scalable approach to simultaneously and stably co-express many CRISPR-based regulators within an easily synthesized and assembled, nonrepetitive DNA cassette. CRISPR has so many applications across biotechnology that achieving the ability to target many distinct genomic locations has profound implications.
But how much is many? In our work here, we chose to co-express 20 CRISPR sgRNAs within a single cassette because it served as a useful benchmark of our approach’s scalability. In more recent not-yet-published experiments, we’ve designed and characterized more than 1000 unique nonrepetitive CRISPR regulators, and our calculations suggest that the upper maximum is more like 100,000. So if your approach scales well, many can be a seriously large number. Here, we will provide some introspection into how this project started and highlight some of its surprising discoveries.
Since its early inception, synthetic biologists have engineered living cells to act like primitive computers; unlike their silicon counterparts, living computers can self-replicate, manufacture valuable chemicals (e.g. therapeutics or materials), sense their environment, perform decision-making, and accordingly modify their environment. These “programs” are written as genetic circuits, encoded within DNA sequences, that must be synthesized, assembled, and inserted into the cell’s genome. Like their silicon counterparts, a genetic circuit’s capabilities are limited by its number of regulators. The most advanced genetic circuit to-date has about 12 individually expressed, interacting regulators, and yet we need to co-express many more regulators to develop truly sophisticated signal processing and decision-making capabilities inside cells. The living computers of the future can help solve imminent 21st century problems, for example, reducing atmospheric carbon dioxide levels or treating antibiotic resistance, but only if they are equipped with the necessary genetic circuitry to simultaneously sense many environmental changes, process many signals, and coordinate many cellular activities.
Within the environment of a fairly open-ended graduate course, we brainstormed how CRISPR could completely upend the state-of-the-art in genetic circuit engineering, resulting in several genetic circuit designs. We carefully arranged the long DNA sequences to implement these designs and arrived at an observation that we think was shared by many in the field: There are just too many repetitive DNA sequences in these CRISPR genetic parts to do what we wanted to do. Repetitive DNA can’t be synthesized by gene synthesis companies. And even if you assemble repetitive DNA via other cloning techniques and insert it into an organism, it will eventually be deleted by homologous recombination , particularly if the introduced DNA’s program causes any growth-inhibiting stress, which is common. By the end of the course, a new side-project was born; if we could design, construct, and characterize nonrepetitive CRISPR genetic parts, that would overcome a field-wide challenge to scaling up genetic circuit engineering.
Chronologically, we first targeted the CRISPR single-guide RNA (sgRNA) handles – these are the 61 nucleotide sequences that fold into RNA structures, responsible for binding and loading up into the Cas9 protein. The goal was to maximally diversify these nucleotide sequences without compromising their function, leveraging our algorithmic prowess to efficiently achieve both. From our results, we were surprised that we could introduce up to 23 mutations into these handle sequences while retaining their functionality (surprising discovery #1)! However, they had to be the right mutations. To differentiate between good and bad mutations, we developed a completely new way of designing genetic parts, according to specified sequence and structural constraints, while leveraging machine learning to optimize those constraints. After three rounds of a design-build-test-learn cycle, we were able to learn those optimal design constraints and apply them to experimentally validate a toolbox of 28 highly functional and nonrepetitive sgRNA handle sequences. With 28 programmable and nonrepetitive regulators in hand, we turned our attention to the genetic parts used to express them. To eliminate all sources of repetitive DNA, we designed, constructed, and characterized 64 nonrepetitive promoter sequences and 63 biologically neutral DNA spacers. We also identified 137 nonrepetitive transcriptional terminators from an existing toolbox . Our Supplementary Data is a veritable smorgasbord of reusable genetic part data. Have a feast!
Our next step was to place these nonrepetitive genetic parts together into Extra Long sgRNA Arrays (ELSAs) to simultaneously co-express many sgRNAs. However, solving a grand challenge is like peeling an onion; you pull back one layer of problems to find another, all while wiping away the tears. Our initial arrays were hand-designed and accidentally contained several undesired genetic elements, such as promoters that transcribe in the opposite direction to express anti-sense RNA, which likely accelerated sgRNA degradation. Hand-designed arrays were also prone to more DNA synthesis failures; the arrangement of the nonrepetitive genetic parts within the array affected key metrics that determine a gene synthesis company’s ability to synthesize DNA fragments. In all, we devised 23 quantitative design rules governing sgRNA expression, DNA synthesis success, and guide RNA design, leveraging our pre-existing Cas9 Calculator algorithm for predicting sgRNA off-target activity . We then developed an optimization algorithm, the ELSA Calculator, to identify the optimal guide RNA sequences and ordering of nonrepetitive genetic parts to maximally satisfy our design rules. Once we switched to computational design, our synthesis success rates rose considerably, while eliminating undesired genetic elements (surprising discovery #2). Our largest ELSA contained 100 nonrepetitive genetic parts to co-express 20 sgRNAs, all within a 4186 bp DNA cassette, synthesized as 2 non-clonal fragments, received within 5 days, and readily assembled using Gibson assembly. We co-expressed this ELSA with deactivated Cas9 to redirect metabolic flux towards a classic metabolic engineering target, succinic acid, resulting in a 150-fold increase in production titer. Notably, with so many sgRNAs co-expressed, we needed to increase dCas9 expression considerably to achieve enzyme knock-downs up to 3500-fold. We then rigorously tested the ELSA’s genetic stability over several days of adaptation and production culturing without finding a single mutation. A single genome-integrated DNA cassette rewired the cell’s metabolic network with excellent genetic stability!
a. The basic expression unit of one sgRNA in a bacterial ELSA. b. Repeat chord diagrams for the natural S. pyogenes CRISPR locus, a 12-sgRNA ELSA using wild-type genetic parts, a 12-sgRNA ELSA using engineered genetic parts and a 20-sgRNA ELSA using engineered genetic parts. c. The part compositions of a 20-sgRNA ELSA targeting six genes, called ELSA-Succinate. d. Characterization of the ELSA-Succinate strain, employing RT-qPCR and quantitative LC-MS measurements to record changes in mRNA and metabolite levels, compared to a no-ELSA strain control.
Overall, we are most excited by this technology’s versatility; anyone can use the ELSA Calculator and its database of nonrepetitive genetic parts to design their own ELSAs for a wide variety of biotech applications. We demonstrated this in our work by using ELSAs across additional examples: knocking down the expression of amino acid biosynthesis enzymes to create multi-auxotrophic strains (an engineered cell biocontainment example); and knocking down bacterial adaptive stress responses to greatly reduce persister cell formation in response to antibiotic treatment (a clinically relevant example). We also show that our nonrepetitive sgRNAs retain their ability to cleave DNA when loaded into cleavage-competent Cas9. This level of programmable functionality within a single compact DNA cassette, so easily synthesized and assembled, is what excites us the most.
The ELSA Calculator is available at https://salislab.net/software/design_elsa_calculator.
By Alex Reis, Sean Halper and Howard Salis
Our Paper: Reis, A.C., Halper, S.M., Vezeau, G.E., Cetnar, D.P., Hossain, A., Clauer, P.R. and Salis, H.M. (2019) Simultaneous repression of multiple bacterial genes using nonrepetitive extra-long sgRNA arrays. Nature Biotechnology. https://www.nature.com/articles/s41587-019-0286-9
1. Shen, P. and Huang, H.V. (1986) Homologous recombination in Escherichia coli: dependence on substrate length and homology. Genetics, 112, 441-457.
2. Chen, Y.-J., Liu, P., Nielsen, A.A., Brophy, J.A., Clancy, K., Peterson, T. and Voigt, C.A. (2013) Characterization of 582 natural and synthetic terminators and quantification of their design constraints. Nature methods, 10, 659.
3. Farasat, I. and Salis, H.M. (2016) A biophysical model of CRISPR/Cas9 activity for rational design of genome editing and gene regulation. PLoS computational biology, 12, e1004724.