We are on the cusp of a new age in synthetic biology where it is possible to engineer biological systems from scratch. The ultimate aim is for these systems to synthesize drugs, produce biofuels, and unravel disease mechanisms. Unfortunately, these goals remain out of reach since we still lack an understanding of how to optimize the genetic parts used to build these systems. Inducible promoters that can switch gene expression “on” and “off” are a crucial piece of this puzzle. However, the rules to create inducible promoters with designed properties remain unclear.
We used the classic lacZYA promoter as a model to learn how to design inducible promoters in bacteria. The lacZYA promoter sequence encodes binding sites that recruit RNA polymerase as well as a transcriptional repressor protein, LacI. When bacterial cells are exposed to an inducer molecule (lactose or IPTG), LacI becomes unable to repress the promoter allowing RNAP to transcribe. In this study, we synthesized and tested thousands of lacZYA variants made up of combinations of strong and weak RNAP and LacI binding sites to explore how tuning the strengths and positioning of these binding sites affects the inducible activity of the lacZYA promoter (Figure 1). A major takeaway from our results was that stronger LacI sites resulted in more inducible promoters, but using the strongest LacI sites led to surprisingly poor induction. This suggests that the strongest sites are not always ideal for designing inducible systems.
To explore this phenomenon more, we collaborated with the Phillips lab at Caltech, who used our data to train a thermodynamic model of lac promoter dynamics (Figure 2). This model provided two main insights. First, the model illustrated how to best optimize LacI and RNAP binding site pair to maximize fold-change. Notably, tuning LacI sites to proportional strengths as that of RNAP sites was crucial. Too strong and the LacI sites suppress the RNAP sites even during the induced 'on' state, too weak and the LacI sites are unable to suppress even in the uninduced 'off' state. Secondly, we found that training the model on just 5% of the data enabled us to make accurate predictions on the activities of the remaining held-out promoter data, suggesting that powerful models applicable to the vast space of promoter sequence variation could be generated with surprisingly little data.
Lastly, our study also demonstrated that alternative lacZYA architectures, generated by altering the placement and composition of binding sites, can be optimized and prove to be better than the current state-of-the-art inducible promoter, lacUV5. There are a number of benefits to having non-canonical architectures. Mainly, these alternative architectures may be more compatible for precise applications that require custom sequence designs. Furthermore, they enable us to expand beyond the canonical lacZYA architecture and access a more comprehensive genetic toolbox with broadened dynamic ranges of induced and uninduced expression. Finally, the flexibility in alternative architectures enables us to avoid repetitive genetic parts, which have been previously associated with adverse effects such as recombination and genetic instability (Hossain et al, 2020).
Ultimately, the demands for engineering biological circuits are becoming more specific and complex, which calls for a need to refine our understanding of the individual genetic parts. Our systematic dissection addresses many questions about the relationship between promoter architecture and transcriptional dynamics, which we hope will guide synthetic biologists towards finding the best inducible promoters to suit their genetic circuit design needs.