The ability to precisely integrate large DNA sequences at desired genomic locations represents a fundamental genome editing capability with high therapeutic potential. For example, many diseases are caused by dispersed and heterogeneous mutations spread across an entire gene, making design and implementation of bespoke editing approaches for each variant costly and impractical. Instead, programmable gene insertion could enable a blanket therapy approach whereby a given cDNA can be inserted at the endogenous target site (Fig. 1a). Moreover, this capability could place gene cassettes into safe harbor loci for loss-of-function variants or for cellular engineering, obviating safety concerns with randomly integrating viral vectors or transposons. Despite this promise, tools to enable targeted insertions are largely dependent on double-strand breaks (DSBs) - leading to uncontrolled indels at the target site, unintended genome rearrangements, and cellular toxicity - and homology directed repair (HDR), which is inefficient in many cell types (Fig. 1b). Therefore, new programmable gene insertion technologies independent of DSBs and HDR are needed.
One promising approach is to use CRISPR-associated transposases (CASTs)[2, 3], which combine CRISPR-Cas effector(s) with Tn7- or Tn5053-like transposition enzymes to direct RNA-guided integration of a desired cargo (Fig. 2a). Thus far, there are two main classes – type I CASTs and type V-K CASTs – which have complementary advantages and disadvantages (Fig. 2b). For example, type I systems have high product purity (meaning, only the intended cargo is inserted at the target site; called simple insertions) and high specificity (meaning, insertions occur only at your target site) but are larger, have more components, and can give inverted insertions. Type V-K systems are smaller, have fewer components, and give unidirectional insertions but integrate undesired cointegrate products (consisting of cargo duplication and plasmid backbone insertion) (Fig. 2c) with lower specificity (Fig. 2d).
Ideally, we could engineer a CAST that combines the advantages of both types while obviating the disadvantages. To do this, we hypothesized that starting with a type V-K CAST and engineering in purity and specificity would be much more feasible than starting with a type I CAST and trying to decrease its size, complexity, and insertion directionality. With this goal, we set out on this mission.
First, we recognized that the main difference between type I and type V-K CASTs that leads to the product purity difference is an enzyme called TnsA, which is present in type I systems but absent in all known type V-K systems. TnsA and the main transposase, TnsB, work together to excise the cargo on the donor plasmid – ultimately leading to cut-and-paste transposition (Fig. 3a). Lack of TnsA means only a single nick is created on each side of the cargo – leading to a complex mechanism called replicative transposition, which results in undesired cointegrate products (Fig. 3b).
We asked whether we could add the missing TnsA function in type V-K CASTs by fusing an orthogonal DNA nickase to TnsB, thereby switching its transposition mechanism from replicative to cut-and-paste and improving insertion product purity. The nickase recognition sequence could be encoded on the donor backbone, which would not complicate RNA-guided programmability. An ideal nickase fusion would:
- Be small in size
- Have a long recognition sequence so nicking occurs only on the donor and not at the on-target or off-target site(s)
- Have a predictable, strand-specific nick location
- Have been proven to function in bacterial and human cells
One nearly perfect class of enzyme that fits these criteria is homing endonucleases! Interestingly, these proteins preceded CRISPR-Cas as promising genome editing reagents. Thus, we were excited to combine the old and the new in genome editing to enable improved CAST function.
By fusing a nicking homing endonuclease (nHE) to TnsB of type V-K CASTs, along with modifying the donor plasmid to contain the corresponding nHE site flanking the transposon ends, we created HELIX - which stands for Homing Endonuclease-assisted Large-sequence Integrating CAST-compleX (Fig. 3c).
When characterizing HELIX and comparing it to CASTs, we discovered the following:
- HELIX dramatically increases CAST product purity by reducing cointegrate products. Importantly, integration efficiency remains comparable to the wild-type systems (Fig. 4a)
- A HELIX approach is generalizable to CAST diversity. One version, AcHELIX, was extremely efficient and had nearly perfect product purity. (Fig. 4a)
- HELIX alone is substantially more specific than its parent CAST. As of now, we are unsure why this is the case – though we can generally speculate that the nHE fusion is altering transposome conformation in a favorable way. Additional component fusions (e.g. Cas12k-TniQ or Cas12k-TnsC) further increased specificity in a HELIX, but not CAST, architecture (Fig. 4b).
- A particular factor, called pi protein, further increases HELIX, but not CAST, specificity without component fusions (Fig. 4c). The role of pi protein is still an engima, though we think it may have something to do with altering donor DNA topology.
- One HELIX variant, coined N7HELIX, showed detectable insertions in human lysate and human cells; however, this occurs at low efficiency with current constructs and conditions (Fig. 4d).
There are several exciting future directions to improve HELIX systems, to further define their mechanism, to enable them to efficiently insert DNA into mammalian genomes, and to apply them in interesting contexts. This study, in combination with other recent exciting work[1, 4-7], highlights a growing interest in trying to harness the diversity of mobile genetic elements and their encoded cargos (e.g. CASTs, recombinases, retrotransposases, homing endonucleases, IscB/TnpB) as novel and targeted genome editing tools.
1. Tou, C. J. & Kleinstiver, B. P. Recent Advances in double-strand break free kilobase-scale genome editing technologies. Biochemistry (2022).
2. Klompe, S.E. et al. Transposon-encoded CRISPR-Cas systems direct RNA-guided DNA integration. Nature 571, 219–225 (2019).
3. Strecker, J. et al. RNA-guided DNA insertion with CRISPR-associated transposases. Science 365(6448), 48-53 (2019).
4. Anzalone, A.V., Gao, X.D., Podracky, C.J et al. Programmable deletion, replacement, integration, and inversion of large DNA sequences with twin prime editing. Nat. Biotechnol. 40, 731–740 (2022).
5. Yarnall, M.T.N., Ioannidi, E.I., Schmitt-Ulms, C., Krajeski, R.N. et al. Drag-and-drop genome insertion of large sequences without double-strand DNA cleavage using CRISPR-directed integrases. Nat. Biotechnol. (2022).
6. Durrant, M.G., Fanton, A., Tycko, J. et al. Systematic discovery of recombinases for efficient integration of large DNA sequences into the human genome. Nat. Biotechnol. (2022).
7. Altae-Tran, H., Kannan, S. et al. The widespread IS200/605 transposon family encodes diverse programmable RNA-guided endonucleases. Science 374(6563), 57-65 (2021).
Cover image is adapted from smithsonianmag.org. Ernesto del Aguila III/National Human Genome Research Institute/NIH