15 non cross-reacting protein glues for one-pot seamless assembly of multiple peptides

Inteins are remarkable seamless protein ligation tools with demonstrated broad applications in biotechnology and biochemistry. Our expanded library provides orthogonal split inteins that can be used simultaneously, thus unlocking the full potential of this exciting tool in many application areas.
15 non cross-reacting protein glues for one-pot seamless assembly of multiple peptides

By Filipe Pinto and Ella Lucille Thornton

Nature has evolved numerous proteins for the manipulation of DNA, which we use in the lab on a daily basis. We cut it, ligate it, modify it; but what about proteins? Of course, there are proteases that can cut proteins, but tools for the post-translational ligation of proteins are somewhat more unusual. Here, we show the remarkable capabilities of split inteins as protein ligation tools (Fig. 1).

Figure 1 | Protein ligation using split inteins.

Inteins are protein segments capable of protein ligation and have been referred to as “Nature's gift to protein chemists”1. The process of joining adjacent residues via a peptide bond is known as protein splicing (in trans if the intein is split) and the intein itself is not present in the final sequence, thus achieving scarless peptide ligation.

The self-catalytic nature of the splicing reaction, which relies solely on protein folding and the fact that the inteins are not part of the final product, are characteristics of exceptional value; thus, making them remarkable tools for both protein engineering and synthetic biology applications.

As nothing is flawless, inteins have particular preferences for the amino acid residues on either side of the ligation site. Deviation from these 6-residue sequences (3 aa. on each extein) may affect splicing efficiency, thus making life difficult for those wishing to use inteins with proteins that cannot tolerate amino acid residue additions or substitutions.

The more you have, the more you can choose from!

Bearing this in mind, we aimed to expand the set of orthogonal inteins currently available, to increase the chances of finding inteins compatible with the target proteins and to be able to use them simultaneously. To this end, we searched the literature and InBase2 (the intein database) for inteins with different native insertion sites (exteins) and selected an initial set of 34 phylogenetically distant inteins (Fig. 2), with the assumption that homology negatively correlates with orthogonality.

Figure 2 | Phylogram based on the structural alignment of the 34 inteins selected for characterization (inteins retrieved from InBase are in bold).

Next, came the question of how to efficiently characterize such a large library of inteins under similar conditions and close to their native context. To address this, we split the fluorescent protein mCherry and observed how efficiently and quickly our library of split inteins was able to glue back together the two halves and reconstitute fluorescence (Fig. 3a). Not only did this approach allow us to analyse how each intein performs, it also enabled a high-throughput technique to screen for cross-reactivity between the inteins (Fig. 3b). We were able to identify 15 highly orthogonal split inteins, five of which were previously uncharacterised, and defined reaction conditions in which 10 of them can be used concurrently in vitro.

Figure 3 | A novel split mCherry platform (a) for rapidly assessing split inteins trans-splicing and orthogonality (b).

“Once you have the tools to build your dreams, they are a step closer to become a reality!” FP

With this expanded library in hand, we further demonstrated its versatility in enabling new synthetic biology (in vivo complex logic circuit design) and protein engineering (in vitro modular protein assembly) applications.

(i) We have built and connected three orthogonal AND gates based on intein-split extracytoplasmic function (ECF) sigma factors to develop a 3-input 3-output integrated logic circuit (Fig. 4a). It produces fluorescent outputs when Escherichia coli cells are exposed to two or more inputs and discriminates between them by reporting a corresponding fluorescent protein.

(ii) We have performed the in vitro assembly of a large (~226 kDa) highly repetitive protein from smaller precursors individually expressed in different bacterial cells. We directly used clear E. coli lysates (no precursors’ purification required) to assemble the protein in ‘one pot’ using 5 orthogonal inteins (Fig. 4b) or by on column assembly, using only two inteins to glue the protein fragments one after the other, reusing cell lysates.

Figure 4 | Split inteins enabled complex genetic circuits (a) and in vitro large protein assemblies (b).

This work provides an expanded library of orthogonal split inteins and shows its vast potential in both in vivo applications, including scaling up genetic logic circuit design in living cells, and in vitro applications, including providing a rapid and simple way to seamlessly assemble large repetitive proteins that are of biotechnological interests but often difficult to clone and produce using heterologous expression systems.

We believe that our work will be of interest to the broad scientific community, in particular to synthetic and chemical biology researchers who seek new enabling tools to scale up gene circuit design and new efficient methods for protein assembly and biomaterial engineering.

 The paper:

Pinto F, Thornton EL & Wang B. An expanded library of orthogonal split inteins enables modular multi-peptide assemblies. Nature Communications, 11, 1529 (2020). DOI: 10.1038/s41467-020-15272-2


1.   Shah NH & Muir TW. Inteins: nature’s gift to protein chemists. Chem. Sci. 5, 446-61 (2014).

2.   Perler FB. InBase: the Intein Database. Nucleic Acids Res. 30, 383-84 (2002).