Targeted & directional nanopore sequencing: Exploring the unknown
Long read sequencing can solve complex genomic rearrangements, such as oncogenic fusion genes, but the limited throughput of one sequencing run restricts accurate detection. Therefore, we developed an assay where we target Cas9 to one region of the genome and selectively explore the surroundings.
We like to think that we know everything in the genome. But the more we sequence, the more we discover new things that we have previously missed. Sometimes this limitation is technical, with current methodologies simply unable to sequence certain regions of the genome.
For some time, I have investigated gene fusions in cancer genomes- genomically combined genes that result in altered, oncogenic properties - from a fundamental angle. However, I continually came back to the same point: why aren’t these fusion genes, which clearly have implications for the patient, detected quickly and robustly for cancer patients? The reason for this is diversity; diversity in gene fusion partners and diversity in breakpoints, the region where two genes are combined. And current diagnostic methodologies struggle with this diversity.
I come from a lab with strong expertise in cancer genomics, especially sequencing technologies such as long read nanopore sequencing. My group continually looks for ways to use this technology to overcome technical limitations of other sequencing technologies and discover what we are missing. For fusion genes, nanopore sequencing would have been the tool of choice - long reads to span the complex regions and delivers fast results - but the throughput of one sequencing run is just too low. We needed a method to target areas of the genome that we were interested in, and only sequence those regions.
Our breakthrough only happened once Oxford Nanopore Technologies provided a protocol for targeted nanopore sequencing based on Cas9 enrichment. This protocol worked great, but required two known regions to excise the region in between and perform targeted sequencing on this region. However, for fusion genes usually only one gene, one region is known. We thought - why only focus on what is known? What if we sequence from one known targeted area and use the long reads to “discover” the surrounding genomic sequence (Figure 1)?
Figure 1: Targeted nanopore sequencing (A) Genomic DNA with a known fusion partner (green) and surrounding unknown sequences (orange). With our targeted sequencing approach we can enrich for and selectively sequence the region adjacent to known part of the genome. (B) Normal coverage (2-5X) of a gene fusion using one long-read sequencing run. (C) Targeting Cas9 to one known gene partner at a specific locus and directionally reading into the unknown fusion partner results in much higher coverage of the gene fusion and ability to discover the fusion partner and exact genomic breakpoint.
We adapted the protocol and it immediately worked, essentially letting the long reads discover the sequences that followed the targeted area. By designing the guides in a specific way, we could choose which direction to sequence, allowing us to target any area of the genome and look to see what was around that sequence. My colleague Glen Monroe and I applied this technique to fusion positive cell lines and tumor samples for this manuscript and even solved diagnostically problematic samples. Through the ability to target any region in the genome and to choose in which direction you want to sequence we started many collaborations to apply this technique, some far from our original research interest. We are excited to help others solve research questions with this technique and are now looking at other applications where sequencing into the unknown can be useful.