Research on the DNA “packaging problem” - how 2 meters of mammalian chromatin fits into a nucleus with a diameter in the order of 10-6 meters - has revealed distinct territories for each chromosome, separate clustering of transcriptionally active and silent parts of chromosomes and more frequent chromatin contacts within the boundaries of around 100kb-long genome stretches. To understand the control of gene expression, a particular focus lies on chromatin loops, which bring regulatory DNA elements into close proximity to transcriptional units. However, the spatial organization of the actual transcriptional output of the genome (RNA) within the nucleus has so far received less attention.
There are several characteristics of the transcriptome that suggest evolutionary benefits of 3-dimensional nuclear RNA organization: i) Cellular RNA is found at concentrations that initiate self-assembly with or without proteins in in vitro conditions1. In vivo, reversible and liquid membrane-less assemblies of RNA and proteins can increase the efficiency of biochemical processes2, 3, but irreversible aggregation can be detrimental for cells and therefore regulation of the equilibrium between the different states is crucial. ii) The abundances of RNAs in cells span several orders of magnitude, with ribosomal RNA contributing around 50% of the nuclear transcriptome and 90% or more in whole cells. Dynamic sequestration of such highly expressed RNAs might help to avoid sponging of RNA-binding proteins into non-functional interactions with high-abundance transcripts. iii) RNA in nuclei can diffuse non-directionally once synthesized from the encoding gene4. But the spatial reach of an RNA might be locally restricted if the RNA is sufficiently short-lived and/or chromatin-retained, as is suggested by RNA-FISH images of nascent transcripts that appear as spots or clouds around the corresponding gene loci5. For regulatory RNAs, which affect the expression of other genes, this can limit the number of genes in 3-dimensional space that the regulatory RNAs can act upon. In contrast regulatory proteins like transcription factors enter nuclei through pores from all directions and usually rely on binding to defined DNA sequences to achieve specificity. Consequently, gene and RNA positioning of regulatory RNAs shapes the spatial regulatory landscape in nuclei.
Excited about the prospect of these different aspects of nuclear organization, we set out to develop a sequencing method that could directly infer proximity between RNAs. We started with experiments based on ligation between RNA ends in chemically crosslinked cells to obtain pairwise contacts but soon revised our approach. Ligation-based methods are limited to very close distances between transcripts and can only ligate two nearby transcripts, rather than larger groups of RNAs in close proximity to one another. We chose instead to sequence RNA from individual subnuclear chunks of chemically crosslinked nuclei after tagging all transcripts contained in a chunk with the same, unique DNA barcode in water-in-oil emulsion droplets. For “Proximity RNA-seq” no distance cut-off applies and proximities are identified as long as RNAs are more frequently found co-barcoded than expected by their abundance. Barcoding through random priming and reverse transcription in droplets is not restricted to pairs of RNAs but allows the detection of groups of transcripts. Applying a simple vortexing protocol, without the need for microfluidics equipment, millions of droplets can be rapidly generated. Beads can be barcoded by emulsion PCR and later RNA in subnuclear chunks can be tagged with bead-bound barcodes in such droplets. Soon after prototyping of Proximity RNA-seq, single-cell RNA sequencing in droplets was published6, 7 and suggested that our approach of in-droplet barcoding of subcellular chunks is suitable to infer RNA proximity in cells. Steven Wingett of the Babraham Bioinformatics group led the development of CloseCall, a tailored pipeline to convert raw sequencing data into RNA proximities.
Diving further into Proximity RNA-seq data analysis, we started to learn how the transcriptome is non-randomly positioned and organized within the nucleus. Using the nucleolus as a landmark nuclear structure, we found that different transcripts are located at distinct distances from the nucleoli. The distance from the nucleoli correlates with the degree of alternative splicing and tissue specificity of transcripts. Interestingly, the transcriptional output measured by Proximity RNA-seq revealed that heterochromatin peripheral to nucleoli is interspersed with regions of active transcription. Together with Irene Farabella and Marc Marti-Renom from the CRG, Barcelona, we were able to identify different chromatin compartments in whole chromosome 3D models with differential RNA density and kinetic parameters of transcription. Proximity RNA-seq and biological findings are summarized in our paper in Nature Biotechnology (https://www.nature.com/articles/s41587-019-0166-3).
With the growing evidence that RNA functions as a key regulator of phase separations and spatial organization, we believe that transcriptome-wide assessment of positioning and clustering of transcripts in cells will be very valuable, and Proximity RNA-seq is likely to serve as a blueprint for such measurements in future studies.
1. Van Treeck, B. & Parker, R. Emerging Roles for Intermolecular RNA-RNA Interactions in RNP Assemblies. Cell 174, 791-802 (2018).
2. Stanek, D. et al. Spliceosomal small nuclear ribonucleoprotein particles repeatedly cycle through Cajal bodies. Mol Biol Cell 19, 2534-2543 (2008).
3. Klingauf, M., Stanek, D. & Neugebauer, K.M. Enhancement of U4/U6 small nuclear ribonucleoprotein particle association in Cajal bodies predicted by mathematical modeling. Mol Biol Cell 17, 4972-4981 (2006).
4. Misteli, T. Physiological importance of RNA and protein mobility in the cell nucleus. Histochemistry and cell biology 129, 5-11 (2008).
5. Engreitz, J.M., Ollikainen, N. & Guttman, M. Long non-coding RNAs: spatial amplifiers that control nuclear structure and gene expression. Nat Rev Mol Cell Biol 17, 756-770 (2016).
6. Macosko, E.Z. et al. Highly Parallel Genome-wide Expression Profiling of Individual Cells Using Nanoliter Droplets. Cell 161, 1202-1214 (2015). 7. Klein, A.M. et al. Droplet barcoding for single-cell transcriptomics applied to embryonic stem cells. Cell 161, 1187-1201 (2015).