We have just been lucky enough to have a paper published in Nature Genetics (Mercer et al 2013) showing how the 3D structure of the genome appears to play an important role in gene expression. You can find the original paper here and the press release here.
While it has been known for a few years that many exons are marked by a nucleosome sitting over them, we found a subset of exons show the opposite. Instead, these exons seem to be have nucleosomes sitting adjacent and this lack of nucleosomes helps create DNaseI hypersensitivity sites (DHS) at these locations. Looking across a vast number of cell types investigated as part of the ENCODE project we can also see that the DNase sensitivity of these exons is specific to subsets of cells.
Where this gets more interesting is these DNaseI marked exons also show CHiP-seq profiles that look like those you find in other parts of the genome. Some of these exons are marked by protein binding and chromatin modification profiles indicative of promoters, enhancers and CTCF insulator regulatory regions. We don’t think, however, this means these exons actually have these proteins bound, instead we find that they are in close physical proximity with promoters/enhancers etc and so contain a shadow of their protein binding and chromatin profile. When ChiP-seq is performed, chromatin is usually fixed to stick protein complexes to the pieces of DNA they are interacting with. However, this fixation procedure will also bring along with it pieces of DNA in close proximity, a fact that is exploited in some procedures to determine 3D genome structure (i.e. HiC).
There are several implications to this finding. Firstly, as emphasised in the paper, this means that a subset of exons, marked by DNaseI sensitivity are being brought into close proximity with their upstream promoters, or regulatory elements (be they up or downstream). Secondly, it means that ChiP-seq contains information about the 3D structure of the genome, although at present determining which exons are interacting where is more aptly achieved by methods such as ChiA-PET. Although, internal exons with promoter signals, and no sign of transcriptional initiation (from CAGE tag sequencing for example), could reasonably be hypothesised to interact with their own gene promoter.
Thirdly, it may also call for some reinterpretation of current ChiP-seq results. Generally, the ChiP-seq signal we found on exons was weaker than found on the promoter/enhancer/insulator elements they were close by to. This suggests that weaker ChiP-seq signals in parts of the genome with no evidence of binding motifs for the proteins being investigated may instead point to a physical interaction signal. We didn’t really go into this, but others may want to look through existing ChiP-seq datasets with matching ChiA-PET data and work out what proportion of ChiP-seq peaks may be interaction “shadows” compared to actual binding sites.
So what are the biological implications of exons looping back to promoters and distal regulatory elements? We found DHS associated with alternative splicing of marked exons, especially when the signal showed promoter or enhancer looping association. This was cell line specific, ie: we only saw the alternative splicing association in cell lines where these exons where marked by DHS. The same exons with no mark in a different cell type were not associated with alternative splicing. How this all works mechanistically is unclear, but the looping out of intervening sequence may help splicing to occur correctly. There is good evidence that splicing factors and regulators interact with transcription initiation – possibly, bring the exons into close contact with these proteins may impact on their splicing. This is reminiscent of the finding that different promoters can impact on different splicing outcomes. Future work may discover other functions, including the importance of exon association with CTCF elements.
Lastly this study was only made possible by the incredible amounts of data generated by the ENCODE consortium, benefiting greatly from the different genome analyses carried out on a wide variety of matched cell types. For example, our initial finding of cell type specific DHS signals on specific exons was made possible by the wide range (86 in fact) of cell type profiles generated by the Stamatoyannopoulos lab. While the ChiA-PET validation of ChiP-seq physical interaction signals was dependent on the same cell lines have been used for both analyses. Thanks big data.
Mercer et al. DNase I–hypersensitive exons colocalize with promoters and distal regulatory elements (2013). Nat Genet. doi: 10.1038/ng.2677