There is an interesting new paper out in Genome Research from Eric Lai’s lab (Miura et al. 2013) that finds many genes have much longer 3’UTRs than previously annotated. Sometimes these extended 3’UTRs look constitutive, other-times they have found alternative gene isoforms with 3’UTRs that terminate transcription (on average) several kb further downstream.
In some ways this isn’t too surprising, having spent a lot of time these past years gazing at the UCSC genome browser, it is clear that 3’UTRs keep getting longer and longer. For those in the lncRNA field, this presents some difficulties in determining whether an RNA downstream from a 3’UTR in the sense direction is an independent transcript with it’s own start site, a processed RNA from the 3’UTR, or part of the UTR but for some reason transcription joining the two hasn’t been found. Generally transcripts downstream from a 3’UTR that pass whatever cuttoffs a study imposes will look like (and be called) long noncoding RNAs (lncRNAs).
Miura et al find in some cases putative lncRNAs downstream from 3’UTRs are actually part of 3’UTRs. For example some of <200 putative lncRNAs identified by Ponjavic et al. (2009), that were downstream from coding genes expressed in brain, were found to be extended 3’UTRs. Again this isn’t too surprising, but this analysis would have been more compelling if the authors had used their brain RNA-sequencing data to estimate the percentage that appeared to be UTRs, not lncRNAs, instead of just showing a some northerns for exemplar cases.
Unfortunately for me, our study on lncRNA stability has also come in for a bit of stick over these “not actually lncRNAs” .
Miura et al say:
“We next analyzed Tcf4, a transcription factor in the Wnt pathway. Recent studies proposed differential stability of TCF4 and its downstream lincRNAs (Clark et al. 2012). However, RNA-seq data indicate continuous transcription downstream from Tcf4 (Fig. 6E). Northern analysis using a probe to the downstream unannotated region supported the presence of a 3’ UTR extension of Tcf4 and did not detect shorter ncRNAs (Fig. 6F).”
“Our studies suggest that additional loci currently annotated as lincRNAs may actually correspond to unannotated 3’ UTR extensions. For example, we provide Northern evidence that lincRNAs described by Mattick and colleagues (Clark et al. 2012) can be detected as stable 3’ UTR extensions of Etv1, Paqr9, and Tcf4 mRNAs (Figs. 1C, 6F). “
My issue with Miura et al here isn’t with their experimental results, which look to be sound methodologically, but with the errors in these statements and the resulting (no doubt accidental) misrepresentation of what we did and found in Clark et al. (2012).
First things first. At no stage in our paper did we say TCF4 had downstream lincRNAs. What we said was that the stability of the TCF4 3’UTR (ie: part of the coding gene) looked lower than the stability of exons in the body of the gene, which suggested something different was going on with the regulation of 3’UTR compared to the gene body. So the fact that Miura et al. didn’t find any lncRNAs downstream from the annotated 3’UTR isn’t surprising, since we never claimed any existed.
Second there is a definitional problem here. Throughout the whole paper, Miura et al. refer to lincRNAs (long intergenic/intervening noncoding RNAs). The idea is that lincRNAs are distant from coding genes. Since there is no standard definition for lincRNAs, how far away depends entirely on the study. LincRNAs are a subset of lncRNAs (long noncoding RNAs), which also include intronic, cis-antisense lncRNAs etc. etc.
In our study we were aware that some putative lncRNAs might be part of extended 3’UTRs and that with a microarray based study it would often to difficult to tell these apart. This is why we called any noncoding transcript that initiated less than 30kb downstream from a stop codon a 3’UTR-associated ncRNA (uaRNA). So, unlike the Miura et al. assertion we didn’t call any of these lincRNAs, we called them uaRNAs with the specific knowledge that some could be 3’UTR transcripts. This is part of the reason we split up the different genomic subtypes of lncRNAs in the paper to show that our lncRNA stability results were not dependent on the uaRNA subset (see figure 5D of Clark et al. 2012). We mentioned this difference specifically in our methods section:
“Intergenic transcripts were defined as ncRNAs not transcribed within 2 kb upstream of a protein-coding region on either strand or 30 kb downstream in the sense direction or 2 kb downstream in the antisense direction. 3’UTR-associated transcripts (uaRNAs) were defined as any ncRNA that initiates within 30 kb downstream from a stop codon on the same strand [to allow for the potential of 3’UTRs that extend beyond their annotations (Moucadel et al. 2007; Mercer et al. 2011)].”
Given this, it is not surprising Miura et al. found some uaRNAs that were actually part of extended 3’UTRs. As I said earlier, establishing a cutoff for novel transcripts in a genomic scale data set (or from the Fantom cDNA data set, which is what our microarray was predominately designed against) can be difficult. One can, of course, define a long cutoff for where putative 3’UTRs might end and lncRNA might begin, but there will always be examples of UTR extensions beyond your cutoff and independent transcripts within those cutoffs being missed. This is why Miura et al is in interesting study, as it expands our knowledge about 3’UTR dynamics, but I think it would have been better, albeit a bit less dramatic, if they had read the studies they criticize a little more carefully.
Miura P, Shenker S, Andreu-Agullo C, Westholm JO, & Lai EC (2013) Widespread and extensive lengthening of 3′ UTRs in the mammalian brain. Genome research.
Clark MB, et al. (2012) Genome-wide analysis of long noncoding RNA stability. Genome research 22:885-898.