Description
The MALAT1 (Metastasis-Associated Lung Adenocarcinoma Transcript 1) gene encodes a non-coding RNA that is processed into a long nuclear retained transcript (MALAT1) and a small cytoplasmic tRNA-like transcript (mascRNA). Using a RNA sequence- and structure-based covariance model, we identified more than 130 genomic loci in vertebrate genomes containing the MALAT1 3’-end triple helix structure and its immediate downstream tRNA-like structure, including 44 in the green lizard Anolis carolinensis. Structural and computational analyses revealed a coevolution of the 3’-end module. MALAT1-like genes in Anolis carolinensis are highly expressed in adult testis, thus we named them testis-abundant long noncoding RNAs (tancRNAs). TancRNA loci produce multiple small RNA species, including piRNAs, from the antisense strand. The coevolved 3’-end of tancRNAs serve as potential targets for the PIWI-piRNA complex. Thus, we have identified an evolutionarily conserved class of lncRNAs with similar structural constraints, post-transcriptional processing, subcellular localization and a distinct function in spermatocytes. Overall design: 4 RNA-Seq datasets analyzing gene expression in Anolis carolinensis. NSR-Seq of brain and testis, small RNA-Seq of testis (19-33 nt), small RNA-Seq of testis (52-68 nt).