MixMir: microRNA motif discovery from gene expression data using mixed linear models

BioRxiv : the Preprint Server for Biology
Liyang DiaoKevin C Chen

Abstract

microRNAs (miRNAs) are a class of ∼22nt non-coding RNAs that potentially regulate over 60% of human protein-coding genes. miRNA activity is highly specific, differing between cell types, developmental stages and environmental conditions, so the identification of active miRNAs in a given sample is of great interest. Here we present a novel computational approach for analyzing both mRNA sequence and gene expression data, called MixMir. Our method corrects for 3′ UTR background sequence similarity between transcripts, which is known to correlate with mRNA transcript abundance. We demonstrate that after accounting for k mer sequence similarities in 3’ UTRs, a statistical linear model based on motif presence/absence can effectively discover active miRNAs in a sample. MixMir utilizes fast software implementations for solving mixed linear models which are widely-used in genome-wide association studies (GWAS). Essentially we use 3’ UTR sequence similarity in place of population cryptic relatedness in the GWAS problem. Compared to similar methods such as miReduce, Sylamer and cWords, we found that MixMir performed better at discovering true miRNA motifs in three mouse Dicer knockout experiments from different tissues, two of which were ...Continue Reading

Related Concepts

Gene Expression
RNA, Messenger
Computer Software
Transfection
Cell Line, Tumor
Mice, Knockout
3' Untranslated Regions
Research Study
RNA, Untranslated
MicroRNAs

Related Feeds

BioRxiv & MedRxiv Preprints

BioRxiv and MedRxiv are the preprint servers for biology and health sciences respectively, operated by Cold Spring Harbor Laboratory. Here are the latest preprint articles (which are not peer-reviewed) from BioRxiv and MedRxiv.