RDDpred: A condition-specific RNA-editing prediction model from RNA-seq data

Min su Kim, Benjamin Hur, Sun Kim

Research output: Contribution to journalArticlepeer-review

27 Scopus citations

Abstract

Background: RNA-editing is an important post-transcriptional RNA sequence modification performed by two catalytic enzymes, "ADAR"(A-to-I) and "APOBEC"(C-to-U). By utilizing high-throughput sequencing technologies, the biological function of RNA-editing has been actively investigated. Currently, RNA-editing is considered to be a key regulator that controls various cellular functions, such as protein activity, alternative splicing pattern of mRNA, and substitution of miRNA targeting site. DARNED, a public RDD database, reported that there are more than 300-thousands RNA-editing sites detected in human genome(hg19). Moreover, multiple studies suggested that RNA-editing events occur in highly specific conditions. According to DARNED, 97.62 % of registered editing sites were detected in a single tissue or in a specific condition, which also supports that the RNA-editing events occur condition-specifically. Since RNA-seq can capture the whole landscape of transcriptome, RNA-seq is widely used for RDD prediction. However, significant amounts of false positives or artefacts can be generated when detecting RNA-editing from RNA-seq. Since it is difficult to perform experimental validation at the whole-transcriptome scale, there should be a powerful computational tool to distinguish true RNA-editing events from artefacts. Result: We developed RDDpred, a Random Forest RDD classifier. RDDpred reports potentially true RNA-editing events from RNA-seq data. RDDpred was tested with two publicly available RNA-editing datasets and successfully reproduced RDDs reported in the two studies (90 %, 95 %) while rejecting false-discoveries (NPV: 75 %, 84 %). Conclusion: RDDpred automatically compiles condition-specific training examples without experimental validations and then construct a RDD classifier. As far as we know, RDDpred is the very first machine-learning based automated pipeline for RDD prediction. We believe that RDDpred will be very useful and can contribute significantly to the study of condition-specific RNA-editing. RDDpred is available at http://biohealth.snu.ac.kr/software/RDDpred.

Original languageEnglish
Article number5
JournalBMC Genomics
Volume17
Issue number1
DOIs
StatePublished - Jan 11 2016
Externally publishedYes

Funding

This research was supported by the Bio & Medical Technology Development Program of the NRF funded by the Korean government, MSIP(No. NRF-2014M3C9A3063541). Also, this research was supported by Next-Generation Information Computing Development Program through the National Research Foundation of Korea(NRF) funded by the Ministry of Science, ICT & Future Planning (No. NRF-2012M3C4A7033341). The publication cost will be paid by the Seoul National University Office of Research.

FundersFunder number
Ministry of Science, ICT and Future PlanningNRF-2014M3C9A3063541, NRF-2012M3C4A7033341
National Research Foundation of Korea

    Keywords

    • Condition-specific
    • Machine-learning
    • RNA-editing
    • RNA-seq
    • Random forest
    • Systematic artefact

    Fingerprint

    Dive into the research topics of 'RDDpred: A condition-specific RNA-editing prediction model from RNA-seq data'. Together they form a unique fingerprint.

    Cite this