EHS
EHS

Classification of riboswitch sequences using k-mer frequencies.

Icon for Elsevier Science Related Articles

Classification of riboswitch sequences using k-mer frequencies.

Biosystems. 2018 Sep 08;:

Authors: Guillén-Ramírez HA, Martínez-Pérez IM

Abstract
Riboswitches are non-coding RNAs that regulate gene expression by altering the structural conformation of mRNA transcripts. Their regulation mechanism might be exploited for interesting biomedical applications such as drug targets and biosensors. A major challenge consists in accurately identifying metabolite-binding RNA switches which are structurally complex and diverse. In this regard, we investigated the classification of 16 riboswitch families using supervised learning algorithms trained solely with sequence-based features. We generated a reduced feature set and proposed a visual representation to explore its components. We induced Support Vector Machine, Random Forest, Naive Bayes, J48, and HyperPipes classifiers with our proposed feature set and tested their performance over independent data. Our best multi-class classifier achieved F-measure values of 0.996 and 0.966 in the training and test phases, respectively, outperforming those of a previous approach. When compared against BLAST, our best classifiers yielded competitive results. This work shows that the classifiers trained with our sequence-based feature set accurately discriminate riboswitches.

PMID: 30205141 [PubMed – as supplied by publisher]

Source link

EHS
Back to top button