gkmSVM: an R package for gapped-kmer SVM

Bioinformatics
Mahmoud GhandiMichael A Beer

Abstract

We present a new R package for training gapped-kmer SVM classifiers for DNA and protein sequences. We describe an improved algorithm for kernel matrix calculation that speeds run time by about 2 to 5-fold over our original gkmSVM algorithm. This package supports several sequence kernels, including: gkmSVM, kmer-SVM, mismatch kernel and wildcard kernel. gkmSVM package is freely available through the Comprehensive R Archive Network (CRAN), for Linux, Mac OS and Windows platforms. The C ++ implementation is available at www.beerlab.org/gkmsvm mghandi@gmail.com or mbeer@jhu.edu Supplementary data are available at Bioinformatics online.

References

Mar 3, 2004·Bioinformatics·Christina S LeslieWilliam Stafford Noble
Aug 31, 2011·Genome Research·Dongwon LeeMichael A Beer
Jun 19, 2013·Nucleic Acids Research·Christopher Fletez-BrantMichael A Beer
Jul 19, 2013·Journal of Mathematical Biology·Mahmoud GhandiMichael A Beer
Jul 18, 2014·PLoS Computational Biology·Mahmoud GhandiMichael A Beer
Jun 16, 2015·Nature Genetics·Dongwon LeeMichael A Beer
Mar 8, 2016·ELife·Alisa MoJeremy Nathans
May 7, 2016·Bioinformatics·Dongwon Lee

Citations

Jan 26, 2017·Human Mutation·Michael A Beer
Aug 23, 2017·PLoS Computational Biology·Remo MontiLen A Pennacchio
Oct 31, 2018·Scientific Reports·Anand Pratap SinghSuraiya Jabin
Mar 19, 2020·Nucleic Acids Research·Tommaso AndreaniMiguel A Andrade-Navarro
Feb 28, 2020·Nature Communications·Naresh Doni JayaveluR David Hawkins
May 24, 2020·Annual Review of Genomics and Human Genetics·Michael A BeerDanwei Huangfu
Oct 17, 2018·Scientific Reports·Zhen ShenDe-Shuang Huang
Jul 31, 2020·Nature·E Christopher PartridgeEric M Mendenhall
Dec 23, 2017·Nucleic Acids Research·Raphaël Mourad, Olivier Cuvier
Dec 20, 2019·PLoS Computational Biology·Peter K Koo, Sean R Eddy
Sep 19, 2020·Nature Communications·Kushal K DeyAlkes L Price
Jun 14, 2017·Scientific Reports·Hongbo ZhangDe-Shuang Huang
Mar 17, 2018·Genome Biology·Raphaël MouradOlivier Cuvier
Jan 7, 2018·Genome Research·Hemangi G Chaudhari, Barak A Cohen
Nov 29, 2020·Science Advances·Jialin Liu, Marc Robinson-Rechavi
Dec 2, 2020·PloS One·Louisa-Marie KrützfeldtMartin Kircher
Dec 3, 2020·ELife·Nour J AbdulhayVijay Ramani
Jan 1, 2021·Bioinformatics·Derrick BlakelyYanjun Qi

Related Concepts

Computer Programs and Programming
Sequence Determinations, DNA
Sequence Analysis, Protein
Support Vector Machines
Classification
DNA Repair
Extracellular Matrix
Ras GTPase-Activating Proteins
Biological Neural Networks
Anisopodidae

Trending Feeds

COVID-19

Coronaviruses encompass a large family of viruses that cause the common cold as well as more serious diseases, such as the ongoing outbreak of coronavirus disease 2019 (COVID-19; formally known as 2019-nCoV). Coronaviruses can spread from animals to humans; symptoms include fever, cough, shortness of breath, and breathing difficulties; in more severe cases, infection can lead to death. This feed covers recent research on COVID-19.

Synthetic Genetic Array Analysis

Synthetic genetic arrays allow the systematic examination of genetic interactions. Here is the latest research focusing on synthetic genetic arrays and their analyses.

Congenital Hyperinsulinism

Congenital hyperinsulinism is caused by genetic mutations resulting in excess insulin secretion from beta cells of the pancreas. Here is the latest research.

Neural Activity: Imaging

Imaging of neural activity in vivo has developed rapidly recently with the advancement of fluorescence microscopy, including new applications using miniaturized microscopes (miniscopes). This feed follows the progress in this growing field.

Chronic Fatigue Syndrome

Chronic fatigue syndrome is a disease characterized by unexplained disabling fatigue; the pathology of which is incompletely understood. Discover the latest research on chronic fatigue syndrome here.

Epigenetic Memory

Epigenetic memory refers to the heritable genetic changes that are not explained by the DNA sequence. Find the latest research on epigenetic memory here.

Cell Atlas of the Human Eye

Constructing a cell atlas of the human eye will require transcriptomic and histologic analysis over the lifespan. This understanding will aid in the study of development and disease. Find the latest research pertaining to the Cell Atlas of the Human Eye here.

Femoral Neoplasms

Femoral Neoplasms are bone tumors that arise in the femur. Discover the latest research on femoral neoplasms here.

STING Receptor Agonists

Stimulator of IFN genes (STING) are a group of transmembrane proteins that are involved in the induction of type I interferon that is important in the innate immune response. The stimulation of STING has been an active area of research in the treatment of cancer and infectious diseases. Here is the latest research on STING receptor agonists.