GAD: A Python Script for Dividing Genome Annotation Files into Feature-Based Files.

Interdisciplinary Sciences, Computational Life Sciences
Norhan Yasser, Ahmed Karam

Abstract

Nowadays, the manipulation and analysis of genomic data stored in publicly accessible repositories have become a daily task in genomics and bioinformatics laboratories. Due to the enormous advancement in the field of genome sequencing and the emergence of many projects, bioinformaticians have pushed for the creation of a variety of programs and pipelines that will automatically analyze such big data, in particular the pipelines of gene annotation. Dealing with annotation files using easy and simple programs is very important, particularly for non-developers, enhancing the genomic data analysis acceleration. One of the first tasks required to work with genomic annotation files is to extract different features. In this regard, we have developed GAD ( https://github.com/bio-projects/GAD ) using Python to be a fast, easy, and controlled script that has a high ability to handle annotation files such as GFF3 and GTF. GAD is a cross-platform graphical interface tool used to extract genome features such as intergenic regions, upstream, and downstream genes. Besides, GAD finds all names of ambiguous sequence ontology, and either extracts them or considers them as genes or transcripts. The results are produced in a variety of file format...Continue Reading

References

May 5, 2004·Genome Research·Simon C PotterMichele Clamp
May 17, 2005·Genome Biology·Karen EilbeckMichael Ashburner
Oct 25, 2008·Nucleic Acids Research·Susan TweedieUNKNOWN FlyBase Consortium
Nov 4, 2008·Nucleic Acids Research·Geoffrey L WinsorFiona S L Brinkman
Nov 17, 2009·Nucleic Acids Research·Todd W HarrisPaul W Sternberg
Jan 30, 2010·Bioinformatics·Aaron R Quinlan, Ira M Hall
Sep 9, 2011·Current Protocols in Bioinformatics·Marek S Skrzypek, Jodi Hirschman
Nov 24, 2011·Nucleic Acids Research·J Michael CherryEdith D Wong
Dec 6, 2011·Nucleic Acids Research·Philippe LameschEva Huala
Nov 30, 2012·Nucleic Acids Research·UNKNOWN NCBI Resource Coordinators
Sep 12, 2013·BioData Mining·Salvatore Camiolo, Andrea Porceddu
Jan 1, 2012·Worm·Kevin HowePaul W Sternberg
Jun 3, 2014·BMC Research Notes·Achal Rastogi, Dinesh Gupta
Jun 28, 2016·Nucleic Acids Research·Tatiana TatusovaJames Ostell
Dec 3, 2016·Nucleic Acids Research·Bronwen L AkenPaul Flicek
Nov 21, 2017·Nucleic Acids Research·Daniel R ZerbinoPaul Flicek
May 24, 2018·Nucleic Acids Research·Enis AfganDaniel Blankenberg

❮ Previous
Next ❯

Software Mentioned

GAD
GFF
BEDtools
Lubuntu
gffread
GAD ( Genome Annotation Divider )
Python
Ex
Tkinter
Ensembl

Related Concepts

Trending Feeds

COVID-19

Coronaviruses encompass a large family of viruses that cause the common cold as well as more serious diseases, such as the ongoing outbreak of coronavirus disease 2019 (COVID-19; formally known as 2019-nCoV). Coronaviruses can spread from animals to humans; symptoms include fever, cough, shortness of breath, and breathing difficulties; in more severe cases, infection can lead to death. This feed covers recent research on COVID-19.

Blastomycosis

Blastomycosis fungal infections spread through inhaling Blastomyces dermatitidis spores. Discover the latest research on blastomycosis fungal infections here.

Nuclear Pore Complex in ALS/FTD

Alterations in nucleocytoplasmic transport, controlled by the nuclear pore complex, may be involved in the pathomechanism underlying multiple neurodegenerative diseases including Amyotrophic Lateral Sclerosis and Frontotemporal Dementia. Here is the latest research on the nuclear pore complex in ALS and FTD.

Applications of Molecular Barcoding

The concept of molecular barcoding is that each original DNA or RNA molecule is attached to a unique sequence barcode. Sequence reads having different barcodes represent different original molecules, while sequence reads having the same barcode are results of PCR duplication from one original molecule. Discover the latest research on molecular barcoding here.

Chronic Fatigue Syndrome

Chronic fatigue syndrome is a disease characterized by unexplained disabling fatigue; the pathology of which is incompletely understood. Discover the latest research on chronic fatigue syndrome here.

Evolution of Pluripotency

Pluripotency refers to the ability of a cell to develop into three primary germ cell layers of the embryo. This feed focuses on the mechanisms that underlie the evolution of pluripotency. Here is the latest research.

Position Effect Variegation

Position Effect Variagation occurs when a gene is inactivated due to its positioning near heterochromatic regions within a chromosome. Discover the latest research on Position Effect Variagation here.

STING Receptor Agonists

Stimulator of IFN genes (STING) are a group of transmembrane proteins that are involved in the induction of type I interferon that is important in the innate immune response. The stimulation of STING has been an active area of research in the treatment of cancer and infectious diseases. Here is the latest research on STING receptor agonists.

Microbicide

Microbicides are products that can be applied to vaginal or rectal mucosal surfaces with the goal of preventing, or at least significantly reducing, the transmission of sexually transmitted infections. Here is the latest research on microbicides.