How to search for RNA structures. Theoretical concepts in evolutionary biotechnology
Abstract
The relation between RNA sequences and minimum free energy secondary structures is viewed as a mapping from sequence space into shape space. The properties of such mappings depend strongly on the ratios of the numbers of sequences and structures and, hence, substantial differences are observed between samples of structures derived from AUGC, pure AU or pure GC sequences. Statistical analysis of large samples is used to demonstrate that structures from AUGC sequences are much less sensitive to point mutations than those from sequences containing exclusively AU or GC. The frequency with which a structure is realized in sequence space is inversely proportional to some power c > 1 of the structure's frequency rank, thus following a (generalized) Zipf law. For long sequences the exponent approaches c = 1. An inverse folding algorithm is used to compute samples of sequences folding into the same secondary structure. These sequences are distributed randomly in sequence space. Common structures form extended neutral networks along which populations can migrate through the entire sequence space without changing structure. In this migration, moves of Hamming distance d = 1 and d = 2 are accepted in order to allow for base and base pair e...Continue Reading
References
Citations
Related Concepts
Related Feeds
Bioinformatics in Biomedicine
Bioinformatics in biomedicine incorporates computer science, biology, chemistry, medicine, mathematics and statistics. Discover the latest research on bioinformatics in biomedicine here.