Structure Mismatch Score {R4RNA}R Documentation

Scores how a basepair structure fits a sequence

Description

Calculates a score that indicates how badly a set of basepairs (i.e. a secondary structure) fits with a sequence. A perfect fit is a structure where all basepairs form valid basepairs (A:U, G:C, G:U, and equivalents) and has a score of 0. Each basepair that forms a non-canonical pairing or pairs to gaps increases the score by 1, and each base-pair with a single-sided gap increases the score by 2.

Usage

    structureMismatchScore(msa, helix, one.gap.penalty = 2, two.gap.penalty = 2,
                 invalid.penalty = 1)

Arguments

msa

A multiple sequence alignment. Can be either a Biostrings XStringSet object or a named array of strings like ones obtained from converting XStringSet with as.character.

helix

A helix data.frame

one.gap.penalty

Penalty score for basepairs with one of the bases being a gap

two.gap.penalty

Penalty score for basepairs with both bases being a gaps

invalid.penalty

Penalty score for non-canonical basepairs

Value

Returns an array of mismatch scores.

Author(s)

Jeff Proctor, Daniel Lai

Examples

    data(helix)
    mismatch <- structureMismatchScore(fasta, known)
    
    # Sort by increasing mismatch
    sorted_fasta <- fasta[order(mismatch)]

[Package R4RNA version 1.18.0 Index]