This function takes two sequences as DNAStringSet, and their corresponding AAStringSet, calculates a global alignment and converts this alignment back into a codon alignment.

cdsstring2codonaln(
  cds,
  aa,
  type = "global",
  substitutionMatrix = "BLOSUM62",
  gapOpening = 10,
  gapExtension = 0.5,
  remove.gaps = FALSE
)

Arguments

cds

two sequences DNAStringSet [mandatory]

aa

two sequences AAStringSet [mandatory]

type

type of alignment (see pairwiseAlignment) [default: global]

substitutionMatrix

substitution matrix representing the fixed substitution scores for an alignment (see pairwiseAlignment) [default: BLOSUM62]

gapOpening

the cost for opening a gap in the alignment (see pairwiseAlignment) [default: 10]

gapExtension

the incremental cost incurred along the length of the gap in the alignment (see pairwiseAlignment) [default: 0.5]

remove.gaps

specify if gaps in the codon alignment should be removed [default: FALSE]

Value

codon alignment as DNAStringSet

References

Pagès, H et al. (2014) Biostrings: Efficient manipulation of biological strings. R package version, 2(0).

Author

Kristian K Ullrich

Examples

## define two cds sequences
cds <- Biostrings::DNAStringSet(c("ATGCAACATTGC", "ATGCATTGC"))
names(cds) <- c("cds1", "cds2")
## get protein alignment
aa <- MSA2dist::cds2aa(cds)
cdsstring2codonaln(cds, aa)
#> DNAStringSet object of length 2:
#>     width seq                                               names               
#> [1]    12 ATGCAACATTGC                                      cds1
#> [2]    12 ATG---CATTGC                                      cds2