This function converts a DNAStringSet into a codon matrix.

dnastring2codonmat(cds, shorten = FALSE, frame = 1, framelist = NULL)

Arguments

cds

DNAStringSet [mandatory]

shorten

shorten all sequences to multiple of three [default: FALSE]

frame

indicates the first base of a the first codon [default: 1]

framelist

supply vector of frames for each entry [default: NULL]

Value

An object of class alignment which is a list with the following components:

nb the number of aligned sequences

nam a vector of strings containing the names of the aligned sequences

seq a vector of strings containing the aligned sequences

com a vector of strings containing the commentaries for each sequence or NA if there are no comments

See also

Author

Kristian K Ullrich

Examples

## define two cds sequences
cds1 <- Biostrings::DNAString("ATGCAACATTGC")
cds2 <- Biostrings::DNAString("ATG---CATTGC")
cds1.cds2.aln <- c(Biostrings::DNAStringSet(cds1),
    Biostrings::DNAStringSet(cds2))
## convert into alignment
#dnastring2codonmat(cds1.cds2.aln)
cds1.cds2.aln |> dnastring2codonmat()
#>      [,1]  [,2] 
#> [1,] "ATG" "ATG"
#> [2,] "CAA" "---"
#> [3,] "CAT" "CAT"
#> [4,] "TGC" "TGC"
## use frame 2 and shorten to circumvent multiple of three error
cds1 <- Biostrings::DNAString("-ATGCAACATTGC-")
cds2 <- Biostrings::DNAString("-ATG---CATTGC-")
cds1.cds2.aln <- c(Biostrings::DNAStringSet(cds1),
    Biostrings::DNAStringSet(cds2))
cds1.cds2.aln |> dnastring2codonmat(frame=2, shorten=TRUE)
#>      [,1]  [,2] 
#> [1,] "ATG" "ATG"
#> [2,] "CAA" "---"
#> [3,] "CAT" "CAT"
#> [4,] "TGC" "TGC"