This function converts a DNAStringSet
into a
codon matrix
.
dnastring2codonmat(cds, shorten = FALSE, frame = 1, framelist = NULL)
DNAStringSet
[mandatory]
shorten all sequences to multiple of three [default: FALSE]
indicates the first base of a the first codon [default: 1]
supply vector of frames for each entry [default: NULL]
An object of class alignment
which is a list with the
following components:
nb
the number of aligned sequences
nam
a vector of strings containing the names of the aligned
sequences
seq
a vector of strings containing the aligned sequences
com
a vector of strings containing the commentaries for each sequence
or NA
if there are no comments
## define two cds sequences
cds1 <- Biostrings::DNAString("ATGCAACATTGC")
cds2 <- Biostrings::DNAString("ATG---CATTGC")
cds1.cds2.aln <- c(Biostrings::DNAStringSet(cds1),
Biostrings::DNAStringSet(cds2))
## convert into alignment
#dnastring2codonmat(cds1.cds2.aln)
cds1.cds2.aln |> dnastring2codonmat()
#> [,1] [,2]
#> [1,] "ATG" "ATG"
#> [2,] "CAA" "---"
#> [3,] "CAT" "CAT"
#> [4,] "TGC" "TGC"
## use frame 2 and shorten to circumvent multiple of three error
cds1 <- Biostrings::DNAString("-ATGCAACATTGC-")
cds2 <- Biostrings::DNAString("-ATG---CATTGC-")
cds1.cds2.aln <- c(Biostrings::DNAStringSet(cds1),
Biostrings::DNAStringSet(cds2))
cds1.cds2.aln |> dnastring2codonmat(frame=2, shorten=TRUE)
#> [,1] [,2]
#> [1,] "ATG" "ATG"
#> [2,] "CAA" "---"
#> [3,] "CAT" "CAT"
#> [4,] "TGC" "TGC"