This function calculates Ka/Ks (pN/pS) for all combinations given in an indices list of a DNAStringSet. If the sequences in the DNAStringSet are not a multiple-sequence alignment, pairwise codon alignments can be calculated on the fly. Models used and implemented according to Li (1993) (using seqinr) or Nei and Gojobori (1986) (own implementation) or models from KaKs_Calculator2 ported to MSA2dist with Rcpp.

indices2kaks(
  cds,
  indices,
  model = "Li",
  threads = 1,
  isMSA = TRUE,
  sgc = "1",
  verbose = FALSE,
  ...
)

Arguments

cds

DNAStringSet coding sequence alignment [mandatory]

indices

list list of indices to calculate Ks/Ks [mandatory]

model

specify codon model either "Li" or "NG86" or one of KaKs_Calculator2 model "NG", "LWL", "LPB", "MLWL", "MLPB", "GY", "YN", "MYN", "MS", "MA", "GNG", "GLWL", "GLPB", "GMLWL", "GMLPB", "GYN", "GMYN" [default: Li]

threads

number of parallel threads [default: 1]

isMSA

cds DNAStringSet represents MSA [default: TRUE]

sgc

standard genetic code (for KaKs Calculator models) [default: 1]

verbose

verbosity (for KaKs Calculator models) [default: FALSE]

...

other codon alignment parameters

Value

A data.frame of KaKs values

References

"MS/MA/GNG/GLWL/GLPB/GMLWL/GMLPB/GYN:" Wang et al. (2010) KaKs_Calculator 2.0: a toolkit incorporating gamma-series methods and sliding window strategies.Genomics, proteomics & bioinformatics. 8(1), 77-80.

"Li/LWL:" Li et al. (1985) A new method for estimating synonymous and nonsynonymous rates of nucleotide substitution considering the relative likelihood of nucleotide and codon changes. Mol. Biol. Evol., 2(2), 150-174.

"Li/LPB:" Li (1993). Unbiased estimation of the rates of synonymous and nonsynonymous substitution. Journal of molecular evolution, 36(1), pp.96-99.

"NG86/NG:" Nei and Gojobori. (1986) Simple methods for estimating the numbers of synonymous and nonsynonymous nucleotide substitutions. Mol. Biol. Evol., 3(5), 418-426.

"LPB:" Pamilo and Bianchi. (1993) Evolution of the Zfx and Zfy genes: Rates and interdependence between genes. Mol. Biol. Evol., 10, 271-281.

"MLWL/MLPB:" Tzeng et al. (2004). Comparison of three methods for estimating rates of synonymous and nonsynonymous nucleotide substitutions. Mol. Biol. Evol., 21(12), 2290-2298.

"GY:" Goldman and Yang (1994). A codon-based model of nucleotide substitution for protein-coding DNA sequences. Mol. Biol. Evol., 11(5) 725-736.

"YN:" Yang et al. (2000) Codon-substitution models for heterogeneous selection pressure at amino acid sites. Genetics. 155(1), 431-449.

"MYN:" Zhang et al. (2006). Computing Ka and Ks with a consideration of unequal transitional substitutions. BMC evolutionary biology, 6(1), 1-10.

"data(hiv):" Ganeshan et al. (1997) Human immunodeficiency virus type 1 genetic evolution in children with different rates of development of disease. J. Virology. 71(1), 663-677.

Wang et al. (2009). gamma-MYN: a new algorithm for estimating Ka and Ks with consideration of variable substitution rates. Biology Direct, 4(1), 1-18.

See also

Author

Kristian K Ullrich

Examples

## load example sequence data
data("hiv", package="MSA2dist")
## create indices
idx <- list(c(2, 3), c(5,7,9))
#indices2kaks(hiv, idx, model="Li")
hiv |> indices2kaks(idx, model="Li")
#>   Comp1 Comp2   seq1   seq2        ka         ks         vka          vks
#> 1     2     3 U68497 U68498 0.1229150 0.01780069 0.001145105 0.0005798517
#> 2     5     7 U68500 U68502 0.1598801 0.14384729 0.001607816 0.0064428544
#> 3     5     9 U68500 U68504 0.1280077 0.16775540 0.001386777 0.0080191008
#> 4     7     9 U68502 U68504 0.1080925 0.15227654 0.001025740 0.0058858088
#>          Ka         Ks     Ka/Ks
#> 1 0.1229150 0.01780069 6.9050687
#> 2 0.1598801 0.14384729 1.1114569
#> 3 0.1280077 0.16775540 0.7630618
#> 4 0.1080925 0.15227654 0.7098433
#indices2kaks(hiv, idx, model="NG86")
hiv |> indices2kaks(idx, model="NG86")
#>            Comp1 Comp2   seq1   seq2 Codons Compared Ambigiuous Indels Ns
#> result.1       2     3 U68497 U68498     91       91          0      0  0
#> result.1.1     5     7 U68500 U68502     91       91          0      0  0
#> result.2       5     9 U68500 U68504     91       91          0      0  0
#> result.2.1     7     9 U68502 U68504     91       91          0      0  0
#>                          Sd               Sn                S                N
#> result.1                1.5             22.5             57.5            215.5
#> result.1.1 6.33333333333333 29.6666666666667 56.6666666666667 216.333333333333
#> result.2                6.5             24.5 56.1666666666667 216.833333333333
#> result.2.1 7.83333333333333 21.1666666666667             57.5            215.5
#>                            ps                 pn             pn/ps
#> result.1   0.0260869565217391  0.104408352668213  4.00232018561485
#> result.1.1  0.111764705882353   0.13713405238829  1.22698888978996
#> result.2    0.115727002967359  0.112990007686395 0.976349553597824
#> result.2.1  0.136231884057971 0.0982211910286156  0.72098533840154
#>                           ds                dn             dn/ds
#> result.1   0.026551445288187 0.112429520622234  4.23440303915417
#> result.1.1 0.121024643713475  0.15144523277581  1.25135863348919
#> result.2   0.125695312562783 0.122465898420489  0.97430760084485
#> result.2.1 0.150342008285561  0.10527596415443 0.700243167927274
#>                           Ka                Ks             Ka/Ks
#> result.1   0.112429520622234 0.026551445288187  4.23440303915417
#> result.1.1  0.15144523277581 0.121024643713475  1.25135863348919
#> result.2   0.122465898420489 0.125695312562783  0.97430760084485
#> result.2.1  0.10527596415443 0.150342008285561 0.700243167927274
#indices2kaks(hiv, idx, model="NG86", threads=2)
hiv |> indices2kaks(idx, model="NG86", threads=2)
#>            Comp1 Comp2   seq1   seq2 Codons Compared Ambigiuous Indels Ns
#> result.1       2     3 U68497 U68498     91       91          0      0  0
#> result.1.1     5     7 U68500 U68502     91       91          0      0  0
#> result.2       5     9 U68500 U68504     91       91          0      0  0
#> result.2.1     7     9 U68502 U68504     91       91          0      0  0
#>                          Sd               Sn                S                N
#> result.1                1.5             22.5             57.5            215.5
#> result.1.1 6.33333333333333 29.6666666666667 56.6666666666667 216.333333333333
#> result.2                6.5             24.5 56.1666666666667 216.833333333333
#> result.2.1 7.83333333333333 21.1666666666667             57.5            215.5
#>                            ps                 pn             pn/ps
#> result.1   0.0260869565217391  0.104408352668213  4.00232018561485
#> result.1.1  0.111764705882353   0.13713405238829  1.22698888978996
#> result.2    0.115727002967359  0.112990007686395 0.976349553597824
#> result.2.1  0.136231884057971 0.0982211910286156  0.72098533840154
#>                           ds                dn             dn/ds
#> result.1   0.026551445288187 0.112429520622234  4.23440303915417
#> result.1.1 0.121024643713475  0.15144523277581  1.25135863348919
#> result.2   0.125695312562783 0.122465898420489  0.97430760084485
#> result.2.1 0.150342008285561  0.10527596415443 0.700243167927274
#>                           Ka                Ks             Ka/Ks
#> result.1   0.112429520622234 0.026551445288187  4.23440303915417
#> result.1.1  0.15144523277581 0.121024643713475  1.25135863348919
#> result.2   0.122465898420489 0.125695312562783  0.97430760084485
#> result.2.1  0.10527596415443 0.150342008285561 0.700243167927274

## define three unaligned cds sequences
cds1 <- Biostrings::DNAString("ATGCAACATTGC")
cds2 <- Biostrings::DNAString("ATGCATTGC")
cds3 <- Biostrings::DNAString("ATGCAATGC")
cds_sequences <- Biostrings::DNAStringSet(list(cds1, cds2, cds3))
names(cds_sequences) <- c("cds1", "cds2", "cds3")
## create indices
idx <- list(c(1, 2), c(1,3))
## set isMSA to FALSE to automatically create pairwise codon alignments
#indices2kaks(cds_sequences, idx, model="Li", isMSA=FALSE)
cds_sequences |> indices2kaks(idx, model="Li", isMSA=FALSE)
#>          Comp1 Comp2 seq1 seq2 ka       ks vka      vks Ka       Ks Ka/Ks
#> result.1     1     2 cds1 cds2  0 9.999999   0 9.999999  0 9.999999     0
#> result.2     1     3 cds1 cds3  0 9.999999   0 9.999999  0 9.999999     0