This function subsets a DNAStringSet or an AAStringSet by a mask and region given one or both options as IRanges.

string2region(seq, mask = NULL, region = NULL, add = TRUE)

Arguments

seq

DNAStringSet or AAStringSet [mandatory]

mask

IRanges object indicating masked sites [default: NULL]

region

IRanges object indicating region to use for dist calculation (by default all sites are used) [default: NULL]

add

indicate if mask and region should be added to metadata [default: TRUE]

Value

A list object with the following components:

DNAStringSet or AAStringSet

regionUsed

See also

Author

Kristian K Ullrich

Examples

## load example sequence data
data("hiv", package="MSA2dist")
## create mask
mask1 <- IRanges::IRanges(start=c(11,41,71), end=c(20,50,80))
## use mask
hiv.region <- hiv |> cds2aa() |> string2region(mask=mask1)
#(hiv.region |> slot("metadata"))$regionUsed
hiv.region |> regionused()
#> IRanges object with 4 ranges and 0 metadata columns:
#>           start       end     width
#>       <integer> <integer> <integer>
#>   [1]         1        10        10
#>   [2]        21        40        20
#>   [3]        51        70        20
#>   [4]        81        91        11
## use region
region1 <- IRanges::IRanges(start=c(1,75), end=c(45,85))
hiv.region <- hiv |> cds2aa() |> string2region(region=region1)
#(hiv.region |> slot("metadata"))$regionUsed
hiv.region |> regionused()
#> IRanges object with 2 ranges and 0 metadata columns:
#>           start       end     width
#>       <integer> <integer> <integer>
#>   [1]         1        45        45
#>   [2]        75        85        11
## use mask and region
hiv.region <- hiv |> cds2aa() |> string2region(mask=mask1, region=region1)
#(hiv.region |> slot("metadata"))$regionUsed
hiv.region |> regionused()
#> IRanges object with 3 ranges and 0 metadata columns:
#>           start       end     width
#>       <integer> <integer> <integer>
#>   [1]         1        10        10
#>   [2]        21        40        20
#>   [3]        81        85         5