Quantile normalization — quant.norm • PRECISION.array

Normalize training dataset with quantile normalization and store the quantiles from the training dataset as the references to frozen quantile normalize test dataset.

Usage

quant.norm(train = NULL, test = NULL, ref.dis = NULL)

Arguments

train: training dataset to be quantile normalized. The dataset must have rows as probes and columns as samples. This can be left unspecified if ref.dis is suppied for frozen normalize test set.
test: test dataset to be frozen quantile normalized. The dataset must have rows as probes and columns as samples. The number of rows must equal to the number of rows in the training set. By default, the test set is not specified (test = NULL) and no frozen normalization will be performed.
ref.dis: reference distribution for frozen quantile normalize test set against previously normalized training set. This is required when train is not supplied. By default, ref.dis = NULL.

Value

a list of two datasets and one reference distribution:

train.mn: the normalized training set
test.fmn: the frozen normalized test set, if test set is specified
ref.dis: the reference distribution

References

Bolstad, B. M., Irizarry R. A., Astrand, M, and Speed, T. P. (2003) A Comparison of Normalization Methods for High Density Oligonucleotide Array Data Based on Bias and Variance. Bioinformatics 19(2) , pp 185-193. http://bmbolstad.com/misc/normalize/normalize.html

Examples

set.seed(101)
group.id <- substr(colnames(nuhdata.pl), 7, 7)
train.ind <- colnames(nuhdata.pl)[c(sample(which(group.id == "E"), size = 64),
                               sample(which(group.id == "V"), size = 64))]
train.dat <- nuhdata.pl[, train.ind]
test.dat <- nuhdata.pl[, !colnames(nuhdata.pl) %in% train.ind]

# normalize only training set
data.qn <- quant.norm(train = train.dat)
str(data.qn)
#> List of 3
#>  $ train.qn: num [1:1810, 1:128] 7.26 7.15 6.94 7.22 7.16 ...
#>   ..- attr(*, "dimnames")=List of 2
#>   .. ..$ : chr [1:1810] "A_25_P00011991" "A_25_P00011991" "A_25_P00011991" "A_25_P00011991" ...
#>   .. ..$ : chr [1:128] "GL5140E" "JB5556E" "JB4783E" "GL4527E" ...
#>  $ test.fqn: NULL
#>  $ ref.dis : num [1:1810] 4.15 4.24 4.29 4.32 4.35 ...

# normalize training set and frozen normalize test set
data.qn <- quant.norm(train = train.dat, test = test.dat)
str(data.qn)
#> List of 3
#>  $ train.qn: num [1:1810, 1:128] 7.26 7.15 6.94 7.22 7.16 ...
#>   ..- attr(*, "dimnames")=List of 2
#>   .. ..$ : chr [1:1810] "A_25_P00011991" "A_25_P00011991" "A_25_P00011991" "A_25_P00011991" ...
#>   .. ..$ : chr [1:128] "GL5140E" "JB5556E" "JB4783E" "GL4527E" ...
#>  $ test.fqn: num [1:1810, 1:64] 6.31 6.7 7.01 5.83 5.98 ...
#>   ..- attr(*, "dimnames")=List of 2
#>   .. ..$ : chr [1:1810] "A_25_P00011991" "A_25_P00011991" "A_25_P00011991" "A_25_P00011991" ...
#>   .. ..$ : chr [1:64] "JB4166E" "JB5669E" "JB4112E" "JB5847E" ...
#>  $ ref.dis : num [1:1810] 4.15 4.24 4.29 4.32 4.35 ...

# frozen normalize test set with reference distribution
ref <- quant.norm(train = train.dat)$ref.dis
data.qn <- quant.norm(test = test.dat, ref.dis = ref)
str(data.qn)
#> List of 3
#>  $ train.qn: NULL
#>  $ test.fqn: num [1:1810, 1:64] 6.31 6.7 7.01 5.83 5.98 ...
#>   ..- attr(*, "dimnames")=List of 2
#>   .. ..$ : chr [1:1810] "A_25_P00011991" "A_25_P00011991" "A_25_P00011991" "A_25_P00011991" ...
#>   .. ..$ : chr [1:64] "JB4166E" "JB5669E" "JB4112E" "JB5847E" ...
#>  $ ref.dis : num [1:1810] 4.15 4.24 4.29 4.32 4.35 ...