Quantile normalization
quant.norm.Rd
Normalize training dataset with quantile normalization and store the quantiles from the training dataset as the references to frozen quantile normalize test dataset.
Arguments
- train
training dataset to be quantile normalized. The dataset must have rows as probes and columns as samples. This can be left unspecified if
ref.dis
is suppied for frozen normalize test set.- test
test dataset to be frozen quantile normalized. The dataset must have rows as probes and columns as samples. The number of rows must equal to the number of rows in the training set. By default, the test set is not specified (
test = NULL
) and no frozen normalization will be performed.- ref.dis
reference distribution for frozen quantile normalize test set against previously normalized training set. This is required when
train
is not supplied. By default,ref.dis = NULL
.
Value
a list of two datasets and one reference distribution:
- train.mn
the normalized training set
- test.fmn
the frozen normalized test set, if test set is specified
- ref.dis
the reference distribution
References
Bolstad, B. M., Irizarry R. A., Astrand, M, and Speed, T. P. (2003) A Comparison of Normalization Methods for High Density Oligonucleotide Array Data Based on Bias and Variance. Bioinformatics 19(2) , pp 185-193. http://bmbolstad.com/misc/normalize/normalize.html
Examples
set.seed(101)
group.id <- substr(colnames(nuhdata.pl), 7, 7)
train.ind <- colnames(nuhdata.pl)[c(sample(which(group.id == "E"), size = 64),
sample(which(group.id == "V"), size = 64))]
train.dat <- nuhdata.pl[, train.ind]
test.dat <- nuhdata.pl[, !colnames(nuhdata.pl) %in% train.ind]
# normalize only training set
data.qn <- quant.norm(train = train.dat)
str(data.qn)
#> List of 3
#> $ train.qn: num [1:1810, 1:128] 7.26 7.15 6.94 7.22 7.16 ...
#> ..- attr(*, "dimnames")=List of 2
#> .. ..$ : chr [1:1810] "A_25_P00011991" "A_25_P00011991" "A_25_P00011991" "A_25_P00011991" ...
#> .. ..$ : chr [1:128] "GL5140E" "JB5556E" "JB4783E" "GL4527E" ...
#> $ test.fqn: NULL
#> $ ref.dis : num [1:1810] 4.15 4.24 4.29 4.32 4.35 ...
# normalize training set and frozen normalize test set
data.qn <- quant.norm(train = train.dat, test = test.dat)
str(data.qn)
#> List of 3
#> $ train.qn: num [1:1810, 1:128] 7.26 7.15 6.94 7.22 7.16 ...
#> ..- attr(*, "dimnames")=List of 2
#> .. ..$ : chr [1:1810] "A_25_P00011991" "A_25_P00011991" "A_25_P00011991" "A_25_P00011991" ...
#> .. ..$ : chr [1:128] "GL5140E" "JB5556E" "JB4783E" "GL4527E" ...
#> $ test.fqn: num [1:1810, 1:64] 6.31 6.7 7.01 5.83 5.98 ...
#> ..- attr(*, "dimnames")=List of 2
#> .. ..$ : chr [1:1810] "A_25_P00011991" "A_25_P00011991" "A_25_P00011991" "A_25_P00011991" ...
#> .. ..$ : chr [1:64] "JB4166E" "JB5669E" "JB4112E" "JB5847E" ...
#> $ ref.dis : num [1:1810] 4.15 4.24 4.29 4.32 4.35 ...
# frozen normalize test set with reference distribution
ref <- quant.norm(train = train.dat)$ref.dis
data.qn <- quant.norm(test = test.dat, ref.dis = ref)
str(data.qn)
#> List of 3
#> $ train.qn: NULL
#> $ test.fqn: num [1:1810, 1:64] 6.31 6.7 7.01 5.83 5.98 ...
#> ..- attr(*, "dimnames")=List of 2
#> .. ..$ : chr [1:1810] "A_25_P00011991" "A_25_P00011991" "A_25_P00011991" "A_25_P00011991" ...
#> .. ..$ : chr [1:64] "JB4166E" "JB5669E" "JB4112E" "JB5847E" ...
#> $ ref.dis : num [1:1810] 4.15 4.24 4.29 4.32 4.35 ...