You will need R 2.1.0 or R 2.1.1
(http://www.R-Project.org)
for this lab. This section lists all of the R packages
you need to have installed and also lists some additional R packages which are recommended.
Most of the R packages can be installed from the Bioconductor site automatically using:
source("http://www.bioconductor.org/getBioC.R")
getBioC()
However, Bioconductor releases only occur once every six months, whereas the authors of these labs
would typically use a more up-to-date version of some packages e.g. limma, which is
why it is advisable to check the R package version numbers below and install them from the links
provided if your installation is not up-to-date.
It is highly desirable to have these R packages in a directory in which you have write permission,
especially for the estrogen package. You can use
.libPaths("C:/Custom/R/library/directory") or .libPaths("C:\\Custom\\R\\library\\directory")
before you run install.packages() (or equivalent) to install the package(s) in a customized directory location.
| Package | Windows | MacOS X | Source |
|---|---|---|---|
| affy_1.6.7 | affy_1.6.7.zip | affy_1.6.7.tgz | affy_1.6.7.tar.gz |
| Biobase_1.5.12 | Biobase_1.5.12.zip | Biobase_1.5.12.tgz | Biobase_1.5.12.tar.gz |
| estrogen_1.5.0 | estrogen_1.5.0.zip | estrogen_1.5.0.tar.gz | estrogen_1.5.0.tar.gz |
| hgu95av2_1.8.4 | hgu95av2_1.8.4.zip | hgu95av2_1.8.4.tar.gz | hgu95av2_1.8.4.tar.gz |
| hgu95av2cdf_1.5.1 | hgu95av2cdf_1.5.1.zip | hgu95av2cdf_1.5.1.tar.gz | hgu95av2cdf_1.5.1.tar.gz |
| limma_2.0.2 | limma_2.0.2.zip | limma_2.0.2.tar.gz | limma_2.0.2.tar.gz |
| statmod_1.1.1 or statmod_1.2.0 | statmod_1.2.0.zip | statmod_1.2.0.tar.gz | statmod_1.2.0.tar.gz |
| xtable_1.2-5 | xtable_1.2-5.zip | xtable_1.2-5.tgz | xtable_1.2-5.tar.gz |
| Package | Windows | MacOS X | Source |
|---|---|---|---|
| affylmGUI_1.3.1 | affylmGUI_1.3.1.zip | affylmGUI_1.3.1.tar.gz | affylmGUI_1.3.1.tar.gz |
| limmaGUI_1.3.9 | limmaGUI_1.3.9.zip | limmaGUI_1.3.9.tar.gz | limmaGUI_1.3.9.tar.gz |
| R2HTML_1.54 | R2HTML_1.54.zip | R2HTML_1.54.tar.gz | R2HTML_1.54.tar.gz |
| sma_0.5.14 | sma_0.5.14.zip | sma_0.5.14.tgz | sma_0.5.14.tar.gz |
| tkrplot_0.0-12 | tkrplot_0.0-12.zip | tkrplot_0.0-12.tar.gz | tkrplot_0.0-12.tar.gz |
The Estrogen data set is required, but this is available as an R package,
so it is listed in Section 1.1. An additional data set,
ApoAI is required for this lab. It can be obtained from:
http://bioinf.wehi.edu.au/marray/jsm2005/apoai.zip.
This lab introduces linear modelling as a tool for identifying differentially express genes
in the conxtext both a cDNA microarray experiment and an Affymetrix GeneChip microarray experiment.
The main R library we will use for this purpose is called limma an abbreviation for
"Linear Models for Microarray Analysis". Labs 1 and 2 introduced pre-processing for two color arrays.
Lab 3 discussed pre-processing for Affymetrix GeneChips. In a typical microarray analysis project
these procedures would be applied before beginning the search for differentially expressed genes.
For this lab we will do a the minimal amount of pre-processing for brevity.
In this section we consider a case study where two RNA sources are compared through a common reference RNA. The analysis of the log-ratios involves a two-sample comparison of means for each gene.
In this example we assume that the data is available as an RG list in the
data file ApoAI.RData.
Background. The data is from a study of lipid metabolism by Callow et al (2000) ([1]). The apolipoprotein AI (ApoAI) gene is known to play a pivotal role in high density lipoprotein (HDL) metabolism. Mice which have the ApoAI gene knocked out have very low HDL cholesterol levels. The purpose of this experiment is to determine how ApoAI deficiency affects the action of other genes in the liver, with the idea that this will help determine the molecular pathways through which ApoAI operates.
Hybridizations. The experiment compared 8 ApoAI knockout mice with 8 wild type (normal) C57BL/6 ("black six") mice, the control mice. For each of these 16 mice, target mRNA was obtained from liver tissue and labelled using a Cy5 dye. The RNA from each mouse was hybridized to a separate microarray. Common reference RNA was labelled with Cy3 dye and used for all the arrays. The reference RNA was obtained by pooling RNA extracted from the 8 control mice.
| Number of arrays | Red (Cy5) | Green (Cy3) |
|---|---|---|
| 8 | Wild Type "black six" mice (WT) | Pooled Reference (Ref) |
| 8 | ApoAI Knockout (KO) | Pooled Reference (Ref) |
The experimental design could also be described in a diagram, such as:

This experiment is slightly old. Today it would be quite unusual to use this many replicate arrays without including any dye-swaps. In two color arrays, dye bias is still a significant problem. This is the main reason why normalization is always necessary.
This is an example of a single comparison experiment using a common reference. The fact that the comparison is made by way of a common reference rather than directly as for the swirl experiment makes this, for each gene, a two-sample rather than a single-sample setup.
library(limma)
load("ApoAI.RData") # from: http://bioinf.wehi.edu.au/marray/jsm2005/apoai.zip
objects()
names(RG)
MA <- normalizeWithinArrays(RG) # uses print-tip loess by default
In order to construct a design matrix, let us remind ourselves of the linear model which we are fitting for each gene:

where
is the vector of
normalized log ratios from the sixteen arrays,
is the Expected Value
of
,
is the
design matrix and
is the vector of log ratios to estimate, corresponding to the
"M" (fold change) column in the final
list of differentially expressed genes given by topTable).
The estimated log ratios are also known as "coefficients",
"parameters" and "log fold changes".
This experiment has three types of RNA: Reference (Ref), Wild Type (WT),
and Knockout (KO), so it is sufficient to estimate two log ratios in the
linear model for each gene, i.e. we will estimate two parameters, so our
design matrix should have two columns. In our case, the two parameters in the
vector are the log ratios
which compare gene expression levels in WT vs Ref and KO vs WT.
(There are other possible parameterizations which could have been chosen
instead. We are using one which allows us to estimate the contrast of interest
(KO vs WT) directly from the linear model fit, rather than estimating it
later as a contrast (i.e. a linear combination of parameters estimated from the
linear model).
The design matrix we will use is:

where the first column is for the "WT vs Ref" parameter and the second column is for the "KO vs WT" parameter. The first 8 arrays hybridize WT RNA with Ref RNA so it makes sense that they each have a '1' in the WT vs Ref column. The last 8 arrays hybridize KO RNA with Ref RNA which corresponds to the sum of the two parameters, "WT vs Ref" and "KO vs WT" which is clear if you replace "vs" with a minus sign (remembering that everything has been log2 transformed so that subtraction here actually represents a log ratio).
This design matrix can be defined in R as follows:
design <- matrix(c(rep(1,16),rep(0,8),rep(1,8)),ncol=2)
colnames(design) <- c("WT-Ref","KO-WT")
design
fit <- lmFit(MA,design=design) names(fit)
fit <- eBayes(fit) names(fit)
We now use the function topTable to obtain a list the genes with the most
evidence of differential expression between the Knockout and Wild-Type RNA samples. The
knockout gene (ApoAI) should theoretically have a log fold change of minus infinity, but
microarrays cannot measure extremely large fold changes. While the M value of the ApoAI
gene in the topTable may not have much biological meaning, the high ranking shows that this
gene is consistently down-regulated across the replicate arrays.
topTable(fit,coef="KO-WT") numGenes <- nrow(RG$genes) completeTableKOvsWT <- topTable(fit,coef="KO-WT",number=numGenes) write.table(completeTableKOvsWT,file="KOvsWTgenes.xls",sep="\t",quote=FALSE,col.names=NA)
The ".xls" extension is used so that the (tab-delimited text) file can
be opened in Excel by double-clicking its icon.
The arguments of topTable can be studied in more detail with ?topTable or
args(topTable) or just topTable. Currently in limma,
topTable is a high-level wrapper function which calls a lower level function
"toptable". The default method for ranking genes is the B statistic (log odds
of differential expression, Lonnstedt and Speed 2003 [2]),
but the moderated t statistic and p-value can also be used. Using the average fold-change
(the M column) is not recommended because this ignores the variability between replicate
arrays.
Using an M A plot, we can see which genes are selected as being differentially expressed by the B statistic (log odds of differential expression), which is the default ranking statistic for the topTable. Of course, the differentially expressed genes selected by the B statistic may not have the most extreme fold changes (M values), because some of the genes with extreme average fold changes may vary significantly between replicate arrays so they will be down-weighted by the empricial Bayes statistics.
M <- fit$coefficients[,"KO-WT"] A <- fit$Amean ord <- order(fit$lods[,"KO-WT"],decreasing=TRUE) top10 <- ord[1:10] plot(A,M,pch=16,cex=0.1) text(A[top10],M[top10],labels=substring(fit$genes[top10,"NAME"],1,5),cex=0.8,col="blue")
For this section of the lab, we will use data from a set of eight Affymetrix chips from a
2x2 factorial-design experiment designed to measure changes in gene expression in
a breast-cancer cell line due to the presence (or absence) of estrogen and due to
a time effect (10 hours or 48 hours). The experiment was performed by Scholtens
et. al [6] who have kindly provided their data for public access on the Bioconductor
website:
http://www.bioconductor.org/data/experimental.html.
The Estrogen data is stored in the "data" subdirectory of an R package called "estrogen" which is listed on this website.
It can be installed like any other R package, then the data can be found from within R in the directory:
system.file("data",package="estrogen").
This data set is described in detail in the "Estrogen 2x2 Factorial Design" vignette by Denise Scholtens and Robert Gentleman [7].
The following description of the experiment is taken from that vignette.
Experimental Data"The investigators in this experiment were interested in the effect of estrogen on the genes in ER+ breast cancer cells over time. After serum starvation of all eight samples, they exposed four samples to estrogen, and then measured mRNA transcript abundance after 10 hours for two samples and 48 hours for the other two. They left the remaining four samples untreated, and measured mRNA transcript abundance at 10 hours for two samples, and 48 hours for the other two. Since there are two factors in this experiment (estrogen and time), each at two levels (present or absent,10 hours or 48 hours), this experiment is said to have a 2x2 factorial design."
The table below describes the experimental conditions for each of the eight arrays. When stored as tab-delimited text, this is known as the RNA Targets file.
| Name | FileName | Target |
|---|---|---|
| Abs10.1 | low10-1.cel | EstAbsent10 |
| Abs10.2 | low10-2.cel | EstAbsent10 |
| Pres10.1 | high10-1.cel | EstPresent10 |
| Pres10.2 | high10-2.cel | EstPresent10 |
| Abs48.1 | low48-1.cel | EstAbsent48 |
| Abs48.2 | low48-2.cel | EstAbsent48 |
| Pres48.1 | high48-1.cel | EstPresent48 |
| Pres48.2 | high48-2.cel | EstPresent48 |
The file (shown as a table above) is known as the "RNA Targets" file in
limma or in affylmGUI. It should be stored in
tab-delimited text format and it can be created either in a spreadsheet
program (such as Excel) or in a text editor. The column headings must appear
exactly as above. Each chip should be given a unique name or number (in the
Name column). The Affymetrix CEL file name should be listed for each
chip in in the FileName column. The Target column tells
limma or affylmGUI which chips are replicates.
By using the same "Target" name for the first two rows in the table, we are
telling limma or affylmGUI that these two CEL files
represent replicatechips for the experimental condition
(Estrogen Absent, Time 10 hours). Note that we could use a different Targets file
for this analysis in which the time effect was ignored if we only wanted to compare
Estrogen Absent and Estrogen Present. This would simply require removing the
10's and 48's from the Target column.
There is actually a file called "phenoData.txt" supplied within the
estrogen package, which could be used as a Targets file, but in this
exercise you should use the format above or download "EstrogenTargets.txt"
from
http://bioinf.wehi.edu.au/marray/jsm2005/lab4/EstrogenTargets.txt.
This way is simpler because we describe the type of RNA target in just one column (Target), rather
than in two columns (estrogen and time.h).
exprSet objectLab 3 discussed how to pre-process and produce gene expression measures for Affymetrix GeneChip datasets. The first stage for
the analysis in this lab is to produce RMA expression values for the data stored in the estrogen package. This package contains raw data in CEL file format. Before starting you should check that the file EstrogenTargets.txt is in your current directory. You can check this using dir().
To load the libraries we will need to use:
library(limma) library(affy) library(hgu95av2cdf)
The next stage is to read this data and compute RMA expression measures. Accomplish this using:
datadir <- system.file("extdata",package="estrogen")
dir(datadir)
# Please ensure that either "EstrogenTargets.txt" from
# http://bioinf.wehi.edu.au/marray/jsm2005/Data/EstrogenTargets.txt
# is in the current working directory as given by getwd(), or that you
# provide the read.phenoData function with a full path to
# "EstrogenTargets.txt", e.g. "C:/JSM/EstrogenTargets.txt"
# or "C:\\JSM\\EstrogenTargets.txt".
pd <- read.phenoData("EstrogenTargets.txt",header=TRUE,row.names=1,as.is=TRUE)
abatch <- ReadAffy(filenames=pData(pd)$FileName, celfile.path=datadir)
abatch@phenoData <- pd
eset <- rma(abatch)
As discussed in Lab 3 it is usual, and appropriate, to check dataset quality before continuing your analysis. However, due to brevity we will skip over this in this lab. A full set of quality assessment plots can be found at http://www.stat.berkeley.edu/~bolstad/PLMImageGallery/ under the title "estrogen". These plots show no significant quality problems with any arrays in this dataset.
Defining a design matrix is quite simple for Affymetrix data because each microarray chip has only a single channel, as distinct from the two-color cDNA microarrays slides. Recall that we used an RNA Targets file (stored in tab-delimited text) to describe the Estrogen experiment:
| Name | FileName | Target |
|---|---|---|
| Abs10.1 | low10-1.cel | EstAbsent10 |
| Abs10.2 | low10-2.cel | EstAbsent10 |
| Pres10.1 | high10-1.cel | EstPresent10 |
| Pres10.2 | high10-2.cel | EstPresent10 |
| Abs48.1 | low48-1.cel | EstAbsent48 |
| Abs48.2 | low48-2.cel | EstAbsent48 |
| Pres48.1 | high48-1.cel | EstPresent48 |
| Pres48.2 | high48-2.cel | EstPresent48 |
When we created an object of class exprSet with rma, the
phenotype data (i.e. the RNA Targets table) was automatically included in our
eset object (of class exprSet), and it can be recovered using either
the pData method of the exprSet class.
targets <- pData(eset)
We have four pairs of replicate arrays (each of which is a single-channel array), so we should estimate four parameters in the linear model. The design matrix should have four columns and the rows should describe the four pairs of replicate arrays.

where the four columns of the matrix correspond to EstAbsent10,
EstPresent10, EstAbsent48 and EstPresent48.
This matrix can be defined in R as follows:
design <- model.matrix(~ -1 + factor(targets$Target,levels=unique(targets$Target))) colnames(design) <- unique(targets$Target) numParameters <- ncol(design) parameterNames <- colnames(design) design
Now that we have defined our design matrix, fitting a linear model is as simple as:
fit <- lmFit(eset,design=design)
fit is an object of class MArrayLM. Its components can be
accessed with the $ operator just a standard list in R.
names(fit)
Contrasts are linear combinations of parameters from the linear model fit.

where
is a vector of contrasts for gene
,
is the contrasts matrix, and
is a vector of
coefficients (estimated log fold changes), obtained from a
linear model fit.
We will estimate three contrasts (so our contrasts matrix will have three columns). The first contrast is an estrogen effect at time 10 hours, the second as an estrogen effect at time 48 hours and the third is a time effect in the absence of estrogen.

contrastNames <- c(paste(parameterNames[2],parameterNames[1],sep="-"),
paste(parameterNames[4],parameterNames[3],sep="-"),
paste(parameterNames[3],parameterNames[1],sep="-"))
contrastsMatrix <- matrix(c(-1,1,0,0,0,0,-1,1,-1,0,1,0),nrow=ncol(design))
rownames(contrastsMatrix) <- parameterNames
colnames(contrastsMatrix) <- contrastNames
contrastsMatrix
fit2 <- contrasts.fit(fit,contrasts=contrastsMatrix) names(fit2)
fit2 <- eBayes(fit2) names(fit2)
We now use the function topTable to obtain a list genes differentially
expressed between Estrogen-Present and Estrogen-Absent at time 10 hours, followed by a list of
genes differentially expressed between Estrogen-Present and Estrogen-Absent at time 48 hours.
topTable(fit2,coef="EstPresent10-EstAbsent10") topTable(fit2,coef="EstPresent48-EstAbsent48") numGenes <- nrow(eset@exprs) completeTableEst10 <- topTable(fit2,coef="EstPresent10-EstAbsent10",number=numGenes) write.table(completeTableEst10,file="Est10genes.xls",sep="\t",quote=FALSE,col.names=NA) completeTableEst48 <- topTable(fit2,coef="EstPresent48-EstAbsent48",number=numGenes) write.table(completeTableEst48,file="Est48genes.xls",sep="\t",quote=FALSE,col.names=NA)
If the genes int the topTable have standard IDs (e.g. UniGene or GenBank), then they can be
linked with external annotation information on the Internet.
Load the annotation package hgu95av2, which can be obtained from
http://www.bioconductor.org/data/metaData.html.
library(hgu95av2cdf) library(hgu95av2)
Now we obtain:
AffyBatch object, ab),hgu95av2SYMBOL environment in the hgu95av2 annotation package),
hgu95av2GENENAME environment in the hgu95av2 annotation package), and
hgu95av2UNIGENE environment in the hgu95av2 annotation package).
geneIDs <- ls(hgu95av2cdf)
It is possible to extract the annotation information from the appropriate
R environments within the annotation package (hgu95av2), and
store it in simple R data structures as follows:
# geneSymbols <- unlist(as.list(hgu95av2SYMBOL)) # geneNames <- unlist(as.list(hgu95av2GENENAME)) # unigene <- unlist(as.list(hgu95av2UNIGENE))
However we must be very careful, because in recent versions of Bioconductor annotation packages, some gene IDs can map to multiple gene names, multiple symbols and/or multiple unigene clusters. The method above, while appearing simple, also does not allow for the possibility that the gene names are stored in a different order in the annotation package from the gene IDs. We will therefore use some rather complicated-looking R code to extract the gene symbols, names and unigene IDs from the hgu95av2 environment. Multiple entries will be collapsed and separated by semicolons, e.g. if geneID 001 corresponds to gene names "Name1" and "Name2", these will be collapsed into "Name1; Name2".
geneSymbols <- as.character(unlist(lapply(mget(geneIDs,env=hgu95av2SYMBOL),
function (symbol) { return(paste(symbol,collapse="; ")) } )))
geneNames <- as.character(unlist(lapply(mget(geneIDs,env=hgu95av2GENENAME),
function (name) { return(paste(name,collapse="; ")) } )))
unigene <- as.character(unlist(lapply(mget(geneIDs,env=hgu95av2UNIGENE),
function (unigeneID) { return(paste(unigeneID,collapse="; ")) } )))
The key functions to note in the code above are mget which
extracts multiple annotation strings from the appropriate annotation environmentand lapply which applies our multiple-entry-collapsing function to
every element in a list of annotation strings.
Now we abbreviate the gene names to a maximum length of 40 characters, for neat formatting in a table, and we extract the unigene ID from the string which contains "Hs" (for Homo sapiens) as well as the unigene ID.
geneNames <- substring(geneNames,1,40)
unigene <- gsub("Hs\\.","",unigene)
genelist <- data.frame(GeneID=geneIDs,GeneSymbol=geneSymbols,GeneName=geneNames,
UniGeneHsID=paste("<a href=http://www.ncbi.nlm.nih.gov/UniGene/clust.cgi?ORG=Hs&CID=",
unigene,">",unigene,"</a>",sep=""))
Now we recreate the toptable for the two contrasts considered earlier, "EstPresent10-EstAbsent10" and "EstPresent48-EstAbsent48" this time providing a hyperlink to the UniGene website for each gene in the toptable.
unigeneTopTableEst10 <- topTable(coef="EstPresent10-EstAbsent10",n=20,fit=fit2,genelist=genelist) unigeneTopTableEst48 <- topTable(coef="EstPresent48-EstAbsent48",n=20,fit=fit2,genelist=genelist)
library(xtable)
xtableUnigeneEst10 <- xtable(unigeneTopTableEst10,display=c("d","s","s","s","s","g","g","g","e","g"))
xtableUnigeneEst48 <- xtable(unigeneTopTableEst48,display=c("d","s","s","s","s","g","g","g","e","g"))
cat(file="estrogenUniGeneE10.html","<html>\n<body>")
print.xtable(xtableUnigeneEst10,type="html",file="estrogenUniGeneE10.html",append=TRUE)
cat(file="estrogenUniGeneE10.html","</body>\n</html>",append=TRUE)
cat(file="estrogenUniGeneE48.html","<html>\n<body>")
print.xtable(xtableUnigeneEst48,type="html",file="estrogenUniGeneE48.html",append=TRUE)
cat(file="estrogenUniGeneE48.html","</body>\n</html>",append=TRUE)
The display argument to the xtable function is used to specify the format of text or numbers
displayed in cells of the HTML table:
| Format code | Meaning |
|---|---|
| d | Decimal (base ten) integer, e.g. 48 |
| s | Character string, e.g. "Block" |
| g | General real floating-point number, e.g. 8.25 |
| e | Floating point number in exponent format, e.g. 1.02E-05 |
Now we plot an M vs A plot for each contrast. Recall that M is a log fold change in base 2
and A is a log intensity in base 2 (in this case for each gene, we will use the average of all
of the A values). The geneSymbols vector should be available from
Section 3.8. If you could not complete Section 3.8 because you don't
have the hgu95av2 annotation package installed, then you can just use the geneIDs vector
also described in Section 3.8, which does not require the annotation package.
M.Est10 <- fit2$coefficients[,"EstPresent10-EstAbsent10"] A <- fit2$Amean ord.Est10 <- order(fit2$lods[,"EstPresent10-EstAbsent10"],decreasing=TRUE) top30.Est10 <- ord.Est10[1:30] plot(A,M.Est10,pch=16,cex=0.1) text(A[top30.Est10],M.Est10[top30.Est10],labels=geneSymbols[top30.Est10],cex=0.8,col="blue") M.Est48 <- fit2$coefficients[,"EstPresent48-EstAbsent48"] ord.Est48 <- order(fit2$lods[,"EstPresent48-EstAbsent48"],decreasing=TRUE) top30.Est48 <- ord.Est48[1:30] plot(A,M.Est48,pch=16,cex=0.1) text(A[top30.Est48],M.Est48[top30.Est48],labels=geneSymbols[top30.Est48],cex=0.8,col="blue") M.EstAbs48vs10 <- fit2$coefficients[,"EstAbsent48-EstAbsent10"] ord.EstAbs48vs10 <- order(fit2$lods[,"EstAbsent48-EstAbsent10"],decreasing=TRUE) top30.EstAbs48vs10 <- ord.EstAbs48vs10[1:30] plot(A,M.EstAbs48vs10,pch=16,cex=0.1) text(A[top30.EstAbs48vs10],M.EstAbs48vs10[top30.EstAbs48vs10],labels=geneSymbols[top30.EstAbs48vs10],cex=0.8,col="blue")
We can now examine which genes respond to Estrogen at either time (10 hours or 48 hours) by examining moderated F-statistics on 2 degrees of freedom and p-values calculated from these F-statistics:
F.stat <- fit2$F p.value <- fit2$F.p.value
What p-value cutoff should be used? One guide is to examine control probe-clusters which are known not to be differentially expressed.
i <- grep("AFFX",geneNames(eset))
summary(p.value[i])
We will choose a p-value of 1.0E-5 which is well below the average p.value for the control
probe-clusters. Consider those genes with moderated F-statistics with p-values below
1.0E-5, and classify these according to whether they are significantly up or
down regulated at the early or late times:
results <- classifyTestsF(fit2, p.value=1.0E-5)
vennDiagram(results,names=c("Est10PresAbs","Est48PresAbs","EstAbs48vs10"))
Use the table command (below) to determine how many genes were up-regulated at 10 hours,
and how many of those genes were still up-regulated at 48 hours. The same question can be answered for
down-regulated genes, using the same table.
table(E10=results[,1],E48=results[,2])
Thanks to Scholtens et al. [3] for providing the estrogen data on the Bioconductor site (http://www.bioconductor.org/data/experimental.html). Thanks to Yee Hwa Yang and Sandrine Dudoit for the ApoAI data.
Thanks also to Gordon Smyth for the limma usersguide which
this lab was based on.
This lab is heavily based on a Lab produced by James Wettenhall for IBC 2004.
| Knockout RNA | RNA extracted from a biological specimen which has had one gene artificially knocked out (removed) from it in a laboratory. |
| Wild Type RNA | RNA extracted from a biological specimen whose genes are in their natural form (as found in the wild). |