I started this list in 2001. Since then, most R packages for genomic analysis have now come together under the roof of the Bioconductor project. For that reason, and because of time constraints on my side, I will not update this list any more (last change July 2005). For a well-organized list of the packages available in Bioconductor see the BioC task list (version 1.8). Also helpful should be the CRAN Multivariate Statistics task view.
Overview:
This list contains R packages, and software based on the R system, to analyze gene expression data from DNA array experiments, both for olignonucleotide chips and cDNA microarrays. Please drop me a line (strimmer@uni-leipzig.de) if there are any inaccuracies or to suggest other packages that should be listed here.
General-purpose Packages:
Package | Description | Author | Contact | Version | License |
BioConductor (Stable and development packages) |
Collaborative open-source project to develop a modular general framework for the analysis of cDNA arrays and gene chips. Includes and unifies some of the packages below. | R. Gentleman (and many others) |
rgentlem@ hsph.harvard.edu |
1.6 | GNU GPL |
FunDaMiner | All-purpose package for gene expression analysis. Collection of various methods from preprocesssing over differential expression and multiple testing to clustering and classification. | Michael T. Mader | mtmader@gmx.de | 1.0.1 | GNU GPL |
OOMAL | Object-Oriented Microarray Analysis Library implemented in S-Plus. | ? | ? | 4.0 | No commercial use |
DNAMR | Companion R package to "Exploration and Analysis of DNA Microarray and Protein Array Data" book. | J. Cabrera | cabrera@ rci.rutgers.edu |
0.1 | GPL |
SMIDA | Companion R package to "Statistics for Microarrays" book. | J. McClure and E. Wit | johndm@ stats.gla.ac.uk |
0.1 | GPL |
Specialized Packages: Differential Expression (Error Models)
Package | Description | Author | Contact | Version | License |
LPE | Local pooled error (LPE) test for gene expression data with a small number of replicates (2-3). | N. Jain | nitin.jain@ pfizer.com |
1.1.5 | GNU GPL |
HEM | Heterogeneous Error Model for Analysis of Microarray Data. | HJ. Cho | hcho@virginia.edu | 1.0.2 | GNU GPL |
SMA | Data import from GenePIX and SPOT image programs, data management utilities, normalization and differential expression, diagonal discriminant analysis, various plot functions etc. Includes some mouse data. | S. Dudoit (and others) |
sandrine@ stat.berkeley.edu |
0.5.13 | GNU GPL |
vsn | Variance stabilization and calibration for microarray data. | W. Huber |
w.huber@dkfz.de | 1.5 | GNU GPL |
YASMA | ANOVA analysis, filtering and interpolation functions. Includes some tuberculosis data. | L. Wernisch (and others) |
l.wernisch@ cryst.bbk.ac.uk |
0.20 | GNU GPL |
maanova | Analysis of two-dye Micro Array experiment (using ANOVA, permutation and bootstrap, cluster and consensus tree). | Hao Wu | hao@jax.org | 0.91-3 | GPL 2 |
LIMMA | Linear models for microarray data. | G. Smyth | smyth@wehi.edu.au | 1.8.9 | GNU GPL |
DEDS | Various statistics of differential expression for microarray data, including t statistics, fold change, F statistics, SAM, moderated t and F statistics and B statistics. | Y. Xiao and J. Yang | jean@biostat. ucsf.edu |
1.03 | GNU GPL |
Specialized Packages: Differential Expression (Empirical Bayes and FDR)
Package | Description | Author | Contact | Version | License |
siggenes | Significance Analysis of Microarrays (SAM) and Empirical Bayes Analyses of Microarrays (EBAM). | H. Schwender | holger.schw@ gmx.de |
1.2.11 | Free for non-commercial use |
EBarrays | Empirical Bayes methods for microarray analysis. | C. Kendziorski and M. Newton |
kendzior@biostat. wisc.edu |
1.0-19 | GNU GPL |
qvalue | Several approaches to estimate the false discovery rate (FDR). | J. Storey (and others) |
jstorey@ u.washington.edu |
1.1 | GNU GPL |
fdrtool | Estimation and Control of (Local) False Discovery Rates. | K. Strimmer |
strimmer@ uni-leipzig.de |
1.1.0 | GNU GPL |
BUM/ SPLOSH |
Improved estimates of local FDR to detect differentially expressed genes. | S. Pounds | stanley.pounds@ stjude.org |
? | ? |
High Probability |
Decision-theoretic improvement of FDR (=dFDR) to detect differentially expressed genes. | D. Bickel | bickel@prueba.info | 1.0-2 | GPL |
locfdr | Computation of local false discovery rates. | B. Efron and B. Narasimhan | brad@stat. stanford.edu |
1.0-3 | GPL |
twilight | Improved estimate of local FDR to detect differentially expressed genes. | S. Scheid | stefanie.scheid@ molgen.mpg.de |
1.0.1 | GNU GPL |
LBE | Improved estimate of local FDR to detect differentially expressed genes. | C. Dalmasso (and others) | dalmasso@ vjf.inserm.fr |
? | ? |
LocalFDR/ Varmixt |
Improved estimate of local FDR from p-values to detect differentially expressed genes. | J. Aubert (and others) | aubert@inapg.fr | ? | ? |
localFDR | Estimation of local FDR from p-values using stochastic order models. | J.G. Liao | liaojg@umdnj.edu | ? | ? |
FDVAR | Variance estimation of FDR. | A.B. Owen | owen@stat.stanford.edu | ? | ? |
OCplus | Computes theoretical and empirical FDR, sensitivity, false positive rate and sample size requirements when selecting differentially expressed genes in simple microarray experiments. | Y. Pawitan and A. Ploner | alexander.ploner@meb. ki.se |
1.2.0 | GNU GPL |
SAGx | Statistical Analysis of the GeneChip. Includes methods for identifying differentially expressed genes (pava FDR). | P. Broberg | per.broberg@ astrazeneca.com |
1.5.2 | GNU GPL |
Specialized Packages: Networks
Package | Description | Author | Contact | Version | License |
GeneNet | Modeling and inferring gene networks. | K. Strimmer (and others) |
strimmer@ uni-leipzig.de |
1.1.0 | GNU GPL |
GeneNT | Testing of edges in gene networks with two-stage screening algorithm. | D. Zhu (and others) |
zhud@umich.edu | 1.0 | GNU GPL |
Specialized Packages: Other
Package | Description | Author | Contact | Version | License |
aroma | Object-oriented microarray analysis package. | H. Bengtsson | hb@maths.lth.se | 0.75 | GNU GPL |
pickgene | Normalization, differential expression, simulation, etc. | B. S. Yandell | yandell@ stat.wisc.edu |
1.0.0 | GNU GPL |
som | Clustering using self-organizing maps. Also provides simple filtering and normalization functions. Includes yeast cell cycle data. | J. Yan | jyan@ stat.wisc.edu |
0.3-4 | GNU GPL |
permax | Permutation tests for microarray data (2-sample t-test, correlation test, etc.). | R. J. Gray | gray@ hsph.harvard.edu |
2.2 | GNU GPL |
hdarray | Bayesian t-tests for expression change. Part of GeneX/Cyber T. | A. D. Long | tdlong@uci.edu | 3.70 | No commercial use |
ISIS | Class discovery based on maximizing a discriminant score. | A. von Heydebreck | heydebre@ molgen.mpg.de |
2.0 | No commercial use |
GeneClust | Exploratory analysis of gene expression microarray data (S-Plus). Implements Gene Shaving. | K.-A. Do (and others) |
kim@ mdanderson.org |
1.0b11 | No commercial use |
affyR | Analysis of data from Affymetrix oligonucleotide arrays. | L. Gautier | laurent@ cbs.dtu.dk |
0.3.3 | No commercial use |
maffy | Routines for normalizing Affymetrix Oligonucleotide Arrays. | M. Åstrand | magnus.astrand@ astrazeneca.com |
0.2 | GNU GPL |
tRMA | Tools for microarray analysis: normalisation, differential expression, visualisation etc. | P. Baker (and others) |
trma@ cmis.csiro.au |
1.7.0 | ? |
POE | Probability of expression (POE). An approach to the analysis of gene expression microarrays using three-component mixtures. | G. Parmigiani and E. Garrett |
gp@jhu.edu esg@jhu.edu |
? | GNU GPL |
Li-Wong Model | S-Plus Scripts for Li-Wong full and reduced model estimates. | F. A. Wright | fwright@bios. unc.edu |
? | ? |
PAM | Sample classification from gene expression data, by the method of nearest shrunken centroids. | R. Tibshirani (and others) |
tibs@stat. stanford.edu |
1.24 | ? |
statomics | Statistical analysis of genomic and proteomic data | D. Bickel | bickel@prueba.info | 0.2 | GNU GPL |
plsgenomics | PLS analyses for genomics. | A.-L. Bourlesteix | boulesteix@stat. uni-muenchen.de |
1.0 | GNU GPL |
Frontends Using R (Web-based or Windows-based):
Package | Description | Author | Contact | Version | License |
GeneX | General platform for the analysis and comparison of gene expression data. | NCGR and the Computational Genomics Group at the University of California, Irvine | genex@ncgr.org | ? | GNU LGPL |
GeneTraffic | General platform for the data management and the analysis of two-colour microarray experiments (requires Windows). | IOBION Informatics LCC | help@iobion.com | 2.5 | Commercial |
SNOMAD | Various procedures for standardization and normalization of microarray data. | C. Colantuoni (and others) |
ccolantu@jhmi.edu | ? | ? |
cDNA microarray analysis |
Analysis of two-color microarray data (requires Windows). | G. C. Tseng | ctseng@pitt.edu | ? | ? |
General Packages:
Package | Description | Author | Contact | Version | License |
mva | Classical multivariate analysis: principal component analysis,
K-means, hierarchical clustering, factor analysis, etc. R base package. |
R Core Team | R-core@ r-project.org |
1.5.0 | GNU GPL |
modreg | Modern regression analysis: Smoothing and Local Methods. R base package. |
R Core Team | R-core@ r-project.org |
1.5.0 | GNU GPL |
multiv | Multivariate data analysis routines: hierarchical clustering, principal component analysis, Sammon's nonlinear mapping, correspondence analysis, K-means, etc. | F. Murtagh | fmurtagh@eso.org | 1.1-4 | No commercial use |
fastICA | Implementation of FastICA algorithm to perform Independent Component Analysis (ICA) and Projection Pursuit. | J. L.Marchini | marchini@ stats.ox.ac.uk |
1.1-4 | GNU GPL |
cluster | Functions for (hierarchical) cluster analysis. | P. Rousseeuw (and others) |
rousse@uia.ua.ac.be | 1.6-4 | GNU GPL |
cclust | Convex clustering methods, including K-means algorithm and calculation of several indexes for finding the number of clusters in a data set. | E. Dimitriadou | dimi@ci.tuwien.ac.at | 0.6-9 | GNU GPL |
tree | Classification and regression trees. | B. Ripley | ripley@stats.ox.ac.uk | 1.0-12 | GNU GPL |
class | Various functions for classification. | B. Ripley | ripley@stats.ox.ac.uk | 6.2-6 | GNU GPL |
nnet | Feed-forward neural networks and multinomial log-linear models. | B. Ripley | ripley@stats.ox.ac.uk | 6.2-6 | GNU GPL |
mclust | Model-based clustering and discriminant analysis, including hierarchical clustering and EM for parameterized Gaussian mixtures and Poisson noise. | C. Fraley (and others) |
fraley@ stat.washington.edu |
1.1-4 | No commercial use |
e1071 | Functions for latent class analysis, support vector machines, fuzzy clustering, bagged clustering, etc. | F. Leisch (and others) |
Friedrich.Leisch@ ci.tuwien.ac.at |
1.3-11 | GNU GPL |
dr | Dimension reduction regression, incl. sliced inverse regression (SIR) | S. Weisberg |
sandy@stat.umn.edu | 1.0-3 | GNU GPL |
randomForest | Classification based on a forest of classification trees using random inputs. | A. Liaw | andy_liaw@ merck.com |
3.4-4 | GNU GPL |
LogitBoost | Classification with LogitBoost. | M. Dettling | dettling@stat. math.ethz.ch |
1.0 | GNU GPL |
kmethods | Kernel based dimensionality reduction and clustering methods (ISOMAP etc). | M. Kuss | astro@cs.tu-berlin.de | 0.1-1 | GNU GPL |