Methods Inf Med 2005; 44(03): 468-472
DOI: 10.1055/s-0038-1633995
Original Article
Schattauer GmbH

Microarray Annotation and Biological Information on Function

B. Brors
1   Dept. of Theoretical Bioinformatics, German Cancer Research Center, INF 280, Heidelberg, Germany
› Author Affiliations
Further Information

Publication History

Publication Date:
06 February 2018 (online)

Summary

Objectives: Many methods for statistical analysis of gene expression studies by DNA microarrays produce lists of genes as output. To understand gene lists in terms of traditional biology, e.g. which pathways may be affected, it is necessary to get appropriate annotations for the probes on an array.

Methods: Problems arise with the different sources that have been used by manufacturers to design microarray probes, and their association to biological entities like genes, transcripts and proteins. Function annotation is of crucial importance, and systems like Gene Ontology can be used for this purpose. It arranges annotation terms in a hierarchical manner and thus makes annotations in a gene list amenable to automated analysis.

Results: Several methods for analyses of gene function are described. The hierarchical nature of systems like Gene Ontology particularly suggests using methods from graph theory.

Conclusions: The main problem in annotating micro-array probes and inferring affected functional modules is the incompleteness and degree of error in current biological databases. Initial approaches to make use of functional annotation exist, but have to be extended, in particular with respect to estimating the statistical significance of results.

 
  • References

  • 1 Lipshutz RJ, Fodor SP, Gingeras TA, Lockhart DJ. High density synthetic oligonucleotide arrays. Nat Genet 1999; 21 (Suppl. 01) Suppl 21-4.
  • 2 Lockhart DJ, Winzeler EA. Genomics, gene expression and DNA arrays. Nature 2000; 405: 827-36.
  • 3 Schena M, Shalon D, Davis RW. P. O. B Quantitative monitoring of gene expression patterns with a complementary DNA microarray. Science 1995; 270: 467-70.
  • 4 Dudoit S, Yang YH, Speed TP, Callow MJ. Statistical methods for identifying differentially expressed genes in replicated cDNA microarray experiments. Statistica Sinica 2002; 12: 111-39.
  • 5 Tusher VG, Tibshirani R, Chu G. Significance analysis of microarrays applied to the ionizing radiation response. Proc Natl Acad Sci USA 2001; 98: 5116-21.
  • 6 Golub TR, Slonim DK, Tamayo P, Huard C, Gaasenbeek M, Mesirov JP. et al. Molecular classification of cancer: class discovery and class prediction by gene expression monitoring. Science 1999; 286: 531-7.
  • 7 Ramaswamy S, Tamayo P, Rifkin R, Mukherjee S, Yeang CH, Angelo M. et al. Multiclass cancer diagnosis using tumor gene expression signatures. Proc Natl Acad Sci USA 2001; 98: 15149-54.
  • 8 Tibshirani R, Hastie T, Narasimhan B, Chu G. Diagnosis of multiple cancer types by shrunken centroids of gene expression. Proc Natl Acad Sci USA 2002; 99: 6567-72.
  • 9 Eisen MB, Spellman PT, Brown PO, Botstein D. Cluster analysis and display of genome-wide expression patterns. Proc Natl Acad Sci USA 1998; 95: 14863-8.
  • 10 Datta S, Datta S. Comparisons and validation of statistical clustering techniques for microarray gene expression data. Bioinformatics 2003; 19: 459-66.
  • 11 Dudoit S, Fridlyand J. Bagging to improve the accuracy of a clustering procedure. Bioinformatics 2003; 19: 1090-9.
  • 12 Yeung KY, Medvedovic M, Bumgarner RE. From co-expression to co-regulation: how many microarray experiments do we need?. Genome Biol 2004; 5: R48
  • 13 Ashburner M, Ball CA, Blake JA, Botstein D, Butler H, Cherry JM. et al. Gene ontology: tool for unification of biology. The Gene Ontology Consortium. Nat Genet 2000; 25: 25-9.
  • 14 Claverie JM. Do we need a huge new centre to annotate the human genome?. Nature 2000; 403: 12
  • 15 Bailey JA, Eichler EE. Genome-wide detection and analysis of recent segmental duplications within mammalian organisms. Cold Spring Harb Symp Quant Biol 2004; 68: 115-24.
  • 16 Altschul SF, Gish W, Miller W, Myers EW, Lipman DJ. Basic local alignment search tool. J Mol Biol 1990; 215: 403-10.
  • 17 Altschul SF, Madden TL, Schäffer AA, Zhang J, Zhang Z, Miller W. et al. Gapped BLAST and PSIBLAST: a new generation of protein database search programs. Nucl Acids Res 1997; 25: 3389-402.
  • 18 O’Donovan C, Martin MJ, Gattiker A, Gasteiger E, Bairoch A, Apweiler R. High-quality protein knowledge resource: SWISS-PROT and TrEMBL. Brief Bioinformatics 2002; 3: 275-84.
  • 19 Balasubramanian R, LaFramboise T, Scholtens D, Gentleman R. A graph theoretic approach to testing associations between disparate sources of functional genomics data. Bioinformatics. 2004: 15 (Epub ahead of print, doi:10.1093/bioinformatics/ bth405)
  • 20 Draghici S, Khatri P, Martins RP, Ostermeier GC, Krawetz SA. Global functional profiling of gene expression. Genomics 2003; 81: 98-104.
  • 21 Zeeberg BR, Feng W, Wang G, Wang MD, Fojo AT, Sunshine M. et al. Go Miner: a resource for biological interpretation of genomic and proteomic data. Genome Biol 2003; 4: R28
  • 22 Doniger SW, Salomonis N, Dahlquist KD, Vranizan K, Lawlor SC, Conklin BR. MAPPFinder: using Gene Ontology and Gen MAPP to create a global gene-expression profile from microarray data. Genome Biol 2003; 4: R7
  • 23 Beißbarth T, Speed TP. GOstat: find statistically overrepresented Gene Ontologies within a group of genes. Bioinformatics 2004; 20: 1464-5.
  • 24 Joslyn CA, Mniszewski SM, Fulmer A, Heaton G. The Gene Ontology categorizer. Bioinformatics 2004; 20 (Suppl. 01) Suppl i169-i177.
  • 25 Camon E, Barell D, Lee V, Dimmer E, Apweiler R. The Gene Ontology Annotation (GOA) Database – an integrated resource of GO annotations to the UniProt knowledgebase. In Silico Biol 2004; 4: 5-6.
  • 26 Xie H, Wasserman A, Levine Z, Novik A, Grebinskiy V, Shoshan A. et al. Large-scale protein annotation through Gene Ontology. Genome Res 2002; 12: 785-94.
  • 27 Pruess M, Fleischmann W, Kanapin A, Karavidopoulou Y, Kersey P, Kriventseva E. et al. The Proteome Analysis database: a tool for the in silico analysis of whole proteomes. Nucl Acids Res 2003; 31: 414-7.
  • 28 Zhou X, Kao MCJ, Wong WH. Transitive functional annotation by shortest-path analysis of gene expression data. Proc Natl Acad Sci USA 2002; 99: 12783-8.
  • 29 Lord PW, Stevens RD, Brass A, Goble CA. Investigating semantic similarity measures across the Gene Ontology: the relationship between sequence and annotation. Bioinformatics 2003; 19: 1275-83.
  • 30 Schoch C, Kohlmann A, Schnittger S, Brors B, Dugas M, Mergenthaler S. et al. Acute myeloid leukemias with reciprocal rearrangements can be distinguished by specific gene expression profiles. Proc Natl Acad Sci USA 2002; 99: 10008-13.