Genome annotation is the description of a gene correlated with a structure, function, products, regulation and other enriching information. “The “unit” of genome annotation is the description of an individual gene and its protein (or RNA) product, and the focal point of each such record is the function assigned to the gene product. The record may also include a brief description of the evidence for this assigned function.”
Gene Ontology (GO) terms are used as an aid for annotation of a gene allowing standardized descriptions by annotating GO terms to the genes (or vice versa) in the Gene Ontology Database. This method makes possible to search genes that share common characteristics (described with the GO terms) and run objective analyses around these characteristics.
Gene enrichment is a statistical analysis to identify over-represented (over expressed) genes from a large pool of genes or proteins that share a common characteristic (i.e. microarray); these over-represented genes could be associated with the disease. Enrichement is statistically Comparing GO terms to the genes set to understand the biological processes of those genes. The test determines if the GO term is enriched for the genes.
GO-TermFinder is a tool used to find gene ontology information and analyze the annotation of the GO terms to a microarray (a large group of genes) calculating the statistical significance of each annotation to find significantly enriched gene ontology terms associated to the list of genes. By performing this analysis scientists can determine if the genes have a share a common characteristic by finding the terms that are more strongly associated (by annotation significance) to the group of genes. GO-TermFinder uses the hypergeometric distribution statistical formula to find the significance of the annotated GO terms.
1. Koonin EV, Galperin MY. Sequence – Evolution – Function: Computational Approaches in Comparative Genomics. Boston: Kluwer