Others titles
- Chemical Effect Integrated into GO Pathways
- Go Pathways Response to Chemical Exposure
- GO Pathway and Genes Affected Upon Chemical Exposure
- Chemical GO Enriched Associations Gene Analysis
- Chemical GO Enriched
Keywords
- Toxicogenomics
- Gene Disease Association
- Gene Chemical Pathways
- Gene Enrichment
- Gene Ontology Enrichment Analyses
- Gene Ontology Terms
- Comparative Toxicogenomics Database
- Relationships Between Chemicals and Diseases
- Chemical and Disease Inferences
- Chemical Disease Hypotheses
Chemical Gene Ontology Enriched Associations
This dataset contains the results of Gene Ontology (GO) enrichment analyses performed for groups of genes that are in some way affected by a chemical. This analysis was done using the tool GO-TermFinder resulting in GO terms shared between the genes, creating information used to inferences in the biological processes, molecular functions or cellular components that might be involved in the effect of the chemical over the genes and/or the mechanism of disease.
Get The Data
- ResearchNon-Commercial, Share-Alike, Attribution Free Forever
- CommercialCommercial Use, Remix & Adapt, White Label Log in to download
Description
The Gene Ontology terms, developed by the Gene Ontology Consortium, is a system that allows to verbally describe the functions and processes of genes once these are identified, by annotating the genes to a series of standardized terms and vice versa. This system also permits to find common characteristics (common denominators) between genes by performing a Gene Ontology Enrichment analysis.
Genome annotation is the description of a gene correlated with a structure, function, products, regulation and other enriching information. “The “unit” of genome annotation is the description of an individual gene and its protein (or RNA) product, and the focal point of each such record is the function assigned to the gene product. The record may also include a brief description of the evidence for this assigned function.”
Gene Ontology (GO) terms are used as an aid for annotation of a gene allowing standardized descriptions by annotating GO terms to the genes (or vice versa) in the Gene Ontology Database. This method makes possible to search genes that share common characteristics (described with the GO terms) and run objective analyses around these characteristics.
Gene enrichment is a statistical analysis to identify over-represented (over expressed) genes from a large pool of genes or proteins that share a common characteristic (i.e. microarray); these over-represented genes could be associated with the disease. Enrichement is statistically Comparing GO terms to the genes set to understand the biological processes of those genes. The test determines if the GO term is enriched for the genes.
GO-TermFinder is a tool used to find gene ontology information and analyze the annotation of the GO terms to a microarray (a large group of genes) calculating the statistical significance of each annotation to find significantly enriched gene ontology terms associated to the list of genes. By performing this analysis scientists can determine if the genes have a share a common characteristic by finding the terms that are more strongly associated (by annotation significance) to the group of genes. GO-TermFinder uses the hypergeometric distribution statistical formula to find the significance of the annotated GO terms.
1. Koonin EV, Galperin MY. Sequence – Evolution – Function: Computational Approaches in Comparative Genomics. Boston: Kluwer
Academic; 2003.
About this Dataset
Data Info
Date Created | 2004-01-20 |
---|---|
Last Modified | 2024-05-30 |
Version | 2024-05 |
Update Frequency |
Monthly |
Temporal Coverage |
N/A |
Spatial Coverage |
N/A |
Source | John Snow Labs; Comparative Toxicogenomics Database; |
Source License URL | |
Source License Requirements |
Publicly available and free for research application but citation is required. Permission asked for commercial uses |
Source Citation |
Publicly available and free for research application but citation is required. Permission asked for commercial uses |
Keywords | Toxicogenomics, Gene Disease Association, Gene Chemical Pathways, Gene Enrichment, Gene Ontology Enrichment Analyses, Gene Ontology Terms, Comparative Toxicogenomics Database, Relationships Between Chemicals and Diseases, Chemical and Disease Inferences, Chemical Disease Hypotheses |
Other Titles | Chemical Effect Integrated into GO Pathways, Go Pathways Response to Chemical Exposure, GO Pathway and Genes Affected Upon Chemical Exposure, Chemical GO Enriched Associations Gene Analysis, Chemical GO Enriched |
Data Fields
Name | Description | Type | Constraints |
---|---|---|---|
Chemical_Name | Name of the chemical associated with the group of genes analyzed for Gene Ontology enrichment. | string | required : 1 |
Chemical_ID | Identification number of the chemical by the US National Library of Medicine’s Medical Subject Headings (MeSH). MeSH is a controlled vocabulary of thousands of biomedical terms that serves to standardize the terminology used in published texts that belong to life sciences. Each MeSH term has a unique identifier, which can be from 7 to 8-character length. The MeSH unique identifier was changed to 10-character length after November 2013. | string | required : 1 |
Cas_Registry_Name | Unique numeric identifier designated by CAS for the chemical substance. CAS registry number also serves as a reference to find information on the specific chemical. CAS is a division of the American Chemical Society (ACS); the CAS registry collects information of millions of chemical substances identified since the early 1900’s. | string | - |
Gene_Ontology | Gene ontology (GO). Gene ontology is a controlled vocabulary used to describe or define gene function and properties collecting concepts and the relationships between these concepts. Gene functions by the GO are divided in three main classes: molecular function, cellular component and biological process. Molecular function = names for activities performed at a molecular level by individual products of the gene; these are often appended with the word “activity” to avoid confusion with gene product names (e.g.: adenylate cyclase activity). Cellular component= names for structures inside the cell or structures at molecular level formed by groups of gene products (formed by groups of proteins). Biological process= name for a series of steps that lead to a biological change, these include pathways and other processes in which the activities of multiple gene products intervene. | string | required : 1 |
GO_Term_Name | Gene ontology term that describes the molecular function, biological process or cellular component. | string | required : 1 |
GO_Term_ID | Alphanumerical Identification for GO terms. The GO term ID is used to browse the GO terms in the Gene Ontology database. The GO database is a relational database comprised of the GO ontologies as well as the annotations of genes and gene products to terms in those ontologies. The GO database is the source of all data available through the legacy AmiGO 1.8 browser and search engine. | string | required : 1 |
Highest_GO_Level | The highest level to which the GO term is assigned within the GO hierarchical ontology. Many GO terms are located at multiple levels within the ontology; only the highest level is displayed. Level 1 constitutes “children” of the most general Biological Process, Cellular Component, and Molecular Function terms. The structure of GO can be described in terms of a graph, where each GO term is a node, and the relationships between the terms are edges between the nodes. GO is loosely hierarchical, with 'child' terms being more specialized than their 'parent' terms, but unlike a strict hierarchy, a term may have more than one parent term (note that the parent/child model does not hold true for all types of relation). For example, the biological process term “hexose biosynthetic process” has two parents, “hexose metabolic process” and “monosaccharide biosynthetic process.” This is because “biosynthetic process” is a subtype of “metabolic process” and a “hexose” is a subtype of “monosaccharide.” | integer | level : Ordinal |
P_Value | Raw p-value. The p-value indicates how significant is the GO term to the group of genes related to the chemical; the closer to zero, the greater the probability that the GO term is shared by these genes due to reasons other than by chance. The p-value is calculated using hypergeometric distribution method, which compares the GO terms shared by the genes with the background distribution of the annotation; the components of the formula are the variables Target Match Quantity, Target Total Quantity, Background Match Quantity and Background Total Quantity. | string | - |
Corrected_P_Value | P-value after applying Bonferroni adjustment. Bonferroni correction is made when a group of variables (in this case a group of genes) is being tested, to give a more accurate significance. | string | - |
Target_Match_Quantity | Number of genes that interact with the chemical and are annotated to the GO term. | integer | level : Ratio |
Target_Total_Quantity | Total number of genes that interact with the chemical. | integer | level : Ratio |
Background_Match_Quantity | Total number of genes that are annotated to the GO term. | integer | level : Ratio |
Background_Total_Quantity | Total number of human genes. | number | level : Ratio |
Data Preview
Chemical Name | Chemical ID | Cas Registry Name | Gene Ontology | GO Term Name | GO Term ID | Highest GO Level | P Value | Corrected P Value | Target Match Quantity | Target Total Quantity | Background Match Quantity | Background Total Quantity |
10074-G5 | C534883 | Biological Process | negative regulation of cellular metabolic process | GO:0031324 | 3 | 4.82e-06 | 0.00511 | 4 | 4 | 2126 | 45352 | |
10074-G5 | C534883 | Biological Process | negative regulation of nitrogen compound metabolic process | GO:0051172 | 3 | 5.08e-06 | 0.005379999999999999 | 4 | 4 | 2154 | 45352 | |
10074-G5 | C534883 | Biological Process | positive regulation of miRNA transcription | GO:1902895 | 6 | 8.97e-06 | 0.00951 | 2 | 4 | 56 | 45352 | |
10074-G5 | C534883 | Cellular Component | Myc-Max complex | GO:0071943 | 4 | 5.83e-09 | 6.18e-06 | 2 | 4 | 2 | 45352 | |
10074-G5 | C534883 | Cellular Component | RNA polymerase II transcription repressor complex | GO:0090571 | 3 | 4.55e-07 | 0.000482 | 2 | 4 | 13 | 45352 | |
10074-G5 | C534883 | Molecular Function | cis-regulatory region sequence-specific DNA binding | GO:0000987 | 6 | 3.29e-06 | 0.00349 | 3 | 4 | 427 | 45352 | |
10074-G5 | C534883 | Molecular Function | DNA-binding transcription factor binding | GO:0140297 | 4 | 4.28e-06 | 0.00454 | 3 | 4 | 466 | 45352 | |
10074-G5 | C534883 | Molecular Function | E-box binding | GO:0070888 | 8 | 6.57e-06 | 0.0069700000000000005 | 2 | 4 | 48 | 45352 | |
10074-G5 | C534883 | Molecular Function | RNA polymerase II cis-regulatory region sequence-specific DNA binding | GO:0000978 | 7 | 2.63e-06 | 0.00278 | 3 | 4 | 396 | 45352 | |
10074-G5 | C534883 | Molecular Function | RNA polymerase II transcription regulatory region sequence-specific DNA binding | GO:0000977 | 6 | 5.38e-06 | 0.0057 | 3 | 4 | 503 | 45352 |