Gene Ontology Disease Gene Inference Networks

$447.50 / year

This dataset from the Comparative Toxicogenomics Database (CTD) contains the relationships between gene ontology terms and diseases. These relationships were inferred due to the fact that the gene ontology term and the disease in some way share independent relationships with a same gene or group of genes; the inference was made through curation of research publications, the building of diagrams and statistical analysis.


This dataset from the Comparative Toxicogenomics Database (CTD) contains different types of standardized identifications for the term and the disease to provide a cross-platform compatibility making able to identify the term and the disease in major science databases and to locate the references for the research in which the inference was based. It also provides the genes for which the inference was made.

Chemicals are among the main environmental factors that influence health and the way these can cause disease is not totally understood. The Comparative Toxicogenomics Database purpose is to provide a tool to generate new hypotheses on the mechanism of chemicals in the development of diseases by collecting curated data reported in the scientific literature on chemicals, genes and diseases and making inferences on the relationships of these three elements. This is accomplished through transitive inference, which happens when for example a chemical and a disease share interactions with one or more genes, thus inferring that there is a relationship between the chemical and the disease linked to a process or product of the particular genes, with this information could be inferred the mechanism of action of the chemical upon the gene to produce the disease, the genes linked to the disease, the physiopathology of the disease and other inferences. “For example, if chemical A interacts with gene B, and independently gene B is associated with disease C, then chemical A is inferred to have a relationship with disease C (via gene B).” (1) These inferences could be given in other directions, for example, a gene and a disease could share the same group of chemicals; also the inferences could have direct evidence in which there are published research with evidence of the relationship, while other inferences don’t have direct evidence in the literature and can be used to create new testable hypothesis about the mechanism of disease, initiate new research on the relationship and potentially predict disease treatment and prevention.

1. Davis AP, Grondin CJ, Johnson RJ, Sciaky D, King BL, McMorran R, Wiegers J, Wiegers TC, Mattingly CJ. The Comparative Toxicogenomics Database: update 2017. Nucleic Acids Res. 2016 Sep 19;[Epub ahead of print]

Date Created


Last Modified




Update Frequency


Temporal Coverage


Spatial Coverage



John Snow Labs; Comparative Toxicogenomics Database;

Source License URL

Source License Requirements

Publicly available and free for research application but citation is required. Permission asked for commercial uses

Source Citation

Davis AP, Grondin CJ, Johnson RJ, Sciaky D, King BL, McMorran R, Wiegers J, Wiegers TC, Mattingly CJ. The Comparative Toxicogenomics Database update 2017. Nucleic Acids Res. 2016 Sep 19;[Epub ahead of print]


Relationship Between Gene Ontology and Diseases, Gene Interactions, Mechanisms of Diseases, Gene and Disease Relationship, Comparative Toxicogenomics Database, Relationships Between Genes and Diseases, Chemical and Disease Inferences, Chemical Disease Hypotheses, Taxogenomics

Other Titles

Pathway Disease Relationships, Pathway GO Context of Gene-Disease Interaction

Disease_NameName of the diseasestringrequired : 1
Disease_IDUnique identifier assigned to the disease by MeSH or OMIM, linked to the source record(s) for the disease. OMIM (Online Medelian Inheritance in Man) is a database of human genes and genetic disorders that displays the type of genetic variation and expression; OMIM uses a six-digit identifier for each gene or genetic disorder. MeSH is a controlled vocabulary of thousands of biomedical terms (including diseases) that serves to standardize the terminology used in published texts that belong to life sciences. Each MeSH term has a unique identifier, which can be from 7 to 8 character length. The MeSH unique identifier was changed to 10-character length after November 2013.stringrequired : 1
GO_NameGene ontology term that describes the molecular function, biological process or cellular component. Gene ontology is a controlled vocabulary used to describe or define gene function and properties collecting concepts and the relationships between these concepts. Gene functions by the GO are divided in three main classes: molecular function, cellular component and biological process. Molecular function = names for activities performed at a molecular level by individual products of the gene; these are often appended with the word “activity” to avoid confusion with gene product names (e.g.: adenylate cyclase activity). Cellular component= names for structures inside the cell or structures at molecular level formed by groups of gene products (formed by groups of proteins). Biological process= name for a series of steps that lead to a biological change, these include pathways and other processes in which the activities of multiple gene products intervene.stringrequired : 1
GO_IDAlphanumerical Identification for GO terms. The GO term ID is used to browse the GO terms in the Gene Ontology database. The GO database is a relational database comprised of the GO ontologies as well as the annotations of genes and gene products to terms in those ontologies. The GO database is the source of all data available through the legacy AmiGO 1.8 browser and search engine.stringrequired : 1
Inference_Gene_QuantityNumber of genes curated in the inference.integerlevel : Ratio required : 1
Inference_Gene_SymbolsShort-form abbreviation of the name of the gene(s). The approved symbols for human genes are collected in the HUGO Gene Nomenclature Committee database; each name and symbol is unique for every gene and can be applied for other species.string-
Disease NameDisease IDGO NameGO IDInference Gene QuantityInference Gene Symbols
3M complexGO:1990393Autistic DisorderMESH:D0013211CUL7
3M complexGO:1990393Carcinoma, Renal CellMESH:D0022921CUL7
3M complexGO:1990393Miller-McKusick-Malvaux-Syndrome (3M Syndrome)MESH:C5353141CUL7
3M complexGO:1990393Three M Syndrome 2MESH:C5678621OBSL1
3M complexGO:1990393Urinary Bladder NeoplasmsMESH:D0017491FBXW8
3-methyl-2-oxobutanoate dehydrogenase (lipoamide) complexGO:0017086Acute Kidney InjuryMESH:D0581860
3-methyl-2-oxobutanoate dehydrogenase (lipoamide) complexGO:0017086Acute Lung InjuryMESH:D0553710
3-methyl-2-oxobutanoate dehydrogenase (lipoamide) complexGO:0017086AlbuminuriaMESH:D0004190
3-methyl-2-oxobutanoate dehydrogenase (lipoamide) complexGO:0017086AlopeciaMESH:D0005050
3-methyl-2-oxobutanoate dehydrogenase (lipoamide) complexGO:0017086Anxiety DisordersMESH:D0010080