Disease Pathway Associations

$179 / year

This dataset contains the relationships between biological pathways and diseases. These relationships were inferred due to the fact that the pathway and the disease in some way share independent relationships with a same gene or group of genes; the inference was made through curation of research publications, the building of diagrams and statistical analysis.


The dataset from the Comparative Toxicogenomics Database (CTD) contains different types of standardized identifications for the pathway and the disease to provide a cross-platform compatibility making able to identify the pathway and the disease in major science databases and to locate the references for the research in which the inference was based. It also provides the gene for which the inference was made.

Chemicals are among the main environmental factors that influence health and the way these can cause disease is not totally understood. The Comparative Toxicogenomics Database purpose is to provide a tool to generate new hypotheses on the mechanism of chemicals in the development of diseases by collecting curated data reported in the scientific literature on chemicals, genes and diseases and making inferences on the relationships of these three elements. This is accomplished through transitive inference, which happens when for example a chemical and a disease share interactions with one or more genes, thus inferring that there is a relationship between the chemical and the disease linked to a process or product of the particular genes, with this information could be inferred the mechanism of action of the chemical upon the gene to produce the disease, the genes linked to the disease, the physiopathology of the disease and other inferences. “For example, if chemical A interacts with gene B, and independently gene B is associated with disease C, then chemical A is inferred to have a relationship with disease C (via gene B).” (1) These inferences could be given in other directions, for example, a gene and a disease could share the same group of chemicals; also the inferences could have direct evidence in which there are published research with evidence of the relationship, while other inferences don’t have direct evidence in the literature and can be used to create new testable hypothesis about the mechanism of disease, initiate new research on the relationship and potentially predict disease treatment and prevention.

1. Davis AP, Grondin CJ, Johnson RJ, Sciaky D, King BL, McMorran R, Wiegers J, Wiegers TC, Mattingly CJ. The Comparative Toxicogenomics Database: update 2017. Nucleic Acids Res. 2016 Sep 19;[Epub ahead of print]

Date Created


Last Modified




Update Frequency


Temporal Coverage


Spatial Coverage



John Snow Labs; Comparative Toxicogenomics Database;

Source License URL

Source License Requirements

Publicly available and free for research application but citation is required. Permission asked for commercial uses

Source Citation

Davis AP, Grondin CJ, Johnson RJ, Sciaky D, King BL, McMorran R, Wiegers J, Wiegers TC, Mattingly CJ. The Comparative Toxicogenomics Database update 2017. Nucleic Acids Res. 2016 Sep 19;[Epub ahead of print]


KEGG or REACTOME Identifiers, Relationships Between Biological Pathways, Relationships Between Biological Diseases, Comparative Toxicogenomics Database, Relationships Between Chemicals and Diseases, Chemical and Disease Inferences, Chemical Disease Hypotheses, Toxicogenomics, Gene Disease Association, Gene Chemical Pathways

Other Titles

Diseases and Genes Linked Through Pathways, Diseases and Genes Affecting Pathways, Genes and Pathways Under Disease Conditions

Disease_NameName of the disease associated with the pathway.stringrequired : 1
Disease_IDUnique identifier assigned to the disease by MeSH or OMIM, linked to the source record(s) for the disease. OMIM (Online Medelian Inheritance in Man) is a database of human genes and genetic disorders that displays the type of genetic variation and expression; OMIM uses a six-digit identifier for each gene or genetic disorder. MeSH is a controlled vocabulary of thousands of biomedical terms (including diseases) that serves to standardize the terminology used in published texts that belong to life sciences. Each MeSH term has a unique identifier, which can be from 7 to 8 character length. The MeSH unique identifier was changed to 10-character length after November 2013.stringrequired : 1
Pathway_NameName of the pathway associated with the disease. A pathway is a series of steps in biological processes at a molecular level that leads to a biological change; a pathway is possible due to the activities of multiple gene products and other elements. Pathways involve molecular interactions and reactions pertaining to metabolism, genetic information processing, environmental information processing, cellular processes, organismal systems, human diseases and drug development. Every pathway has a generic or standardized name that helps on obtaining an overview of the pathway’s nature, others have unique but well-known names, for example, the Krebs cycle.stringrequired : 1
Pathway_IDKEGG or REACTOME identifier. Alphanumerical code that identifies the pathway in the lookup of the KEGG or the REACTOME databases. KEGG Pathway Database is a collection of manually drawn pathway maps representing the knowledge on the molecular interaction and reaction networks for metabolism, genetic information processing, environmental information processing, cellular processes, organismal systems, human diseases and drug development. The Reactome Pathway Database is a curated database of pathways and reactions (pathway steps) in human biology. The Reactome definition of a 'reaction' includes many events in biology that are changes in state, such as binding, activation, translocation and degradation, in addition to classical biochemical reactions. Information in the database is authored by expert biologist researchers, maintained by Reactome editorial staff, and extensively cross-referenced to other resources e.g. NCBI, Ensembl, UniProt, UCSC Genome Browser, HapMap, KEGG (Gene and Compound), ChEBI, PubMed and GO.stringrequired : 1
Inference_Gene_SymbolSo Short-form abbreviation of the name of the gene that was inferred to be linked to the association between the chemical and the disease. The approved symbols for human genes are collected in the HUGO Gene Nomenclature Committee database; each name and symbol is unique for every gene and can be applied for other species.stringrequired : 1
Disease NameDisease IDPathway NamePathway IDInference Gene Symbol
17-Hydroxysteroid Dehydrogenase DeficiencyMESH:C537805Androgen biosynthesisREACT:R-HSA-193048HSD17B3
17-Hydroxysteroid Dehydrogenase DeficiencyMESH:C537805Fatty acid, triacylglycerol, and ketone body metabolismREACT:R-HSA-535734HSD17B3
17-Hydroxysteroid Dehydrogenase DeficiencyMESH:C537805Fatty Acyl-CoA BiosynthesisREACT:R-HSA-75105HSD17B3
17-Hydroxysteroid Dehydrogenase DeficiencyMESH:C537805Metabolic pathwaysKEGG:hsa01100HSD17B3
17-Hydroxysteroid Dehydrogenase DeficiencyMESH:C537805MetabolismREACT:R-HSA-1430728HSD17B3
17-Hydroxysteroid Dehydrogenase DeficiencyMESH:C537805Metabolism of lipids and lipoproteinsREACT:R-HSA-556833HSD17B3
17-Hydroxysteroid Dehydrogenase DeficiencyMESH:C537805Metabolism of steroid hormonesREACT:R-HSA-196071HSD17B3
17-Hydroxysteroid Dehydrogenase DeficiencyMESH:C537805Steroid hormone biosynthesisKEGG:hsa00140HSD17B3
17-Hydroxysteroid Dehydrogenase DeficiencyMESH:C537805Synthesis of very long-chain fatty acyl-CoAsREACT:R-HSA-75876HSD17B3
17-Hydroxysteroid Dehydrogenase DeficiencyMESH:C537805Triglyceride BiosynthesisREACT:R-HSA-75109HSD17B3