Chemical Vocabulary

$179 / year

This dataset contains the terms of the vocabulary, organized in a hierarchical manner, used in the Comparative Toxicogenomics Database (CTD) to describe the chemicals inferred to have an interaction over a gene or disease. The dataset contains different types of standardized identifications for the chemical to provide a cross-platform compatibility making able to identify the chemical in major scientific databases.


The Comparative Toxicogenomics Database (CTD) purpose is to provide a tool to generate new hypotheses on the mechanism of chemicals in the development of diseases by collecting curated data reported in the scientific literature on chemicals, genes and diseases and making inferences on the relationships of these three elements.

The CTD datasets can be used to create a tool for input of queries to obtain inferred relationships between genes, chemicals and diseases and the significance of the inferences. When a query is run, the terms on this dataset are used for input and search of chemicals.

Date Created


Last Modified




Update Frequency


Temporal Coverage


Spatial Coverage



John Snow Labs; Comparative Toxicogenomics Database;

Source License URL

Source License Requirements

Publicly available and free for research application but citation is required. Permission asked for commercial uses

Source Citation

Davis AP, Grondin CJ, Johnson RJ, Sciaky D, King BL, McMorran R, Wiegers J, Wiegers TC, Mattingly CJ. The Comparative Toxicogenomics Database update 2017. Nucleic Acids Res. 2016 Sep 19;[Epub ahead of print]


Toxicogenomics, Gene Disease Association, Gene Chemical Pathways, Gene or Disease Interaction, Comparative Toxicogenomics Database, Relationships Between Chemicals and Diseases, Chemical and Disease Inferences, Chemical Disease Hypotheses

Other Titles

Chemical Vocabulary Chemistry Dictionary, Chemical Vocabulary Chemistry Terms, Chemical Vocabulary Chemistry Definition

Chemical_NameName of chemical.stringrequired : 1
Chemical_IDIdentification number of the chemical by the US National Library of Medicine’s Medical Subject Headings (MeSH). MeSH is a controlled vocabulary of thousands of biomedical terms that serves to standardize the terminology used in published texts that belong to life sciences. Each MeSH term has a unique identifier, which can be from 7 to 8-character length. The MeSH unique identifier was changed to 10-character length after November 2013.stringrequired : 1
Cas_Registry_NumberUnique numeric identifier designated by CAS for the chemical substance. CAS registry number also serves as a reference to find information on the specific chemical. CAS is a division of the American Chemical Society (ACS); the CAS registry collects information of millions of chemical substances identified since the early 1900’s.string-
DefinitionDescription of the chemical.string-
Parent_IDIdentifiers of the parent terms ('|'-delimited list). The chemical vocabulary is structured as a polyhierarchic tree in which a chemical may appear as a descendant term in more than one branch, thus belonging to more than one parent term.string-
Tree_NumbersIdentifiers of the nodes where the chemical is located in the tree. ('|'-delimited list)stringrequired : 1
Parent_Tree_NumbersIdentifiers of the nodes where the parent terms are located in the tree. ('|'-delimited list)string-
SynonymsOther names by which the chemical is known. ('|'-delimited list)string-
Drug_Bank_IDIdentifier of the chemical in the DrugBank database ('|'-delimited list). The DrugBank database is a unique bioinformatics and cheminformatics resource that combines detailed drug (i.e. chemical, pharmacological and pharmaceutical) data with comprehensive drug target (i.e. sequence, structure, and pathway) information.string-
Chemical NameChemical IDCas Registry NumberDefinitionParent IDTree NumbersParent Tree NumbersSynonymsDrug Bank ID
(0.017ferrocene)amyloseMESH:C089250MESH:D000075163|MESH:D000688|MESH:D005296D01.490.200/C089250|D02.691.657/C089250|D09.301.915.361/C089250|D09.698.365.855.361/C089250D01.490.200|D02.691.657|D09.301.915.361|D09.698.365.855.361(0.017 ferrocene)amylose
001-C8-NBDMESH:C114385MESH:D009842|MESH:D010069D03.383.129.462.580/C114385|D12.644.456/C114385D03.383.129.462.580|D12.644.456001 C8 NBD|H-MeTyr-Arg-MeArg-D-Leu-NH(CH2)8NH-NBD|MeTyr-Arg-MeArg-Leu-NH-NBD
001-C8 oligopeptideMESH:C114386MESH:D009842D12.644.456/C114386D12.644.456001 C8 oligopeptide|H-MeTyr-Arg-MeArg-D-Leu-NH(CH2)8NH2
0231A , StreptomycesMESH:C434150MESH:D006576D03.633.400/C434150D03.633.400
0231B, StreptomycesMESH:C434149MESH:D006576D03.633.400/C434149D03.633.400
027075 compoundMESH:C000620092MESH:D010793|MESH:D052117D03.383.246.118/C000620092|D03.383.710.605/C000620092|D03.633.100.115/C000620092D03.383.246.118|D03.383.710.605|D03.633.100.115N-(benzo(d)(1,3)dioxol-5-yl)-2-(1-oxo-4-(p-tolyl)-5,6,7,8-tetrahydrophthalazin-2(1H)-yl)butanamide
06-Paris-LA-66 protocolMESH:C046983MESH:D003630|MESH:D008727|MESH:D011239|MESH:D014750D02.455.426.559.847.562.050.200/C046983|D03.132.436.681.827.817/C046983|D03.633.100.473.402.681.827.817/C046983|D03.633.100.733.631.192.500/C046983|D04.210.500.745.432.769.795/C046983|D04.615.562.050.200/C046983|D09.408.051.059.200/C046983D02.455.426.559.847.562.050.200|D03.132.436.681.827.817|D03.633.100.473.402.681.827.817|D03.633.100.733.631.192.500|D04.210.500.745.432.769.795|D04.615.562.050.200|D09.408.051.059.200Paris-LA
071031B compoundMESH:C585814MESH:D013876|MESH:D052117D02.886.778/C585814|D03.383.246.118/C585814|D03.383.903/C585814|D03.633.100.115/C585814D02.886.778|D03.383.246.118|D03.383.903|D03.633.100.115