This dataset is from a selection of attributed target lists extracted from the literature as supplementary data for other downloadable databases. The criteria for inclusion are drug target coverage for these human proteins. However, the exact definition varies between lists, as explained in the description below. This includes different terminology (e.g. “successful, “approved” or “proven”). There are also differences in primary target (~1:1 drug: protein) vs. secondary or subunit mappings (1:many).
The dataset includes utilities that can be explored; there are two that might be considered: a) following the database links and b) comparing them for intersects (protein IDs in common) and differentials (protein IDs unique to particular lists or subsets). The information extends to comparisons with lists that can be generated in the course of other studies or other published work (e.g. expression data or disease association gene candidates). Other areas that can be explored are: a) what other utilities that are found valuable and b) other recently published target lists recommended for inclusion.
The metadata descriptions from this source are minimal since context is provided either in the references and/or the download descriptions for the appropriate databases or sources. The lists are Excel sheets of UniProtKB, HGNC and ChEMBL live links.
Lists that are not UniProtKB Accessions in the first place are normalized to these (e.g. mappings of Human Gene Nomenclature Committee (HGNC) Symbols or Entrez Gene IDs (EGID) to UniProtKB). They are then filtered to human and Swiss-Prot (i.e. any TrEMBL entries are removed) and to approved drug targets if this is an option in the original list. In such cases lists that are hosted thus become transformations, rather than direct facsimiles, of the primary sources.
Given such ID cross-mappings are not perfect; absolute correctness is not guaranteed. However, International Union of Basic and Clinical Pharmacology (IUPHAR) and the British Pharmacological Society (BPS) Guide to PHARMACOLOGY versions are supplied in good faith and the originals are available in every case.
If readers are unfamiliar with protein list “slicing and dicing” the source recommends the following:
1. The UniProtKB interface
2. Venny (for comparing up to four lists)
3. Panther (for displaying detailed protein classifications and attributes from lists)
“ChEMBL, DrugBank, Human Metabolome Database and the Therapeutic Target Database are resources of curated chemistry-to-protein relationships widely used in the chemogenomic arena. In this work we have extended an earlier analysis (PMID 22821596) by comparing chemistry and protein target content between 2010 and 2013. For the former, details are presented for overlaps and differences, statistics of stereochemistry as well as stereo representation and MW profiles between the four databases. For 2013 our results indicate quality improvements, major expansion, increased achiral structures and changes in MW distributions. An orthogonal comparison of chemical content with different sources inside PubChem highlights further interpretable differences. Expansion of protein content by UniProt IDs is also recorded for 2013 and Gene Ontology comparisons for human-only sets indicate differences. These emphasise the expanding complementarity of chemistry-to-protein relationships between sources, although different criteria are used for their capture.” Wiley Online Library Abstract on “Comparing the Chemical Structure and Protein Content of ChEMBL, DrugBank, Human Metabolome Database and the Therapeutic Target Database”.