Protein Post Translational Modifications

$395 / year

This dataset includes protein post translational modifications as well as associated annotation data obtained from the Biological General Repository for Interaction databases (BIOGRID) for major model organisms species including the type of modification, protein sequence and specific aminoacid involved.

Complexity

A gene is a localized section of DNA; through a process called transcription this DNA is copied into a messenger RNA (mRNA). The mRNA is a copy of the gene DNA sequence; mRNA is later translated to a protein through a process called translation. After the translation the protein needs the addition or removal of some elements in order to become active and perform its correct function, these are called post-translational modifications. A protein is conformed by a sequence of aminoacides, these modifications are made on specific aminoacids on the proteins.

BioGRID interactions are recorded as relationships between two proteins or genes (i.e. they are binary relationships) with an evidence code that supports the interaction and a publication reference. The term “interaction” includes, as well as direct physical binding of two proteins, co-existence in a stable complex and genetic interaction. It should not be assumed that the interaction reported in BioGRID is direct and physical in nature; the experimental system definitions below indicate the nature of the supporting evidence for an interaction between the two biological entities. It should also be noted that some interactions in BioGRID have various levels of evidential support. BioGRID simply curates the result of the experiment from the publication and does not guarantee that any individual interaction is true, well-established or the current consensus view of the community. Curating all available evidence supporting for an interaction enables orthogonal data from various sources to be collated, allowing users of the database to decide confidence in the existence and/or physiological relevance of that interaction.

The Biological General Repository for Interaction Datasets (BioGRID) is a public database that archives and disseminates genetic and protein interaction data from model organisms and humans (thebiogrid.org). BioGRID currently holds over 980,000 interactions curated from both high-throughput datasets and individual focused studies, as derived from over 55,000 publications in the primary literature. Complete coverage of the entire literature is maintained for budding yeast (S. cerevisiae), fission yeast (S. pombe) and thale cress (A. thaliana), and efforts to expand curation across multiple metazoan species are underway. Current curation drives are focused on particular areas of biology to enable insights into conserved networks and pathways that are relevant to human health. BioGRID provides interaction data from several model organism databases, resources such as Entrez-Gene, SGD, TAIR, FlyBase and other interaction meta-databases.

Description source: Chatr-Aryamontri A, Oughtred R, Boucher L, Rust J, Chang C, Kolas NK, O’Donnell L, Oster S, Theesfeld C, Sellam A, Stark C, Breitkreutz BJ, Dolinski K, Tyers M. The BioGRID interaction database: 2017 update. Nucleic Acids Res. 2016 Dec 14;2017(1) [ Pubmed, NAR ]

Date Created

2015-12-25

Last Modified

2017-06-30

Version

3.4.150

Update Frequency

Monthly

Temporal Coverage

N/A

Spatial Coverage

N/A

Source

John Snow Labs => Biological General Repository for Interaction Datasets

Source License URL

John Snow Labs Standard License

Source License Requirements

N/A

Source Citation

Stark C, Breitkreutz BJ, Reguly T, Boucher L, Breitkreutz A, Tyers M. BioGRID a general repository for interaction datasets. Nucleic Acids Res. 2006 Jan 1;34(Database issue)D535-9. PubMed PMID 16381927. Chatr-Aryamontri A, Breitkreutz BJ, Oughtred R, Boucher L, Heinicke S, Chen D, Stark C, Breitkreutz A, Kolas N, O'Donnell L, Reguly T, Nixon J, Ramage L, Winter A, Sellam A, Chang C, Hirschman J, Theesfeld C, Rust J, Livstone MS, Dolinski K, Tyers M. The BioGRID interaction database 2015 update. Nucleic Acids Res. 2015 Jan;43(Database issue)D470-8. PubMed PMID 25428363

Keywords

PTM, Protein Modification Database, Posttranslational, Glycosylation Prediction, Phosphoproteome, PTM Proteomics, PTM Protein, Protein Modification, Amino Acids

Other Titles

Protein Postranslational Modifications Database, Species Wide Postranslational Modifications, Annotated Major Model Systems Postranslational Modifications, PTM Proteomics Post Translational Modifications, PTM Protein Post Translational Modifications, Protein Modification Post Translational Modifications, Amino Acids Post Translational Modifications

Name Description Type Constraints
Post_Translational_Modification_IDBiogrid unique identifier for gene postranslational modification.integerunique : 1 required : 1 level : Nominal
Entrez_Gene_IDEntrez unique identifier for gene the gene which protein has been reported to be postranslational modified. Unique identifier for the gene of the National Center for Biotechnology Information (NCBI)’s Entrez Gene database. This Entrez Gene unique integer can be browsed in the Entrez system online to find nomenclature, sequence, products and other specific details of the gene. The identifier is species specific, a gene ID of a human gene can’t be applied to the same gene of a different species.integerrequired : 1 level : Nominal
Biogrid_IDBiogrid database identifier for gene which protein product has been reported to be modified.integerrequired : 1 level : Nominal
Systematic_NameName of the protein given for the experiment.string-
Official_SymbolShort-form abbreviation of the name of the gene coding for the protein. The approved symbols for human genes are collected in the HUGO Gene Nomenclature Committee database; each name and symbol is unique for every gene and can be applied for other species.stringrequired : 1
SynonymsSynonyms for the name of the protein gene involved in postranslational modificationstring-
SequenceSequence of the aminoacids that conform the protein in FASTA format. FASTA format is used in bioinformatics to represent the sequence of DNA or proteins, each letter represents one aminoacid.string-
Refseq_IDIdentification number of the sequence in the RefSeq database of the National Center for Biotechnology Information. RefSeq database contains annotated sequeces of proteins and DNA.string-
PositionPosition number of the modified aminoacid within the protein sequence.integerlevel : Nominal
Post_Translational_ModificationType of modification reported.stringrequired : 1
ResidueSpecific aminoacid in which the modification was made. Symbol letter in FASTA format.stringrequired : 1
AuthorFirst Author of the publication describing the modification or first author surname of the publication in which the modification has been shown, optionally followed by additional indicators, e.g. Stephenson A (2005).stringrequired : 1
Pubmed_IDIdentification number of the text describing the modification published in PubMed database. PubMed is a US National Library of Medicine citation database that contains millions of abstracts, references and full text links of biomedical literature from different trusted sources.integerrequired : 1 level : Nominal
Organism_IDIdentification number in the BioGrid database for the organism in which the modification was found.integerrequired : 1 level : Nominal
Organism_NameOfficial name for the organism in which the modification was foundstringrequired : 1
Is_Relationships_PresentWhether there was found or not in the referenced study a relationship between other proteins due to the modification.booleanrequired : 1
NotesNotes on the experimental set up used as well as experimental conditions.string-
Source_DatabaseDatabase from which the data were extracted.stringrequired : 1
Post_Translational_Modification_IDEntrez_Gene_IDBiogrid_IDSystematic_NameOfficial_SymbolSynonymsSequenceRefseq_IDPositionPost_Translational_ModificationResidueAuthorPubmed_IDOrganism_IDOrganism_NameIs_Relationships_PresentNotesSource_Database
6778158454114032CUL1NeddylationKLi T (2006)166207729606Homo sapienstrueBIOGRID
6820898454114032CUL1NeddylationKNie L (2011)211196859606Homo sapienstrueBIOGRID
732399790107243CADNeddylationKJones J (2008)182475579606Homo sapienstrueBIOGRID
6774418454114032CUL1NeddylationKPark Y (2008)186277669606Homo sapienstrueBIOGRID
6774596667112550SP1SumoylationKWang YT (2008)185721939606Homo sapienstrueBIOGRID
7034727769113552ZNF226SumoylationKLi T (2004)151619809606Homo sapienstrueBIOGRID
7034747638113454ZNF221SumoylationKLi T (2004)151619809606Homo sapienstrueBIOGRID
7074258454114032CUL1NeddylationKJin HS (2013)232670669606Homo sapienstrueBIOGRID
732508375106870ARF1NeddylationKJones J (2008)182475579606Homo sapienstrueBIOGRID
6778076996112855TDGhTDGSumoylationKLi T (2006)166207729606Homo sapienstrueBIOGRID