Others titles
- Human Gene Information and miRNA Prediction Site
- Human Gene Information and miRNA Profiling
- Human Gene Information and miRNA Target Prediction
Keywords
- Human Gene Information
- miRNA Annotations
- Microrna
- miRNA
- microRNA
- Prediction Site
- miRNA Profiling
- miRNA Target Prediction
- miRNA Cancer
Human Gene Information and miRNA Annotations
This dataset describes Information about 28,353 human genes and their miRNA annotations together with their Transcript ID, Gene ID, Gene symbol, Gene description, Species ID, Number of 3P-seq tags + 5 and Representative transcripts.
Get The Data
- ResearchNon-Commercial, Share-Alike, Attribution Free Forever
- CommercialCommercial Use, Remix & Adapt, White Label Log in to download
Description
Proteins are built by using the information contained in molecules of messenger RNA (mRNA). Cells have several ways of controlling the amounts of different proteins they make. For example, a so-called ‘microRNA’ molecule can bind to an mRNA molecule to cause it to be more rapidly degraded and less efficiently used, thereby reducing the amount of protein built from that mRNA. Indeed, microRNAs are thought to help control the amount of protein made from most human genes, and biologists are working to predict the amount of control imparted by each microRNA on each of its mRNA targets.
The human and mouse databases started with Gencode annotations (Harrow et al., 2012), for which 3′ UTRs were extended, when possible, using RefSeq annotations (Pruitt et al., 2012), recently identified long 3′-UTR isoforms (Miura et al., 2013), and 3P-seq clusters marking more distal cleavage and polyadenylation sites (Nam et al., 2014). Zebrafish reference 3′ UTRs were similarly derived in a recent 3P-seq study (Ulitsky et al., 2012).
3P-seq data were available for seven developmental stages or tissues of zebrafish, enabling isoform profiles to be generated and predictions to be tailored for each of these. For human and mouse, however, 3P-seq data were available for only a small fraction of tissues/cell types that might be most relevant for end users, and thus results from all 3P-seq datasets available for each species were combined to generate a meta 3′-UTR isoform profile for each representative ORF. Although this approach reduces accuracy of predictions involving differentially expressed tandem isoforms, it nonetheless outperforms the previous approach of not considering isoform abundance at all, presumably because isoform profiles for many genes are highly correlated in diverse cell types (Nam et al., 2014).
About this Dataset
Data Info
Date Created | 2006-10 |
---|---|
Last Modified | 2021-09-01 |
Version | Release 8.0 |
Update Frequency |
Irregular |
Temporal Coverage |
2006-2016 |
Spatial Coverage |
N/A |
Source | John Snow Labs; TargetScanHuman Prediction of microRNA Targets; |
Source License URL | |
Source License Requirements |
N/A |
Source Citation |
N/A |
Keywords | Human Gene Information, miRNA Annotations, Microrna, miRNA, microRNA, Prediction Site, miRNA Profiling, miRNA Target Prediction, miRNA Cancer |
Other Titles | Human Gene Information and miRNA Prediction Site, Human Gene Information and miRNA Profiling, Human Gene Information and miRNA Target Prediction |
Data Fields
Name | Description | Type | Constraints |
---|---|---|---|
Transcript_ID | Transcription ID is assigned in the first step of gene expression, in which a particular segment of DNA is copied into RNA (especially mRNA) by the enzyme RNA polymerase. | string | required : 1 |
Gene_ID | Name or identification/ID of a human gene (from UTR input file). | string | required : 1 |
Gene_Symbol | Symbol of a human gene (from UTR input file). | string | required : 1 |
Gene_Description | Specific description of a human gene’s basic physical and functional unit of heredity. | string | - |
Species_ID | Name or identification/ID of species (from UTR input file). | integer | level : Nominalrequired : 1 |
ThreeP_Sequence_Tags | ThreeP (3P) or chromosome 3 expressed sequence tag or EST is a short sub-sequence of a cDNA sequence. ESTs may be used to identify gene transcripts, and are instrumental in gene discovery and in gene-sequence determination | integer | level : Nominalrequired : 1 |
Is_Representative_Transcript | Representative transcript shows the representative miRNA, which is the miRNA in this family with the lowest total context score. Although only one miRNA is chosen as the representative miRNA, all the other miRNAs of the miRNA family are also predicted to target the same target gene at the same target site(s). | boolean | required : 1 |
Data Preview
Transcript ID | Gene ID | Gene Symbol | Gene Description | Species ID | ThreeP Sequence Tags | Is Representative Transcript |
ENST00000263100.3 | ENSG00000121410.7 | A1BG | alpha-1-B glycoprotein [Source:HGNC Symbol;Acc:5] | 9606 | 74 | True |
ENST00000374001.2 | ENSG00000148584.10 | A1CF | APOBEC1 complementation factor [Source:HGNC Symbol;Acc:24086] | 9606 | 80 | True |
ENST00000318602.7 | ENSG00000175899.10 | A2M | alpha-2-macroglobulin [Source:HGNC Symbol;Acc:7] | 9606 | 3121 | True |
ENST00000299698.7 | ENSG00000166535.15 | A2ML1 | alpha-2-macroglobulin-like 1 [Source:HGNC Symbol;Acc:23336] | 9606 | 5 | True |
ENST00000401850.1 | ENSG00000128274.11 | A4GALT | alpha 1,4-galactosyltransferase [Source:HGNC Symbol;Acc:18149] | 9606 | 21 | True |
ENST00000236709.3 | ENSG00000118017.3 | A4GNT | alpha-1,4-N-acetylglucosaminyltransferase [Source:HGNC Symbol;Acc:17968] | 9606 | 5 | True |
ENST00000209873.4 | ENSG00000094914.8 | AAAS | achalasia, adrenocortical insufficiency, alacrimia [Source:HGNC Symbol;Acc:13666] | 9606 | 98 | True |
ENST00000261686.6 | ENSG00000081760.12 | AACS | acetoacetyl-CoA synthetase [Source:HGNC Symbol;Acc:21298] | 9606 | 1594 | True |
ENST00000545511.1 | ENSG00000081760.12 | AACS | acetoacetyl-CoA synthetase [Source:HGNC Symbol;Acc:21298] | 9606 | 1594 | False |
ENST00000232892.7 | ENSG00000114771.9 | AADAC | arylacetamide deacetylase [Source:HGNC Symbol;Acc:17] | 9606 | 5 | True |