MetaSRA provides an API for programmatic access to the current version of the database, with two resources: samples and terms.
If you need to download the whole database, or need an older version of MetaSRA, see the
Access samples as JSON by fetching http://metasra.biostat.wisc.edu/api/v01/samples.json?
,
or fetch CSV with http://metasra.biostat.wisc.edu/api/v01/samples.csv?
.
For downloading SRA data with the SRA Toolkit, you can fetch a text
file with one run ID per line with http://metasra.biostat.wisc.edu/api/v01/runs.ids.txt?
, or a CSV file with one run per line
and accompanying metadata with http://metasra.biostat.wisc.edu/api/v01/runs.csv?
. See
for instructions on using these files to download sequence data from SRA and processed expression data from Recount2.
You can use any combination of the following query-string arguments to filter samples:
study |
Filter samples by this SRA study ID. Required: you must provide a value for study and/or and . |
and |
Given a comma-separated list of ontology term ID's (see below,) return only samples that match all of the terms. Required: you must provide a value for study and/or and . |
not |
Given a comma-separated list of ontology term ID's (see below,) return only samples that do not match any of the terms. |
sampletype |
Show only samples matching this computationally-predicted sample type. Valid options are cell line , tissue , primary cells , stem cells , in vitro differentiated cells , and iPS cells .
The server will accept any of the following as equivalent: primary cells , primary+cells , or primary%20cells .
|
species |
Filter samples by species. Valid options are human and mouse .
|
assay |
Filter samples by assay type. Valid options are RNA-seq and ChIP-seq .
|
limit |
Limit the results to this many studies. |
skip |
Skip this many studies (useful for paging in combination with limit.) |
Returned CSV files have one row per sample, with the following fields:
study_id | SRA study ID |
study_title | Study title |
sample_id | SRA sample ID |
sample_name | Sample name from SRA metadata - not all samples have a sample name. |
sample_type | Computationally-predicted sample type |
sample_type_confidence | Sample type confidence |
mapped_ontology_ids | An ID for each of the most-specific ontology terms mapped to this sample (see note below), comma-separated |
mapped_ontology_terms | Term name for the most-specific ontology terms mapped to this sample (see note below,) comma-separated |
raw_SRA_metadata | Raw SRA metadata for this sample, except blacklisted fields (see note below.) "key: value" pairs, semicolon-separated. |
sample_species | Sample species |
assay | Assay type |
Returned JSON files have this shape:
{
studyCount: 24, // The total number of studies matching your search (not accounting for limit and skip)
sampleCount: 170, // Total number of samples matching your search (not accounting for limit and skip)
terms: [ // Common ontology terms for samples in your search, roughly sorted by frequency
...
{
sampleCount: 34, // A rough count of matching samples (not counting descendant terms, see note below)
dterm: {
name: "female organism", // Term name
ids: ["UBERON:0003100"] // List of ID's for this term in one or more ontologies
}
}
...
],
studies: [ // Matching samples are grouped by study
{
study: {
title: "My super fantastic study"
id: "SRP012345" // SRA study ID
}
sampleCount: 22, // Number of samples from this study that match your search
dterms: [ // All matching terms for samples in this study (see note below)
{
name: "Brodmann (1909) area 11",
ids: ["UBERON:0013528"],
}
...
],
sampleGroups: [ // Samples in each study are grouped by their raw SRA attributes, all being the same
// except for a blacklist of ID-like fields (see note below) which can vary.
{
samples: [ // List of samples in this group
{
id: "SRS0123456" // SRA sample ID
name: "My sample" // Not all samples have a name
experiments: [ // Associated SRA experiment and run ID's
{
id: "SRX0123456",
runs: ["SRR0123456", "SRR0123457", ...]
},
...
]
}
...
],
info: {
species: "human", // Sample species
assay: "RNA-seq" // Sample assay type
},
attr: [
["tissue", "lung"], // [key, value] for raw SRA metadata fields for these samples, excluding blacklist (see note below)
...
],
type: {
type: "tissue", // Sample type, computationally predicted from sample attributes
conf: 0.9445349 // Sample type confidence
},
dterms: { // Most-specific terms for these samples (see note below.)
{
name: "disease of cellular proliferation",
ids: ["DOID:14566"]
}
...
}
}
...
]
}
...
]
}
Note on terms: Ontology terms are hierarchical: e.g. "lung disease" is a descendant of "disease." When you search on a term MetaSRA will include matches to all of its more-specific descendants, e.g. a search for "disease" will return samples that are labeled with "lung disease". But for brevity, the results will show only the most specific terms for a sample that have no descendants in the set, e.g. a sample labeled with "lung disease" will show "lung disease" in the results but not "disease".
Note on attributes and sampleGroups: MetaSRA excludes some raw SRA attributes using a blacklist. There is inconsistancy in how the fields are used, but the blacklisted fields are generally ID fields without information characterizing the sample. This is so that when grouping terms by like-attributes, the grouping is not interrupted by ID fields (sampleGroups are presnt in the JSON files, but not in the CSV's.) You can view the blacklist at the top of this file.
To query ontology terms used by the metaSRA, you can fetch this URL as JSON: http://metasra.biostat.wisc.edu/api/v01/terms?
.
This resource only returns terms that are associated with at least one sample in MetaSRA.
You can use any combination of the these arguments to filter terms:
q |
Search string - return terms with names like this argument. Sort terms by relevance. |
ids |
Comma-separated list of ontology term ID's. Return terms matching any of these ID's. |
limit |
Only return up to this many terms. The limit cannot exceed 500, and the limit will default to 500 if none is provided. |
{
terms: [
{
name: "tetrapod frontal bone", // Term name
ids: ["UBERON:0000209"], // List of ID's for this term in one or more ontologies
syn: "frontal, frontal bone, os frontal, os frontale" // Comma-separated list of synonyms
ancestors: [ // Jumble of less-specific (ancestor) related terms (at radius one or two)
{
name: "dermal bone",
ids: ["UBERON:0001474"]
}
...
],
descendants: [...] // List of more-specific (descendant) related terms (at radius one or two)
}
...
]
}