Normalized metadata for the Sequence Read Archive

MetaSRA is an annotation/re-coding of sample-specific metadata in the Sequence Read Archive using biomedical ontologies. Currently, MetaSRA labels biological samples with terms in the Disease Ontology, Experimental Factor Ontology, Cell Ontology, Uberon, and Cellosaurus.

Frequently asked questions

How often is the MetaSRA updated?

MetaSRA obtains the raw metadata from the SRAdb project. The MetaSRA will be updated when the SRAdb is updated. The MetaSRA will also be updated when improvements are made to the computational pipeline that performs the annotation.

What species does the MetaSRA label?

MetaSRA contains annotated data for both human and mouse samples. Future releases will include more species.

What assays does the MetaSRA label?

MetaSRA contains annotated data for both RNA-seq and ChIP-seq samples. Future releases will include more assays.

How were the samples labeled?

Each sample's raw metadata was annotated using a custom automated computational pipeline. This pipeline is described in detail in the publication.

Where can I find processed expression data for MetaSRA search results?

Processed expression data for RNA-seq samples that are returned by a MetaSRA query can be downloaded from refine.bio. Click the "Download" link and then click "Generate refine.bio URL. After clicking the "Generate refine.bio URL" button, the refine.bio URL will appear in the box below the "Generate refine.bio URL" button. The refine.bio URL generation process may take several minutes depending on the dataset size. Please click or copy/paste the refine.bio link to download the processed expression data from refine.bio. To download the data at refine.bio, click the "Move to Dataset" button. Once pressed, a "Download" button will appear that will enable you to download the dataset."