Overview
The OSDR biological data API is an application programming interface enabling granular access to GeneLab and ALSDA data contained in
- a REST interface, which exposes a higher-level, structured view of the objects in the database, with JSON being the primary output format;
-
a query interface, which allows for complex querying
and filtering of both the metadata and the data;
outputs are provided as tables(CSV/TSV and tables converted to JSON).
TL;DR
Organization of datasets, assays, samples, and their metadata as presented via the REST interface:

- id.accession
- id.sample%20name=Mmus_C57-6J_LVR_GC_I_Rep1_M31
- investigation.ontology%20source%20reference
- study.characteristics
- study.characteristics.strain
- study.factor%20value.spaceflight!=basal%20control
- file.data%20type=pca
- etc.
- column.*
- column.ENTREZID
- column.Mmus_C57-6J_LVR_GC_I_Rep1_M31!=0
- etc.
TL;DR: examples
(expand)Hierarchical metadata
|
For example, if a certain sample has been assayed with microscopy as well as RNA-Seq, the metadata describing such a sample can be found under the respective assays accordingly (note that any sample may be associated with some, but not all, assays in a dataset). E.g. in order to request the ISA (investigation, study, assay) metadata of the OSD-48 sample Mmus_C57-6J_LVR_GC_I_Rep1_M31 that represents it as an analysis object in a microscopy assay, the following hierarchy is applicable: and, respectively, for the same sample, but in an RNA-Seq assay: |
Note: the syntax of the two example links is discussed further down in the
REST interface section. The REST interface also serves as a means to identify exact assay and sample names of interest. |
On-demand metadata combinations
|
This is exemplified by the two links above producing JSONs that share identical sections (investigation, study)
and only differ in |
|
Indeed, the same section may be repeated in full across outputs for multiple samples; however, this also means that an output for one sample in one assay of one dataset will always contain complete relevant metadata. |
The REST interface
|
|
Metadata REST endpoints(expand) |
File records REST endpoints(expand) |
REST syntax extensions(expand) |
The query interface
|
All query endpoints accept an optional parameter format (see Output formats). If omitted, the format defaults to CSV. |
Sample-level metadata query endpoints
|
|
Providing fields as such period-separated keys, optionally paired with values, as components of a GET request (e.g. study.characteristics.strain=S288C), allows for filtering by values of multiple metadata fields at once. The GET request syntax understands the following conventions: |
Format example | Meaning |
study.characteristics.strain | Include this field in the output, regardless of its value, for all implicated samples. |
=study.characteristics.strain | Only include samples that have this field annotated with a non-null (non-NaN) value; also include the field itself in the output. |
study.characteristics.strain=S288C | Only include samples whose value of this field is equal to the provided value; also include the field itself in the output. |
study.characteristics.strain=S288C|BY4743 | Only include samples whose value of this field is equal to either of the provided values
(separated by a vertical pipe, i.e. a logical "OR"); also include the field itself in the output. |
study.characteristics.strain!=S288C | Only include samples whose value of this field is not equal to the provided value; still include the field itself in the output. Note that this excludes null (NaN) values, since NaNs are not equal to anything (not even to themselves) by definition. |
study.characteristics.strain=/^BY\d+$/ study.characteristics.strain=/^BY\d+$/i study.characteristics.strain=/^BY\d+$/c |
Only include samples whose value of this field matches the provided regular expression
(in this case: ^BY\d+$, i.e. a leading "BY"
followed by a number);
also include the field itself in the output. The /i flag invokes case-insensitive matching, while the /c flag enforces case sensitivity. Note: the flags override the behavior.matchcase modifier. |
study.characteristics | Include all fields in the given section that are present for any of the samples in the request. This wildcard syntax is applicable to any ISA field starting from the 2nd level (e.g., assay.parameter value, study.factor value, investigation.study assays, etc.) |
Example usage
(expand)
Revisiting one of the examples above, if one were interested in
|
then, respectively, these URL components would be used:
|
resulting in the following URL: /v2/query/metadata/ ?id.accession!=OSD-509 &study.characteristics.organism=/Saccharomyces/ &study.characteristics.strain!=S288C &=study.parameter%20value.growth%20temperature &study.factor%20value.spaceflight |
Naturally, if one also wanted to include names of all files associated with each of these samples, and constrain
only to files whose names end in "csv", an additional component would be added:
|
resulting in the following URL: /v2/query/metadata/ ?id.accession!=OSD-509 &study.characteristics.organism=/Saccharomyces/ &study.characteristics.strain!=S288C &=study.parameter%20value.growth%20temperature &study.factor%20value.spaceflight &file.file%20name=/csv$/ |
Note that in the latter case, since the tabular format only allows to display one entry per row, but each sample may have multiple CSV files associated with it, a sample can take up multiple rows that only differ in the file.file name column. |
Assay-grouped metadata query endpoints
|
|
Example usage
(expand)
Revisiting one of the examples above, if one were interested in
|
then, respectively, these URL components would be used:
|
resulting in the following URL: /v2/query/assays/ ?investigation.study%20assays.study%20assay%20technology%20type &=study.characteristics.organism &=study.factor%20value.spaceflight |
Data query endpoints
|
If the data type is understood by the API as being tabular, additional GET key-value pairs can be provided, addressing columns as column.COLUMN_NAME and otherwise following the same conventions as those outlined for sample-level metadata. Note: by default, providing any column query component constrains the output to only the requested column(s); to display all columns regardless, the wildcard column.* is to be used. |
Example usage
(expand)
|
|
|
|
Query interface modifiers
(expand)Output formats
REST endpoints | Query endpoints | Notes | Examples | |
default | json | csv | see conventions below | REST | query |
interactive | html | html | see conventions below | REST | query |
raw | raw | only for query data endpoints: retrieves the original data file | query | |
alternative | tsv | see conventions below | query | |
json.split | format: {"columns": […], "data": [[…], …]} | query | ||
json.records | format: [{"field": value, …}, …] | query | ||
json.table | format: {"schema": …, "data": [{"field": value, …}, …]} | query | ||
auxiliary | browser | browser | resolves to html if possible to visualize; to raw otherwise |
Output format conventions
|