LUMA Data Extractor

API documentation

Data can be extracted automatically using the LUMA RESTful interface. As with the main extractor, this requires a personal access token.

Dataset metadata

A description of the dataset(s) available to your access token is obtained via the endpoint at:
http://www.urban-climate.net/metadata_public/datasets.php?userId=<token>

The output is a JSON list of dataset objects. An example that contains one dataset is shown below:

[
    {
    "label": "24t34gweg",
    "userId": ,
    "maxRequestSize": "44",
    "siteId": "KSSW",
    "instId": "CSAT3",
    "outDefId": "CSAT3_ECpack",
    "varId": null,
    "timeRes": "30min",
    "levelId": "1",
    "startDate": {
      "date": "2017-05-01 09:35:01.000000",
      "timezone_type": 3,
      "timezone": "UTC"
    },
    "endDate": {
      "date": "2017-05-31 09:35:01.000000",
      "timezone_type": 3,
      "timezone": "UTC"
    },
    "instSuffix": null,
    "vars": [
      [
        "u",
        "zonal_wind_component",
        "m.s-1"
      ],
      [
        "v",
        "meridional_wind_component",
        "m.s-1"
      ],
      [
        "w",
        "Vertical wind component",
        "m.s-1"
      ],
      [
        "Tsonic",
        "sonic_temperature",
        "Celsius"
      ],
      [
        "Q_H",
        "upward_turbulent_sensible_heat_flux",
        "W.m-2"
      ],
    ],
    "deployments": [
      {
        "serial": "0194-2",
        "easting": "0.0",
        "northing": "0.0",
        "height": "52.9",
        "yaw": "210.0",
        "pitch": "",
        "roll": null,
        "levelNum": "1",
        "deploymentId": "391",
        "instId": "CSAT3",
        "fileIdentifier": "CSAT3_ECpack_%SITE",
        "suffix": "B",
        "varId": "nSamples",
        "defId": "CSAT3_ECpack",
        "siteId": "KSSW",
        "type": "ec",
        "friendlyName": "EC - check sonic serieal",
        "startDate": "2016-09-15 00:00",
        "endDate": null,
        "dailyProcessing": "1",
        "outputFile": "%Y\/London\/L{levelNum}\/KSSW\/DAY\/%DOY\/CSAT3_ECpack_%SITE",
        "timeRes": [
          "30min"
        ]
      }
    ],
    "instrumentSuffixes": [
      "B"
    ]
  }]

Field descriptions

label: The unique identifier of the dataset
userId: The token of the user requesting the data
maxRequestSize: The maximum number of rows that the user can download per request (null means unlimited)
siteId: The measurement site at which the data was obtained
instId: The type of instrument
timeRes: Time resolution string of dataset
levelId: The QAQC level of the data (lowest is 0)
startDate: Earliest allowed date (null if no limit is set)
endDate: Latest allowed date (null if no limit is set)
endDate: Latest allowed data (null if no limit is set)
instSuffix: Instrument suffix: States that the administrator has recommended that only deployments (see below) involving an instrument with a specific suffix be used for extraction. Null if no recommendation made.
vars: List of variables available from this dataset. Each variable has [identifier, description, units]
deployments: The list of instrument deployments that serves this dataset (see below)
instrumentSuffixes: For convenience, a summary of all the instrument suffixes that appear in the deployments list

Deployments list

A dataset may cover a long period of time, during which instruments are moved or replaced. A deployment contains the metadata covering a period when an instrument with a specific serial number is placed at a location (relative to the site position) and makes measurements. A new deployment is started if the instrument is reconfigured, replaced or adjusted.

If a deployment consists of multiple instruments working in tandem (e.g. Latent heat fluxes using eddy covariance requires two) then a deployment will contain as many entries as there are instruments

serial: Instrument serial number - defines the physical instrument being used
easting: Instrument location - metres east of site co-ordinates
northing: Instrument location - metres north of site co-ordinates
height: Instrument height - metres above sea level unless soil or indoors)
yaw: Yaw (degrees clockwise from north; null if empty)
pitch: Pitch (degrees; 180=up, 0=down, 90=horizontal; null if empty)
For instruments that are pointed at something (e.g. scintillometer or camera):
- roll: Roll in degrees (degrees clockwise around line of sight axis; null if empty)
- target: Target Site or description (instruments that are pointed at something)
- targetHeight: Height of target (m above sea level)
- targetDistance: Distance to target of target (m above sea level)
For indoor sensors:
- storey: Building storey
- roomType: Room type
- distToWindow: Distance to nearest window (m)
levelNum: QAQC level number (0 is lowest)
instId: Instrument identifier (instrument type)
suffix: Instrument suffix: Linked to serial number for simpler description
siteId: As above
startDate: Start date of deployment YYYY-mm-dd HH:MM
endDate: End date of deployment YYYY-mm-dd HH:MM (null if still running)
timeRes: The time resolution(s) available - note that the dataset time resolution dictates this

Downloading data

Data is downloaded from the following endpoint: http://data.urban-climate.net/LUMA/dataset/<dataset_id>/get_data

Query variables reqired:

dataset_id: Unique identifier for dataset
token: User token
start_date: Earliest date for extraction (YYYY-mm-dd-HH:MM:SS)
end_date: Latest date for extraction (YYYY-mm-dd-HH:MM:SS)
instSuffix: Defines which instrument suffix to use in extraction: Mandatory if more than 2 instrument suffixes present in dataset; Optional if not (see "behaviour" section below)
var: Measurement variable(s) to extract. Submit multiple var query variables to add more variables (e.g. ?var=Tair&var=RH for both Tair and RH)
missing: Values to denote missing data. Can be -999, blank, or x
data_format: Output format: Can be html, json, or csv

Notes on extractor behaviour

If the start_date and end_date variables produce a longer time series than maxRequestSize, the output will be truncated before end_date is reached.
Metadata is only provided in the json output format
instSuffix query variable:
- If only 1 instrument suffix is involved in the dataset, instSuffix can be blank
- If 2 instrument suffixes involved in the dataset and operate simultaneously, instSuffix can be blank. If blank, the administrator's recommendation is used to decide which instrument gets priority when both are available and a continuous time series is synthesised from the two instruments (error raised if no recommendation was made). If not blank, data from only the specified instrument is downloaded
- If >2 instrument suffixes involved in the dataset, instSuffix must be specified. Data from only this instrument is downloaded.