How to access data from the ONS beta API

Introduction

Over the past 12 months we’ve had a lot of interest from our users about accessing our data via our Beta API to build automated charts, apps and tools.

A group from Cambridge University built a coronavirus deaths tracker using our Beta service. In July 2020 ONS published a subnational ageing tool that also uses this service.

We have data on a variety of topics available, covering the likes of inflation, unemployment and well-being. More are added regularly.

This post introduces our Beta service, focusing on the API and goes into detail about how users can get the most out of it.

What is the ONS Beta service?

Our Beta service is called Customise My Data (CMD). It takes data we publish at ONS and makes it available in a more open, useful format for our users. It runs alongside our regular publications via our release calendar.

CMD comes in two parts: the filter journey and the Application Programming Interface (API).

For example, let’s say you’re interested to know the number of people aged in their thirties in Cardiff. You can use our population filter to generate a custom spreadsheet with just that data filtered for you to download in a XLS or CSV file.

For users with no particular background in programming this is likely to be the best way for you to access CMD. A good place to dive into our data is via our local statistics page, scrolling down to the datasets listed under ‘Customise My Data’.

Our API is a system that allows users to make more advanced requests for this same data.

It’s designed for more technically-minded users who want data in a consistent, structured format. You can use a variety of tools and programs to work with our API data, such as Python, JavaScript and R.

Access to the API is free with no registration required. It returns data in a JSON format. There are some rate limits: please see the Frequently Asked Questions at the bottom of this post.

For users who just want to filter and access data quickly, with no special technical knowledge, please try filtering the datasets through your browser.

Getting started with the API

Our datasets endpoint lists all the datasets we have available on the beta API.

We will take Consumer Prices Index including owner occupiers’ housing costs (CPIH) as an example dataset and walk through it.

Navigating to the CPIH dataset endpoint will return some JSON metadata that tells you some relevant information about the dataset such as the release_frequency (monthly) and the
next_release (17 February 2021 at the time of writing).

All datasets on the API have at least one edition and one version. A version is an update to the data, usually the next release in a time series.

An edition contains all the versions that fit together. We may have to start a new edition if there is a significant change to the data structure such as a change in geography or a revision to an official classification.

We can see from the editions endpoint for CPIH that it has just one edition called
time-series. This time-series has a latest_version which at the time of writing was number 4.

How data is structured in the API

Before we start to query the data itself it will help to understand how data is structured in the API.

Data is held in a tidy data format. This means:

  • Each observation (or value) is a row in the dataset.
  • Each variable (also known as a dimension) is a column in the dataset.

All datasets in the API have a time and a geography dimension plus one or more additional dimensions.

In the CPIH example the time is a range of months stretching (currently) from 1988 to 2021 and just one geography: United Kingdom. We also have an aggregate dimension that contains the various goods and services covered in the dataset. To make a request for data users must specify an option from all the dimensions in a dataset.

How to query the API

There are three ways to query the ONS API:

  1. Download the entire dataset.
  2. Query an observation.
  3. Filter a dataset for more advanced queries.

To download the entire CPIH dataset, navigate to the latest_version. From here you will see the CSV or XLS download link. You can read it directly into a program such as Python or paste the URL into a web browser to download it automatically.

The second way to query the API is the observations endpoint. This allows you to query an observation or several observations of data. To write our query we will need to know what dimensions are contained within the CPIH dataset and select at least one option from each.

To do this, take the latest_version URL and add /dimensions.

This will show you a list of dimensions included in this data. In our case we have three:

  1. Time
  2. Geography
  3. Aggregate

To find out what options are available for each dimension, choose one and add it to your request, followed by /options. For example, here are all the valid time options for this dataset.

Pick an observation you want and save it. Once you have observations for all dimensions, you can put them together using the observations endpoint. This can be found by putting /observations after the latest_version.

A valid observation query therefore looks like this:

https://api.beta.ons.gov.uk/v1/datasets/cpih01/editions/time-series/versions/4/observations?time=Apr-20&geography=K02000001&aggregate=cpih1dim1G100000

This will return an observation of 117.5.

You also have the ability to use one wildcard operator * in your call. This will return all options for that dimension. This call returns all months covered by time for this aggregate:

https://api.beta.ons.gov.uk/v1/datasets/cpih01/editions/time-series/versions/4/observations?time=*&geography=K02000001&aggregate=cpih1dim1G100000

Please note that the wildcard operator is limited to one dimension per dataset.

If you require larger slices of data there is a more advanced ‘filter a dataset’ functionality. This requires POST requests rather than the GET requests we have been covering in this article. Please see the developer documentation for more details.

Frequently Asked Questions (FAQs)

When is the API updated?

The API is updated as soon as possible after publication of the same data on the main ONS release calendar. Currently we don’t have the ability to publish the data simultaneously at 07:00 or 09:30 along with our bulletins. There is a time lag that varies from dataset to dataset, which we aim to keep to a minimum.

Are there any limits on API use?

Yes. Use of the API is limited to 120 requests per 10 seconds and 200 requests per minute. Any traffic above this limit will be blocked for one minute and then allowed to continue. In this case the user will see a JSON response 429 HTTP error code and a Retry-After header showing the number of seconds until the user may continue.

We also have some limits on the number of items returned for certain endpoints and an offset parameter to navigate these limits. Please see the full developer documentation for more details.

Why isn’t the data I want available?

All data published on CMD has to be assessed for its suitability, processed, reformatted and signed off, which takes time. We are always working on adding new datasets. If there are any you would like to be made available please get in touch with us via customise.my.data@ons.gov.uk.

Do you have a sub-national breakdown of data available via this service?

Yes – please see the ONS local statistics page and scroll down to ‘Customise my data (beta website)’.

How can I get in touch with you about this service?

Please email customise.my.data@ons.gov.uk.

Is there more technical documentation available?

Yes, please see our full developer docs.

Leave a comment

Your email address will not be published. Required fields are marked *