This is a companion post to the one Andy has published outlining ‘why’ we have built the API, and a quick overview of the ‘how’. More detail on all of this is available on our new developer site and I won’t be offended if you want to dive straight into that.
To start, you will need a list of available datasets. Because these are the same API’s we use ourselves to provide the web journey for Customise My Data the available data will be the same and, importantly, updated and available at the same time.
Access to statistical data from the API is possible in two ways both of which allow access to all or a subset of a single dataset.
Requesting data directly
The first way to get data is through what we have referred to internally as the ‘observation level’ part of the API. This forms part of the ‘Explore our data’ service and is there to support simpler queries and will return the resulting data in JSON.
This allows data to be accessed through GET requests and will return the data and metadata, for example;
As you can see from the above example the request specifies the dataset and contains an option for each dimension in the dataset; in this case time, geography and aggregate (the coicop hierarchy used to classify goods and services in inflation statistics).
This method allows one of these dimension values to be replaced with a wildcard of ‘*’ and because you get the data back in the response it allows the simple creation of things like this time series chart;
Returned alongside the data are any relevant metadata we hold, for example confidence intervals or data markings and links to information about the dataset the observation came from.
Filtering a dataset
The other route is through our ‘Filter a dataset’ service which supports the front end user journeys for CMD. This allows more complex querying of a dataset and does not have the same restrictions as the observation level API, returning the data itself in CSV and XLSX.
The XLSX is designed to provide the human readable version whereas the CSV is more suitable for machines, with one observation per row (for interest it is also the same format we use to import data into our systems).
It is not intended for these to be the final formats formats for both these files and for the CSV in particular are currently looking at how we can extend (or additionally include as a 3rd option) CSVW.
This method is split into two parts, filters and filter-output, to allow users to create a set of filters and resubmit whenever required rather than having to POST and create a new filter each time.
If you have any questions or comments let me know on twitter @robchamberspfc, comment on this blog or let us know by emailing email@example.com