»Stay in touch Sign up to our newsletter for event invitations and the best information law news.

Top Tip: A data state of mind

Every week, Request Initiative will share its top tips for submitting a Freedom of Information request. This week: how to go about asking for entire datasets.

The Freedom of Information Act is a way for you to put questions to the government and have them answered. How many arrests for dangerous driving were there in the last month? What’s the average wait for an ambulance in Birmingham? Or how much has the Department of Health spent on ‘refreshments’ in the last 6 months? (It was £120,000)

Asking a question means that you’re getting the Freedom of Information officer to go through whichever database holds the records you’re interested in, pull out what you want, do a little number crunching and then send you a number. Instead, you can cut out the middleman and simply request the database itself. It gives you additional information to play with and makes the request quicker to handle so you’re less likely to be denied based on cost.

Let’s take the police’s Automatic Numberplate Recognition (ANPR) system as an example. There are a lot of requests on WhatDoTheyKnow asking how many cameras there are and where they are but the responses don’t tell you a lot about the system as a whole. By requesting the whole dataset, we might have more luck.

Firstly, think about what is likely to be recorded alongside the information you’re interested in. Often, you can find out from blank copies of the forms used to input the data. If this isn’t an option, ask for the obvious choices as well as “any other non-sensitive data which is routinely recorded.” And remember, your FOI officer has a duty to advise and assist under section 16 of FOIA, so don’t be afraid to ask.

In this case, a user manual for the ANPR system was disclosed two years ago. It tells us that each record in the database has a:

- Numberplate

– Digitised picture of the numberplate

– Time the data was captured

– Date the data was captured

– Location (and GPS coordinate) of the camera

– Force identification code

– Camera name (such as description of where it is and the exact location)

The best way to request a whole dataset is in CSV (comma separated values) format. Whatever proprietary software might be in use, every database can export to CSV and you can open it with Excel when you receive it.

But be careful. Requesting a database means you can capture a lot of sensitive information. The numberplate is ‘personal data’, and so exempt from disclosure (section 40). And the camera locations could tip off criminals and terrorists and trigger exemptions for national security (section 24) and law enforcement (section 31).

You still have a right to that information, but be prepared to make several appeals to get it. To get around this, instead of asking for the entire database you can ask for the sensitive columns to be deleted.

Frustratingly, because the camera name is also a description of its location this would mean losing our only means of working out how many cameras there are, and in which police authorities. But you could ask that each camera name be replaced by a unique ID number, or that the identifying details are simply redacted.

By requesting the dataset we can work out which authorities have the most cameras, when they’re in operation and which cameras are most active. Then you can start to think about how the numberplate data collected en masse interacts with the separate police database for vehicles of interest.

Requesting datasets can be a useful way of getting more than you asked for, but it has its own pitfalls. Remember:

- If you want a piece of information, think about what else will be stored in the same system and whether it’s better to request the whole dataset

- Try to get as clear a picture as you can of what the database you’re requesting looks like

- If you want a quick disclosure make sure that you’re not requesting any sensitive information

One Comment

  1. [...] week’s topic, looking at asking for datasets, is timely, with the new FOI rules on datasets coming in from September [...]