27 September 2023

Information overload: How new data regulations may not be the answer

Start the conversation

Jon Porter* says Europe’s new data regulations make it easier to access the data companies hold on their clients, but that doesn’t mean we will understand it.


Photo: Markus Spiske

If the numerous tech scandals of recent years have taught us anything, it’s that tech companies hold a truly terrifying amount of data about us all.

This data can be outright dangerous when it falls into the wrong hands.

Europe’s response to that risk, as part of the General Data Protection Regulation (GDPR), is the “Right of Access”, which says that, when requested, any company should be prepared to provide you with your personal data.

They should provide it in a way that’s easy to read, in a timely manner, and with enough background information for you to understand how they got it and how they use it.

The thinking is that once you know what data a company holds about you, you can use it to make informed decisions.

The problem is companies can often be really stingy about actually providing this data.

After all, if your service is essentially “forcing consent” (as Google was recently fined millions for doing), you might not want your users to easily see how much personal data you’re collecting.

I decided to test the “Right of Access” offered by four of the biggest tech companies operating in the EU: Apple, Amazon, Facebook, and Google.

What I found suggested that while you can certainly get the raw data, actually understanding it is another matter.

Everything you need to know about GDPR

According to the UK’s data protection regulator, the ICO, companies must provide all personal data on request.

It must be provided in a “concise, transparent, intelligible and easily accessible form, using clear and plain language” in a “commonly used electronic format.”

Both Google and Apple’s data download services let you pick and choose what data you want to download.

Facebook doesn’t, but all three are easy to find on their respective websites, and it arrives quickly.

Meanwhile, rather than presenting it as an easy option to find on its site, getting a single link with all of your Amazon data relies on you digging through the site’s “Contact Us” page to find the option hidden at the end of the list.

Once I requested it, it took the full 30 days to receive a link to download my data (the limit imposed by the regulation).

When trying to look at the data I’d received, however, things got messy.

Some files were ambiguously labelled, while others were stored in formats that tested the limits of what constitutes “commonly used.”

Google’s location-tracking data was particularly hard to understand.

The company has been repeatedly criticised for tracking Android users, even when they’ve turned off the main location-tracking option in the operating system.

This information is very difficult to view and understand.

All of my location data from Google was contained within a single 61MB JSON file, and opening it with Chrome revealed a bewildering array of fields labelled “timestampMs,” “latitudeE7,” “logitudeE7,” and estimations about whether I was sitting still or in some kind of transport (I assume).

I don’t doubt that this is all the location history information that Google has associated with my account, but without context, this data is meaningless.

If the purpose of the GDPR is to allow people to have more control and understanding of what data is collected from them, this part of Google’s download has little to offer.

When it came to other files, it wasn’t even clear what data I was looking at in the first place.

The most confusing files out of the entire data download are also the most important.

They contain the kinds of personal information that potential advertisers would kill for, and Google should make more of an effort to explain what they are.

Apple fared better than Google in the way it presented its data, although there were still problems.

The majority of the data Apple provided was in file types that were easy to read and understand.

But once you get into these files, there’s still a lot of information that’s difficult to understand.

The creepiness of being able to listen to all my Alexa requests notwithstanding, Amazon did far better with how it presented its data, although this may just have been because of how comparatively little it holds about me.

For the most part, files and folders were clearly labelled, although the company still has some work to do on labelling the contents of its spreadsheets.

Ironically enough, Facebook actually had the most comprehensible data of the four services.

For starters, every single file Facebook gives you is an HTML file.

Each is sorted into its own clearly labelled folder, and an index file gives you an overview of what each document contains.

The files themselves are clearly laid out and formatted, and browsing them feels almost like browsing a page on Facebook itself.

It’s still terrifying to see the amount of data Facebook has stored on you (and that’s not even getting into the instances of people having found records of all their old calls and SMS messages), but at least you’re well-informed about what exactly this information is, rather than having to guess based on the contents of each file.

At the end of my experiment, I’m left with just under 138 GB of data across the four services I contacted.

After attempting to sift through and understand it all, it’s clear that these companies, and the GDPR regulations that govern them, have a long way to go if they want to give us real control over our data.

Being able to download it is one thing, but making it useful means working harder to ensure that what’s downloaded is easier for the average person to understand.

At a minimum, that means providing a better index to tell you what data is contained in what file, but it also means organising the contents of those files in a way that allows them to make better sense by themselves.

* Jon Porter is a reporter for The Verge. He tweets at @JonPorty.

This article first appeared at www.theverge.com.

Start the conversation

Be among the first to get all the Public Sector and Defence news and views that matter.

Subscribe now and receive the latest news, delivered free to your inbox.

By submitting your email address you are agreeing to Region Group's terms and conditions and privacy policy.