Large Amounts of Personal Data in Excel Files

Another week, another catastrophic personal data leak story (this time, one that’s not from the UK):

http://www.koreatimes.co.kr/www/news/biz/2008/09/123_30635.html

What I always find surprising about these stories is that people are using spreadsheets to handle datasets with millions of records in them.

I use Excel everyday at work and have done so, on and off, for years. If I’ve not been using Excel then it’s been OOo. Spreadsheets are ubiquitous, flexible and people seem to be able to work with them without a great deal of training.

But as a way of handling the data for 11 million people, how well do they perform compared to database systems?

Let’s assume that these big companies have professionally administered and expensive database systems that collect their millions of customers’ data. This data is probably fairly secure. Accessing it would involve sophisticated hacking (in the mainstream press sense of the word).

Non-technical employers of the company extract data for some or all of the customers as a spreadsheet and use the spreadsheet to do market research or what-have-you. Such a system’s security would be compromised if that employee were to accidentally or deliberately leave a CD or pendrive somewhere public. In terms of security, such a system is evidently not fit for purpose.

But how useful is it? As someone who’s spent a fair amount of my professional life working with script languages like PHP, Perl, Bash and what-not, I tend to see spreadsheets as a rough draft of a more automated system. Shouldn’t office employees be trained to write scripts in VBA or similar and not cobble stuff together with spreadsheets?

I am reminded of an epigram that the writer and engineer Nevil Shute used:

An engineer is a man who can make something for five bob that any bloody fool can make for a quid!

If you have a lot of employees, a Heath Robinson arrangement of spreadsheets is workable. A well trained programmer might be able to automate a lot of what they are doing.

It’s not that I want to be disrespectful to or underestimate the intelligence and learning of people who work with spreadsheets. My feeling is that companies would probably achieve significant gains in terms of productivity and improvements in security if scripting were seen as a core office skill.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s