You have probably seen the splashes on the news pages. The British government are considering a database that logs a degree of internet traffic. There is a report here if you missed it
What are they considering logging? Well let us look at what is currently logged. Details of the times, dates, duration and locations of mobile phone calls, numbers called, website visited and addresses e-mailed are already stored by telecoms companies for 12 months. Any of these details are surrendered to an appropriate agency on request. The proposal is that these records should now be held for 2 years and be held directly by the government.
Jacqui Smith went on to say: "There are no plans for an enormous database which will contain the content of your emails, the texts that you send or the chats you have on the phone or online.”
Hmmm… let us consider what is being said here. Not the content then. What reasonable use would there be in storing the email header information only? Well, you would have the IP address it was sent from, the email account that it was sent from and you would have the time that it was sent. That is no great trick for SMTP since it is sent in plain text by default. SMTP (mail) protocols are really just special purpose TCP/IP chatter on port 25. This stuff is defined in RFC 821 and 822. It is easy enough to log that stuff if you can record any packet on a network. You can do similar things for IMAP and POP3. So, to effectively you would need to be sitting on the email servers to record this. Ok. The UK government can enforce this on UK servers if they want to – you can’t fight city hall… but what if the email is not on a UK server? Hotmail is not based in the UK and I am willing to bet that it doesn’t internally use SMTP or IMAP – when sending a message from one hotmail user to another, you are effectively doing a database operation and that is how I would implement it if I were you. I bet that most web based email services such as Yahoo, Gmail and so on work that way. The UK government could ask Google to send it this data but would they? It seems unlikely. How about imail.ru (a Russian free webmail) or maktoob.com which is in Jordan. Now, Jordan and the UK get on pretty well but would they reasonably hand over that sort of data to the UK government? I don't think so. The Russians? Even less chance. There are hundreds of web email providers.
Oh, and here is something else that makes me wonder. You know why the industry doesn’t chase down the people who send the SPAM? Well, how would you tell who they were? It is trivial to fake an SMTP header and that is what the spammers do. There is nothing to stop the terrorists doing the same.
How about SMS messages? Well, they are a bit different because the whole message is sent as a packet. Longer messages are sent as multiple messages and stitched back together later, it seems. The message and the header are all in the same packet. I suppose that a scheme could overwrite the message content before recording the packet to a log but I would be surprised if that were done. The Multimedia Messaging Service protocols are more complex and more problematic.
Logging all phone numbers and times of calls and location of the caller? Well, that is pretty powerful if you know who the number represents. More than 75% of the UK population have a mobile phone. What other government can claim to be able to track 75% of their population at any time? Of course, pay as you go phones can be a problem. Pop into Tesco with some cash and you can buy a phone and some air time. Name? You are not required to give it. You want a free SIM card? You can have a dozen. Companies want to give them away. Why would a terrorist use the same one twice? This measures strikes me as an excellent way of monitoring the honest and the stupid but a rotten way of monitoring the intelligent and devious. There is also the question of the sheer volume of data as there is with emails. There are 60 million people in the UK roughly. About 75% have a mobile. That is 45 million mobiles to track. Some of those are teenagers who send dozens of texts a day. That could easily be 450 million texts per day. That is more than 160 billion texts per year. Good luck analysing that many. As for emails, that boggles the mind. There are more than 100 billion SPAM emails per day. Britain punches above her weight her because computer ownership is common. Let us say that 5% of these are in the UK. So, 5 billion SPAM emails per day. That is 1.8 trillion emails per year. Good luck in storing and scanning all those.
Hmmm… what websites were visited? That could be a useful one. In the course of writing this post, I have been to over 100 sites and I made no attempt at all to hide where I went. I don’t mind anyone knowing that I was looking at news sources and RFCs. Had I minded, I would have used a proxy. There are over 2000 free web proxies, hardly any of which are in the UK. You could investigate everyone who uses a proxy, of course. He who would keep a secret must keep it secret that he has a secret to keep, if I may quote Carlyle. You would be looking at trillions of web addresses each year though. It would be difficult data to mine. Where would you capture the data? The DNS servers would seem to be an obvious choice but I don’t need to go via a DNS server at all – indeed, the local cache serves most of my needs and I can keep a hosts file as large as I need. I don’t have to use a UK based DNS service at all and unless data is harvested at every router along the way, I don’t see how the traffic could be recorded as it doesn’t go through a central point. Again, you can monitor those who let you but those that want to slip through the net will find it easy enough to do so.
What about other forms of communication? Instant messaging would be hard to monitor – text messages for most types go via the server but voice and data go from peer to peer via UDP. That would be hard to monitor without something very like the Bundestrojaner, a bit of software created by the Austrian government to monitor individual computers using malware type techniques. That would be politically difficult to implement widely. Audio and video data is hardest yet to capture and when you look at structures like the Skype cloud architecture where there is little centralised control, it is tempting to throw up your hands in horror.
Of course, the more data you collect, the less effective your screening is. You really want to monitor the smart and criminal ones – and you have data on the dumb and the honest. You have so much data that it could only be analysed by machine, even if you have an army of spooks. The more data you have, the lower the signal to noise ratio and the less intelligent scrutiny you can give to the signal.
The problem is actually still worse. Let us consider what data related to terrorism might look like. Would it be a message saying “On Tuesday, we will meet at the town hall at 7:30. You bring the semtex and I will bring the guns. If wet, meet in the King’s head”? Why would it be in English? Why would it be in plain text? I could send that information as an MP3 of speech, as a JPG, as a video, as an encrypted file or hidden in a dozen ways, many of which are well known and have been used in dozens of films. We can safely assume that any terrorist worth his salt can do 20 minutes research. Code books are old hat but they still work. No scanning program can work out whether a discussion of the health of an aged relative really means something different when decrypted the old fashioned way with a look up reference such as the old book ciphers. There are also some cool things that you can do with steganography.
So, what does this cost us if it is implemented? Well, maybe not much. If the data is mostly ignored then there is little loss of liberty and the intelligence services will not be wasting much of their time. It might be useful in a case where our friends in the Office for Security and Counter Terrorism were trying to work out who a suicide bomber had been talking to.
However, if it is misused, it will have a massive effect on civil liberties and will blind the intelligence services because there will be too much data to ever process.
There is also a problem that you always have to consider. Even if you trust this government (and I am making no statement at all on that), do you trust every government that will come after? Will none of them use this to oppress their opponents or police the ranks of their own party? Will no future government use this to control its population? Forever is a very long time. There will be a bad leader some day. I leave it to you to decide how happy you are with that thought.
Mark Long, Digital Looking Glass