AOL Data: First Searcher Identified

Techcrunch has information on the first person positively identified from the AOL data. AOL searcher number 4417749 has been identified as Thelma Arnold, a 62 year old widow living in Lilburn, Georgia.

As you might expect, the searches made by her are pretty innocent. Her search queries range from “numb fingers” to “60 single men” to “dog that urinates on everything.” The New York Times has a pretty in-depth article about Thelma and other, yet unidentified searchers.

Ms. Arnold, who agreed to discuss her searches with a reporter, said she was shocked to hear that AOL had saved and published three months’ worth of them. “My goodness, it’s my whole personal life,” she said. “I had no idea somebody was looking over my shoulder.”

In the privacy of her four-bedroom home, Ms. Arnold searched for the answers to scores of life’s questions, big and small. How could she buy “school supplies for Iraq children”? What is the “safest place to live”? What is “the best season to visit Italy”?

Wonder when we can expect the first lawsuits to be filed? Personally, I expected some yesterday. AOL had a shitty reputation before, I’d be surprised if this doesn’t end up sinking them at some point.

Web Interface for AOL Data

A commenter over at Techcrunch put together a simple little web interface to the AOL search data.

Michael Arrington from Techcrunch spoke with Andrew Weinstein over the phone lastnight about this. Andrew is the AOL employee who first issued the apology that can be seen over at Techcrunch. Anyway, Michael thinks Andrew is truly pissed off about what happened, as he definitely should be.

What I’d like to know, is how the decision came about to release this data in the first place. This had to be a decision made from pretty high up the ladder. Another thing, AOL shouldn’t even allow access to this data in it’s raw format. Or, very, very few people should be able to access the raw data, except for a few servers. I mean, nobody at AOL should have any reason to use such detailed data. Instead, there should be a reporting type system that runs reports based on the raw search data, that way nobody can actually see the data itself, only the summarized reports.

I don’t think Jason’s idea of turning off logging is practical. It’s really quite simple, don’t allow access to the raw log data.

Philipp Lenssen has some pretty good commentary over at Google Blogoscoped. He’s taken some time to see what individuals are searching for, pretty amusing:

At 10:08 PM, 28963 looks for “porn sites”. 28963 quickly amends the search query to read “freee porn sites”. (Two days later, 28963 shows a sudden interest in genital warts.)

He’s got a lot more of them, so head over to Google Blogoscoped for more amusement. Garett Rogers at the Googling Google blog at ZDnet has some commentary too.

This is the type of news that will reach every single AOL user. People will be boycotting the company because of their blatent disregard for the privacy of users. As my fellow Canadians would understand — this could be the TSN turning point.

Markus Frind has put together nice post detailing how one AOL user likes searching for ways to commit murder. Some of his commenter’s are upset, but Markus asks some good questions:

Users in the comments are pissed off at the idea that people can be arrested for planning a crime like murder, calling it minority report like. I ask you why is it that americans have no problems arresting people that are planning or researching how to conduct terrorist attacks? Yet if a person plans on killing his wife that is ok, until he actually does it? How many people do you have to plan on killing before its ok for a company like AOL to hand your records over to the government? I am not taking sides, I’m just pointing out the obvious double standard. This story will open a can of worms, and will decide just how private your data online really is.

AOL Releases Private Data

So, AOL released a bunch of search data. Doesn’t sound so bad right? Well, it is, because AOL included identities, so basically you can see who has been searching for what. The data spans over a 3 month period. It even gives information as to which links were clicked on the search results page. No usernames are included, but user ID’s are, which can be linked back to usernames with little trouble. From Techcrunch:

The utter stupidity of this is staggering. AOL has released very private data about its users without their permission. While the AOL username has been changed to a random ID number, the abilitiy to analyze all searches by a single user will often lead people to easily determine who the user is, and what they are up to. The data includes personal names, addresses, social security numbers and everything else someone might type into a search box.

The original download has since been taken offline. However, there’s plenty of mirrors. The data in its compressed form weighs in around 439M, uncompressed it reaches just over 2 gigs.

UNEASYsilence has taken time
to look through some of the data. Some of what they saw actually frightened them.

There are some truly scary things in this database.

There are hundreds of searches from people looking to kill themselves and even more scary are searches from users that seem to be looking to commit murder.

People are fucked up. Really though, some good could come of this. With all this super detailed search data, certain groups of people could be targeted. For example, those searching for “boylove” or “child love” constantly could be assumed to be some sort of pedophile. I could see groups like The War Against Nambla using this information to find new sickos to target.

UPDATE: AOL is now saying this was a screw up. Initially the data was reported to be released to the public for research purposes. Jason Calacanis, an AOL employee, is suggesting that AOL “NOT KEEP LOGS of our search data.”