A Day of Big Data
After recording a day’s worth of all of the possible contributions to big data that I made, I found that a great deal of my online activity could be giving away more personal information than I would have originally considered. The site that I believe could be getting the least amount of data out of me is, ironically, the one I probably use the most. At the start of my day, approximately 9 a.m., I logged onto the social media website Tumblr. While I spent a great deal of time during yesterday blogging about my various interests, I think that the nature of a Tumblr user’s sporadic posting makes it difficult to uncover any discernible patterns from their entries. The next website I used for most of the day, starting at around 10 a.m., was Youtube. I used this while I was studying to listen to music that I couldn’t download into an MP3 format. The “other videos you might like” feature showed me that Youtube actually has a lot of difficulty discerning between my various interests (either by subject matter or content). Youtube is also linked to my Google Plus account, which means that Google is getting data related to my video interests as well.
I used Facebook briefly, around 1:00 p.m., to message the supervisor for an internship that I’m taking part in. While I doubt that Facebook could gleam any useful information from that one post, the website in general is one of the largest potential threats in terms of forming my online persona. I also received a message from a girl I’ve been speaking to on the dating app Tinder. This app uses Facebook to build a profile, meaning that the social media outlet has access to parts of my love life that I’d normally prefer to keep private. Closer throughout the end of the day, 3 p.m. through 6 p.m., I was using the gaming platform Steam. Along with my name, phone number, and credit card information, Steam has access to my video game interests as well. I used Netflix briefly that day as well, around 8 p.m., which gathers a lot of similar data that Steam does but for film/TV.
The picture that I believe these websites can paint of me is of someone who has specific interests in film, TV, and video games and other facets of popular culture. I don’t believe I have enough negative/misleading information out there that could form an “incorrect” picture of me, but it’s difficult to be certain. Even if I wanted to opt-out of my inclusion in big data, it would be impractical for someone like me who enjoys the multitude of benefits that come from having several digital services.
At what point will a website’s ability to read my data allow them to know what my interests and habits are before I do, with more accuracy than current versions of that service have?
I noticed that you mentioned Tinder, which is interesting to me. You also mentioned that you are a user of Netflix, Steam, Facebook, and Tumblr, as well. I know when it comes to dating sites where you actually make a lengthy profile, keywords give you “matches”. However, the flaw in that is that you as a person would control what you put on there. For instance, someone might say “I play X, Y, and Z game”, “I love this show!”, etc, but they aren’t going to necessarily put that they have Spice Girls on their youtube history as a guilty pleasure or that they google a particular fetish porn. But how cool would it be if a dating site linked to your facebook and google accounts and matched you with people who have REAL similar interests to you. For instance if it took into consideration your particular interests, things you may forget to include, or things you may be too embarrassed to include on a dating site. This might actually give users more accurate matching because it would take into consideration all of the guilty pleasures and weird quirks. Even if it didn’t outright display what you googled or youtubed, it could potentially give incredibly aligned matches!
If you want to have your mind blown a little bit, check out Dataclysm: Who We Think We Are (When No One’s Looking). It’s by the founder of OKCupid and offers some truly fascinating experiments that have been done by dating websites. It illuminates not only how we portray ourselves, but also how we think we know what will make us happy in a partner, and how possible it is to be wrong about that.
Overall good work on this post. I think you’ve represented well the websites you’ve visited, but you may have a blind spot for other devices that could be generate data. The GPS in your phone? Road sensors? Surveillance video from places you visited?
As far as your question goes, I’m interested in what exactly you mean by “with more accuracy.” How accurate do you consider current models? What degree of accuracy would satisfy your question?