Blog Public Health Data & Training Center

OKCupid and digging in to big data

Written by Ben Cooper, MPH, manager of the Public Health Data & Training Center at the Institute for Public Health


We’ve all heard the term “big data” tossed around a lot lately. As someone who has been working with data of all shapes and sizes for nearly two decades, I’ve learned that it isn’t the size of the data that matters, as much as what can be done with it.

For example, the folks over at OKCupid have created a blog called oktrends to help understand, often in humorous ways, the behavior and preferences of the users on their online dating site.

“All of this data … is aggregated and anonymous. … But when you put all this stuff together, you’re able to look at people in a way that people have never been able to look at people before.”

Headed up by OkCupid’s Christian Rudder, the blog delves into topics ranging from “Race and Attraction” to “Exactly What To Say In A First Message.” They have the luxury of large datasets, often over a million records, that allows them to drill down into the details on issues further than other researchers would be able to.

Rudder delves deeper into these ideas in his book Dataclysm: Who We Are (When We Think No One’s Looking). In an interview with NPR last fall, he says of the data available from the site, “All of this data … is aggregated and anonymous. … But when you put all this stuff together, you’re able to look at people in a way that people have never been able to look at people before.”