The year of big data

The year of big data
Campaigns fed on a steady diet of it in 2012, but what will the future hold for data-driven consultants?

It’s a fitting title for 2012—the year of big data. Massive data sets integrate information from multiple sources, sometimes produced with mining techniques and presented on mashup applications. They show a personalized picture of the voter, drawing in information from the political, commercial and even cultural spheres.

Big data figured prominently in the 2012 campaign. But this year might deserve the tag for another reason: everyone is talking about it.

The buzz around big data in campaign circles implies that it is at a minimum a key component of a winning strategy. Rich, detailed information about vast numbers of voters seems to lend itself to effective campaign decision making, especially in tasks that involve targeting.

Time will tell whether election outcomes have truly turned or will turn on big data. However, it’s not too soon to give some serious thought to who might be situated well to win, and to lose, in this new world.

Both the Romney and Obama campaigns broke new political ground with data and the practices associated with it. Team Obama’s integration of the voter file with fundraising data and social media contacts was a significant development. And the AP’s Jack Gillum offered a dramatic description of Romney’s use of mined data to identify a new pool of donors. The campaign used “a secretive data-mining project … to sift through American’s personal information—including purchasing history and church attendance.”

A practice like this may be fairly new to politics, but it’s been artfully perfected in the marketing world.

The Romney case is reminiscent of the much-circulated account by Charles Duhigg, in The New York Times, of the big data prowess of retail giant Target, collecting purchasing habits, demographics and a bevy of other information about customers. While the Romney campaign identified donors for a fundraising appeal, Target could eerily predict which customers were pregnant, and then follow-up in its marketing to this lucrative customer cohort.

Many tactical moves in the big data era seem to be driven by a data collection motive. The much-touted summer mobile app “Mitt’s VP” promised early news of Romney’s running mate. But it also offered access to contact information for those who downloaded the app, a boon for the campaign.

Tracking the web viewing history of voters gives campaigns additional evidence about preferences and habits, critical for targeted appeals. ProPublica described an imaginative appeal on Pandora, the online subscription music site. A user listening to Garth Brooks might have been asked permission to share his email address with the Romney campaign, presumably based on predictive analytics that show that a Garth Brooks fan—or at least that particular fan—was a likely supporter.

But as with any innovation, in politics or elsewhere, it hasn’t taken long for the lines to be drawn—for the voices of peril and of promise to stake out their ground. Privacy concerns drive many of the skeptics who emphasize that people unknowingly reveal the atomic matter of big data in their day-to-day habits.

On the other hand, many of the optimists are compelled by thoughts of electoral expediency—the prospect that big data may be the key component of the successful 21st century campaign. This blatantly pragmatic view can even be cast in a more altruistic light, one that stresses the role of data in engaging voters—including those not typically interested or active in politics.

While the task of pinpointing the optimist/pessimist divide is quite easy, dealing with the winner/loser distinction is a little trickier. At first glance, it looks like some are poised to win. First among these are the campaigns that can parlay big data into electoral success.

A close second are the vendors and professionals whose products and services translate into profit and successful careers. Think about an obvious case like NGP VAN, with the lion’s share of the Democratic voter list software market. But there’s also the not-so-obvious case, like Pandora, which provides an upgraded music subscription permitting users to avoid the inconvenience of that email pitch.

Some may be well-situated to reap the benefits, but others—even those with sufficient resources to play the game—face challenges.

One challenge extends from a fundamental characteristic of big data—namely that big data and the campaign practices related to it make the implicit tradeoff of volume over accuracy. The draw of the data is that it involves a large number of cases—voters, donors, users—and a lot of information about each one. It’s an enterprise that derives value from volume and along with that tolerates some inaccuracy.

Even the best voter lists include error, but the problem isn’t isolated to lists. The browsing history of a voter might reflect better her children’s digital life than her own. But big data is premised on the assumption that success can be achieved even with imperfection in data. The conceit of big data is that it is continually updated and refined—a product or tool that will be even more valuable in the next cycle because of the information gathered and insight gleaned in the present one. It might not work that way.

Consider the possible build-up of voter concerns about privacy as they learn about and experience first-hand targeted online political advertising. The awe with which the campaign industry views the ability to track digital histories is matched by the distress of both unwavering privacy advocates and those simply unsure of where this new world of mined digital data will end.

Voters are inundated with digital ads—so many that even the most naïve user will eventually put two-and-two together realizing that someone has amassed a lot of information about his preferences and activities. It’s possible that privacy concerns may dissipate, but it’s equally if not more likely that they will build steam—fueled in part by the big data practices of 2012.

The metrics that go hand-in-hand with data-driven efforts can also cut both ways. Metrics help keep the operation and staff on track and offer ready indications of the success of their efforts. This applies both in the field and in the digital realm. At best, these measurable outcomes are functional for an organization—to track performance, motivate staff and volunteers, and even to signal to the broader world (especially donors) that the organization is active and productive. At worst, the measures can represent hollow actions undertaken simply to meet a quota.

And at a minimum, many metrics fail to capture a quality dimension. A count of doors knocked glosses over the quality of the volunteer/voter interaction, just as an open rate says little about actual readership. The challenges posed by big data may be more significant for those whose vision is long term.

Those who live or die in the present, or at least in the very next election, may effectively dodge the minefield. Others who expect to be around for a while should be especially watchful. The party organization comes immediately to mind. It may need to be more concerned than the candidate campaigns with the prospect of voters alienated by tireless contact efforts and besieged with digital ads.

Likewise, any appreciable reaction to perceived breeches of privacy might be down the road, with those campaign efforts limited in duration being somewhat insulated. Even those errors tolerable at t1 might be multiplied dramatically at t2 by data integration practices.

Acknowledging the minefield need not imply a pessimistic projection for big data. Despite flaws, it can help engage and mobilize. Others take issue with this judgment; academic David Parry delivers a provocative assertion that big data reduces campaigns to social engineering competitions.

But remember that even given the prevalence of big data, it’s far from the only game in town. Furthermore, it probably involves a few too many flaws to constitute real engineering, social or otherwise. At the same time, Parry’s warning is well taken, reminding everyone that the ramifications associated with information and tools extend well beyond winners and losers in a given election contest.

If Parry represents a strong voice of concern, Tony Hey is a voice of unbridled optimism about the general prospects of big data, writing about “how data mashups can help save the world.” It’s reasonable to expect that the political outlook falls somewhere between these extremes. Still, it would be unwise for campaign professionals to overlook the flaws and deficiencies of big data as they move forward.

Barbara Trish is an associate professor and chair of the political science department at Grinnell College.

Share this article


Submit a comment

Required field are marked with “*”.