“It’s going to be a landslide,” I assured liberal friends and family in the weeks approaching the election. As they expressed concern over Hillary’s resurgent email controversy, or their sinking suspicion that more people would vote Trump on election day than would readily admit it, I scoffed at their ignorance.
“Have you checked out The Upshot?” I’d ask, like a Kindergarten teacher talking to a kid with a runny nose. “It’s by the New York Times. It has pretty much the best data viz team in the world. And it has Hillary with an 85% chance to win. It’s right on top of the site.”
“Do you know FiveThirtyEight?” I’d float with a particular glee at the confounding, esoteric name they’d never remember well enough to Google later. “It’s the data site by Nate Silver. He’s pretty much the guy who called Obama in 2008.”
By Election Day, I’d reached peak data-smug, Facebooking that Hillary would have this whole thing in the bag by 9pm ET. Maybe 9:30. (Disregard that no polls even closed on the half hour.)
But then the opposite happened. As Trump racked up wins that night, I made sense of it by returning to my favorite graphic from the season–what I came to call FiveThirtyEight’s intestines visualization–to supplement the experience of watching CNN. Above all else, this was the one thing that had me sure of Hillary’s win. Trump might take Florida? I checked the viz. Florida is on the end of the political GI tract, a sphincter of voters defecating well into Trump territory. Florida was never a sure thing. Trump might take Michigan? Wait. Michigan? A mix of mental surgery (“if I slice the New Hampshire intestine here, and then I put a rubber band around Michigan there . . .), electoral college finger counting, and sheer panic ensued as my brain struggled to accept the results.
All the smartest people were wrong. Or were they?
Silver’s tweet demonstrates just how tricky the world of statistics and probability can be. A 30% chance of Trump victory seems unfathomable. A 30% chance of earthquake seems inevitable.
The results of the election blindsided us, but the culprit wasn’t simply our collective inability to understand statistics. We were blindsided because the data was wrong, the actual margin of error behind that data was likely underestimated, and design made it all worse. When probability mixes with news mixes with data visualization, science is inherently editorialized. And in 2016, the best data publications in the world ultimately catered to our need for simplified narratives–to the point that even lauded data designers got lost inside it all.
“To see this sort of thing suddenly reverse itself was . . . wait everybody was wrong? All these people who’d been right this whole times with their giant brains and models,” says Eric Rodenbeck, founder of the lauded data visualization firm Stamen. “It took me a couple days even to just recover from that.”
The fundamental truth is that all political projections are based upon is polling–which is really just a fancy term for a process in which specific groups of people are asked what they think about various candidates or issues via telephone, online, or in person. And 2016’s polling data was wrong in the Midwest and Northeast. While we can blame that on whatever phenomenon we like–say, the fact that many Republican voters were reluctant to admit to supporting a bigoted candidate, or something far more pedestrian–polling always comes down to people surveying other people, scaling a small subset of voters who are answering a few questions to ultimately speak for the actions of the majority on election day.
“Let’s just stop thinking data is infallible,” says Giorgia Lupi, co-founder of data-driven research firm Accurat. “It’s not. Data is primarily human-made, and reflects our errors in judgement.”
In 2008, the difference between state polling and the true final vote was a mere 1.7 points. In 2016, that difference more than doubled to 3.9 points in ten states. As The Upshot put it in an election post-mortem:
It was the biggest polling miss in a presidential election in decades.
Yet in many ways, it wasn’t wholly out of the ordinary.
Here, The Upshot left us with a narrative centered around probability: “Who could have seen this crazy thing coming? Except that crazy things happen every day and you should always see them coming!” Of course just because something has low odds of happening doesn’t mean it won’t happen. Even a 10% chance of something going wrong will still screw one out of every 10 people–which is a lot, really. But that’s a truth that’s very hard for the human mind to rectify.
“This is what we need to think about. How do we represent these things?” says Rodenbeck. “It does happen. The Giants have swept the World Series. It’s statistically improbable but it does happen.”
Uncertainty is a problem with which the data media industry continues to grapple. It’s ultimately unfair to pin it on the populace when they don’t understand probability–as all designers know, you can’t change people, you have to change design for them. That’s especially true when we’ve learned that much of the source data was wrong to begin with. In fact, since the election we’ve learned that the difference between polling and voting was beyond the industry standard margin of error of ~3.5% in ten states.
Yet it’s unfair to simply write off 2016 a polling outlier for which no one is culpable. Polls have stunk for a long time, and someone did see this whole mess coming. Earlier this year, Columbia University researcher and statistician Andrew Gelman published a paper that called the margin of error of polling into question. After analyzing 4,221 polls regarding 608 state-level elections over the past 26 years, his team found we were vastly underestimating margin of error in polling. While a baseline of ~3.5% is standard, he proposed margin of error should doubled to 7%. That would mean that if a poll concluded that 52% of Americans would vote for Hillary Clinton, accounting for margin of error, the true, reliable finding was that anywhere from 45% of Americans to 59% of Americans would vote for Hillary. Because a 7% margin of error actually equates to a 14-point range.
As The Upshot wrote of this research in early October of this year, “The implication? Even if you see a poll in early November that has Donald J. Trump up by three points or Mrs. Clinton up by five, you should still not be so sure who is going to win the election.”
“The newspapers reported about it . . . yet none of them [seem to have] adopted it,” says Kim Rees, co-founder of “do good with data” visualization firm Periscopic. “Well, OK, you’re not really taking action. You’re not being responsible. If you really believe this is true, you should at least be considering it. That pissed me off.”
Yet on the phone, Gelman is quick to downplay the significance of his own paper when I ask about it, saying that he suspects media statisticians like Silver are accounting pretty well for such poll deviation in their models. “The polls were off . . . more than they should have been off,” he says, pointing out our societal amnesia of polling fiascos like Dewey Defeats Truman. “That happens. That’s the way it goes. You have to expect that can happen.”
Either way, adoption of wider margin for error is more than just a math problem, it has to be visualized clearly to readers. In this case, it essentially would have made it impossible to be certain of a walk-away victory for Hillary Clinton. There’s always a range. This range is listed, but often buried on sites like The Upshot and FiveThirtyEight under simpler, bite-sized infographics that rounded out Hillary’s odds of winning to a solid 85%.
As The Upshot admitted later, “Reactions to Mr. Trump’s victory suggest that, despite our efforts, we failed at explaining that an 85% chance is not a 100% chance. If we did it all again, we would probably emphasize uncertainty in a more visceral way, rather than using a simple statement of probability . . .”
And yet in a race as close as this election–Hillary won the popular vote by just seven-tenths of a percent and had four-point polling leads in many states she lost–it’s worth asking: If margin of error is greater than margin of victory, what’s even the point of visualizing or discussing this stuff at all? “If those lines are overlapping all the time, you can’t say who is winning,” says Rees.
The design of graphics themselves are also subject to biases. “As someone who’s in that world, and trusts that world, I remember when we started going into data visualization stuff, we had this idea was all we needed to do was go out, get the data, and represent it,” says Rodenbeck. “We forgot there was this middle step, that actions needed to be taken based upon implicit biases. If we needed anything to awaken us from that notion, it just happened.”
This year, objective red and blue election maps gave way to a new wave of clever, cute, and downright exciting graphics that had silent commentary baked right in. And both The Upshot and FiveThirtyEight topped their election pages with a tug of war graphic between Hillary and Trump, that ultimately presented a singular figure–the possible chance Hillary and Trump each had to win. “There’s a certain about of ballsiness or chutzpah people have when they feel the numbers are dramatic. It’s going to be so fucking far ahead we’re going to put it all out there,” says Rees. “It’s ‘design to that outcome we believe in’ versus ‘let’s just chart this like any other thing.’“
The New York Times coded jitter and randomness into a series of election pressure gauges it tweeted through the night of the election. To anyone with a heart, it felt like the graphics were literally about to explode, like ticking time bombs powered by angst. While two New York Times contributors have each penned articles defending this decision as a scientifically accurate means to present margin of error, the graphics’ emotional impact seemed outsized compared to their informational value. As if there’s no other means to present a range of numbers than with a prop from a MacGruber sketch.
The most striking case of data bravado–in my opinion, obviously–was in FiveThirtyEight’s aforementioned intestines graph. Take a look at it again. As Rees points out, there’s no information here that couldn’t be presented in a boring old bar chart that was nothing but data on an X and Y axis. Instead, FiveThirtyEight created the equivalent of a nostalgic board game. States twisted and turned with uncertain value to the entire election like a political process built upon Chutes and Ladders.
But the winding graph didn’t capture endless possibility like plotted data would. It captured one potential outcome. I only realized how editorialized this seemingly objective graphic was on election night itself, as soon as Hillary looked like she was losing Michigan. What did that one state loss mean for the entire graphic? How could I reprocess FiveThirtyEight’s assumptions with shifting information? I couldn’t, because this graphic is built for a blow-out with every state falling in line exactly as Silver projected.
“It was like it was designed for a Hillary victory,” says Rees.
Perhaps the most difficult part of reconciling what went wrong with 2016’s political projections is that it’s hard to know if we can trust what happens next election season.
The Upshot has published a mea culpa acknowledging it could have done a better job representing uncertainty. Meanwhile, FiveThirtyEight contends that, in fact, it did predict that Trump could win just as he did–the site just gave it a low likelihood of happening. “We basically had [Trump] as a little better chance than a team down 3-2 in the NBA finals,” explained a staffer from FiveThirtyEight. When I think of the statistics that way, I feel like an idiot for being blindsided by a Trump victory. Maybe that’s all FiveThirtyEight should have said the entire election season!
And yet, the business of political poll prognostication coupled with the social-media-powered news machine practically dictates that we demand more from political data publishers than a single grounding metaphor to comprehend the possible outcome of an election. There’s always another story to tell with the data. In 2016, we are all information junkies, and the next fix is never far away.
“The media’s responsibility is to inform the public and contribute to a productive democratic process,” wrote Danah Boyd, a data specialist from NYU and Microsoft Research, in a cautionary post. “By covering political polls as though they are facts in an obsessive way, they are not only being statistically irresponsible, but they are also being psychologically irresponsible.”
Frankly, it’s hard to critique the system when we are all more or less stuck inside it.
“Hopefully the people who are charting kind of got schooled, and are going to check themselves next time they go into something, asking, ‘is this just my bias talking, or is this an actual number?’” says Rees. In fact, during the week since the election, Rees has already organized 40 volunteers to launch what she calls a “Snopes for data charting,” essentially a collection of data specialists commenting on charts you see around the web. It might even live as a Chrome extension that anyone could install.
Over the coming months and years, data designers will undoubtably reckon with how they approach data in the next election. Indeed, the responsibility has never been greater. “If there was ever a time that data visualization could see itself removed from rough and tumble of politics,” says Rodenbeck, “I think we’ve seen that rudely disavowed.”
The Upshot’s Amanda Cox declined to comment for this article, and FiveThirtyEight’s Nate Silver did not respond to multiple requests to comment.