Can This Clever Statistical Model Predict Olympic Medal Winners?

Interactive: Two brothers’ quest to Nate Silver the Sochi Winter Olympics

How many medals will the U.S. walk away with at this year's Winter Olympics? What about perennial runner-up China? Two brothers, Dan and Tim Graettinger, think they have the answers, and you’ll be surprised to hear how they got them.

Dan thought up the idea to Nate Silver the Olympics while watching NBC’s nightly medal count during the 2010 Winter games. Inspired by Google’s 20% rule, where you dedicate 20% of work time to personal interests, Dan pitched the project to his brother, a data analyst. Over the next four years, the two collected more than 30 datasets and ran regression after regression until they found a model that matched the past two Winter Olympics with incredible accuracy. The first chart pictured below shows which countries the brothers’ model predicts will win it all at this year’s Games.

"If he had known how long it would take to assemble the data," Dan tells Co.Design, "maybe Tim would’ve told me to work on something else."

Careful to avoid speculative biases, Dan collected every piece of information he could imagine, spanning economic, geographic, religious, and sociopolitical metrics. After Dan compiled all the data, Tim created a first-pass model that determined whether a country would leave the Games with even one Olympic victory. Many nations, including every country from Africa, South America, and the Middle East, were relatively easy to predict—they’ve never taken home a winter medal. Other perennial losers include Iceland, Greece, and Argentina.

The strongest predictor of a country’s Winter Olympics success was its performance two years before, in the Summer Games. "It was totally unexpected," says Dan. "If a country didn’t medal in the summer, there was 100% certainty that it wouldn’t win at the next Winter Olympics." Although the Jamaican bobsled team has faced quite a bit of stress on the road to Sochi, their country's strong 2012 performance in track and field spells good news for their chances to medal this year.

For the final model, the Graettinger brothers found that only four variables consistently predicted a country’s medal count in the Olympics: geographic size, GDP per capita, the value of its exports, and the capital city’s latitude.

"It’s interesting when you bring together data that doesn’t naturally co-exist," Tim says. "You don’t usually see medal counts lined up against all this economic data."

Of course, not every country fits the model perfectly, and this is what our second graph above illustrates. Although the Graettingers correctly predicted success within three medals for more than 80% of the countries in the 2006 and 2010 Games, several countries consistently defied expectations. South Korea raked in 16 more medals than the model predicted, whereas the United Kingdom walked away with 24 fewer medals than they should have. What is it about these countries that makes them so exceptional—for better or worse?

The bulk of South Korea’s medals came from speed skating. "How do you account for the fact that short-track speed skating is hugely popular in South Korea?" the brothers wonder in a blog post about their findings.

It also appears that home court advantage is a very real factor as well. Both Italy and Canada over-performed in 2006 and 2010, respectively, when the games were hosted in their country.

"If we were going to Vegas, I think we would make some adjustments based on what we see in the outliers table," Dan says. Even if the Graettinger brothers aren’t putting money down on their predictions, an 80% success rate is far better odds than the house ever lets you play at the casino.

It’s easy to get lost in the Olympics’ human stories of dedication and perseverance, but the Graettingers’ findings suggest that a Shaun White born in Kyrgyzstan would probably be sitting at home this week, watching the Games from afar like the rest of us. What do you think? Have the Graettinger brothers pulled a Nate Silver, or did they miss a crucial variable? If you have any thoughts on how to improve their model, leave them in the comments below.

[Image: Turin, Italy via Paolo Bona / Shutterstock]