I wouldn’t be surprised if nobody in Brazil ever read Nate Silver‘s fivetheirtyeight.com again.
A lot of readers in the U.S. might abandon it, too.
On July 7, a day before Brazil was demolished by Germany in a game that ended with an almost unbelievable score of 7-1, Silver predicted Brazil would win–even without two of its star players. After an exhausting–and I do mean exhausting–analysis that could engage only the most diehard statisticians, Silver concluded that Brazil had a 65 percent chance of winning against Germany.
It’s hard to imagine how Silver could have been more wrong.
Unlike Silver’s U.S. election predictions, his analysis of the World Cup was more like a professional paper than a news story. Further, it relied heavily on data from elsewhere that Silver was forced to try to explain and justify. He did that at considerable length, but I wasn’t convinced. I just don’t understand enough about, say, ESPN’s Soccer Power Index to know whether or not to believe it, and I’m certainly not going to read the agate text explaining it. I have a life.
Further, Silver did some fancy hand waving to get around another problem: Who would replace the missing players? He couldn’t know, which left him in the position of trying to predict the game’s outcome without knowing who would be playing. We’re getting far away from data journalism when we read things such as, “So let’s assume that The Guardian’s panelists have it right…” That’s not data; that’s opinion. That’s punditry, which Silver has repeatedly assailed. It would be perfectly acceptable in a column by a sportswriter, but it doesn’t belong in a piece that is supposed to be based on data.
By the time Silver gets to his prediction, he has made so many assumptions and leaps and justifications that we don’t know what to believe.
I was disappointed to see this. I’m a believer in data, and I have no doubt that infusing science journalism with data will improve coverage. Like it or not, Silver is the driving force behind data journalism. And when he gets careless or sloppy, the entire field suffers.
“The Brazil-Germany Match Shows That Sports Predictions, Odds And Data Analysis Are Bullshit,” was the indelicate headline on a story by Eric Goldschein at sportsgrid.com.
” There is no accounting for extraneous variables such as tactical errors (on and off the pitch), mental fortitude (or a lack thereof), misplayed reads, and any of the other thousands of things that can a tip a match’s momentum one way or the other. Soccer, which has its fair share of deflections for goals and blown calls, is particularly prone to this,” he wrote.
In a nice little piece in The Guardian, Andrew Steele, a physicist, reflected on his bet on the game–which was based on the assumption that soccer goals follow a Poisson distribution. The distribution did a nice job of describing the competition up until the Brazil-Germany game, he wrote, but “a 7-1 thrashing does something fishy to the Poisson’s predictors.” The problem, he concluded, was “the rarity of extreme events,” a problem that extends, in his view, to the use of mathematical models by investors (where the consequences can be far more substantial than a temporary loss of national pride).
As it happens, Silver agrees with my analysis, more or less. After the game, he wrote, “Time to eat some crow. That prediction stunk.”
The loss of the two star players “may have had a much larger impact that we accounted for,” he writes. ” Another factor? “Some bad luck for Brazil,” he writes.
We shouldn’t abandon data journalism because Silver blew a call. But if we see too many blown calls, data journalism will surely suffer.
Here’s a thought: How about if we dial predictions way, way down, and instead use data journalism to explain the complicated world we live in?
Predicting the future has never been an easy business. But we still have a lot to learn from the past.