You (probably) know nothing John Snow….

There is currently a battle raging………

Not the usual paradigm based Psychodynamic Vs  Cognitive argument this time something even more fundamental.  How do we analyse the data? Countless undergraduate Psychology students with a keen interest in the behavioural element of the discipline have long been found pouring over programs like SPSS  (or in my day Minitab) as they get to grips with the statistical element of research methodology – arguably the area many Psychology students would identify as their greatest weakness. When they do get round to even the most fundamental elements such as P values (the new A-Level specifications have returned to implementing a heightened level of knowledge regarding test choices and parametric vs non parametric testing) it is then the next stage of interpreting the conclusions drawn.  (Here is a great starter site for students to pick up the fundamentals).

Back to the battle……..A new (ish) contender has entered the ring.  Bayesian modelling is starting to become the statisticians choice of weapon over probability values calculating  rejection of the null.  All examples are best explained using…..Game of Thrones….of course.  See below an article from that uses Bayesian modelling to predict who is going to die next in Game of Thrones!


The Model

The idea is simple. Each chapter in the first five Song of Ice and Fire books is told from the point of view of a particular character, and Vale used the number of chapters dedicated to each character in each book to create a simple mathematical model to predict how many chapters might be dedicated to each character in the next two books. Of course, this method can’t predict specific storylines and plot twists. But it does allow for some educated guesses.

“Presumably, dead implies zero POV chapters,” Vale says. “So there should be a small amount of information about the potential deaths of characters if we believe the model.” For example, Vale’s predictions put the odds of Jon Snow having zero chapters in the sixth book at about 38 percent, and the odds of him having zero chapters in the seventh book a little over 67 percent. In other words, based solely on the model, it appears Snow may well be dead by the end of the sixth book.

But Vale doesn’t put much stock in his own predictions. “I am cautiously pessimistic about the model’s chances of giving a good prediction,” he says.

The Dearth of Data

That’s partially because there’s not much data available. Even at a whopping 5,216 pages, five books doesn’t give Vale much to go on. There’s also no real reason to believe there’s a predictable pattern to how many chapters a character will drive before being killed off. And, of course, the model doesn’t take the content of the previous books into account. That leads to some plainly wrong predictions.

The model says it’s possible that there will be chapters dedicated to characters already dead, for instance, and it says some characters, who are clearly alive, may not appear in any chapters. “In general, the best predictions are obtained by a combination of modelling and common sense,” Vale wrote in the paper. “Here we focus entirely on the modeling side and leave common sense behind.”

One of the big ideas of Bayesian statistics is that you can update your predictions as new data becomes available. So, once the sixth book is out, Vale could add that data to the model to make a set of updated predictions about the seventh book. But he doubts the model will actually do well enough for him to bother updating it with fresh data. Plus, he’s heard rumors that Martin will abandon the practice of writing each chapter from a different character’s point of view, which would break the entire model.

Ultimately, this is probably more of a lesson in what not to do when building a mathematical model. But while the paper might not help you win any betting pools, it does help get a sense of how mathematicians approach predictions—at least when they don’t have much else to go on.

Here is an excellent article that discusses in detail the extent to which bias in endemic within Psychology it also discusses  the Bayesian approach over p.  Including reference to Pennington and Hastie’s research into story order and mock juries in general.