Sunday, December 30, 2012

Footy tipping with some help from Bayes

Christmas is the season for frivolous pursuits. For the fun of it, I thought I would adapt the Bayesian model I use to pool the polls to see how it would fare against the bookmakers in predicting NRL footy outcomes for the 2012 season.

The model I tested was very simple. It assumed that the score difference between two teams can be explained by two parameters. The first is a home game advantage parameter for each team. The second is a parameter for the strength of each team. These team strength parameters are allowed to evolve from round to round. This model can be expressed roughly as follows.

(home_score - away_score) = home_team_advantage + home_strength - away_strength

team_strength_in_round ~ normal(team_strength_in_prev_round, team_standard_deviation)

The JAGS code for this model is as follows.

    model {
        # observational model
        for( i in 1:N_GAMES ) {
            score_diff[i] <- homeAdvantage[Home_Team[i]] +
                (strength[Round[i], Home_Team[i]] - strength[ Round[i], Away_Team[i] ])
            Home_Win_Margin[i] ~ dnorm(score_diff[i], consistencyPrec)        
        }
            
        # temporal model
        for( round in 2:N_ROUNDS ) {
            for( team in 1:N_TEAMS ) {
                strength[round, team] ~ dnorm(strength[(round-1), team], strongWalkPrec[team])
            }
        }
            
        # predictive model
        for( i in N_FROM:N_GAMES ) {
            prediction[i-N_FROM+1] <- score_diff[i]
        }
            
        # priors
        consistencySD ~ dunif(0.0001,100)               # vague prior - positive
        consistencyPrec <- pow(consistencySD, -2)
                
        for( team in 1:N_TEAMS ) {
            strength[1, team] ~ dnorm(0, pow(100, -2))  # vague prior
                
            homeAdvantage[team] ~ dnorm(0, pow(10, -2)) # vague prior
                
            strongWalkSD[team] ~ dunif(0.0001,4)        # vague prior - positive
            strongWalkPrec[team] <- pow(strongWalkSD[team], -2)
        }
    }

I tested the model with this data for seasons 2011 and 2012. For each round in 2012 (prior to the finals), I picked the team the JAGS code and the team the bookmakers thought most likely to win. I did not consider draws. While I estimated the probability of a draw from the JAGS samples, I only picked the maximum from the probabilities of a home win versus an away win. For the JAGS prediction, I simulated each round 10,000 times. For the Bookmaker prediction I converted their odds to probabilities which I adjusted for the bookmaker's overround so that the sum of the home-win, away-win and draw probabilities was one.

The end result (for such a simple model) was very close. Over the course of 2012, the JAGS model picked the winning team 121 times. The bookmakers (or more accurately, the punters collectively) got it right 122 times.

The challenge now is to refine the model and make it better than the bookmakers.

No comments:

Post a Comment