## Saturday, July 27, 2013

### How much was Kevin Rudd worth?

I was a little surprised when I saw Simon Jackman suggest that Kevin Rudd had moved the two-party preferred voting intention by seven percentage points in Labor's favour. It was not consistent with my own analysis and only one pollster (Morgan) has data that supports a seven point movement. Data from all the remaining pollsters suggest the "Rudd Effect" was less than seven percentage points.

Now don't get me wrong, I have enormous respect for Professor Jackman. I purchased and read his 600 page text, Bayesian Analysis for Social Sciences. It is a tour de force on Bayesian statistics. I cannot recommend this book enough. His understanding and knowledge in this area far surpasses my own. Unashamedly, I have used Jackman's approach as the basis for my own aggregation efforts.

However, I suspect he has not noticed that the data since the second ascension of Keven Rudd violates a number of the linear assumptions implicit in his model. In particular, some of the house effects before and after Kevin are radically different. I blogged on this under the rubric: When models fail us. As I noted previously, the violation of the underpinning assumptions results in the model producing incorrect results.

Revisiting the discontinuity model I initially used following Rudd's restoration, I have treated the Morgan, Galaxy and Essential data before and after the restoration as different series. I have also centred the aggregation on the assumption that the house effects for Newspoll and Nielsen sum to zero (this may turn out to be problematic, but it is sufficient for the moment). Notwithstanding, some remaining doubts, I think this approach overcomes many of the problems my earlier discontinuity model had. I will cut to the results before reviewing the R and JAGS code.

The key finding is that Kevin was worth 5.6 percentage points in Labor's two party preferred vote share.

Turning to the house effects, we can see some of the variability in the pre-Rudd (PR) and after-Rudd (AR) values.

The revised model follows. In the first code block is the R code for managing the Morgan sample size and for separating the relevant polls into pre-Rudd (PR) and after-Rudd (AR) series. The second code block has the JAGS code. (As an aside, I have been playing with Stan lately, and might make a switch down the track).

# fudge sample size for Morgan multi - adjustment for observed over-dispersion
output.data[output.data[, 'House'] == 'Morgan multi', 'Sample'] <- 1000

# treat before and after for Morgan, Galaxy and Essential as different series
output.data$House <- paste(as.character(output.data$House),
ifelse(as.character(output.data$House) %in% c('Essential', 'Morgan multi', 'Galaxy'), ifelse(output.data[, 'Date'] >= as.Date(discontinuity), ' AR', ' PR'), ''), sep='') l <- levels(factor(output.data$House))
n <- which(l == 'Newspoll')
l[n] <- l[1]
l[1] <- 'Newspoll' # Newspoll is House number one in the factor ...
output.data$House <- factor(output.data$House, levels=l)



model {
## Based on Simon Jackman's original model

## -- observational model
for(poll in 1:NUMPOLLS) {
y[poll] ~ dnorm(walk[day[poll]] + houseEffect[house[poll]], samplePrecision[poll])
}

## -- temporal model
for(i in 2:PERIOD) { # for each day under analysis ...
}
sigmaWalk ~ dunif(0, 0.01)            ## uniform prior on std. dev.
walkPrecision <- pow(sigmaWalk, -2)   ##   for the day-to-day random walk
walk[1] ~ dunif(0.01, 0.99)           ## uninformative prior
discontinuityValue ~ dunif(-0.2, 0.2) ## uninformative prior

## -- sum-to-zero constraint on house effects
for(i in 2:HOUSECOUNT) { ## vague normal priors for house effects
houseEffect[i] ~ dnorm(0, pow(0.1, -2))
}
#houseEffect[NEWSPOLL] <- -sum(houseEffect[2:HOUSECOUNT])  ## all sum to zero
houseEffect[NEWSPOLL] <- -houseEffect[NIELSEN]   ## Newspoll and Nielsen sum to zero
#houseEffect[NEWSPOLL] <- 0 ## centred on Newspoll as zero
}