Article Text
Abstract
Objectives: Poisson regression is now widely used in epidemiology, but researchers do not always evaluate the potential for bias in this method when the data are overdispersed. This study used simulated data to evaluate sources of overdispersion in public health surveillance data and compare alternative statistical models for analysing such data. If count data are overdispersed, Poisson regression will not correctly estimate the variance. A model called negative binomial 2 (NB2) can correct for overdispersion, and may be preferred for analysis of count data. This paper compared the performance of Poisson and NB2 regression with simulated overdispersed injury surveillance data.
Methods: Monte Carlo simulation was used to assess the utility of the NB2 regression model as an alternative to Poisson regression for data which had several different sources of overdispersion. Simulated injury surveillance datasets were created in which an important predictor variable was omitted, as well as with an incorrect offset (denominator). The simulations evaluated the ability of Poisson regression and NB2 to correctly estimate the true determinants of injury and their confidence intervals.
Results: The NB2 model was effective in reducing overdispersion, but it could not reduce bias in point estimates which resulted from omitting a covariate which was a confounder, nor could it reduce bias from using an incorrect offset. One advantage of NB2 over Poisson for overdispersed data was that the confidence interval for a covariate was considerably wider with the former, providing an indication that the Poisson model did not fit well.
Conclusion: When overdispersion is detected in a Poisson regression model, the NB2 model should be fit as an alternative. If there is no longer overdispersion, then the NB2 results may be preferred. However, it is important to remember that NB2 cannot correct for bias from omitted covariates or from using an incorrect offset.