Statistics from Altmetric.com
If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.
Confounding is a major concern in causal studies because it results in biased estimation of exposure effects. In the extreme, this can mean that a causal effect is suggested where none exists, or that a true effect is hidden. Typically, confounding occurs when there are differences between the exposed and unexposed groups in respect of independent risk factors for the disease of interest, for example, age or smoking habit; these independent factors are called confounders. Confounding can be reduced by matching in the study design but this can be difficult and/or wasteful of resources. Another possible approach—assuming data on the confounder(s) have been gathered—is to apply a statistical “correction” method during analysis. Such methods produce “adjusted” or “corrected” estimates of the effect of exposure; in theory, these estimates are no longer biased by the erstwhile confounders.
Given the importance of confounding in epidemiology, statistical methods said to remove it deserve scrutiny. Many such methods involve strong assumptions about data relationships and their validity may depend on whether these assumptions are justified. Historically, the most common statistical approach for dealing with confounding in epidemiology was based on stratification; the standardised mortality ratio is a well known statistic using this method to remove confounding by age. Increasingly, this approach is being replaced by methods based on regressionmodels. This article is a simple introduction to the latter methods with the emphasis on showing how they work, their assumptions, and how they compare with other methods.
Before applying a statistical correction method, one has to decide which factors are confounders. This sometimes1–4 complex issue is not discussed in detail and for the most part the examples will assume that age is a confounder. However, the use of automated statistical procedures for choosing variables to include in a regression model …
Competing interests: none declared