Table 1

The grading criteria for the evaluation of cumulative evidence on the relationship between air pollution and biomarkers

CriteriaCategoriesProposed operationalisation
Amount of evidence
  • A: Large-scale evidence

  • B: Moderate amount of evidence

  • C: Little evidence

Thresholds may be defined based on sample size, power or false-discovery rate considerations. As a simple rule, we suggest that category A requires a sample size of over 1000 (total number in cases and controls assuming 1:1 ratio) evaluated in the least common genetic group of interest; B corresponds to a sample size of 100–1000 evaluated in this group, and C corresponds to a sample size of <100 evaluated in this group.
Replication
  • A: Extensive replication including at least one well-conducted meta-analysis with little between-study inconsistency

  • B: Well-conducted meta-analysis with some methodological limitations or moderate between-study inconsistency

  • C: No association; no independent replication; failed replication; scattered studies; flawed meta-analysis or large inconsistency

Between-study inconsistency entails statistical considerations (eg, defined by metrics such as I2, where values of 50% and above are considered large, and values of 25%–50% are considered moderate inconsistency) and also epidemiological considerations for the similarity/standardisation or at least harmonisation of phenotyping, genotyping and analytical models across studies.
Protection from bias
  • A: Bias, if at all present, could affect the magnitude but probably not the presence of the association

  • B: No obvious bias that may affect the presence of the association, but there is considerable missing information on the generation of evidence

  • C: Considerable potential for or demonstrable bias that can affect even the presence or absence of the association

A prerequisite for A is that the bias due to phenotype measurement, genotype measurement, confounding (population stratification) and selective reporting (for meta-analyses) can be appraised as not being high plus there is no other demonstrable bias in any other aspect of the design, analysis or accumulation of the evidence that could invalidate the presence of the proposed association. In category B, although no strong biases are visible, there is no such assurance that major sources of bias have been minimised or accounted for because information is missing on how phenotyping, genotyping and confounding have been handled. Given that occult bias can never be ruled out completely, note that even in category A, we use the qualifier ‘probably’.
  • Adapted from Ioannidis et al. Int J Epidemiol 2008;37:120–32 (See supplementary file for references).