Study methods & design
Validity and Coverage of Estimates of Relative Accuracy

https://doi.org/10.1016/S1047-2797(00)00043-0Get rights and content

Abstract

PURPOSE: Studies comparing test accuracy often restrict the confirmation procedure to subjects classified as positive by either test. Relative sensitivity (RSN) and relative false-positive rate (RFP) are two estimable comparative measures of accuracy. This article evaluates the influence of sample size, disease prevalence, and test accuracy on the validity of point estimates of RSN and RFP, and on the coverage of their confidence intervals (CI).

METHODS: For each combination of sample size, disease prevalence, test accuracy, and interdependence between tests 1,000 samples were generated using computer simulations. The percent bias in the RSN and RFP estimates was measured by comparing the means of the 1,000 values computed in each simulation (log-transformed) with their theoretical values. Coverage of the estimated CI was measured by computing the proportion that actually included the theoretical values. Application of these methods was illustrated with data from a study comparing mammography and physical examination in screening for breast cancer.

RESULTS: RSN estimates were valid if the true number of diseased cases exceeded 30, and RFP estimates were valid if the number of nondiseased subjects exceeded 200. When the numbers of diseased and nondiseased subjects exceeded 150 each, the 95% CI of RSN and RFP provided adequate coverage of the parameters (95 ± 2%).

CONCLUSION: Sample size is the most important variable for the validity and coverage of RSN and RFP estimates. For small samples, validity and coverage of RSN and RFP also depend on the accuracy of each test and on the degree of interdependence between the tests.

Introduction

Sensitivity and specificity of a test can be estimated in a study only if disease status of all subjects is independently confirmed using a gold standard. A gold standard, however, rarely can be administered to all subjects because of ethical or cost considerations. Studies comparing two tests often restrict confirmation of disease status to subjects classified as positive by either test. In this context, absolute measures of accuracy (sensitivity and specificity) cannot be readily estimated because the confirmatory procedure does not lead to an assessment of the disease status of all subjects 1, 2, 3. Relative measures of test accuracy, such as relative sensitivity (the ratio of two sensitivities, RSN) and relative false-positive rate (the ratio of two false-positive rates, RFP), however, can be estimated in these circumstances 3, 4. We have described a simple asymptotic variance estimator that can be used to calculate the CI of RSN and RFP (4).

In this article, we investigate the effect of disease prevalence, sample size, test accuracy, and interdependence between tests on the validity of RSN and RFP estimates and on the coverage of their CI.

Section snippets

Background

We focus on self-matched study design (i.e., the two tests to be compared are simultaneously administered to the same individuals) and dichotomous disease status and test results (Table 1).

In Table 1, ND and N are the unknown numbers of diseased and nondiseased subjects. The true disease status of test-positive subjects is ascertained using a confirmatory procedure (a gold standard) (Tables 1B and 1C).

For data in Tables 1B and 1C, estimates of RSN, RFP, and 95% CI of lnRSN and lnRFP are

Results

The simulation work yielded 6,400,000 empirical sets of tables with the same layout as Tables 1B and 1C. One or both marginal totals were 0 in 122 sample tables with a layout similar to Table 1B. Zero marginals were also observed in 101,896 sample tables with a layout similar to Table 1C. Thus, lnRSN or lnRFP and their variance estimates could not be computed for such tables, and the samples were deleted from the analysis of bias and coverage.

All the 122 data sets of the form of Table 1B with

Example

We applied the methods described in this paper to breast cancer screening data from the New York Health Insurance Plan study (6). Based on that study, Schatzkin and coauthors described the estimation of RSN and RFP (3). In the New York Health Insurance Plan study, 20,211 women aged 40–64 years were screened for breast cancer. Breast biopsies were obtained by either mammography or physical examination from 307 women who were positive for a suspicious breast lesion. In 55 cases, the lesion was

Discussion

RSN and RFP are two important measures of test accuracy only when confirmation of test positive results is feasible. Because of ethical and practical considerations, this study design has been used in a variety of settings, such as screening for breast (6) colorectum 9, 10, and prostate (11) cancers, and for the diagnosis or screening of sexually transmitted diseases such as Chlamydia trachomatis infection 12, 13. Thus, the methods described in this paper are applicable to a broad range of

References (14)

  • L. Tabar et al.

    Update of the Swedish two-county program of mammographic screening for breast cancer

    Radiol Clin North Am.

    (1992)
  • P. Cole et al.

    Basic issues in population screening for cancer

    J Natl Cancer Inst.

    (1980)
  • A.S. Morrison

    Screening in Chronic Disease

    (1992)
  • A. Schatzkin et al.

    Comparing new and old screening tests when a confirmatory procedure cannot be performed on all screenees. Example of automated cytometry for early detection of cervical cancer

    Am J Epidemiol.

    (1987)
  • H. Cheng et al.

    Comparison of the accuracy of two tests with a confirmatory procedure limited to positive results

    Epidemiology.

    (1997)
  • SAS® LanguageReference, Version 6

    (1990)
  • P. Strax et al.

    Mammography and clinical examination in mass screening for cancer of breast

    Cancer.

    (1967)
There are more references available in the full text version of this article.

Cited by (12)

  • A new method to address verification bias in studies of clinical screening tests: Cervical cancer screening assays as an example

    2014, Journal of Clinical Epidemiology
    Citation Excerpt :

    Although extension of the proposed approach to address these more sophisticated situations of disease verification is of interest, the development of a simple and straightforward computational method for this purpose will be a challenge. Another assumption of our model involves the large sample normal approximation, and thus, the proposed method may not be appropriate for small studies [39,40]. We developed a straightforward approach to estimate and compare the accuracy of two or more screening tests in the presence of verification bias, which can be readily implemented with standard software.

View all citing articles on Scopus
View full text