Reliability of Standard Health Assessment Instruments in a Large, Population-Based Cohort Study
Introduction
Standardized instruments are often used in survey research. Many of these instruments are devised in clinic settings where health assessment is completed by trained health care professionals. However, prohibitive cost and relative ease make participant-assessed outcome measures a more feasible approach to obtain constructs describing functional and mental health outcomes. With these more convenient measures of health increasingly used as primary outcomes in epidemiologic studies, selecting an appropriate assessment tool involves careful review of the many standard survey instruments available. Special consideration of whether the instruments meet the requirements of the proposed application is critical to interpretation of collected data (1). Reliability and validity of these instruments are often tested thoroughly in populations or settings in which the instrument was originally created 2, 3. However, many questionnaires incorporate standardized survey instruments in populations that may be different from those for which the instrument was intended. In these studies, it is important to establish a level of confidence in the information being ascertained prior to declaring the instrument appropriate for the targeted population.
The Millennium Cohort, the largest cohort study ever undertaken by the US Department of Defense, was launched in 2001 to gather health outcome information along with occupational and environmental exposures employing a longitudinal approach 4, 5. In the first panel of enrollment, more than 77,000 participants joined the 22-year-long study, filling out either a mailed survey or an identical Web-based survey. The Millennium Cohort Study questionnaire is composed of more than 60 multipart questions comprising more than 400 individual data points, including questions from standardized instruments such as the Medical Outcomes Study Short Form 36-item for Veterans (SF-36V) 6, 7, the Primary Care Evaluation of Mental Disorders (PRIME-MD) Patient Health Questionnaire (PHQ) 2, 8, 9, the Posttraumatic Stress Disorder (PTSD) Checklist–Civilian Version (PCL-C) 3, 10, and the CAGE questionnaire to assess problematic drinking behavior (11), as well as questions that target areas such as medical history, vaccinations, environmental exposures, and occupation. Although the concordance of test-retest responses and internal consistency of the standard instruments have been established 6, 7, 8, 9, 10, tests of reliability of these constructs have not been performed in a large, population-based cohort where multiple independent instruments are presented simultaneously. The purpose of this study, therefore, was to establish the reliability as measured by concordance in a test-retest setting and internal consistency of several standardized instruments in a large, population-based military cohort.
Section snippets
Study Population
The invited Millennium Cohort Study participants were randomly selected from all US military personnel serving in the Army, Navy, Coast Guard, Air Force, and Marine Corps as of October 1, 2000. The population-based sample represented approximately 11% of the 2.3 million men and women in service and, oversampled for those who had been previously deployed, were US Reserve and National Guard personnel, and female service members, to ensure sufficient power to detect differences in smaller
Results
Of the 77,047 Millennium Cohort Panel 1 participants, 76,742 (99.6%) had complete demographic and military characteristic data. This population included 73% men, 73% born between 1960 and 1979, 49% without any college experience, 63% married, 70% white non-Hispanic, 77% enlisted personnel, 57% active duty personnel, 48% Army, 20% working as functional support specialists, and 20% combat specialists (Table 2).
Levels of internal consistency among standardized survey scales, as measured by
Discussion
Standardized instruments are often employed to enhance the value of epidemiologic survey research. Diligence in establishing consistency and comparability to promote confidence in results will become increasingly more important. While the use of established survey instruments may be an enticing addition in pursuit of quality health metrics, suboptimal performance in varying populations may be found instead. In this study, the internal consistency of well-known instruments (PHQ, SF-36V, CAGE,
References (32)
- et al.
Psychometric properties of the PTSD Checklist (PCL)
Behav Res Ther
(1996) - et al.
Millennium cohort: enrollment begins a 21-year contribution to understanding the impact of military service
J Clin Epidemiol
(2007) - et al.
Validity and utility of the PRIME-MD Patient Health Questionnaire in assessment of 3000 obstetric-gynecologic patients: the PRIME-MD Patient Health Questionnaire Obstetrics-Gynecology Study
Am J Obstet Gynecol
(2000) - et al.
A reappraisal of the kappa coefficient
J Clin Epidemiol
(1988) - et al.
Evaluating patient-based outcome measures for use in clinical trials
Health Technol Assess
(1998) - et al.
Validation and utility of a self-report version of PRIME-MD: the PHQ Primary Care Study. Primary care evaluation of mental disorders
JAMA
(1999) - et al.
The Millennium Cohort Study: a 21-year prospective cohort study of 140,000 military personnel
Mil Med
(2002) - et al.
SF-36 Health Survey: manual and interpretation guide
(2000) - et al.
The MOS 36-Item Short-Form Health Survey (SF-36). I. Conceptual framework and item selection
Med Care
(1992) - et al.
Utility of a new procedure for diagnosing mental disorders in primary care. The PRIME-MD 1000 Study
JAMA
(1994)
Detecting alcoholism. The CAGE questionnaire
JAMA
Validity of the Patient Health Questionnaire-9 in assessing depression following traumatic brain injury
J Head Trauma Rehabil
An efficient method of identifying major depression and panic disorder in primary care
J Behav Med
Health status assessments using the Veterans SF-12 and SF-36: methods for evaluating outcomes in the Veterans Health Administration
J Ambul Care Manage
SF-36 Physical and Mental Health Summary Scales: A user's manual
Cited by (103)
Sexual health difficulties among service women: the influence of posttraumatic stress disorder
2021, Journal of Affective DisordersCitation Excerpt :Mental disorders were assessed at Time 1. Probable PTSD was measured using the PTSD Checklist−Civilian Version (PCL-C), a validated instrument used to rate the severity of symptoms (Blanchard et al., 1996) that has demonstrated good internal consistency (Cronbach's =0.94) in this cohort (Smith et al., 2007). Based on criteria from the Diagnostic and Statistical Manual of Mental Disorders 4th edition (DSM-IV), probable PTSD was defined as reporting a moderate or higher level of at least one intrusion symptom, three avoidance symptoms, and two hyperarousal symptoms (Diagnostic and statistical manual of mental disorders 4th ed.
A pilot multisite study of patient navigation for pregnant women with opioid use disorder
2019, Contemporary Clinical TrialsA community pharmacy-led intervention for opioid medication misuse: A small-scale randomized clinical trial
2019, Drug and Alcohol DependenceCitation Excerpt :The two-item pain subscale asked about level of bodily pain and pain-related physical functioning and is scored on a 0–200 scale. We assessed depression using the Patient Health Questionnaire (PHQ) depression subscale, a valid mental health assessment with demonstrated reliability (Hides et al., 2007; Smith et al., 2007; Spitzer et al., 1999, 2000). This subscale is scored on a 5-point scale (0=none-minimal; 1=mild; 2=moderate, 3=moderately severe; 4=severe).
Disclosure: This work represents Report 06-24, supported by the Department of Defense, under work unit No. 60002. The views expressed in this article are those of the authors and do not reflect the official policy or position of the Department of the Navy, Department of the Army, Department of the Air Force, Department of Defense, Department of Veterans Affairs, or the US Government. This research has been conducted in compliance with all applicable federal regulations governing the protection of human subjects in research (Protocol NHRC.2000.007).
- ∗
In addition to the authors, the Millennium Cohort Study Team includes Paul J. Amoroso, MD, MPH (Madigan Army Medical Center, Tacoma, WA); Edward J. Boyko, MD, MPH (Seattle Epidemiologic Research and Information Center, Department of Veterans Affairs Puget Sound Health Care System, Seattle, WA; Gary D. Gackstetter, PhD, DVM, MPH (Department of Preventive Medicine and Biometrics, Uniformed Services University of the Health Sciences, Bethesda, MD and Analytic Services, Inc. [ANSER], Arlington, VA; Gregory C. Gray, MD, MPH (College of Public Health, University of Iowa, Iowa City, IA; Tomoko I. Hooper, MD, MPH, Department of Preventive Medicine and Biometrics, Uniformed Services University of the Health Sciences, Bethesda, MD); James R. Riddle, DVM, MPH, and Timothy S. Wells, PhD, DVM, MPH. (both from Air Force Research Laboratory, Wright Patterson AFB, OH.).