Article Text

Download PDFPDF

A review of functional status measures for workers with upper extremity disorders
  1. D F Salerno1,
  2. C Copley-Merriman1,
  3. T N Taylor1,
  4. J Shinogle2,
  5. R M Schulz2
  1. 1Pfizer, Inc., Ann Arbor Laboratories, 2800 Plymouth Road, Ann Arbor, MI 48105, USA
  2. 2University of South Carolina, College of Pharmacy and School of Public Health, Coker Life Sciences Building, Columbia, SC 29208, USA
  1. Correspondence to:
    Dr D F Salerno, 2800 Plymouth Road, Ann Arbor, MI 48105, USA;


In order to identify functional status measures for epidemiological studies among workers with mild to moderate disorders of the neck and upper extremity, a literature search was conducted for the years 1966 to 2001. Inclusion criteria were: (1) relevance to neck and upper extremity; (2) assessment among workers; and (3) relevance to mild to moderate disorders. Of 13 instruments reviewed, six measures were tested among workers. The three best measures, depending on the purpose of research, included the standardised Nordic Musculoskeletal Questionnaire, the Upper Extremity Questionnaire, and the Neck and Upper Limb Instrument. Development of a functional protocol is regarded as a realistic enhancement for research of neck and upper extremity disorders in the workplace. For research and clinical practice, measures of functional status, sensitive enough to measure the subtle conditions in mild to moderate disorders, may provide prognostic information about the risk of developing musculoskeletal disorders in apparently healthy patients. Appropriate use of functional status questionnaires is imperative for a meaningful portrayal of health.

  • functional status
  • musculoskeletal disorders
  • workers
  • ADL, activities of daily living
  • CTS, carpal tunnel syndrome
  • DASH, Disabilities of the Arm, Shoulder, and Hand
  • DRI, Disability Rating Index
  • FSS, Functional Status Scale
  • ICC, intraclass correlation
  • MFA, Musculoskeletal Functional Assessment
  • MHQ, Michigan Hand Outcomes Questionnaire
  • NDI, Neck Disability Index
  • NMQ, Nordic Musculoskeletal Questionnaires
  • NULI, Neck and Upper Limb Instrument
  • SPADI, Shoulder Pain and Disability Index
  • SRM, standardised response mean
  • SSS, Symptom Severity Scale
  • UEQ, Upper Extremity Questionnaire
  • VAS, visual analogue scale
View Full Text

Statistics from

Functional status measures can correlate pain to performance, with direct relevance to employers and workers. Although functional status may range from full ability to severe disability, few measures have been designed for relatively healthy, active workers with upper extremity disorders,1 particularly those with mild to moderate conditions.

Over 75 functional status instruments exist for patients with disability from arthritis or diabetes,2 yet most have focused on severe disability. Thus, although useful in certain epidemiological investigations, most measures are insufficient in addressing the problem of mild to moderate conditions. However, research shows that workers without discernible medical diagnosis often report interference with activities at work or home.3 Investigators need measures capable of detecting subtle as well as pronounced musculoskeletal conditions, and the impact of these conditions on performance.

Musculoskeletal disorders are believed to represent the largest category of work related illness in Britain.4 In the United States (private sector), nearly 6 million workers experience non-fatal injuries or illnesses.5 Although musculoskeletal disorders are among the most prevalent and symptomatic complaints among workers, occupational medicine lacks measures for certain disorders, particularly in early stages.6

New research protocols typically include not only clinical laboratory tests, but also self reported questionnaires.7–,9 This review identifies functional status instruments easily used in occupational health surveys among a working population of mainly healthy subjects. It identifies measures for workers with mild to moderate disorders of the neck and upper extremity. The review is limited to measures for employed adults (distinct from workers who are not employed, or workers under 18 years of age).


Functional status has been characterised as health status, activities of daily living, level of impairment, disability, or handicap. In general, the construct of “function” contains physical, emotional, and social attributes.10 For this review, a functional status measure is defined as an instrument to assess how health and strength, vitality, symptoms (for example, pain or discomfort), emotion, or desires affect performance of everyday activities, recreation, social relations, and work. In short, how physical conditions affect activity.

Stock and colleagues1 identified 12 functional domains relevant to workers: work, household and family responsibilities, self care, transportation, sexual activity, sleep, social activities, recreational activities, mood, self esteem, financial effects, and the iatrogenic effects of assessment and treatment.


Functional status measures may be generic (general health) or specific (disease related); discriminative (determining if the condition is better or worse) or evaluative (measuring whether a score has changed).11 Various classification schemas have been proposed,12,13 yet there is no consensus on the basis for stratification of measures. This review focuses on self reported measures, in contrast to clinician based or economic measures (fig 1). It is important to distinguish between instruments classifying diagnosis or pain, and those classifying functional status.

Figure 1

Framework for outcome measures.

Main messages

  • Functional status measures can correlate pain to performance.

  • Few self reported measures have been designed specifically for workers.

  • The impact of mild to moderate disorders on the workforce is unknown.

  • Three measures are identified for epidemiological studies among workers.

  • Consistent use of functional status measures is encouraged.

Policy implications

  • Uniform data collection provides the benefit of a standard reporting environment.

  • Global data standards enable consistently high quality reports, for intelligible comparisons across industries.

Ideally, functional measures would provide data on a full spectrum of function. An important challenge is to measure subtle disease entities that are troublesome to workers (and employers) but more difficult to evaluate. This new approach requires quantification of symptoms below the threshold of those traditionally measured via clinical laboratory tests or physical examination.


To identify self reported functional status instruments for neck and upper extremity disorders among workers, a Medline search was conducted for English language publications between the years 1966 and 2001. Keywords included: carpal tunnel syndrome, functional status, health surveys, musculoskeletal, occupational health, outcome measures, questionnaire, neck, upper extremity, and worker.

The following criteria were used to select self reported instruments for review: (1) relevance to neck and upper extremity conditions (indicated by question content); (2) assessment among workers; and (3) relevance to mild to moderate disorders (mild to moderate conditions could be defined by the patient, not necessarily correlated with abnormal laboratory tests or physical examination). Instruments were selected for this review if published in peer reviewed studies explicitly designed to evaluate psychometric properties of validity, repeatability, and responsiveness to change.

Psychometric properties

Psychometric properties were defined as follows.


How well an instrument measures what it is supposed to measure, how it reflects reality. A valid scale provides for accurate inferences.14 Validity is often quantified by correlation analyses, ROC curve calculations, regression models, or estimates of sensitivity and specificity. Construct validity shows how the new measure compares with other associated measures. Criterion validity comparisons are with some gold standard.15 Content or face validity is the extent to which a set of items reflects a certain domain. Evidence of validity is found by repeated use of the instrument with performance as expected.


Consistency over time (test-retest, inter- and intrarater reliability). For dichotomous data, the odds ratio or kappa statistics are recommended. Kappa statistics account for agreement by chance.16 For ordered data, weighted kappa values are used.17 A kappa value less than 0.40 represents poor agreement beyond chance, between 0.40 and 0.75 is fair to good; and a value greater than 0.75 is excellent.18 For continuous measures, the intraclass correlation (ICC) is the statistical analogue, combining a measure of correlation with a test in the difference of means.19 Although Pearson product-moment correlations can be used, it is known that observations may disagree sharply yet still be correlated.20 Measures may be compared between instruments only if they have been calculated from comparable populations.

Internal consistency, the ability to measure a single concept, is measured by Cronbach alpha.21 A value of 0.70 is good, 0.80 is sufficient, and 0.90 is excellent.22

Responsiveness to change

The ability to detect change over time. Responsiveness is commonly quantified by effect size, or the standardised response mean (SRM). Cohen23 defined the effect size statistic, d, as the difference between means divided by the standard deviation of either group. An effect size of 0.20 or less is small, 0.50 is moderate, and a value of 0.80 or greater is large. The larger the effect size, the more responsive the instrument.

The SRM is the mean change in scores from baseline to follow up divided by the standard deviation of changes. Interpretation of values is similar to effect size. When the correlation between baseline and follow up scores is equal to 0.5, the SRM is equal to the effect size. When the correlation is high, the SRM is greater than the effect size; when the correlation is low, the effect size is as much as 1.4 times higher than the SRM.24 Again, responsiveness depends on having similar populations being assessed between instruments.


A total of 13 self reported neck and upper extremity instruments were reviewed. Based on appearance, all instruments had content validity. Seven instruments were tested among surgical patients. Six instruments had been tested among workers, including three instruments relevant for workers with mild to moderate disorders.

Functional instruments tested among surgical patients

Shoulder Pain and Disability Index (SPADI)25

This has 13 items on 10 cm visual analogue scales (VAS). Ratings include pain severity, difficulty carrying heavy objects, and placing things on a high shelf. Among 37 men, there was moderate to high correlation (Pearson r range: −0.55 to −0.80) with shoulder range of motion. Among 23 subjects in the test-retest study, ICC was 0.66. In assessing responsiveness, the group showed a mean decrease (−25.6) in scores (possible range: 0–100).

Shoulder Rating Questionnaire26

This has 21 items including a global assessment on a 10 cm VAS, questions regarding pain, daily activities, recreational activities, work, satisfaction, and area for improvement graded on five point scales. In a study among100 patients, there was moderate to high correlation to the revised Arthritis Impact Measurement Scales.27 There were significant correlations for global assessment (Spearman r = −0.56), level of satisfaction (r = −0.56), daily activities (r = −0.84), pain (r = −0.86), and work (r = −0.89). Cronbach alpha ranged from 0.71 to 0.90; weighted kappa were greater than 0.70. Tests for responsiveness (n = 30) showed standardised response means (SRM) between 1.1 and 1.9.

Symptom Severity Scale (SSS)28,29

This has 11 items on five point scales assessing pain severity, nocturnal occurrence of pain, frequency, duration, numbness, weakness, tingling, difficulty with grasping, and use of small objects. Testing was conducted in a three month prospective study, and a retrospective study of patients evaluated after surgery. There was significant correlation for pinch strength (Spearman r = 0.47) and grip strength (r = 0.38). Test-retest reliability was high (Pearson r = 0.91) (n = 31), as was internal consistency (Cronbach α = 0.89) (n = 67). The effect size for SSS was 1.13 in the prospective cohort (n = 26), and 1.4 in the retrospective cohort (n = 38).

Functional Status Scale (FSS)28

This has items assessing eight activities (writing, buttoning clothes, holding a book, gripping a telephone, opening a jar, household chores, carrying grocery bags, bathing, and dressing) on a fivepoint scale. Psychometric testing was conducted in a prospective study, and a retrospective study of patients after surgery for carpal tunnel syndrome (CTS). The highest correlations were for pinch strength (Spearman r = 0.60), and grip strength (r = 0.50). Test-retest reliability was high (Pearson r = 0.93) (n = 31) as was internal consistency (Cronbach α = 0.91) (n = 67). The effect size was 0.71 in the prospective cohort (n = 26), and 0.82 in the retrospective group (n = 38).

Disabilities of the Arm, Shoulder and Hand (DASH)30–,32

This has 30 items assessing symptoms: daily activities, recreation, self care, sleep, sports, family care, occupation, socialising, and self image. It includes dichotomous items, and five to six point scales. The DASH correlated highly (>−0.75) with other measures of function, disability, and pain. Both test-retest reliability and internal consistency exceeded 0.95. Optional modules are available for athletes/performing artists, and working populations. The DASH was developed by organisations for surgeons to measure disability and symptoms.

Michigan Hand Outcomes Questionnaire (MHQ)33,34

This has 37 items on six scales: overall hand function, activities of daily living (ADLs), pain, work performance, aesthetics, and patient satisfaction after surgery. In comparing three MHQ scales to the SF-12, a moderate correlation was found (range 0.54–0.79). Tests of validity showed a significant difference for patients with CTS versus arthritis. Spearman correlation ranged from 0.81 to 0.97. Cronbach α ranged from 0.86 to 0.97.

Short Musculoskeletal Functional Assessment Questionnaire35

This is based on the Musculoskeletal Function Assessment (MFA).36,37 It has 46 items in two indices: a dysfunction index and bother index (how much patients are bothered by problems) with five point scales. Among 420 patients, both indices had significant correlations with walking speed and grip strength (Pearson r ≥ 0.40), ADLs, recreational activities, and emotional function (r ≥ 0.40), and with the SF-36 subscales. Test-retest reliability was high (ICC = 0.93, dysfunction index; ICC = 0.88 bother index), as was internal consistency (Cronbach α > 0.92). A test for responsiveness showed SRMs between 0.76 and −1.14.

Functional instruments tested among workers

Six questionnaires were designed for and tested among workers, of which three measures were relevant for mild to moderate disorders: the standardised Nordic Musculoskeletal Questionnaire, the Upper Extremity Questionnaire, and the Neck and Upper Limb Instrument (table 1).

Table 1

Description of selected functional status instruments tested among workers

Neck Disability Index (NDI)38

This was tested among 48 subjects with neck pain in a chiropractic clinic. A subsequent study39 involved a larger sample of 237 neck pain patients, and showed similarly high internal consistency (Cronbach α = 0.92). Exploratory factor analysis indicated high loadings on items of work, driving, and recreational activities, measuring more of the physical aspect of pain disability.

Disability Rating Index (DRI)40

This was tested among 1092 healthy blue and white collar workers, and 366 patients with pain in the neck, shoulder, or low back with different levels of ability. Responsiveness was significant among 19 arthritis patients with median preoperative scores of 52%, and median postoperative scores of 36%.

Upper Extremity Function Scale41

This was tested among 108 patients with upper extremity disorders receiving Workers' Compensation, and 165 patients at a hand clinic. The instrument correlated highly with average pain level (Pearson r = 0.67) and fear of pain (r = 0.44). Among the CTS patients, it correlated highest with worst pain level (r = 0.54), and pinch strength (r = −0.40). Cronbach α ranged from 0.83 to 0.93 across study groups. The SRM ranged from −0.53 to −1.33 for a subgroup of 16 patients who reported being significantly better; SRM ranged from 0.39 to −1.03 for a subgroup of 55 CTS patients.

Nordic Musculoskeletal Questionnaires (NMQ)42,43

These have been tested among 27 clerical workers, 82 women in electronics manufacturing, 17 medical secretaries, 22 railway maintenance workers, and 29 safety engineers. A specific Neck and Shoulder Questionnaire assesses the severity of symptoms in terms of their effect on activities at work and during leisure time. Widely used in Europe, the NMQ was adapted by the National Institute for Occupational Safety and Health (NIOSH) in the United States.44

Upper Extremity Questionnaire (UEQ)45,46

This has been tested for test-retest reliability among 148 manufacturing workers, and 138 keyboard operators in the United States. For symptom reports among keyboard operators, most kappa values were between 0.60 and 0.89.47 Symptom severity and interference with production rates and/or usual standard of quality were less stable. Among the psychosocial measures, Perceived Stress and Job Dissatisfaction Scales were most consistent (ICC = 0.88); coworker support was least consistent (ICC = 0.44), perhaps due to workforce characteristics, such as high turnover.

Neck and Upper Limb Instrument (NULI)48

This was designed for research or clinical use among workers. In a test-retest assessment among 99 subjects, ICC was 0.88. The tool is used to evaluate effectiveness of interventions and prevention of disability.49


This review identified 13 self reported functional status measures, and the context for their use. While the measures have been used, not all have been tested among workers. None of the seven measures designed for patients undergoing surgery was tested among patients receiving conservative treatment. Of the six measures tested among workers, three measures were relevant for mild to moderate neck and upper extremity conditions: the NMQ, Upper Extremity Questionnaire, and the NULI.

Considerations for a worker population

Basic psychometric principles demand that measures be tested in the population in which they will be used. Implicit in the objective to find measures for workers is the assumption that functional status among workers is different to that in a clinical population. Floor and ceiling effects (that is, the inability of an instrument to accurately reflect patients at the ends of the spectrum) should be considered since questionnaires designed for clinical populations may not adequately address the more subtle conditions of active workers.

Age or gender distributions in the workplace can make statistical comparisons difficult if the sample size is small.50 For example, traditional gender concentrations in manufacturing and service industries pose statistical problems for generalisability.

In addition, psychosocial and vocational factors need to be considered,51,52 as factors such as job dissatisfaction, labour relations, dependents, or layoffs may affect outcomes. Selection issues arise if workers believe health status could affect their ability to receive benefits (for example, public or private disability insurance, or health insurance).53

Another concern is proper comparison of normative data among workers. Most normative values are based on convenience samples with certain limitations, namely spectrum bias.54 For example, nerve conduction studies among workers are routinely compared with normative data. Research shows that norms among workers are different from conventional clinical norms, and need attention to avoid misclassification.55

One reason that instruments have rarely been tested among active workers is the logistical difficulty of arranging access. Measures for mild to moderate conditions are a further challenge, as small changes are harder to detect. However, these conditions are gaining importance with the growth of industries that rely on static or constrained postures of the neck, and/or repetitive use of the upper extremity.

Implications for research and clinical trials

Overall, intervention for subtle conditions may prove cost effective and prevent escalating morbidity. Although minor physical damage is reversible, a cascade of biochemical and mechanical changes falls into place in response to injury.56 These changes lead to immune responses and inflammation, which can be a precursor to disability. In early stages, medical care, physical therapy, massage therapy, or pharmacological interventions may provide proper health management. Obviously, the advantage of detecting mild to moderate disorders is that conservative interventions (for example, engineering or administrative controls) may be initiated to minimise decrements in function.

Although there has been progress in standardising a core set of classification criteria in upper extremity disorders,57 major consensus on testing methods has not emerged. Clearly, standardisation of functional measures in occupational epidemiology is essential to provide a consistent database, for better comparisons across industries.

In addition to a core set of standardised questions, future research may involve worker specific assessments58 to address an individual's work related health concerns. By focusing on the important issues, investigators may apply effective interventions.59


Similar to laboratory tests, which can measure upper and lower limits of normal, investigators need questionnaires that can measure a fuller spectrum of function. Specifically, self reported measures are needed to assess mild to moderate disorders among workers to avoid more severe (and costly) consequences. Questionnaires in clinical research confer health benefits of detecting mild to moderate disorders. Early intervention with such disorders may allow a quicker return to normal function.

Functional status measures can quantify the impact of health on performance. Quantifying function may be useful in detecting mild to moderate disorders of the neck and upper extremity. Unfortunately, use of functional outcomes in epidemiological studies has been limited by lack of standardisation, and insufficient availability of normative data. Additionally, there is no gold standard for testing many upper extremity conditions.

Three measures were identified as most relevant for epidemiological studies among workers with mild to moderate upper extremity conditions: (1) the Nordic Musculoskeletal Questionnaire; (2) the Upper Extremity Questionnaire; and (3) the Neck and Upper Limb Instrument. Other functional measures may be relevant; however, they have not been tested successfully in field studies among workers.

Use of standardised functional measures with a wider spectrum of health is regarded as a realistic enhancement for research of neck and upper extremity disorders in the workplace. An important aspect of development is successful testing of validity, repeatability, and responsiveness among active workers. Appropriate use of functional status questionnaires is imperative for a meaningful portrayal of health.


We thank the late Charles F Cannell, who provided direction in preparation of the manuscript; Thomas J Armstrong and Alfred Franzblau for substantive comment on earlier versions; Tom Y Thuren and Eira Viikari-Juntura for insightful review.


View Abstract


  • This paper was presented, in part, at the Seventh Annual International Meeting of the International Society for Pharmacoeconomics and Outcomes Research, Washington DC, May 2001

Request Permissions

If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.