Introduction

The unprecedented rapid growth of transnational interdependence of capital, trade and labor market is considered a core feature of economic globalization (Spiegel et al. 2004). Despite its beneficial effects on economic growth and social development, economic globalization carries the risk of reducing the health and well being of sizeable parts of working populations due to increased levels of work stress (Benach et al. 2002; Landsbergis 2003). Most often, psychosocial stress at work is experienced as a result of challenges from a demanding environment that are difficult to meet and thus, due to the threat of failure, evoke intense negative emotions and alerting physiological responses (Weiner 1992). Work intensification, job insecurity, poor quality of work, and wage inequalities in competitive labor markets are among the major sources of work stress in the context of economic globalization (Moutsatsos 2008). These features are intertwined as work intensification goes along with cut-down of personnel, downsizing, merging, and outsourcing, thus increasing job instability and insecurity (Vahtera et al. 2004). Especially female employees were shown to experience high job insecurity due to precarious job conditions (fixed-term employment) (Nurminen 2008). Workers without an alternative choice in the labor market are at increased risk of receiving low wages and of experiencing poor quality of work, and this fact supports the view that income inequality and labor market segmentation are growing as a result of current economic globalization (Spiegel et al. 2004).

There is now convincing evidence available on adverse effects of these features of work and employment on the health of working people (Cartwright and Cooper 2008; Schnall et al. 2008). Yet, large parts of this evidence result from research conducted in developed Western societies, mostly Europe, the United States and Canada. Given the dynamics of economic take-off in rapidly developing countries and related labor market mobility, it seems important to monitor work stress in a comparable way across countries. To date, researchers offer a variety of interesting conceptualizations and measures of work stress with relevance to health (Antoniou and Cooper 2005; Perrewe and Ganster 2002). These approaches differ largely with regard to their specificity and theoretical focus, their contribution towards explaining health, the availability of their measurement in different languages, and the quality of their assessment in terms of psychometric properties (Tabanelli et al. 2008).

Here, we present a short generic measure of work stress that is characterized by three distinct features. First, the measure is based on a theoretical model, effort–reward imbalance, that captures a core component of work contracts globally, the relation between efforts invested into job performance and rewards received in turn (Siegrist 1996). The model claims that lack of reciprocity between efforts spent and rewards received (‘high cost/low gain’) occurs frequently under specific conditions (see below), and that failed reciprocity elicits strong negative emotions and sustained stress reactions that in the long run adversely affect physical and mental health. Importantly, rewards in this model are not restricted to wages, but include promotion prospects, job security, and performance-related esteem. In addition, while efforts spent at work mainly reflect extrinsic demands, they are often moderated by the workers motivations. With the assessment of ‘work-related commitment’, the model takes account of this component. Effort–reward imbalance is frequent (a) if workers have no alternative choice in the labor market, (b) if their jobs are confined to heavy competition, and (c) if workers are overcommitted to their work, either due to their psychological profile or to informal work pressure. As these conditions often manifest themselves during economic globalization, as described above, this model may capture at least some essential parts of this process.

As a second feature, this short measure is derived from a longer original questionnaire measuring the model (Siegrist et al. 2004) that was demonstrated to contribute in a significant way to the explanation of a variety of stress-related health disorders in prospective, cross-sectional, and case-control studies, including cardiovascular and metabolic diseases, affective disorders, musculo-skeletal disorders, alcohol dependence, sickness absence, and poor self-rated health (Siegrist and Theorell 2006; Tsutsumi and Kawakami 2004; van Vegchel et al. 2005). Thus, there is considerable evidence that the model of effort–reward imbalance identifies relevant health-adverse aspects of a stressful work environment.

Third, because economic globalization affects working populations globally, it is important to provide standardized measures of work stress with documented psychometric properties in different languages. The original effort–reward imbalance questionnaire, on which this short version is based, has been translated and psychometrically tested in a number of languages, including English, German, Dutch, French, Spanish, Portuguese, Italian, Japanese, Chinese and Korean language (http://www.uni-duesseldorf.de/medicalsociology). Additionally, the questionnaire was analyzed with respect to cross-cultural comparability (Tsutsumi et al. 2008), sensitivity to change (Tsutsumi et al. 2002), and factorial invariance (de Jonge et al. 2008). It may therefore serve the aim of assessing work stress across countries in the context of economic globalization.

The original questionnaire consists of 23 Likert-scaled items (Siegrist et al. 2004). Again and again, there have been requests of a shorter version that is more easily applicable to large-scale investigations. Therefore, we set out to develop a short version of the original questionnaire, and we present its psychometric properties in this contribution. Despite the potential relevance of this measure for comparative international investigations, we limit this report to data obtained from one single country, Germany. Yet, within this large data set, a variety of different professions and occupations is included, and socio-economic variations, including East–West differences, can be analyzed.

Methods

Study sample

This report is based on the 2006 panel wave of the German Socio-Economic Panel (SOEP), a representative longitudinal study of the German population aged 16 years and older. The study was initiated in 1984 and contains annual information about personal, economic, social and political aspects (http://www.diw.de/soep). In the 2006 survey, about half of the data (49%) was collected by personal interview (with or without computer-assistant device) and the remaining half (51%) by standardized questionnaires. When comparing the potential effect on answers produced by these two different modes of data collection, we did not find substantial differences between scores of the scales, with the exception of a slightly higher mean score of ‘effort’ in the self-completed mode of answering. In this analysis, we restricted the 2006 sample to all men and women who were employed or self-employed at this time (N = 10,698).

Measurement

The established version of the effort–reward imbalance (ERI) questionnaire consists of 23 items representing the three scales ‘effort’ (6 items), ‘reward’ (11 items, with three subdimensions) and ‘overcommitment’ (6 items) (Siegrist et al. 2004). As this latter scale was already the result of a previous psychometrically validated reduction capturing the essence of this personal pattern of coping with work demands (‘overcommitment’) (Siegrist et al. 2004), it was included without further change into the short version. In order to achieve a reduced version of the two remaining scales ‘effort’ and ‘reward, the following item selection criteria were applied. First, each item should represent a core aspect of the respective dimension (including the sub-components ‘salary, job promotion’, ‘job security’, and ‘esteem’ of the ‘reward’ component). Second, the frequency of the respective job content should be relatively high in a variety of occupational groups. Third, each item should meet established quality criteria and exhibit a satisfactory correlation coefficient with the respective scale. These latter decisions were based on results of exploratory factor analyses (factor loadings) drawn from a large prospective epidemiological study.

The established rating procedure of the 5-point Likert-scaled items measuring ‘effort’ and ‘reward’ consists of two steps (Siegrist et al. 2004). First, participants agree or disagree whether or not the item content describes a typical experience of their work situation. Subsequently, those who agree are asked to evaluate to what extent they usually feel distressed by this typical experience (from ‘not distressed’ to ‘very distressed’). A sum score of these ratings is calculated for each scale. Accordingly, scores of the ‘effort’ scale vary from 3 to 15, (with higher scores reflecting more stressful experience due to high effort); scores of the ‘reward’ scale vary from 7 to 35 (with lower scores reflecting more stressful experience due to low reward). A different rating procedure was applied to the scale ‘overcommitment’ where respondents were asked to answer the statements on a 4-point Likert scale (range from ‘strongly agree’ to ‘strongly disagree’; sum score 6–24, with higher scores indicating higher overcommitment). Although the psychometric properties reported here are based on the described rating procedure involving the 5-point Likert scales for ‘effort’ and ‘reward’ and the 4-point Likert scale for ‘overcommitment’, we propose to facilitate the measurement in future studies by applying a consistent 4-point Likert scale (as in case of ‘overcommitment’) for all components of the questionnaire (see Sects. “Discussion” and “Appendix”).

An appropriate fit of the data with the theoretical model should result in three moderately correlated second-order factors (‘effort’, ‘reward’, ‘overcommitment’, where ‘reward’ is further specified into three theoretically relevant sub-components) loading on a general third factor that represents the latent construct (see Sects. “Statistical analysis” and “Results”).

Based on theoretical arguments ‘effort’ and ‘overcommitment’ are treated as independent factors as they represent the extrinsic (perceived demands), and intrinsic (cognitive–motivational pattern of coping with demands) components of work-related performance. In this analysis, we use the combined information (ratio) of the two scales ‘effort’ and ‘reward’ as a proxy measure of the core theoretical assumption to test discriminant and criterion validity, as done in previous reports (Siegrist et al. 2004).

We included a number of additional measures. These measures are age, gender, self-rated health, and two indicators of socio-economic position (education and income). Self-rated health was measured by a single standard question: “Would you say your health is…very good; good; fair; bad; very bad?” (Idler and Benyamini 1997). ‘Poor health’ (bad; very bad) was defined as a binary criterion variable. Income information is based on the total monthly individual net income which was divided into gender-specific quartiles. Education was measured using years of formal education that is categorized into four categories in accordance to the German school system.

Statistical analysis

Psychometric information includes mean and standard deviation of each item and of the scales. Scale reliability (Cronbach’s alpha) was determined, based on item-scale correlations and inter-scale correlations. Uncorrected and corrected coefficients were interpreted according to recommendations proposed by Nunnally and Bernstein (1994).

To test the dimensional structure of the theoretical model, we performed confirmatory factor analysis. More specifically, we analyzed improvements in the degree of model fit with current data, starting from a model without any theoretical structure (model 0) to a model which provides the most appropriate representation of the theoretical structure (model 2). Goodness of fit was assessed by the GFI and the AGFI index as well as Chi-square and comparative fit index (CFI). The former two indices illustrate the amount of variance and covariance explained by the model where values >0.90 are considered satisfactory (Byrne 2001). RMSA is a measure based on the analysis of residuals whose value should be significantly below a threshold of 0.08 (Browne and Cudeck 1993). Finally, the CAIC index informs about the model’s parsimony, based on goodness of fit, number of estimated parameters and sample size (Hu and Bentler 1995). Lower values indicate more appropriate model specifications. Maximum likelihood method was applied to estimate parameters.

In addition to these test-statistical models, we applied logistic regression analysis to assess one aspect of criterion-related validity, using self-rated health. Gender-specific quartiles of the scales ‘effort’, ‘reward’, and ‘overcommitment’ as well as of the ‘effort-reward ratio’ were entered separately into the models as predictors, and effects were adjusted for age, gender and socio-economic position (education and income). Odds ratios and 95% confidence intervals are provided. Analyses are based on SPSS 15.0 and AMOS 7.0.

Results

First, information of the sample composition (means and percentages of socio-demographic and socioe-conomic characteristics) is given in Table 1. Of the 10,698 subjects in this analysis 52.5% were men. Age distribution is follows: mean of 41.85 (SD ± 11.62) years. A majority of participants (55.9%) had 11–13 years of education.

Table 1 Sociodemographic characteristics of the study population (SOEP; N = 10698)

Internal consistency

Item-total correlation and Cronbach’s alpha coefficients are presented in Table 2. All Cronbach’s alpha coefficients were higher than 0.70, suggesting satisfactory internal consistency (Cronbach’s alpha of 0.74 for ‘effort’, 0.79 for ‘reward’, and 0.79 for ‘overcommitment’.). Further, all item-total correlation coefficients were above the threshold of 0.30 (Nunnally and Bernstein 1994), indicating considerable consistency of items defining respective scales (all corrected item-total correlation coefficients ranged from 0.40 to 0.72, results not shown).

Table 2 Mean, standard deviation (SD), Cronbachs alpha and item-total correlation coefficients of the scales of the Shortened Effort–Reward Imbalance Questionnaire

Factorial validity

The factorial validity of the ERI scales was tested with confirmatory factor analysis using AMOS 7. First, a one-factor model was estimated proposing that all 16 items load on the same underlying dimension. Second, a model was estimated where the three factors ‘effort’, ‘reward’, and ‘overcommitment’ are represented as first-order factors, and the underlying construct ‘effort−reward imbalance’ as second-order factor. Third, a higher-order model was estimated to test the most appropriate representation of the theoretical construct (see Table 3).

Table 3 Summary of fit-indices for the 3 competitive models of confirmatory factor analysis (see Fig.1)

Initially, a one-factor CFA model was tested. In this simple model 0, all items were assumed to load on a single latent factor. All error terms were uncorrelated. The fit statistics indicate that this model shows a poor global fit.

In model 1, three first-order factors (‘effort’, ‘reward’, and ‘overcommitment’) and one second-order factor (the underlying ERI construct), are proposed, assuming that a single second-order factor accounts for the three first-order factors. Compared to model 0, this model is improved, but still does not represent an acceptable fit with the data (see Table 3; Fig. 1).

Fig. 1
figure 1

Competitive models of confirmatory factor analysis testing the theoretical construct underlying the shortened Effort–Reward Imbalance Questionnaire (Models 1 and 2, see text)

In contrast, model 2 provides a satisfactory fit (see Table 3). Here, the global ‘reward’ factor was further specified by breaking it down into the three theoretically postulated components (‘esteem’, ‘job security’, ‘job promotion/salary’). ‘Effort’ and ‘overcommitment’ were represented as additional second-order factors, as was the case in model 1. The three factors were expected to load on a general third-order factor, representing the ERI construct (see Fig. 1). As shown in Table 3, the fit of this model improved considerably, compared to the former models. In summary, we can conclude that this short measure adequately represents the core theoretical structure of this model of stressful work.

Discriminant validity

To test a further relevant aspect, discriminant validity, we conducted a series of analyses of variance with predefined subgroups, in particular gender, age, socio-economic position, occupational grade, and region (East Germany vs. West Germany). Results not demonstrated in detail revealed significant differences of the scales according to most of these criteria. For instance, lower occupational grades displayed lower rewards, whereas high occupational grades scored higher on ‘overcommitment’. The effort–reward ratio was higher in men than in women, and it was higher in middle-aged compared to the oldest and the youngest age groups.

Criterion validity

Self-rated health was the most appropriate health measure in the 2006 SOEP Survey. Thus, we tested one aspect of criterion validity of the short ERI measure, assuming that employees who scored high on the scales of the construct were at elevated risk of experiencing poor self-rated health, compared to those with lowest scores. Three logistic regression analyses were run, an unadjusted and two adjusted analyses (age and gender; additionally socio-economic position). As can be seen from Table 4, participants with scores in the highest quartile were about three-times as likely to experience poor self-rated health, compared to those in the lowest quartiles. Effects were of similar size for single components and for the composite measure, the ‘effort-reward ratio’. Although some ‘dose-response’ relationship according to quartiles of exposure is observed, those in the highest quartile are clearly at elevated risk. These results suggest that the proposed short measure adequately captures groups of employees at risk of poor self-rated health.

Table 4 Associations of the scales of the shortened Effort–Reward Imbalance Questionnaire with poor self-rated health [logistic regression analyses (odds ratios and 95% confidence intervals)]

Discussion

Results of this psychometric test of a shortened version of the questionnaire measuring effort–reward imbalance at work confirm that the original theoretical construct is well represented. Based on a large representative sample of 10,698 employed men and women in Germany, data reveal satisfactory internal consistency of the scales, and confirmatory factor analyses result in a good model fit of the data with the most appropriate representation of the theoretical structure of ERI. Moreover, each scale is associated with an elevated risk of poor self-rated health, thus indicating satisfactory criterion validity in terms of this health measure. Our results are in line with earlier findings of confirmatory factor analyses of the original ERI questionnaire (Rödel et al. 2004; Siegrist et al. 2004). Concerning discriminant validity, our results support previous investigations that found statistically significant differences in mean scores of effort, reward and overcommitment according to gender and gender role orientation (Starke and Niedhammer 2002), age, socio-economic position and occupational group (Siegrist et al. 2004). In addition, prospective and cross-sectional reports documented associations of the model’s components with poor self-rated health (Godin and Kittel 2004; Niedhammer et al. 2004).

Despite these merits, several limitations of the present study need to be addressed. First, we cannot exclude some bias due to common method variance, given the fact that both work-related and health-related questions are based on subjective evaluations reported within the same assessment. Yet, in several previous publications, the control of effects of personal response styles (e.g., negative affectivity) did not invalidate associations of effort–reward imbalance at work with health (Bosma et al. 1998; Joksimovic et al. 2002; Stansfeld et al. 1999). A second limitation concerns the cross-sectional study design with restriction to a single exposure assessment. Thus, neither the stability over time of this shortened measure of work stress, nor the direction of its association with self-reported health can be further examined. Third, we applied the maximum likelihood method to evaluate the model fit of the data. Application of this method requires a multivariate normal distribution of the variables under study. As this assumption was not met by all variables, we repeated the estimation of the models by an alternative, distribution-free approach, the Unweighted Least Squares (ULS) method (Marcoulides and Hershberger 1997). The fit-indices (GFI and AGFI) were not substantially different from those estimated by maximum likelihood method. Moreover, it was shown that this latter method is quite robust against the violation of the normality assumption (Boomsma and Hoogland 2001; Curran et al. 1996; Muthén and Muthén 2002).

An important limitation is given by the original rating procedure of the Likert-scaled items measuring ‘effort’ and ‘reward’. As mentioned in the Sect. “Methods”, we originally applied a two-step procedure in answering the questions, while reducing the information subsequently into a uniform scaling procedure, with values from 1 to 5 for each item (Siegrist et al. 2004). Recent methodological analyses revealed that this assumption is not well supported by the data (Tsutsumi et al. 2008). In particular, response categories 1 (not applicable) and 2 (applicable, but not distressed) may lead to misclassification. Based on positive experience with (a) the rating procedure of the items measuring overcommitment, and (b) an identical rating procedure introduced recently in the measurement of ERI in a large prospective epidemiological investigation (Siegrist et al. 2006), we now propose to answer all 16 Likert-scaled items of the short version of this questionnaire in an identical way (strongly disagree, disagree, agree, strongly agree) (see Sect. “Appendix”). One methodological step towards this decision consists in collapsing the two response categories 1 and 2 mentioned above, and to rerun all psychometric analyses of the current data. We have completed this step and observed no significant deviations, neither with regard to model fit indices nor with regard to the results of logistic regression analyses (data can be delivered on request).

Although the short measure is intended to strengthen comparative research on work and health in a globalized economy, this report is restricted to data from one single country. Yet, preliminary data from a large Swedish sample reveal similar psychometric properties of this shortened questionnaire, thus supporting previous findings of cross-cultural comparability of measurement of effort–reward imbalance at work (Siegrist et al. 2004; Tsutsumi et al. 2008).

In conclusion, despite these limitations, this research demonstrates that the effort–reward imbalance model can be measured in a reliable and valid way by a shortened version of the original questionnaire. Providing a short validated generic measure of work stress that captures some important health-adverse aspects of work and employment in a globalized economy is considered a useful step towards strengthening policy measures of health-promoting work conditions.