Article Text

Download PDFPDF

Original article
Longitudinal measurement invariance of the effort-reward imbalance scales in the Young Finns study
  1. Maria Törnroos1,
  2. Liisa Keltikangas-Järvinen1,
  3. Taina Hintsa1,
  4. Christian Hakulinen1,
  5. Laura Pulkki-Råback1,
  6. Markus Jokela1,2,
  7. Nina Hutri-Kähönen3,
  8. Mirka Hintsanen1,4
  1. 1Unit of Personality, Work and Health Psychology, IBS, University of Helsinki, Helsinki, Finland
  2. 2Department of Psychology, University of Cambridge, UK
  3. 3Department of Pediatrics, University of Tampere and Tampere University Hospital, Tampere, Finland
  4. 4Helsinki Collegium for Advanced Studies, University of Helsinki, Helsinki, Finland
  1. Correspondence to Professor Mirka Hintsanen, Unit of Personality, Work and Health Psychology, IBS, University of Helsinki, P.O. Box 9, Helsinki FIN-00014, Finland; mirka.hintsanen{at}


Objectives In order to make valid conclusions about individual change in work-related risk factors it is important to examine whether these factors are measurement invariant over time. We tested the measurement invariance of the effort-reward imbalance (ERI) scales using the ERI Questionnaire (ERI-Q). Additionally, we examined the criterion validity of the ERI scales.

Methods The sample used in this study was population-based and comprised 2128 participants (56.6% women) in full-time employment. Data on effort, reward and self-reported general stress were collected in 2007 and 2012. Measurement invariance was assessed separately for the effort and reward scales, with reward treated as a first-order and as a second-order variable. Criterion validity of the ERI scales was also examined using a single-item measure of general stress.

Results Effort and reward were found to be measurement invariant over time, that is, they measured the same latent variable across both time points. Furthermore, ERI and its components showed adequate criterion validity, and effort was additionally found to prospectively predict general stress 5 years later (β=0.072, 95% CI 0.013 to 0.131).

Conclusions Our results indicate that changes in the scores of the ERI scales are more likely caused by changes in perceptions of work characteristics than by changes in the construct of the scales. Additionally, the results support the criterion validity of ERI and its components.

  • Invariance
  • Effort-Reward Imbalance
  • ERI-Q

Statistics from

What this paper adds

  • Measurement of change in work-related stress may be confounded if the scales used are not measurement-invariant over time.

  • Previous research on measurement invariance of the effort-reward imbalance is contradictory.

  • In our study, the effort-reward imbalance scales were shown to be invariant over time.

  • Change in stressful work characteristics can reliably be measured with the effort-reward imbalance scales, and changes in the parameters of the scales reflect true changes in work characteristics.


Work-related stress is a considerable problem in industrialised countries. In 2005, 22% of the workforce in the 15 EU countries reported experiencing work-related stress,1 and stressful working conditions have been shown to have severe consequences on health in the form of depression, sickness absence and coronary heart disease.2–4

The model on effort-reward imbalance at work has been used to assess stressful working conditions and their adverse outcomes on health and well-being.5–8 The effort-reward imbalance model distinguishes two work characteristics, high effort and lack of reward, as indicators of stressful working conditions. Effort is interpreted as the demands and obligations the employee is faced with and reward as the money, esteem and career opportunities (or job security) the employee subsequently expects.5 Lack of reciprocity in the exchange of efforts and rewards, that is, high efforts combined with low rewards, is suggested to cause emotional distress and heightened autonomic arousal with unfavourable health consequences, such as cardiovascular events.5

The effort-reward imbalance model is hypothesised to be sensitive to changes in the labour market and to reflect the predominant macroeconomic labour-market situation.9 The surveillance of changes in work-related stress is important for organisations because these changes might reflect reactions to organisational restructuring or outcomes of stress intervention.10 ,11 However, the analysis of change may be confounded if the scale used to assess change is not invariant over time.12 That is, the scale needs to measure stressful work characteristics in the same way (ie, the same latent variable) across repeated measurement times. Therefore, the invariance of a measure is crucial because it allows researchers to make comparisons between constructs over time, knowing that the operationalisation of the construct has not changed.12 ,13

Effort-reward imbalance and its components are most commonly assessed with the Effort-Reward Imbalance Questionnaire (ERI-Q), a self-report measure developed by Siegrist.5 ,14 The scales have been validated in several studies and the unidimensionality of the effort and the reward scales have been confirmed.9 ,15 ,16 However, knowledge on the measurement invariance of the scales is scarce. In a Dutch panel study on 383 (first wave) and 267 (second wave) healthcare workers (80–90% women) with a 1–2 years follow-up, the factor loadings of the effort-reward imbalance scales were found not to be invariant over time, but the changes were relatively small and may have been related to the use of a two-step format of the ERI-Q.17 A Finnish study on 758 white-collar professionals (14–17% women) on the other hand showed that the effort-reward imbalance scales were invariant across time (4-year follow-up time).18 The discrepancies in the results of previous research might be due to the differences in the samples, and the uneven distribution of gender and occupational groups. Therefore, more research is needed using population-based samples, a wider range of occupational groups, and an even gender distribution. It is also important to examine whether effort-reward scales have measurement invariance over longer time lags than 4 years.

The aim of our study is to assess whether the effort-reward imbalance scales show invariance over time, that is, whether the scales measure the same underlying concepts at two different time points. Additionally, we aim to test the predictive power of the effort-reward imbalance scales on a prospective measure of self-reported general stress. In a previous cross-sectional study on Swiss healthcare workers, the measure of general stress used in our study has been associated with effort-reward imbalance and work-life conflict.19 It has also been recognised as a valid measure of stress symptoms on group level.20 Investigating the impact of effort-reward imbalance on general stress adds valuable information to the work stress literature on criterion validity of the effort-reward imbalance measures. The sample used in our study is large, population-based, consists of a wide range of occupations, and contains data on effort-reward imbalance collected at two time points, 5 years apart (in 2007 and 2012), using the original questionnaire (ERI-Q).



The data for our study was from the ongoing population-based Young Finns study, which started in 1980.21 ,22 The original sample consisted of 3596 participants from six age cohorts. Data for our study were collected in 2007 and 2012. In 2007, the participants were 30–45 years old and, thus, the sample represents working-aged adults. The sample for our study consisted of 2128 participants (1204 women, 56.6%), who had data on the effort and reward scales from either 2007 or 2012. The number of participants varied according to the analyses; there were 1228 and 1177 participants in the analyses of invariance for effort and reward, respectively. In the regression analyses on the association of effort-reward imbalance and its components with general stress, there were 1237 (unadjusted model) participants with no more than 50% missing values in the effort and reward scales in 2007, and 1083 (fully adjusted model) participants with additionally no missing data in the covariates.

Effort-reward imbalance

In 2007 and 2012, effort (α=0.76) was measured with five items and reward (2007: α=0.82; 2012: α=0.83) with 11 items from the original scales of the ERI-Q.5 The components of reward were esteem (5 items; α=0.84), promotion (4 items; 2007: α=0.63; 2012: α=0.62) and security (2 items; 2007: α=0.68; 2012: α=0.62). The responses for effort and reward were given on a scale from 1 (does not apply) to 5 (does apply). Mean scores for effort and reward were calculated for those participants who met the prerequisite of being included in the study. Effort-reward imbalance was calculated by dividing the mean scores of the effort component by the mean scores of the reward component.9

General stress

General stress was measured in 2007 and 2012 by a one-item question on stress symptoms from the Occupational Stress Questionnaire (‘Stress means a situation in which a person feels tense, restless, nervous or anxious or is unable to sleep at night because his/her mind is troubled all the time. Do you feel this kind of stress these days?’).23 Response was given on a scale from 1 (not at all) to 5 (very much). The validity of the single-item measure has been shown previously.20 General stress in 2012 was used as an outcome criterion.

Control variables

In addition to age and gender, educational level and occupational status were included in the regression analyses since they are potential confounders.24 Educational level was classified as (1) low (comprehensive school), (2) intermediate (secondary education), or (3) high (academic; graduated from a polytechnic or a university). Occupational status was based on the Central Statistical Office of Finland: (1) manual, (2) lower non-manual, and (3) upper non-manual. The occupational status of entrepreneurs was determined based on their educational level (low, intermediate and high education corresponding to manual, lower non-manual and upper non-manual, respectively).

Statistical analyses

Longitudinal measurement invariance was assessed with a series of confirmatory factor analyses differentiating four types of measurement invariance: configural invariance, weak (metric) invariance, strong (scalar) invariance and strict (residual) variance.13 Configural invariance is a baseline model, where all parameters are allowed to vary. Weak invariance is assessed when the factor loadings are restricted to be equal over time, but the intercepts and residuals are allowed to vary. Strong invariance is assessed by additionally restricting the intercepts to be equal, but allowing the residuals to vary. Strict invariance is tested when all the parameters (ie, factor loadings, intercepts and residuals) are restricted to be equal over time.

In order to test the invariance of effort and reward, we tested the two dimensions separately following the theoretical model which states that effort-reward imbalance is a division of effort and reward.9 Indeed, previous research has confirmed the factorial validity of the effort-reward imbalance scales when testing effort and reward separately instead of examining the complete effort-reward imbalance model simultaneously.9 ,25 Reward was examined in two ways, as a first-order factor (ie, the sum score of all the components) and as a second-order factor consisting of the first-order factors esteem, promotion and security.

Because the same individual items were measured at both time points, their residual variances were allowed to correlate in all models. In the effort scale, two sets of two items had a correlation over 0.4 in addition to being theoretically very similar. Thus, we allowed them to correlate in all models. In the first-order reward scale, seven items were allowed to correlate as shown in figure 1. The structural model for reward as a second-order factor is shown in figure 2. Factor loadings of the effort and reward items in 2007 are shown in figure 1 (effort and first-order reward) and figure 2 (second-order reward). The items are labelled according to the original labelling.9

Figure 1

Correlations between conceptually similar items and factor loadings in the effort and reward scales in 2007 (all factor loadings shown were significant at p<0.001).

Figure 2

Factor loadings of items in the second-order reward scale with first-order factors esteem, promotion and security in 2007 (all factor loadings shown were significant at p<0.001).

Model fit was evaluated based on the Comparative Fit Index (CFI), the Root Mean Square Error of Approximation (RMSEA) index, and the Bayesian Information Criterion (BIC). Neither CFI nor RMSEA is affected by model complexity, and CFI is also independent of sample size.26 CFI values above 0.95 and RMSEA values below 0.05 indicate good fit. BIC was used to compare models: the lower the BIC, the better the model fit.

The associations of effort-reward imbalance and its components with general stress were examined by linear regression analyses. First, we adjusted for age and gender. Second, we additionally adjusted for occupational status and educational level. In the third model, stress at baseline in 2007 was also entered as a confounder into the analyses. We used STATA (V.12), IBM SPSS (V.21) and Mplus (V.7) for the statistical analyses.


The descriptive statistics for the sample are shown in table 1. The mean age of the participants was 37.6 years in 2007, and slightly more women than men participated (56.6%). Of the participants, 95.4% had secondary education or higher, and the largest occupational group was upper non-manual (47.8%).

Table 1

Sample descriptives

Longitudinal measurement invariance

Table 2 shows a summary of the fit indices for the invariance models. For effort, the strict model fit the data best (RMSEA=0.048, CFI=0.977, BIC=33 418.078). RMSEA and BIC values were lower in the strict model than in the strong (RMSEA=0.051, BIC=33 448.412) or weak RMSEA=0.054, BIC=33 474.448) model. For first-order reward, all models showed adequate fit, but examination of the BIC revealed that the strict model (RMSEA=0.056, CFI=0.916, BIC=66 166.819) fit the data better than the strong (BIC=66 222.601) or the weak (BIC=66 227.953) model. Likewise, when reward was treated as a second-order factor, the strict model had the best fit (RMSEA=0.052, CFI=0.924, BIC=66 038.978) when comparing the BIC to the strong (BIC=66 101.982) or the weak (BIC=66 097.109) model.

Table 2

Summary of goodness-of-fit for the invariance models

Criterion validity

The results for the associations of effort-reward imbalance and its components in 2007, with general stress in 2012, are shown in table 3. High general stress was associated with high effort (β=0.269, 95% CI 0.214 to 0.324), low reward (β=−0.129, 95% CI −0.186 to −0.072) and high effort-reward imbalance (β=0.294, 95% CI 242 to 0.346). These associations were not attenuated after adjusting for educational level and occupational status. Additionally adjusting for baseline, general stress decreased the estimates so that only high effort was associated with higher general stress (β=0.072, 95% CI 0.013 to 0.131).

Table 3

Results of linear regression analyses on effort-reward imbalance and its components in 2007 with general stress in 2012


Changes in work-related stress over time may be caused by personal experiences, organisational restructuring and societal changes in labour markets. The identification of such changes requires measurement instruments that are not confounded by instability in repeated measurements over time. Our results show that the effort and reward scales of ERI-Q14 achieved strict measurement invariance, indicating that the effort and reward scales measure the same latent variable over time. Our results are important because they show that scores on effort-reward imbalance from different time points can be reliably compared with each other, and that changes in the parameters of the scales reflect true changes in work characteristics, in contrast with lack of invariance in the constructs.

Furthermore, we found that high effort, low reward and high effort-reward imbalance were associated with higher risk for general stress 5 years later. Thus, our results indicate that effort-reward imbalance and its components have adequate criterion validity. Moreover, based on the analyses that controlled for the baseline general stress, high effort seems to be a valid prospective risk indicator for higher general stress 5 years later. This goes beyond the scope of what is usually demanded as evidence for adequate criterion validity. These results are in accordance with the effort-reward imbalance model, which states that effort is spent as a part of a contract, where sufficient rewards are expected in return.9 If the rewards do not match the effort, this lack of reciprocity is considered particularly stressful.5 Current findings also support previous studies that have shown associations of effort-reward imbalance with decreased health and well-being.6 ,27 ,28

Previous studies assessing measurement invariance of the ERI-Q have used samples of only one occupational group and an uneven distribution of men and women.17 ,18 It has been suggested that future studies should use more heterogeneous, nation-representative samples.18 The sample used in our study was large, population-based and consisted of various occupations. Additionally, the gender distribution was fairly even. Thus, our results can be more reliably generalised to the general population. The time lag between measurement points in our study was 5 years, compared with 1–4 years in previous studies, which provides information on the long-term time invariance of the effort-reward imbalance scales.

The present study has some limitations that should be taken into account. First, the RMSEA and CFI values of the invariance models for the first-order and the second-order reward scale showed adequate fit, rather than good fit. However, we were able to obtain strict measurement invariance in all scales, which indicates that the constructs are indeed measurement-invariant over time. Second, the outcome variable for assessing criterion validity was measured with only one item. However, the single-item measure of stress has been validated and shown to be a sensitive indicator of well-being at work.20 Third, effort-reward imbalance and general stress were measured using self-reports which might make the results inflated due to response bias. However, it has been suggested that common method variance is not automatically a source of bias, especially in organisational research.29–31 Fourth, the participants of our study were more likely to report having non-manual occupation than manual occupation. Therefore, our results might reflect white-collar professionals slightly better than blue-collar workers.

In conclusion, we showed with prospective longitudinal data that the effort-reward imbalance scales are measurement-invariant over time and can, thus, be reliably measured at different time points with the ERI-Q. Additionally, our results indicate that effort-reward imbalance and its components have criterion validity as measures of experienced stress. Due to the wide use of effort-reward imbalance model to study and depict stressful working conditions, establishing measurement invariance of the scales has implications for a large amount of longitudinal organisational studies. Additionally, knowing that changes in the scores reflect true changes in stressful work characteristics enables occupational health services to target stress management and intervention appropriately.



  • Contributors MT: performed the majority of the statistical analyses, interpreted the analyses and drafted the manuscript. LK-J: contributed to the design and data collection. MH, TH, LP-R and MJ: contributed to the data collection. MH, TH and MJ: revised the manuscript critically and substantially contributed to the interpretation of the results. CH: performed some of the statistical analyses, substantially contributed to the interpretation of the results and read and critically commented on the manuscript. LP-R, NH-K and LK-J: read and critically commented on the manuscript. All authors have approved the final version submitted for publication.

  • Funding This study has been financially supported by the National Doctoral Programme of Psychology (Academy of Finland) (MT), Academy of Finland projects 258578 (MH), 258711 (LK-J) and 265869 (LK-J), the Emil Aaltonen Foundation (MH), the Signe & Ane Gyllenberg Foundation (MH), the Alli Paasikivi Foundation (MH), The Finnish Cultural Foundation (CH), and the Juho Vainio Foundation (LP-R). The Young Finns Study has been financially supported by the Academy of Finland: grants 134309 (Eye), 126925, 121584, 124282, 129378 (Salve), 117797 (Gendi), and 41071 (Skidi), the Social Insurance Institution of Finland, Kuopio, Tampere and Turku University Hospital Medical Funds, the Juho Vainio Foundation, the Paavo Nurmi Foundation, the Finnish Foundation of Cardiovascular Research and the Finnish Cultural Foundation, the Tampere Tuberculosis Foundation and the Emil Aaltonen Foundation.

  • Competing interests None.

  • Patient consent Obtained.

  • Ethics approval The study has been approved by local ethics committees.

  • Provenance and peer review Not commissioned; externally peer reviewed.

Request Permissions

If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.