Article Text
Abstract
If policy makers and employers are to take health issues into account when making decisions that will impact on work practices and work environments, they will need accurate information concerning the impact change in psychosocial working conditions has on health status. Although research is increasing in this area, a variety of different methods have been used to define when change in work conditions has occurred. The present paper considers various issues related to the accurate assessment of change in psychosocial working conditions, focusing on research designs that involve the collection of data at baseline and a single follow-up time point. The aim is to inform investigators about these methodological issues so they can be considered in the design of studies, the analysis of data and the interpretation of research findings.
Statistics from Altmetric.com
It is well established that between-group differences exist in the health outcomes of groups who are exposed to negative psychosocial stress exposures versus those who are not, with groups exposed to negative conditions having worse health status.1–4 In spite of the large body of evidence linking exposure to negative psychosocial work environments (eg, low job control and high job demands) to poor health, there is currently little evidence that changing these exposures will necessarily lead to differences in health risk and level of health outcomes.5
As more researchers start to investigate the health effects of changes in working conditions, there is a need for greater attention to be given to the methodological issues involved in the analysis of change among individuals over time. In addition, there is a need to understand differences in the approaches to examining change over time (eg, using a difference score versus using a regression model), as differences in these methods can lead to different results, even when using the same data set.6–8
Defining who has and who has not changed, at an individual level, is a methodologically challenging task.9–12 While change could be identified in many ways (eg, retrospective recall of change by individuals at follow-up), most often it is determined using reports of the psychosocial work environment on two separate occasions. If psychosocial work constructs could be measured without error, respondents who experience a change in their work environment over time could be identified using a simple difference score calculation. However, like most constructs, the measurement of psychosocial work stress is subject to error, with estimates of work stress (xt1) being composed of the true work stress score (Xt1) plus some error (∊), both random and systematic, which needs to be taken into consideration when measuring change.
The purpose of this paper is to review the measurement of change at the individual level, as it applies to psychosocial work constructs. Our primary focus is on situations where researchers are seeking to include changes at the level of the individual as an independent variable in a model to predict a health outcome of interest, that is, defining the amount of change between time points is the primary interest. We present and discuss four main methodological considerations related to defining change between time points. These are:
-
the day-to-day variability in scores;
-
the reliability of a difference score when it is the unit of analysis;
-
adjusting for the effect of initial scores on change; and
-
the stability of the construct over time.
From the outset, we do not claim the information in this paper is new. However, based on the papers we are aware of that have examined the effects of change in psychosocial exposures on health outcomes,5 13–21 these considerations have not been widely incorporated into the examination of change in working conditions. Thus, a review of these challenges seems prudent as longitudinal data on working conditions become more prevalent.
Throughout this paper we provide examples using data from two different sources: the longitudinal file of the Canadian National Population Health Survey (NPHS), an ongoing bi-yearly survey of approximately 17 000 Canadians (psychosocial work exposures have been measured in the NPHS over 6- and 2-year periods using an abbreviated version of the Job Content Questionnaire (JCQ); see appendix for a list of questions)22 23; and employees of a multi-national Canadian company manufacturing foam parts used by the automotive industry (in this sample work stress was measured twice, 12 months apart, using the full JCQ, before and after an ergonomic intervention; see appendix for questions).24 Our preponderance of examples using the JCQ is only a function of data availability as the challenges described here apply equally to all self-reported measures of the work environment.
Issue 1. The day-to-day variability in scores
Researchers measuring change should first determine whether observed fluctuations in scores (or movement between categories of work stress) between time points are actual changes or just the day-to-day variability (measurement error) in the instrument being used. The day-to-day variability of a measure is estimated by a test-retest reliability, although this reliability estimate has limited interpretability until it is translated back into the scale of the measure used. Jacobson25 26 has proposed a reliable change (RC) index to estimate the variation in scores between time periods attributable to measurement error:
where:
x2−x1 = the difference between follow-up and baseline scores,
σt1 = the standard deviation of the measure at baseline,
ρ = the reliability of the measure (ideally a test-retest correlation),
Jacobson suggests that, 95% of the time, respondents with no real change would not exceed an RC value of ±1.96 (ie, p<0.05). This formula can be rearranged to calculate the expected change (x2−x1) for a given value of RC′ (eg, 1.96)26:
where, 95% of the time, difference scores among respondents with no change will not exceed x2−x1. This can be referred to as MDC95 (the minimum detectable change at a 95% confidence level).27 28 Minimum detectable change (MDC) using different thresholds can easily be calculated by substituting a different value for the RC′ in the equation above (eg, for a MDC90, a threshold which respondents with no change would not exceed 90% of the time, would have a RC′ = 1.645; similarly a MDC99 would have an RC′ = 2.576). Ideally, the test-retest (T-RT) correlation (ρ) should be taken from the same, or a similar, population over a time period short enough to be confident that no real change has occurred but long enough to ensure that responses are not based on recall of previous responses alone. Streiner and Norman recommend a time period of 2–14 days,29 however this may not be justified for all work stress measures. For example, real fluctuations in constructs like job demands may occur over shorter time periods.
With large sample sizes (300+ respondents) one may be tempted to use Cronbach’s alpha (α), an alternate measure of reliability based on item homogeneity, to estimate the reliability of the instrument.30 We suggest always using the T-RT, as the α may underestimate the T-RT estimate if relatively few items are being used to capture a relatively broad concept,31 32 as is often the case with psychosocial work stress measures (see example below).
Application in research on changes in psychosocial working conditions
We are aware of only two T-RT estimates for the JCQ over a time period where no change is likely to have occurred. The first, from the Ontario Child Health Survey (OCHS), reports T-RT correlations for various components of the full JCQ (skill discretion = 0.895, decision authority = 0.84 and job control = 0.885) using a subsample of 48 participants over a 2-week period.i A second study, by Head and colleagues,16 reported a T-RT correlation of 0.902 for job control (15 items) in a subsample of 267 participants over a 1-month period in the Whitehall II study (obtained through personal communication with Dr Head).
Using the T-RT information from the OCHS, we estimated the T-RT for the five job control items contained in the NPHS sample (T-RT ICC = 0.86), giving a minimum detectable change estimate (MDC95) between 1994 and 2000 job control scores of ±4. That is, 95% of the time respondent’s with only day-to-day variability in their scores will not change by more than ±4. We then estimated the per cent of the labour force who reported changes in job control above this threshold. These results, for both 1994–2000 and 2000–2002 NPHS samples are presented in table 1, with 25.5% of respondents classified as changing in self-reported job control greater than error between 1994 and 2000, and 18.3% of the sample reporting changes between 2000 and 2002.
Ideally, estimates of test-retest reliability should be from a subset of the same sample being analysed. Given the limited information on the T-RT correlations for work stress instruments in general, we encourage researchers designing studies to consider retesting a subsample of their study population to accurately estimate the day-to-day variability in their measures.
We are aware of only one paper that has incorporated T-RT information into the estimate of change.16 Another assigned a change less than 10% to indicate day-to-day variability,17 with another using movement of more than 0.5 of a standard deviation in baseline score to indicate change.15 Using either of the latter two methods would have overestimated the number of respondents whose job control scores had changed in our sample.
Measuring change in working conditions as either movement between job strain categories5 13 14 or by creating categories of change based on the distribution of difference scores (eg, creating tertiles or quartiles of the difference score and assuming the middle group(s) have not changed19 21) may infer that change has taken place, when it is no more than day-to-day variability in samples with relatively stable working conditions. Conversely, in samples where psychosocial work stress is relatively unstable over time, no change may be inferred when in fact change has occurred. Further, forming groups of change based on the difference scores may mislabel groups of change if the working condition of interest only occurs in one direction.
Issue 2. The reliability of a difference score when it is the unit of analysis
Instead of categorising who has and who has not changed, investigators may want to estimate the magnitude of change between time points. In other words, the difference between time two and time one estimates is the variable of interest. If change is measured with a difference score, then we become reliant on the precision with which we can estimate the differences between time points, which is in turn dependent upon the precision of each measure. The formula to calculate the precision of a difference score, as given by Traub,33 is:
where:
ρ2D = the reliability of the difference score,
ρt1 = the reliability of scores at time one,
ρt2 = the reliability of scores at time two,
σt1,t2 = the covariance between time one and time two.
A high correlation between time one and time two values may indicate the measure is unable to detect which respondents have changed, lowering the reliability of this score.11 33 As the variance increases between time one and time two, the reliability of the difference score increases. Low cross-sectional precision at either time point (as measured by Cronbach’s α) will decrease the precision of the difference estimate. Note that for ρ2D, Cronbach’s α, not the T-RT, should be used given we are interested in how precisely the construct is measured by scale items at a given time point, not the day-to-day fluctuations in the measure.
Streiner and Norman34 suggest that a difference score should be used only when its reliability exceeds 0.50, although higher reliability may be desirable for some researchers. Table 2 presents the reliability of change measurements for the NPHS data (1994–2000 and 2000–2002) and job control scores for the 1-year follow-up at the Canadian foam manufacturing plant. The low reliability of the difference score estimate in the NPHS data is driven by the low reliability of time one and time two job control measures, most probably due to the use of the abbreviated JCQ measure in this sample. Given an α of 0.80 (the usual α estimate for the full JCQ35), the corresponding reliability for the change estimate would increase to 0.61 and 0.52 in 1994–2000 and 2000–2002, respectively, giving the difference score acceptable reliability, according to Streiner and Norman’s criterion.
Application in research on changes in psychosocial working conditions
Neither of the papers that we identified that measured change in psychosocial work stress using a difference score reported the reliability of their change score.18 20 Note that even if the reliability of a difference score is equal to Streiner and Norman’s criterion, still only 50% of the variance in this score is attributable to true variance between respondents. If wanting to measure change between time points with a difference score, researchers should consider the use of abbreviated measures, as they will usually result in estimates with lower precision. In general, the use of an unreliable estimate of the difference between time points, similar to other measures, will reduce the associations found between this measure and an outcome, provided the error in the score is randomly distributed. When the reliability of the difference score is low, we suggest that change should be defined instead as groups of respondents whose work stress has increased, remained stable or decreased (based on movement greater than day-to-day variability).
Issue 3. Adjusting for the effect of initial scores on change
Regression to the mean (RTM) refers to the tendency for extreme observations in a distribution at baseline to move closer to the mean at follow-up.36 When measuring change in working conditions there are two separate aspects of RTM that need to be considered.
The first is the RTM that one might expect when there has been no actual change in the measure of interest. RTM in this instance is due to scores at the positive end of a distribution tending to include observations that have positive errors of measurement on the first test, and vice versa for scores at the bottom of the distribution.6 Therefore, examination of the changes in scores at the second testing, when no change has occurred, should be estimated. Similar to day-to-day variability, outlined earlier, the lower the reliability of the measure (ρ), the greater the RTM one can expect.
The second area relates to methods used to account or adjust for RTM in observational study designs. In an observational study, researchers may use regression modelling techniques to estimate the predicted follow-up score of a given score at baseline (after centering the baseline variable), as a method for adjusting their analysis for RTM. However, in this situation researchers are no longer measuring actual change in scores, as the residual score from the regression indicates who has changed more, or less, than expected based on their baseline score.7–9 In this situation researchers will often find that the conclusions between the results of analyses using a difference score and analyses using a regression differ. This situation, where different conclusions about change will result from the same data set depending upon the analysis that is chosen, is commonly referred to as Lord’s paradox.7 8 37 The correct analysis to undertake will depend on the assumptions about expected change in work stress scores over time (see the following section for an example).
Application in research on changes in psychosocial working conditions
A scatter plot of job control scores using the OCHS test-retest data (over a time period when no change was thought to have occurred) used previously is presented in fig 1. In this figure the dashed line represents where the time one score is equal to the time two score, and the solid line represents the predicted time two score based on baseline score (ie, the regression line). Figure 1 demonstrates little differences between the two lines, suggesting minimal RTM due to differences in the direction of errors of measurement at each end of the job control distribution, over this short time period.
Alternatively, fig 2 presents the scatter plot of job control scores from a Canadian foam manufacturer over a 12-month period, where we would expect that real changes in job control scores have occurred. In this situation there are larger differences between the line of equality and the line based on the regression analysis. In this figure there appears to be a greater tendency for low scores at time one to increase and high scores to decrease, however part of this movement towards the mean will include real changes in job control, and therefore this should not be defined as movements due to measurement error alone.
This figure also demonstrates the differences in classification when using difference scores versus a regression analysis. Observations where job control at time one is equal to job control at time two (ie, scores lying on the dashed line) would be classified as having changed based on the regression line when they are at the ends of the distribution, leading to differences in the classification of who has changed and who has not between these two methods. When examining differences in change between groups, if group membership is related to initial score, this will produce different conclusions about changes in scores and group membership, demonstrating Lord’s paradox as described above.
In general, given the assumption that job control scores are unlikely to increase or decrease without a catalyst (ie, without some change in actual working conditions or change in perception) and the interest is in the amount of change in the scores between time points, the correct method of analysis would be to use the difference score as opposed to the residual score.7 8 However, this decision may change depending on the reasons for examining change scores and how respondents are allocated to groups (eg, randomly or non-randomly).7 8 Finally, when looking at changes in scores over time, it is also important not to attribute changes that are restricted to one end of the distribution as RTM. For example, an intervention that is particularly effective among participants with low baseline values for a construct may lead to a systematic increase in low values at baseline, while high values remain relatively stable.38
Issue 4. The stability of the construct over time
Consistency of the underlying construct being measured – in terms of definition, the meaning of response options and the calibration of values – is a basic prerequisite for the measurement of change.9 10 39 One normally assumes these criteria are met if the same question is asked at each time point. However, using identical wording does not guarantee comparability in measures, particularly with self-reported items.
Building on the work of Golembiewski40 and Schwartz and Sprangers41 differentiated between four types of change that may occur between time points. These are real change and three within-individual shifts: recalibration, revaluation and reconceptualisation. They suggest that these within-person shifts usually occur as the result of a catalyst (eg, exposure to a workplace intervention) and are moderated by external (eg, social support) and individual (eg, personality) factors.42
4.1 Recalibration
Recalibration refers to shifts in the respondent’s internal standards and may be caused by a change in awareness or understanding of the concept being measured.43 Awareness of recalibration is particularly important when examining the effects of workplace interventions, as these often, either directly or indirectly, increase awareness or knowledge of a particular construct.43
Several ways to detect recalibration have been proposed, including administering “then-tests”ii at follow-up43–45 or asking respondents to describe their measurement anchors during baseline interviews (see Armenakis46 or Schwartz and Sprangers41 for reviews of these and other methods). However, none of these methods are without flaws47 48 and restrictions on interview time or questionnaire length may make them impractical in some situations.41 46
4.2 Revaluation
Revaluation refers to a shift in the order of factors influencing the overall appraisal of a construct. Along with reconceptualisation (see next section), identifying revaluation is challenging because the internal dimensions used by respondents to rank themselves shift, but no recalibration may occur.
In observational study designs ways to detect revaluation include the use of advanced statistical techniques (such as multi-group confirmatory factor analysis across time points) to investigate whether the relationship between the individual items that comprise the overall measure vary across time periods; for example, are factor loadings of individual job control questions on overall job control consistent over time49 (see Schwartz and Sprangers41 and Armenakis46 for complete reviews of these and other methods).
4.3 Reconceptualisation
Reconceptualisation refers to a change in the respondent’s definition of what the construct being measured means. For example, a respondent may redefine what it means to “use their skills” between measurement points. Along with revaluation, reconceptualisation is particularly important to investigate when using multidimensional constructs.50
Statistical methods to identify reconceptualisation are similar to revaluation and include examination of factor structures across time points,40 the coefficients of congruence method51 and analysis of covariance structures.52 More research is needed to look at similarities between each of these methods, and in particular, to define what level of reconceptualisation significantly impacts on change estimates.
Application in research on changes in psychosocial working conditions
Given that psychosocial work stress is often assessed through self-reports, all three types of within-person shifts listed above may occur. Respondent’s internal scales may “recalibrate” between time points as a result of personal exposure to more positive or negative working environments, or through discussions about working conditions with friends, family or colleagues. Similarly, respondents may use revaluation and reconceptualisation as coping mechanisms, similar to the process through which life satisfaction changes among cancer patients.53
Apart from survey redesign, there are ways to identify within-person shifts: comparing changes in self-reported measures with more objective outcomes or using methods such as examining factor structures between time points when using multi-item constructs. More objectively worded questions may also help minimise potential response shifts. For example, recalibration is more likely to occur on an item worded: “I am often required to work more hours than I would like to” (with response options from strongly agree to strongly disagree) than on: “How many extra hours, above what you are paid for, do you work on a regular basis?” (see Spector and Fox54 for an example of a self-reported autonomy scale based on “factually verifiable” information).
Investigators may also consider including items such as “then-tests” or having respondents describe their anchors at baseline and follow-up. Although not fool proof, these measures will provide some information about the likelihood of possible response shifts in items.
Table 3 compares change in job control over a 6-year period in the NPHS with an objective measure of change in working conditions (change in occupational skill level requirements). Respondents reporting negative changes in job control (greater than error) were more likely to also have a negative change in occupations and vice versa for respondents with positive changes in job control. Secondly, for respondents with no change in occupational skill requirements, changes in sense of mastery were associated with changes in job control; however, we are limited in our ability to discern whether these changes are due to within-person shifts or actual changes in working conditions in the same occupation. Increasing education was related to both positive and negative changes in job control. Negative changes in perceptions of job control could be the results of recalibration in perceptions because of increased education and thus increased expectations about job content. To assess the likelihood of reconceptualisation or revaluation in this study sample, we also compared the factor structure of the individual job control items using a multi-group confirmatory factor analysis across time points. Constraining the factor loadings to equality for the same questions at each time point did not substantially alter the model fit, as assessed by the χ2 statistic, suggesting minimal reconceptualisation or revaluation (results not presented but available on request).
Apart from de Lange and colleagues5 who examined the relationship between change in job (a relatively objective measure of possible change in working conditions) and reports of changes in the psychosocial working environment, we are not aware of any other studies that have examined possible within-person shifts in the assessment of working environments over time.
If response shifts are identified, investigators should discontinue using change scores and adopt other methods to answer research questions.9 Possible alternative methods include path analysis techniques, which allow modelling of separate direct paths between time one and time two scores and the outcome of interest.55 We are not suggesting that a change in a respondent’s perception of his or her work environment is not important. as individual perceptions lead to physiological and neurological responses which in turn affect health. However, discriminating between actual change in working conditions and within-person shifts is important in the development and implementation of effective interventions as the health effects of these changes may differ (eg, work redesign versus stress management).
CONCLUSIONS
We have outlined four areas that investigators measuring change in psychosocial working conditions across two time points should consider. We offer the following concluding remarks (in no particular order).
(1) The measurement of change requires measures with high reliability. When change is measured using scales with low reliability the variability in scores due to measurement error will increase, the reliability of the difference score will likely decrease and the amount of regression to the mean will increase. This is important for surveys using abbreviated questionnaires, which commonly have lower reliability scores than the original questionnaires on which they were based. Whenever possible, re-administering questionnaires to a subgroup of the study sample, over a period of time where change will not have occurred, should be performed so that the amount of day-to-day variability in the measure being used, as well as the potential effect of regression to the mean, can be accurately estimated.
(2) Researchers regressing time one scores on time two scores, should not interpret the residual as a “corrected” change score. Residualised scores can only be used to estimate which individuals have changed more, or less, than expected, given the baseline scores and other items included in the regression model. If researchers are only focusing on the amount of change in scores, the difference between time two and time one scores is the most appropriate method, once the researcher is sure that the changes observed are greater than day-to-day variability.
(3) Study designs that measure change between two time points should attempt to include more objective questions or other objective measures related to work stress constructs, or perform post-hoc statistical tests, to examine if within-person shifts have occurred. If within-person shifts are suspected, change measurement should no longer be performed as assumptions of congruence between time one and time two measures are not valid.
Key points
-
Measuring change is a methodologically challenging task that requires measures with high reliability, which contain items that are likely to detect change if it occurs.
-
Researchers regressing time one scores on time two scores, should not interpret the residual as a “corrected” change score.
-
Study designs that measure change between two time-points should attempt to include more objective questions, other objective measures related to work stress constructs, or perform post-hoc statistical tests, to examine if within-person shifts have occurred.
Key references
-
Lord FM. Elementary models for measuring change. In: Harris CW, ed. Problems in measuring change. Proceedings of a conference sponsored by the Committee on Personality Development in Youth of the Social Science Research Council, 1962. Madison, WI: University of Wisconsin Press, 1963:21–38.
-
Cronbach L, Furby L. How should we measure “change” - or should we? Psychol Bull 1970;74:68–80.
-
Rogosa D. Myths and methods: “myths about longitudinal research” plus supplemental questions. In: Gottman JM, ed. The analysis of change. Mahwah, NJ: Lawrence Erlbaum Associates, 1995:3–66.
-
Burr JA, Nesselroade J. Change measurement. In: von Eye A, ed. Statistical methods in longitudinal research. Boston: Academic Press, 1990:3–34.
-
Jacobson N, Truax P. Clinical significance: a statistical approach to defining meaningful change in psychotherapy research. J Consult Clin Psychol 1991;59:12–19.
-
Golembiewski R, Billingsley K, Yeager S. Measuring change and persistence in human affairs: types of change generated by OD designs. J Appl Behav Sci 1976;12:133–57.
-
Streiner D, Norman G. Measuring change. In: Health measurement scales: a practical guide to their development and use. 2nd edn. New York: Oxford University Press, 1995:163–80.
-
Wright D. Comparing groups in a before-after design: when t-tests and ANCOVA produce different results. Br J Educ Psychol 2006;76:663–75.
-
Schwartz C, Sprangers M. Methodological approaches for assessing response shift in longitudinal health-related quality-of-life research. Soc Sci Med 1999;48:1531–48.
-
Armenakis A. A review of research on the change typology. In: Woodman RW, Pasmore WA, eds. Research in organizational change and development. Vol 2. Greenwich, CT: JAI Press, 1988;68–80.
QUESTIONS (SEE ANSWERS ON PAGE 285)
Which statements are true and which are false?
-
What should researchers do as a first step when looking to examine change in working conditions between two time points?
-
-
Make sure that changes in scores over time are greater than the expected measurement error in the instrument
-
Calculate the difference between time two and time one scores and divide it into groups
-
Assume anyone who has changed more than 10% has had a true change in score between time points
-
-
How should a residual estimate, when a time two score is regressed on a time one score, be interpreted?
-
-
As a corrected difference (change) score
-
As a method to identify who had changed more, or less, than expected at time two, given the time one score
-
Time two scores should never be regressed on a time one score
-
-
What factors will impact the amount of day-to-day variability in a multi-item measure?
-
-
The number of items in the scale
-
The variance in the scores at baseline
-
The reproducibility of a score over a time period when no change has occurred
-
All of the above
-
-
What refers to the process where an individual changes their overall definition of a concept between two time periods?
-
-
Reconceptualisation
-
Recalibration
-
Revaluation
-
None of the above
-
Acknowledgments
A version of this paper was presented at the 2nd ICOH International Conference on Psychosocial Factors at Work. Thanks to the measurement group at the Institute for Work & Health for previous discussions on this topic. Peter Smith was supported by a strategic training research fellowship from the Canadian Institutes of Health Research Strategic Training Program in the Transdisciplinary Approach to the Health of Marginalized Populations. Dorcas Beaton is supported by a Canadian Institute of Health Research (CIHR) New Investigator’s award.
Appendix
Self-reported job control questions in the Canadian National Population Health Survey (NPHS)
All work stress questions are coded (1 = strongly agree, 2 = agree, 3 = neither agree nor disagree, 4 = disagree, 5 = strongly disagree)
Decision latitude
1. Your job requires you learn new things
2. Your job requires a high level of skill
3. Your job requires that you do things over and over
Decision authority
1. Your job allows you freedom to decide how you do your job
2. You have a lot to say about what happens in your job
Self-reported job control questions in Canadian foam manufacturing company
All work stress questions are coded (1 = strongly agree, 2 = agree, 3 = neither agree nor disagree, 4 = disagree, 5 = strongly disagree)
Decision latitude
1. My job requires I learn new things*
2. My job requires that I do things over and over*
3. My job requires me to be creative
4. My job requires a high level of skill*
5. I get to do a variety of things on my job
6. I have an opportunity to develop my own special abilities
Decision authority
1. My job allows me to make a lot of decisions on my own*
2. On my job I have very little freedom to decide how to do my work
3. I have a lot of say about what happens on my job*
*Item contained within both NPHS and Canadian Foam Manufacturing Study.
REFERENCES
Footnotes
-
Competing interests: None.
-
↵i Note that α estimates for the total OCHS study population (n = 1611) for the same work stress dimensions were generally lower than T-RT scores (skill discretion = 0.80, decision authority = 0.74 and job control = 0.85).
-
↵ii The then-test simply asks respondents, at follow-up, to rate their level on a particular domain at baseline (eg, considering how you are now, how would you rate your level of X at baseline?)