Article Text


Original article
CONSTANCES: a general prospective population-based cohort for occupational and environmental epidemiology: cohort profile
  1. Marcel Goldberg1,2,
  2. Matthieu Carton1,
  3. Alexis Descatha1,3,
  4. Annette Leclerc1,
  5. Yves Roquelaure4,
  6. Gaëlle Santin1,
  7. Marie Zins1,2,
  8. the CONSTANCES team
  1. 1Population-based Epidemiological Cohorts Unit, INSERM UMS 11, Villejuif, France
  2. 2Paris Descartes University, Paris, France
  3. 3Occupational Health Unit, Raymond Poincaré Hospital, Garches, France
  4. 4LEEST, Medical School, Angers University, Angers, France
  1. Correspondence to Professor Marcel Goldberg, INSERM UMS 11, Population-based Epidemiological Cohorts Unit, 16 avenue Paul Vaillant Couturier, Villejuif F-94807, France; marcel.goldberg{at}


Why the cohort was set up?CONSTANCES is a general-purpose cohort with a focus on occupational and environmental factors.

Cohort participants CONSTANCES was designed as a randomly selected sample of French adults aged 18–69 years at inception; 200 000 participants will be included.

Data collection phases At enrolment, the participants are invited to complete questionnaires and to attend a health screening centre (HSC) for a health examination. A biobank will be set up. The follow-up includes an yearly self-administered questionnaire, a periodic visit to an HSC and linkage to social and national health administrative databases.

Main types of data collected Data collected for participants include social and demographic characteristics, socioeconomic status, life events and behaviours. Regarding occupational and environmental factors, a wealth of data on organisational, chemical, biological, biomechanical and psychosocial lifelong exposure, as well as residential characteristics, are collected at enrolment and during follow-up. The health data cover a wide spectrum: self-reported health scales, reported prevalent and incident diseases, long-term chronic diseases and hospitalisations, sick-leaves, handicaps, limitations, disabilities and injuries, healthcare usage and services provided, and causes of death.

Control of selection effects To take into account non-participation and attrition, a random cohort of non-participants was set up and will be followed through the same national databases as participants.

Data access Inclusions begun at the end of 2012 and more than 110 000 participants were already included by September 2016. Several projects on occupational and environmental risks already applied to a public call for nested research projects.

  • Population-based cohort
  • Occupational epidemiology
  • Chronic diseases
  • Aging

This is an Open Access article distributed in accordance with the Creative Commons Attribution Non Commercial (CC BY-NC 4.0) license, which permits others to distribute, remix, adapt, build upon this work non-commercially, and license their derivative works on different terms, provided the original work is properly cited and the use is non-commercial. See:

Statistics from

What this paper adds

  • Historical cohort studies have been instrumental in the identification of numerous occupational hazards but have some limitations.

  • There is a need for prospective population-based cohort for studying diseases induced by occupational and environmental exposures.

  • The CONSTANCES population-based cohort is composed of a randomly selected sample of the French adult population and collects prospectively a variety of chemical, physical and biological agents, postural, mechanical and organisational constraints and stress at work as well as numerous health-related data.

  • Thanks to the geocoding of residential addresses CONSTANCES allows for the study of environmental and contextual factors.

Why the cohort was set up?

There is a long record of use of cohorts for studying diseases induced by occupational factors. Starting in the 50s for investigating cancer risks among workers of specific trades, the historical cohort study is still the gold standard when it comes to assessing the possible carcinogenicity to humans of workplace exposures.1 Since then, the use of cohorts has expanded to the study of many other occupational diseases and health conditions.

During the past 50 years, prospective occupational cohort studies have been the cornerstone of investigation for outcomes such as asthma, chronic bronchitis, carpal tunnel syndrome, myocardial infarction or hearing loss.2 Alternatively, if the factors of exposure are frequent, a general prospective population-based cohort may be drawn from a well-defined population. Some general prospective population-based cohorts were specifically designed to study occupationally induced diseases,3–5 while others had a larger scope but also included the collection of occupational factors.6–14 CONSTANCES was also designed to cover a large scope of topics, with a special focus on occupational and environmental factors.

CONSTANCES is a large prospective population-based cohort designed as a research infrastructure relying on the experience of the GAZEL cohort conducted by our group which is currently supporting more than 80 different nested research projects, including many occupational and environmental studies.14–16

CONSTANCES should contribute to the study of effects of occupational and environmental factors on cancer,17 to the short-term and long-term effects of biomechanical and psychosocial factors on musculoskeletal disorders.18 Exposure to chemicals may induce chronic respiratory diseases,19 neurodegenerative diseases and impact on cognitive functioning.20 Psychosocial factors at work may induce cardiovascular diseases,21 and mental health disorders.22 Owing to the economic context, new employment patterns and risks in young populations as well as workability in the early old age and other determinants of early exit from the labour force are an important concern,23 albeit working conditions and occupational exposures are major determinants of premature ageing.24–26 While working conditions may lead to impairments and disease, loss of work can also be deleterious for health, and many studies have investigated how unemployment influences mental and physical health and health-related behaviours,27 and how health may influence the risk of becoming unemployed.28 CONSTANCES was designed as a large random sample of the French adult population, it will also provide descriptive information on the health of the French population.

Who is in the cohort?

The general design of CONSTANCES has been detailed elsewhere;29 ,30 here we summarise the main characteristics of the cohort participants.

We plan to gradually enrol 200 000 participants aged 18–69 years at inception in 22 selected health screening centres (HSCs) located in 20 ‘départements’ in the principal regions of France. For practical reasons, the source population of CONSTANCES is restricted to persons living in one of the 20 CONSTANCES ‘départements’ and affiliated to the National Health Insurance Fund, that is, salaried workers, whether they are professionally active, unemployed or retired and their family (more than 85% of the French population), thus excluding agricultural and self-employed workers.

Eligible persons are selected from the national social security database, which includes all persons living in France, according to a random sampling scheme stratified on age, gender, socioeconomic status (SES) and region of France in order to be representative of the source population. They receive at home an invitation and those who volunteer receive questionnaires to complete and attend a HSC for a health examination.

After a field pilot, the recruitment started in late 2012 and the cohort will be completed in 2018. Currently (September 2016), more than 112 000 participants are enrolled in the cohort, with a participation rate of 7.3%. The already available data showed that there was quite a diverse distribution of main socioeconomic variables, as shown in table 1.

Table 1

CONSTANCES cohort: main sociodemographic characteristics of the sample*

Control of selection effects

To take into account selection effects due the voluntary participation, we draw a ‘control cohort’ from a random sample of 400 000 non-participants, that is, persons who were invited but did not participate, for whom we prospectively collect social and occupational, health and healthcare usage data from the same administrative databases as for the participants. Auxiliary data are known for the whole sample, and data from the other sources are only known for respondents.

Statistical analysis was conducted in two steps: (1) determination of the variables to be used to model the probability of response to the survey from the auxiliary data using logistic regression with the respondent as the dependent variable; (2) estimations of the prevalence of variables of interest after correction for the non-response by a reweighting technique where the final weight is equal to the product of the inverse probability of participation (original sampling weight) and a correction factor for non-participation. Details of the method are described elsewhere.31

We checked whether our reweighting method yields improved prevalence estimates. As an example, figure 1 shows the estimates of the prevalence of intensive work (panel A) and of carrying heavy loads (panel B) among men according to age. Curves P1 correspond to the prevalence with only sampling weights, while curves P2 show the prevalence with sampling weights further adjusted on age, sex, socioeconomic status, geographic area and health and professional variables. Fully adjusted prevalence estimates were markedly amplified when health and professional variables were used, yielding more accurate estimates of the prevalence in the general population.

Figure 1

The estimates of the prevalence of intensive work (A) and of carrying heavy loads (B) among men according to age. P1: prevalence with sampling weights based only on the stratification variables. P2: prevalence with sampling weights adjusted on age, sex, socioeconomic status, geographic area and heath and professional variables from SNIIRAM and CNAV.

How have people been followed?

After enrolment, participants are followed-up by an annual self-administered questionnaire sent to their homes (paper or web-based) and they are invited for a new health examination every 5 years. The annual participation rate to the follow-up questionnaire of participants who were included from 2012 to 2014 was higher than 80%.

They are also followed up by annual linkage with national social and health administrative databases that permanently record data: the retirement pension database collects social and occupational data from employers' annual reports and from social welfare organisations;32 the ‘Système national d'information interrégimes de l'Assurance Maladie’ (SNIIRAM), contains medical data about reimbursements, serious chronic diseases and hospital discharge records;33 the National Death Registry-CepiDc records the cause of death.34

What has been measured?

The main data regularly collected include social and demographic characteristics, health-related behaviour, disease history, self-reported health scales, incident and prevalent diseases, sick leaves, impairments, limitations, disabilities, injuries and healthcare usage and cause of death. In the HSC examination weight, height, waist-hip ratio, blood pressure, electrocardiogram, vision, hearing, lung function and laboratory tests are measured. For those aged 45 years and older cognitive and physical functioning tests are conducted by neuropsychologists. The collection of biological samples (blood and urine) from half of the cohort will start in 2017.

We placed particular focus on a multifaceted view on working conditions and occupational exposure, as well as environmental factors, using different sources. From the questionnaires we collect at inclusion and during follow-up data on current and lifelong occupational exposure to chemical, physical and biological agents, postural, mechanical and organisational constraints and stress at work (

Table 2 lists the main data on occupational exposures collected at inclusion, and table 3 presents the number of participants exposed or having been exposed in their professional career among the participants already included for which data are currently available.

Table 2

Main occupational exposures collected from questionnaires

Table 3

Main lifelong occupational exposures among CONSTANCES participants*

Full job histories are collected at inclusion from a specific questionnaire: for each work episode of 6 months or more, participants are asked to detail starting and end dates, occupation, type of contract. These data are needed for the coding of jobs according to the international ISCO 1968,36 and ISIC rev.2,37 and the national PCS and NAF38 French nomenclatures. More than 600 000 jobs held by the CONSTANCES participants are expected. Coded job histories allow for linkage with available job-exposure matrices (JEMs) developed by the Occupational Health Department of the National Institute for Health Surveillance in the Matgene project.39 Available JEMs concern chlorinated solvents, fuels and petroleum solvents, asbestos, mineral wool fibres (glass, rock, slag), refractory ceramic fibres, dust flour, leather dust, respirable dust of free crystalline silica, respirable cement dust; other JEMs are under development (noise, biomechanical factors) and will be used when they become available. We also plan to use the CANJEM JEM, in the final stages of development in Montreal, including about 250 different occupational exposures.40 Coding of occupational histories began in early 2016, and it is expected that linkage with the Matgene JEMs will be performed by the end of 2016 and later with CANJEM.

From the retirement pension database full job histories since the first job are also collected at enrolment and are prospectively updated each year during the follow-up. The following data are extracted for each job episode: employer, work periods, interruptions (for disease or accident, disability, maternity, military service, unemployment, retirement), salary, type of contract.

Regarding environmental factors, residential addresses are collected and x/y geocoded. This is carried out prospectively from the enrolment of the participants; we are also currently collecting and geocoding all the addresses of the 10 years before enrolment. This allows for assessing the residential exposure of the participants to various parameters available at an ecological scale, such as maps of air pollution41 and climate, electromagnetic fields from powerlines or exposure to radiofrequency fields from mobile phone base stations and broadcast transmitters, water pollution or noise. Residential histories can also be used to study available contextual socioeconomic parameters, such as urbanisation, population density, collective equipment,38 deprivation,42 or an index of accessibility to medical resources,43 already used by several nested research projects in CONSTANCES.

Data access

Every group in France or in other countries is entitled to apply to develop a nested project within CONSTANCES and to access to its database. Projects are evaluated twice a year by the CONSTANCES Scientific Committee on feasibility and scientific quality criteria. A Charter describes the rules that have been established for using the CONSTANCES infrastructure. The material needed for applying can be downloaded from the CONSTANCES website:

A call for proposals is open since 2014. More than 60 projects covering a wide range of topics were already proposed and approved by the Scientific Committee; the full list is available at:

What has been found?

As the cohort is still young and not yet fully constituted, only preliminary cross-sectional findings, most of them not yet published, are available. A first set of studies addressed the estimation of the prevalence of some conditions in the French population aged 30–69 years, using the reweighting method described above. The prevalence of overweight and obesity in France in 2013 was estimated: the prevalence of overweight was 41.0% and 25.3% in men and women respectively. The prevalence of obesity was 15.8% for men and 15.6% for women. The prevalence of abdominal obesity is higher with rates of 41.6% and 48.5% for men and women, respectively.

The estimation of airway obstruction (AO) prevalence showed that the prevalence of AO defined by forced expiratory volume in 1 s/forced vital capacity ratio <0.70 was 5.6 (men 7.7%, women 3.8%).

We also looked at the prevalence of musculoskeletal disorders in the working general population aged 30–69 years. The prevalence of persistent pain varied between 14% (elbows) and 35% (back) in women, 9% and 24% for men, respectively, for the same locations; prevalence of rachis pain in female workers was 35%, against 22% for executives and 35% and 25% among male workers, respectively.

Other cross-sectional studies were conducted. A study investigated the association between self-perceived diet and compliance with nutritional guidelines from the French National Nutrition and Health Program (PNNS); each additional point at the PNNS score increases perceived dietary balance (rated from 1 to 8) by 0.23 (0.22–0.24) men and women perceiving their diet as equally balanced. The participants who declared a limited consumption of snacks and ready-prepared meals also perceived their diet as more balanced.

We also looked at tobacco and electronic cigarette (EC) use in 2014 and the trajectories over 1 year follow-up showing that the use of EC is very rare among non-smokers, slightly more frequent among ex-smokers 1%), and mixed use is two times more frequent (2%). Prevalence is similar among men and women, and decreases with age among ex-smokers and mixed users. Frequency seems lower among employees and blue collar workers. Mixed users show the lowest prevalence of Very good-Good self-rated health and the highest prevalence of depression. There is a clear gradient in EC use according to the number of pack-years of tobacco smoking. The follow-up shows that no exclusive EC user had become a smoker 1 year later.44

Another study examined the ‘effect sizes’ of different cognitive function determinants in 11 711 participants 45–75-year-old; the Free and Cued Selective Reminding Test (FCSRT), Verbal Fluency Tasks, Digit Symbol Substitution Test (DSST) and Trail Making Test (TMT), parts A and B were measured and the effect size of sociodemographic (age, sex, education), lifestyle (alcohol, tobacco, physical activity), cardiovascular (diabetes, blood pressure) and psychological (depressive symptomatology) variables were computed as semipartial omega-squared coefficients (ω2; part of variation of a neuropsychological score i.e, independently explained by a given variable). These set of variables explained from R2=10% (semantic fluency) to R2=26% (DSST) of the total variance. In all tests, sociodemographic variables accounted for the greatest part of the explained variance. Age explained from ω2=0.5% (semantic fluency) to ω2=7.5% (DSST) of the total score variance, gender from ω2=5.2% (FCSRT) to a negligible part (semantic fluency or TMT) and education from ω2=7.2% (DSST) to ω2=1.4% (TMT-A). Behavioural, cardiovascular and psychological variables influenced only slightly the cognitive test results (all ω2<0.8%, most ω2<0.1%).45

Regarding occupational epidemiology, a preliminary study of our group using data from the field pilot described the prevalence of occupational biomechanical exposures and persistent musculoskeletal pain according to social position and employment status. In men and women, there was a clear gradient between social position and prevalence of occupational exposure to physical effort, squatting position, working with arms up or screwing hand. In men, the prevalence of persistent back and knee pain increased with lower social position. In women, there is a clear inverse gradient in the prevalence of pain with the social position for all sites except the neck.46

Some of the external applications for nested studies from the call for proposals also addressed occupational factors. The main topics related to musculoskeletal disorders, chronic respiratory diseases, cognitive ageing and occupational health surveillance or to the health of teachers and researchers in relation with their working conditions. There were also projects on environmental factors such as air pollution or mobile phone use. Table 4 lists the project titles and the affiliation of the Principal Investigators (PIs) who applied for access to the cohort.

Table 4

Current research projects on occupational and environmental epidemiology within CONSTANCES

What are the main strengths and weaknesses?

Designed as a general-purpose cohort covering a broad scope of health conditions and determinants, the CONSTANCES cohort put a special focus on occupational and environmental factors, in order to constitute a powerful tool for research on occupational and environmental factors. CONSTANCES will also provide indicators describing the health of workers in France according to occupations and economic sectors, including estimates of population attributable fractions for many occupational factors and outcomes; it also allows to study the relationships of employment trajectories (including unemployment) and health.

CONSTANCES has several strengths. CONSTANCES is a large-sized randomly sampled cohort, including participants living and working in either urban or rural neighbourhoods (84% and 16% respectively), diversified in terms of jobs and socioeconomic status. Numerous data are prospectively and repeatedly collected.

CONSTANCES has also some limitations. First it covers only salaried workers excluding agricultural and self-employed workers; however a collaboration was established with the COSET project managed by the Occupational Health Department of the National Institute for Health Surveillance, which is currently setting up two complementary cohorts, one of agricultural workers and one of self-employed persons which were designed in a way that data sharing with CONSTANCES will be possible (

Another limitation is the voluntary participation of the cohort members and the low participation rate (7.3%), the same order of magnitude as other similar cohorts, like UK Biobank in which the participation rate is 5.47% ( However, the use of reweighting techniques relying on the ‘non-participants cohort’ allows for correcting for selection effects.


View Abstract


  • Contributors MG and MZ planned the study. MC, AD, AL and YR planned the collection of the occupational factors. GS designed the reweighting methods. MG drafted the manuscript, and MG and MZ are responsible for the overall content of the manuscript.

  • Funding The CONSTANCES cohort is supported by the Caisse Nationale d'Assurance Maladie des travailleurs salariés-CNAMTS, and was funded in its pilot phase by the Direction générale de la santé” of the Ministry of Health (CPO 2007–2009), and by the Institut de Recherche en Santé Publique-Institut Thématique Santé Publique, and the following sponsors : Ministère de la santé et des sports, Ministère délégué à la recherche, Institut national de la santé et de la recherche médicale, Institut national du cancer et Caisse nationale de solidarité pour l'autonomie (AMC10003LSA). CONSTANCES is accredited as a ‘National Infrastructure for Biology and health’ by the governmental Investissements d'avenir programme and was funded by the Agence nationale de la recherche (ANR-11-INBS-0002 grant). CONSTANCES also receives funding from MSD, AstraZeneca and Lundbeck managed by INSERM-Transfert. CONSTANCES is conducted in partnership with the National Health Insurance Fund administered by CNAMTS, and with the National Retirement Insurance Fund administered by the Caisse nationale d'assurance vieillesse-CNAV. Quality control procedures are taken in charge by ClinSearch for the data collected in the HSCs, and by Asqualab and EuroCell for the biological data. The authors also gratefully acknowledge the major contribution to the protocol of numerous colleagues, in France and abroad, who helped in the general design of the cohort, and of the participating HSCs. The authors express also their thanks to Dominique Polton from the CNAMTS for her help, and to Christophe Albert and Joël Brulard for the drawing of eligible persons and the access to the CNAV database.

  • Competing interests None declared.

  • Patient consent Obtained.

  • Ethics approval All confidentiality, safety and security procedures were approved by the French legal authorities. According to the French regulations, the CONSTANCES Cohort project has obtained the authorisation of the National Data Protection Authority (Commission nationale de l'informatique et des libertés-CNIL). CNIL verified that before inclusion, clear information is provided to the eligible participants (presentation of CONSTANCES, type of data to be collected, ability to refuse to participate, informed consent, etc). Concrete procedures for setting up the two cohorts (participants and non-participants) ensure the confidentiality of the data at every point in its circulation as well as the anonymity of the cohort of non-participants. In addition, CONSTANCES was approved by the National Council for Statistical Information (Conseil national de l'information statistique-CNIS), the National Medical Council (Conseil national de l'Ordre des médecins-CNOM), the Institutional Review Board of the National Institute for Medical Research-INSERM and our local Committee for Persons Protection (Comité de protection des personnes).

  • Provenance and peer review Not commissioned; externally peer reviewed.

Request permissions

If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.