Article Text
Abstract
Surveillance of diseases and associated exposures is a major issue in occupational health, especially identifying and preventing new threats for worker’s health. New complementary methods relying on exploitation of already existing data, such as those from health insurance, could be developed to look for relevant signals for early detection of emerging occupational diseases. In this context, a systematic data mining could be performed on databases from the ”Mutualité Sociale Agricole” (MSA), the dedicated social security system to French agricultural workers, which covers about 3 million individuals. As this healthcare system holds a large amount of data, MSA databases could allow us to apply ”big data” analytics in order to study occupational risks of French agricultural workers. Thereby, this innovative approach could permit to look for associations between diseases and occupational activities without any prior hypothesis and also could have the potential to be used on continuous data flow for vigilance.
The authorisation of the French National Commission on Informatics and Liberty allowed the cross-linking of MSA databases using a common anonymous identifier for each individual. The main methodological point is programming of unsupervised analysis, especially latent models of mixed factors, applied to the ”occupational activity x diseases” matrices. Due to the lack of direct information about exposure, a complementary work is performed to estimate retrospectively the exposure to pesticides of agricultural workers.
This innovative method which will be presented, has the following advantages: 1) offers a systematic approach, 2) has a strong statistical power, 3) is costless about data acquisition.