Objectives Previous evaluations of algorithms to code job descriptions to standardised occupation classification (SOC) codes suggest that some jobs will need expert coding to reduce misclassification. For jobs coded using the SOCcer algorithm (http://soccer.nci.nih.gov), we evaluated the utility of several metrics for identifying discordances between expert and automated SOC assignments to develop recommendations to prioritise expert review.
Methods The SOCcer algorithm was applied to expert-coded job descriptions from three studies to obtain each job’s top ten scoring U.S. SOC-2010 codes and their ‘score’ (measure of fit; continuous 0–1). The SOCcer and expert SOC codes were linked to the CANJEM job-exposure matrix comprising exposure estimates for 258 agents (probability, intensity, exposure status: probability > 0 vs. 0). We evaluated the agreement between the expert and the top scoring SOC code (proportion of agreement), and in their agent-specific CANJEM estimates (kappa for exposure status; intra-class correlation coefficient, ICC, for probability and intensity) in subsets of jobs stratified by metrics derived from the SOCcer score and CANJEM. We describe the overall patterns.
Results Moderate agreement was usually achieved for jobs with a maximum score ≥ 0.3. Higher agreement was observed for jobs with SOCcer score distance between the top two scoring SOC codes of ≥0.1 versus <0.1. Combining these two characteristics, kappa’s and ICC’s were 0.7–0.8 for jobs with ≥0.3 maximum score and ≥0.1 score distance (36–53% of all jobs) compared to 0.3–0.5 for jobs that did not meet both thresholds. We also found higher agreement for jobs with the same versus different exposure status for the top two scoring SOC codes.
Conclusions When applying SOCcer to un-coded jobs, we found that expert review would be most informative (reduce misclassification) for jobs with maximum scores < 0.3 and for jobs where the top two ranked SOC codes had score distances < 0.1 or differing exposure estimates.
Statistics from Altmetric.com
If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.