A search strategy for occupational health intervention studies
- J Verbeek1,4,
- J Salmi1,
- I Pasternack1,2,
- M Jauhiainen3,
- I Laamanen3,
- F Schaafsma4,
- C Hulshof4,
- F van Dijk4
- 1Cochrane Occupational Health Field, Finnish Institute of Occupational Health, Department of Research and Development of Occupational Health Services, Kuopio, Finland
- 2FinOHTA, STAKES, Helsinki, Finland
- 3Information Centre, Finnish Institute of Occupational Health, Helsinki, Finland
- 4Coronel Institute for Occupational and Environmental Health, AMC, Amsterdam, the Netherlands
- Correspondence to: Dr J Verbeek Cochrane Occupational Health Field, Finnish Institute of Occupational Health, Department of Research and Development of Occupational Health Services, PO Box 93, 70701 Kuopio, Finland;
- Accepted 10 May 2005
Background: As a result of low numbers and diversity in study type, occupational health intervention studies are not easy to locate in electronic literature databases.
Aim: To develop a search strategy that facilitates finding occupational health intervention studies in Medline, both for researchers and practitioners.
Methods: A gold standard of articles was created by going through two whole volumes of 19 biomedical journals, both occupational health specialty and non-occupational health journals. Criteria for occupational health intervention studies were: evaluating an intervention with an occupational health outcome and a study design with a control group. Each journal was searched independently by two of the authors. Search terms were developed by asking specialists and counting word frequencies in gold standard articles.
Results: Out of 11 022 articles published we found 149 occupational health intervention studies. The most sensitive single terms were work*[tw] (sensitivity 71%, specificity 88%) and effect*[tw] (sensitivity 75%, specificity 63%). The most sensitive string was (effect*[tw] OR control*[tw] OR evaluation*[tw] OR program*[tw]) AND (work*[tw] OR occupation*[tw] OR prevention*[tw] OR protect*[tw]) (sensitivity 89%, specificity 78%). The most specific single terms were “occupational health”[tw] (sensitivity 22%, specificity 98%) and effectiveness[tw] (sensitivity 22%, specificity 98%). The most specific string was (program[tw] OR “prevention and control”[sh]) AND (occupational[tw] OR worker*[tw]) (sensitivity 47%, specificity 98%).
Conclusion: No single search terms are available that can locate occupational health intervention studies sufficiently. The authors’ search strings have acceptable sensitivity and specificity to be used by researchers and practitioners respectively. Redefinition and elaboration of keywords in Medline could greatly facilitate the location of occupational health intervention studies.
The idea of evidence based occupational health is nowadays supported by an increasing number of practitioners and researchers in the field.1 This has created a need for an overview of the evidence for effectiveness of occupational health interventions. To this end we recently started an occupational health field within the Cochrane Collaboration.2 The basic evidence is formed by original studies on evaluation of effectiveness of occupational health interventions. Both researchers and practitioners of occupational health will nowadays search for evidence on problems arising from practice in literature databases that are widely available through the internet, such as Medline through PubMed. For them it is most helpful to develop search methods that are effective and efficient.
A search should yield as much information as is available on a specific topic and lead to as few articles as possible that are not related to the search topic. In epidemiological terms this would mean a search that is both sensitive and specific.
For clinical medicine, special search strategies have been developed to search efficiently for articles on therapy, diagnosis, or other clinically important topics. These tools for clinicians cannot be transposed easily to occupational health or occupational medicine. Evidence on effectiveness for therapy is mostly based on evaluating treatment with drugs, usually in randomised controlled trials. This is reflected in the search strategies in PubMed which are made up of words like “double blind” or “randomised controlled trial [pt]”.3 However, if one searches for randomised controlled trials (RCTs) in the field of occupational health, this yields only a few articles. This is partly caused by the nature of evidence that exists in the field of occupational health, where there are few RCTs but many studies on interventions, programme evaluations, or other study designs. For community intervention programmes it has been argued as well that evidence should be extended beyond the classical RCT.4 After all, on many occasions it is impossible or not feasible to randomise in trials on community or occupational health interventions. In this case it would be unwise to restrict oneself to RCTs and conclude that there is no evidence on interventions. This means that we need to develop literature search strategies that are sensitive to and specific for a broad range of study designs that are used in evaluating occupational health interventions.
Search strategies are usually developed by first compiling a gold standard of articles that should be found if one is searching for relevant articles. Next, one can search Medline or other literature databases with relevant search words. Comparison of the yield of search words or a combination thereof with the gold standard will show what proportion of articles from the gold standard are found and not found. Put in other words, it will show the sensitivity and specificity of the search words for the studies defined by the gold standard.
In this article we want to report on the qualities of various search words to find occupational health intervention studies.
Construction of a gold standard
To develop the gold standard we searched the tables of contents and full volumes of several categories of highly ranked biomedical journals published in 2000 and 2001 for articles that fulfilled our criteria of an occupational health intervention study. This is called hand searching. In a previous article we defined occupational health intervention studies as studies that intend to eliminate or control hazards at work or in organisations related to health and disability, change health and disability related behaviour and skills among workers, or prevent or better treat diseases and related disabilities. We translated these into criteria for intervention studies and criteria for occupational health outcomes (table 1) To be included in the gold standard, studies had to fulfil both one of the intervention criteria and one or more of the occupational health criteria.
The journals listed in table 2 were hand searched. First we searched the eight journals in the field of occupational health that were reported as containing most information on occupational health problems.5 We expected that this would reveal the most articles on occupational health interventions. Next, we hand searched three general medical journals with the expectation that high quality research would be published in highly ranked medical journals. In addition we hand searched four specialty journals in fields with important occupational health problems. To also find articles in fields more remote from the occupational health field we searched three more diverse specialty journals. Altogether, we hand searched 19 different journals and in total 11 022 original research articles from publication years 2000 and 2001. All journals were searched independently by two of the authors. The agreement between the two independent hand searchers yielded a kappa of 0.76. Discrepancies between search results were resolved by discussion and compromise, which led to the gold standard. Agreement between the hand searchers and the final gold standard was 0.88.
Development of search words
Based on the definitions of occupational health intervention studies we conceived as many search words as possible. We used the network of the Cochrane Occupational Health Field to ask for additions to this list of search words. A third strategy to collect search words was by analysing the frequency of words in title, abstract, and keywords in the articles of the gold standard by means of the reference management programme Endnote (Thomson Scientific, Stamford, CT, USA). The search words from the first two strategies were all contained in the third strategy. Therefore, we used words that occurred at least twice in titles, 10 times in abstracts, or four times in keywords for further analysis. We chose those most frequent words from the list that met best occupational health intervention criteria. We made separate lists for search words for “occupational health” and for “intervention studies”. To develop a more sensitive search we truncated certain words—for example, “worker”, “working”, “workers”, and “workers” will all be found by the truncated search term “work*”. In some cases we could also truncate search words for the specific search without losing too much of the specificity. In cases where this was possible, the same search words were used in different forms by using so-called tags for searching in the title and in the abstract (tag:[tiab]) only or as a Medical Subject Heading only (tag:[MeSH]).
In PubMed we selected all original articles that were published in the journals and volumes that we had hand searched and downloaded those into a SPSS file (search string: “all relevant journal names” [jour] AND 2000:2001[dp] AND “journal article”[pt] NOT review[pt]). In the SPSS file we indicated which articles belonged to the gold standard. Then, we used the most frequently occurring search words to search with PubMed in Medline in the journals and volumes that we had hand searched. The yield of these searches was downloaded and merged with the existing SPSS file which contained all journal articles and the gold standard. This enabled us to construct two by two tables from which we could calculate the appropriate test properties for each search word and combinations thereof.
The outcome of a search strategy is defined in the same terms as any diagnostic test (table 3). A sensitive search will find a high proportion of articles that belongs to the gold standard resulting in the smallest number of false negatives. A specific search will exclude the highest proportion of articles that do not belong to the gold standard and yield the smallest number of false positives. Sensitivity and specificity are inversely related which means they always have to be considered in their combination.
From a practical point of view we discern two different users for the search strategy—researchers and practitioners. Researchers who are reviewing the literature would like to be sure that a search yields all available studies and they would have time as part of the research project to go through relatively high numbers of studies. This indicates a search strategy with a high sensitivity and low number of false negatives. However, in our experience it is a very tedious job to go through thousands of article titles and abstracts. Numbers beyond 2000 will become unfeasible even for researchers. Practitioners would like to have a limited number of articles fitting their question from practice precisely. They would not have time to go through a great number of studies. In our experience, 10 hits in a total of 100 articles to go through would be the upper limit of feasibility for them. Some authors formulated the inverse of the positive predictive value as the number of titles needed to read to find one hit.6 This indicates a search strategy with a high specificity and a rather low number of false positives.
Following Haynes,3 we defined our outcome in two ways:
the most sensitive search strategy with a specificity of at least 90%
the most specific search strategy with a sensitivity of at least 50%.
For all the selected search words we constructed two by two tables and calculated the sensitivity and specificity and their 95% confidence intervals.
We made lists of most sensitive and most specific single search words that covered either the content “occupational health” or “intervention”. These words should have a specificity of at least 50% and a sensitivity of at least 20% respectively.
To find the most sensitive search term combination we used all search words with a sensitivity of more than 15% and a specificity of more than 75%. Next we made a search string with the combination that yielded the most sensitive search by subsequently adding those search words (with the OR operator) that found most additional articles from the gold standard with least false positive results. A similar search string for specificity was made by starting with search words with specificity of at least 90% and sensitivity of more than 20% and adding search words (with the OR operator) that increased sensitivity with the least lowering of specificity. Finally, occupational health terms and intervention terms were combined by the AND operator.
The total number of articles retrieved by means of hand searching was 149 out of 11 022 yielding a prevalence rate of 1.35 occupational health articles per 100 articles published in our sample. Most articles were found in occupational health specialty journals and least in non-related specialty journals (table 2) Very few occupational health intervention studies are published in the major high quality journals.
We used 60 search words from titles, 78 from the abstracts, and 41 from the keywords and additional truncations of these in our analysis. A list of the most sensitive specific search words is given in tables 4 and 5. The most sensitive and specific search string is given in table 6.
Surprising results were found for “occupational health”. Used as a MeSH term it had a sensitivity of 14%, used with the text word tag [tw] the sensitivity was 22% and with the title abstract tag [tiab] the sensitivity was only 5%. The same applied to the search word “occupational”: searching as text word sensitivity was 57%, but searching in title and abstract only the sensitivity was 26%. The best MeSH term was the subheading “prevention and control” with sensitivity 46% and specificity 92%.
We did not find a single search word that could easily locate occupational health intervention studies. In addition, our search strings fell a little short of the sensitivity and specificity that we formulated in advance. It was not possible to develop a search string that was both sensitive and specific at the same time. However, the properties of the search strings that we developed are still sufficient to increase the effectiveness and efficiency of searching for both practitioners and researchers. As the search strings are general strings they will almost always be combined with a more specific topic, such as a specific disease. This will increase the specificity of the search and thereby the feasibility in practice.
The strength of our study is that we were able to create a gold standard of those articles on occupational health interventions that we wanted to find. This made it possible to calculate diagnostic test properties for a great number of relevant search words. Other studies on search strategies for occupational health did not have this possibility and can draw less straightforward conclusions.5,7,8
One of the weaknesses of our study is that we were hampered by the lack of a computer program that could easily calculate test characteristics for different combinations of search words, like Haynes described.9 Although we tried to develop the search strategy as systematically as possible, we had to rely on subjective choices of search words to test, instead of being able to test all different combinations of search words. However, the method we used—by simply adding search words to a string and making an assessment of the extra benefit—seemed sufficiently reliable. We feel that we did not miss any important combination that would have changed our results substantially. We also used a logistic regression analysis to predict the best outcome from a combination of search words. It was difficult to interpret because it did not give us the opportunity to select sensitive and specific terms separately and it did not add to the strategy that we used already.
Searching the medical literature for evidence of effectiveness is an important aspect of evidence based occupational health.
Occupational health intervention studies are not prevalent and therefore not easy to locate in medical databases.
A search strategy for occupational health studies should be sensitive and specific.
The most sensitive search string for occupational health studies has a sensitivity of 89% and a specificity of 78%.
The most specific search string has a specificity of 98% and a sensitivity of 47%.
Another difficulty was the broad definition of the occupational health field and interventions studies, incorporating several different outcomes. This made it difficult to find single search terms that satisfied our criteria which is reflected by the relatively low sensitivity of the most specific search terms. As far as we know, our study is the first in trying to document the development of a search strategy for intervention studies different from RCTs. This entails several difficulties, such as which study designs to include and the lack of a uniform tagging in databases. We decided to include a broad range of intervention studies because we felt that this is most appropriate for the occupational health field. It is still to be seen how useful the evidence is compared with RCTs, because at the moment the methodology to properly synthesise these study results is lacking.
The interobserver agreement with a kappa of 0.77 was reasonable, especially given the sometimes very vague research designs that were used. Most discussions on inclusion between hand searchers were in this area where it is sometimes difficult to discern an evaluation study from a prognostic study. This was also a learning process for some of us who were less experienced in classifying articles by research design. That the hand searching still provides a valid result is also supported by using “randomised controlled trials[pt]” as a search word and trying to find only the RCTs in the gold standard. Thirty from the 32 present RCTs were then found (the two that were not found were erroneously not tagged as RCT by Medline). This means that those studies that we did find and categorised as RCTs were also tagged as such by the Medline staff.
Another comment on our gold standard is that our choice of journals, even though it is based on previous research, is still somewhat arbitrary. One could argue that occupational psychology and occupational safety journals are lacking. This could have increased the number of, for example, stress intervention studies and safety interventions and thus, possibly, have generated other search words. We recommend further research into efficient search strategies for these more specific areas.
In comparison with other studies on search strategies our results are more or less comparable. Finding occupational health intervention studies is apparently more difficult than finding diagnostic studies but similar to prognostic studies. Search strategies for diagnostic studies yield higher sensitivity than ours with 98% for diagnostic studies,6 and 98% sensitivity combined with 74% specificity for another study of diagnostic studies.9 Wilczynski found a sensitivity of 90% for prognostic studies with 80% specificity.10 This compares well with our 89% and 78%.
Searching medical databases for occupational health intervention studies is important for practitioners and researchers in occupational health.
Using best search strings helps in locating occupational health intervention studies.
One difficulty that has not been resolved in studies on the development of optimal search strategies is the generalisibility of the results. Because all studies, just as ours did, make a selection of journals to be hand searched, the population of articles studied will differ from those in Medline as a whole. First, the test characteristics of the search strategy could therefore be different if the selection of journals differs very much from those in Medline in general. Next, the precision will be much lower when searching in the whole of Medline over a number of years because the precision depends on the prevalence of the studies to be found. Our study material was also different because we selected occupational health journals and original studies. It is unclear what this means for the test characteristics. On the other hand, it is almost impossible to select study material at random from Medline. It would entail an enormous variety of journals to hand search, which would be impossible to get access to.
Because the prevalence of occupational health articles in the general medical literature is low, the number of articles that will be retrieved will depend mostly on the specificity. In the two by two table this means that the numbers in cells b and d are much greater than the numbers in cells a and c. A decrease of specificity of a few percentage points will result in many hundreds or thousands of false positive articles, increasing the number needed to read. Some authors used the number needed to read as the main outcome measure of their search strategy.6,11 However, because this is so much dependent on the prevalence of articles in the total sample we did not use this measure.
A major step forward would be a more uniform tagging system and practice by Medline staff of occupational health articles with both an “occupational health” tag and an “intervention study” tag. This would mean that the current tags have to be redefined in line with the occupational health studies we defined. Until that time the search strategies that we developed are the best means to retrieve occupational health studies from Medline.
This work has been made possible through grants from the Dutch Ministry of Social Affairs and Employment and the Finnish Ministry of Health and Social Affairs.