Subscribe to RSS
DOI: 10.3414/ME14-01-0046
Frequency Analysis of Medical Concepts in Clinical Trials and their Coverage in MeSH and SNOMED-CT[*]
Publication History
received:
22 April 2014
accepted:
05 October 2014
Publication Date:
22 January 2018 (online)
Summary
Background: Eligibility criteria (EC) of clinical trials play a key role in selecting appropriate study candidates and the validity of the outcome of a clinical trial. However, in most cases EC are provided in unstandardised ways such as free text, which raises significant challenges for machine-readability.
Objectives: To establish a list of most frequent medical concepts in clinical trials with semantic annotations. This concept list contributes to standardisation of EC and identifies relevant data items in electronic health records (EHRs) for clinical research. The coverage of the list in two major clinical vocabularies, MeSH and SNOMED-CT, will be assessed.
Methods: Four hundred and twenty-fivec linical trials conducted between 2000 and 2011 at a German university hospital were analysed. 6671 EC were manually annotated by a medical coder using Concept Unique Identifiers (CUIs) provided by the Unified Medical Language System. Two physicians performed a semi-automatic CUI code revision. Concept frequency was analysed and clusters of concepts were manually identified.A binomial significance test was applied to quantify coverage differences of the most frequent concepts in MeSH and SNOMED-CT.
Results: Based on manual medical coding of 425 clinical trials, 7588 concepts were identified, of which 5236 were distinct. A top 100 list containing 101 most frequent medical concepts was established. The concepts of this list cover 25 % of all concept occur-rences in all analysed clinical trials. This list reveals six missing entries in SNOMED-CT, 12 in MeSH. The median of EC frequency per trial has increased throughout the trial years (2000 –2005: 8 EC/trial, 2011: 14 EC/ trial).
Conclusions: Relatively few concepts cover one quarter of concept occurrences that represent EC in recent studies. Therefore, these concepts can serve as candidate data elements for integration into EHRs to optimise patient recruitment in clinical research.
* Supplementary material published on our web-site www.methods-online.com
-
References
- 1 McDonald A, Knight R, Campbell MVE, Grant A, Cook J. et al. What influences recruitment to randomized controlled trials? A review of trials funded by two UK funding agencies. Trials 2006; 7: 9.
- 2 Van der Wouden J, Blankenstein A, Huibers M, Van der Windt D, Stalman W, Verhagen A. Survey among 78 studies showed that Lasagna’s law holds in Dutch primary care research. J Clin Epidemiol 2007; 15: 819-824.
- 3 Dugas M, Fritz F, Krumm R, Breil B. Automated UMLS-based comparison of medical forms. PloS One 2013; 8 (07) e67883.
- 4 Medical Data Models. [Online] [cited 2014]. Available from. https://medical-data-models.org/.
- 5 Operational Data Model. [Online] [cited 2014]. Available from. http://www.cdisc.org/odm.
- 6 UMLS Metathesaurus. [Online] [cited 2014]. Available from. https://uts.nlm.nih.gov/home. html.
- 7 Unique Identifiers in the Metathesaurus. [Online] [cited 2014 June]. Available from. http://www.nlm.nih.gov/research/umls/new_users/online_ learning/Meta_005.html.
- 8 Miotto R, Weng C. Unsupervised mining of frequent tags for clinical eligibility text indexing. Journal of Biomedical Informatics 2013; 46 (06) 1145-1151.
- 9 Doods J, Botteri F, Dugas M, Fritz F. WP7 , EHR4CR. A European inventory of common electronic health record data elements for clinical trial feasibility. Trials: 2014
- 10 U.S. National Library of Medicine: MeSH. [Online] [cited 2014]. Available from. https://www.nlm.nih.gov/mesh/.
- 11 U.S. National Library of Medicine: Snomed- CT. [Online] [cited 2014]. Available from. https://www.nlm.nih.gov/research/umls/Snomed/snomed_main.html.
- 12 Jacobs AK, Quinn TA, Nelson SJ. Mapping SNOMED-CT Concepts to MeSH Concepts. In: AMIA Annu Symp Proc. 2006: 965.
- 13 International Website of University Hospital Muenster. [Online] [cited 2014]. Available from. http://internationalpatients.klinikum.uni-muenster.de/index.php?id=1&L=2.
- 14 ClinicalTrials.gov. [Online] [cited 2014]. Available from. http://www.clinicaltrials.gov/.
- 15 Wingert F. Medical Linguistics: Automated Indexing into Snomed. Crit Rev Inform 1988; 1 (04) 333-403.
- 16 Pathak J, Wang J, Kashyap S, Basford M, Li R, Masys D. et al. Mapping clinical phenotype data elements to standardized metadata repositories and controlled terminologies: the eMERGE Network experience. J Am Med Inform Assoc 2011; 18 (04) 376-386.
- 17 Pattern Matching and Replacement. [Online] [cited 2014]. Available from. http://stat.ethz.ch/R-manual/R-devel/library/base/html/grep.html.
- 18 Selvin S. Modern Applied Biostatistical Methods. Using S-Plus. Oxford University Press; 1998
- 19 Ross J, Samson T, Sim I. Analysis of Eligibility Criteria Complexity in Clinical Trials. In: Analysis of Eligibility Criteria Complexity in Clinical Trials. 2010: 46-50.
- 20 Cuggia M, Besana P, Glasspool D. Comparing semi-automatic systems for recruitment of patients to clinical trials. J Medical Informatics 2011; 80 (06) 371-388.
- 21 Dugas M, Lange M, Müller-Tidow C, Prokosch H. Routine data from hospital information systems can support patient recruitment for clinical studies. Clin Trials 2010; 7 (02) 183-189.
- 22 Getz K, Wenger J, Campo R, Seguine E, Kaitin K. Assessing the impact of protocol design changes on clinical trial performance. Am J Ther 2008; 15 (05) 450-457.
- 23 Hearn J, Sullivan R. The impact of the ‘Clinical Trials’ directive on the cost and conduct of non-commercial cancer trials in the UK. Eur J Cancer 2007; 43 (01) 8-13.