Methods Inf Med 2015; 54(01): 41-44
DOI: 10.3414/ME13-02-0027
Focus Theme – Original Articles
Schattauer GmbH

An Eligibility Criteria Query Language for Heterogeneous Data Warehouses[*]

R. Bache
1   Department of Informatics, King’s College London, London, UK
2   Department of Primary Care and Public Health Sciences, King’s College London, London, UK
,
A. Taweel
1   Department of Informatics, King’s College London, London, UK
2   Department of Primary Care and Public Health Sciences, King’s College London, London, UK
,
S. Miles
1   Department of Informatics, King’s College London, London, UK
,
B. C. Delaney
2   Department of Primary Care and Public Health Sciences, King’s College London, London, UK
› Author Affiliations
Further Information

Publication History

received: 15 June 2013

accepted: 07 May 2014

Publication Date:
22 January 2018 (online)

Summary

Introduction: This article is part of the Focus Theme of Methods of Information in Medicine on “Managing Interoperability and Complexity in Health Systems”.

Objectives: The increasing availability of electronic clinical data provides great potential for finding eligible patients for clinical research. However, data heterogeneity makes it difficult for clinical researchers to interrogate sources consistently. Existing standard query languages are often not sufficient to query across diverse representations. Thus, a higher- level domain language is needed so that queries become data-representation agnostic. To this end, we define a clinician-readable computational language for querying whether patients meet eligibility criteria (ECs) from clinical trials. This language is capable of implementing the temporal semantics required by many ECs, and can be automatically evaluated on heterogeneous data sources.

Methods: By reference to standards and examples of existing ECs, a clinician-readable query language was developed. Using a model-based approach, it was implemented to transform captured ECs into queries that interrogate heterogeneous data warehouses. The query language was evaluated on two types of data sources, each different in structure and content.

Results: The query language abstracts the level of expressivity so that researchers construct their ECs with no prior knowledge of the data sources. It was evaluated on two types of semantically and structurally diverse data warehouses. This query language is now used to express ECs in the EHR4CR project. A survey shows that it was perceived by the majority of users to be useful, easy to understand and unambiguous.

Discussion: An EC-specific language enables clinical researchers to express their ECs as a query such that the user is isolated from complexities of different heterogeneous clinical data sets. More generally, the approach demonstrates that a domain query language has potential for overcoming the problems of semantic interoperability and is applicable where the nature of the queries is well understood and the data is conceptually similar but in different representations.

Conclusions: Our language provides a strong basis for use across different clinical domains for expressing ECs by overcoming the heterogeneous nature of electronic clinical data whilst maintaining semantic consistency. It is readily comprehensible by target users. This demonstrates that a domain query language can be both usable and interoperable.

* Supplementary material published on our web-site www.methods-online.com


 
  • References

  • 1 Ross J, Tu S, Carini S, Sim I. Analysis of Eligibility Criteria Complexity in Clinical Trials. AMIA Summits Transl Sci Proc. 2010: 46-50.
  • 2 EHR4CR - Electronic Healthcare Records for Clinical Research website. http://www.ehr4cr.eu, last accessed 27/11/2013.
  • 3 Tu S, Peleg M, Carini S. Bobak M, Ross J. et al. A practical method for transforming free-text eligibility criteria into computable criteria. J Biomed Inform 2011; 44 (02) 239-250.
  • 4 Weng C, Tu SW, Sim I, Richesson R. Formal representation of eligibility criteria: a literature review. J Biomed Inform 2010; 43 (03) 451-467.
  • 5 Wang SJ, Ohno-Machado L, Mar P, Boxwala AA, Greenes RA. Enhancing Arden Syntax for Clinical Trial Eligibility Criteria. Proc AMIA Symp. 1999: 1188.
  • 6 Sordo M, Boxwala A, Ogunyemi O, Greenes R. Description and status update on GELLO: a proposed standardized object-oriented expression language for 
clinical decision support. Stud Health Technol Inform 2004; 107: 164-168.
  • 7 Tu S, Peleg M, Carini Rubin D, Sim I. Ergo - A Template-based Expression Language for Encoding Eligibility Criteria. http://ebookbrowse. com/ergo-technical-documentation-pdf-d47453206 (last accessed 30/4/2013).
  • 8 Benson T. Principles of Health Interoperability HL7 and SNOMED: Chapter 7. Springer: 2009
  • 9 Standard Specification for Continuity of Care Record (CCR), ASTM E 236905.e2. West Conshohocken, PA, USA: 2010
  • 10 National Quality Forum, Quality Data Model, December 2012. www.qualityforum.org/QualityDataModel.aspx (last accessed 29/11/ 2013).
  • 11 Nigrin DJ, Kohane IS. Temporal Expressiveness in Querying a Time-stamp-based Clinical Database. J Am Med Inform Assoc 2000; 7 (02) 152-163.
  • 12 Bache R, Miles S, Taweel A. An Adaptable Architecture for Patient Cohort Identification from Diverse Data Sources. J Am Med Inform Assoc 2013 Sep 24.
  • 13 Chen Y, Bache R, Miles S, Cuggia M, Soto-Rey I, Taweel A. A SOA-based Platform for Automating Clinical Trial Feasibility Study. In. Proceedings of the IADIS International Conference E-health 2013. IADIS Press; 2013: 87-94.
  • 14 i2b2- Informatics for Integrating Biology and the Bedside. National Centre for Biomedical Computing. https://www.i2b2.org. Accessed 1/6/2013 doi: 10.1136/amiajnl-2013-001858.
  • 15 Ethier JF, Dameron O, Curcin V, McGilchirst M, Verheij R, Arvanitis T. et al. A unified structural/terminological interoperability framework based on LexEVS: application to TRANSFoRm. J Am Med Inform Assoc. 2013; 20 (05) 986-994.
  • 16 Delaney BC, Peterson KA, Speedie S, Taweel A, Arvanitis TN, Hobbs FDR. Envisioning a Learning Health Care System: The Electronic Primary Care Research Network. A Case Study. Annals of Family Medicine 10 (01) 54-59.