Subscribe to RSS
DOI: 10.3414/ME14-01-0137
Exploiting Distributed, Heterogeneous and Sensitive Data Stocks while Maintaining the Owner’s Data Sovereignty
Publication History
received:
10 December 2014
accepted:
30 May 2015
Publication Date:
22 January 2018 (online)
Summary
Background: To achieve statistical significance in medical research, biological or data samples from several bio- or databanks often need to be complemented by those of other institutions. For that purpose, IT-based search services have been established to locate datasets matching a given set of criteria in databases distributed across several institutions. However, previous approaches require data owners to disclose information about their samples, raising a barrier for their participation in the network.
Objective: To devise a method to search distributed databases for datasets matching a given set of criteria while fully maintaining their owner’s data sovereignty.
Methods: As a modification to traditional federated search services, we propose the decentral search, which allows the data owner a high degree of control. Relevant data are loaded into local bridgeheads, each under their owner’s sovereignty. Researchers can formulate criteria sets along with a project proposal using a central search broker, which then notifies the bridgeheads. The criteria are, however, treated as an inquiry rather than a query: Instead of responding with results, bridgeheads notify their owner and wait for his/her decision regarding whether and what to answer based on the criteria set, the matching datasets and the specific project proposal. Without the owner’s explicit consent, no data leaves his/ her institution.
Results: The decentral search has been deployed in one of the six German Centers for Health Research, comprised of eleven university hospitals. In the process, compliance with German data protection regulations has been confirmed. The decentral search also marks the centerpiece of an open source registry software toolbox aiming to build a national registry of rare diseases in Germany.
Conclusions: While the sacrifice of real-time answers impairs some use-cases, it leads to several beneficial side effects: improved data protection due to data parsimony, tolerance for incomplete data schema mappings and flexibility with regard to patient consent. Most importantly, as no datasets ever leave their institution, owners can reject projects without facing potential peer pressure. By its lower barrier for participation, a decentral search service is likely to attract a larger number of partners and to bring a researcher into contact with the right potential partners.
-
References
- 1 Asslaber M, Zatloukal K. Biobanks: transnational, European and global networks. Brief Funct Genomic Proteomic 2007; 6 (03) 193-201.
- 2 Wichmann H-E. et al. Comprehensive catalog of European biobanks. Nat Biotechnol 2011; 29 (09) 795-797.
- 3 Schröder C, Heidtke KR, Zacherl N, Zatloukal K, Taupitz J. Safeguarding donors’ personal rights and biobank autonomy in biobank networks: the CRIP privacy regime. Cell Tissue Bank 2011; 12 (03) 233-240.
- 4 Demchok J, Taube S, Fombonne B, Lubensky I. The National Cancer Institute (NCI) specimen resource locator. Eur J Cancer 2013; 49 Suppl (Suppl. 04) S37-38.
- 5 Eder J, Dabringer C, Schicho M, Stark K. Information Systems for Federated Biobanks. In Hameurlain A, Küng J, Wagner R. editors Transactions on Large-Scale Data- and Knowledge-Centered Systems I [Internet]. Berlin, Heidelberg: Springer; 2009: 156-190. Available from http://dx.doi.org/10.1007/978-3-642-03722-1_7.
- 6 Aymé S, Schmidtke J. Networking for rare diseases: a necessity for Europe. Bundesgesundheitsblatt-Gesundheitsforschung-Gesundheitsschutz 2007; 50 (12) 1477-1483.
- 7 Doods J, Bache R, McGilchrist MM, Daniel C, Dugas M, Fritz F. et al. Piloting the EHR4CR Feasibility Platform across Europe: Methods Inf Med [Internet]. Jun 18. 2014; [cited Jul 29, 2014] 53: 4 Available from http://www.schattauer.de/index.php?id=1214&doi=10.3414/ME13-01-0134.
- 8 TMF e.V. German Biobank Registry (Website) [Internet]. [cited Feb 8, 2014 ]. Available from. http://www.biobanken.de.
- 9 Weber GM, Murphy SN, McMurry AJ, MacFadden D, Nigrin DJ, Churchill S. et al. The Shared Health Research Information Network (SHRINE): a prototype federated query tool for clinical data repositories. J Am Med Inform Assoc 2009; 16 (05) 624-630.
- 10 Ouagne D, Hussain S, Sadou E, Jaulent M-C, Daniel C. The Electronic Healthcare Record for Clinical Research (EHR4CR) information model and terminology. Stud Health Technol Inf 2012; 180: 534-538.
- 11 EURORDIS-NORD-CORD. Joint Declaration: 10 Key Principles of Rare Disease Patient Registries [Internet]. 2012. [cited 2014 Sep 18]. Available from http://download.eurordis.org/documents/pdf/EURORDIS_NORD_CORD_JointDec_Registries_FINAL.pdf.
- 12 Kadioglu D. Institutionsübergreifende Nutzung Verteilter Metadata Repositories [Master Thesis]. Dortmund: Fachhochschule Dortmund; 2013
- 13 Warzel DB, Andonaydis C, McCurry B, Chilukuri R, Ishmukhamedov S, Covitz P. Common data element (CDE) management and deployment in clinical trials. AMIA Annu Symp Proc. 2003: 1048.
- 14 ISO/IEC JTC1 SC32 WG2. ISO/IEC 11179 Information Technology - Metadata registries [Internet]. [cited Aug 8, 2014 ]. Available from. http://metadata-standards.org/11179/.
- 15 Jiang G, Evans J, Oniki TA, Coyle JF, Bain L, Huff SM. et al. Harmonization of Detailed Clinical Models with Clinical Study Data Standards. Methods Inf Med 2015; 54 (01) 65-74.
- 16 Mate S, Köpcke F, Toddenroth D, Martin M, Prokosch H-U, Bürkle T. et al. Ontology-Based Data Integration between Clinical and Research Systems. PloS One [Internet]. 2015; [cited Mar 19, 2015] 10: 1 Available from http://www.ncbi.nlm.nih.gov/pmc/articles/PMC4294641/.
- 17 Dugas M. Missing Semantic Annotation in Databases: The Root Cause for Data Integration and Migration Problems in Information Systems. Methods Inf Med 2014; 53 (06) 516-517.
- 18 Leser U, Naumann F. Informationsintegration: Architekturen und Methoden zur Integration verteilter und heterogener Datenquellen. 1st ed. Heidelberg: dpunkt; 2006
- 19 Bundesministerium für Bildung und Forschung. Im Kampf gegen den Krebs - Deutsches Konsortium für Translationale Krebsforschung gegründet [Internet]. 2012. Available from http://www.bmbf.de/_media/press/Pm_1029-135.pdf.
- 20 Grätzel von Grätz P. Die Deutschen Zentren der Gesundheitsforschung. Berlin: Bundesministerium für Bildung und Forschung; 2011
- 21 Lablans M, Borg A, Ückert F. A RESTful interface to pseudonymization services in modern web applications. BMC Med Inform Decis Mak 2015; 15 (01) 2.
- 22 Altmann U, Katz FR, Dudeck J. A reference model for clinical tumour documentation. Stud Health Technol Inform 2006; 124: 139.
- 23 Arbeitsgemeinschaft Deutscher Tumorzentren e.V. Einheitlicher Onkologischer Basisdatensatz von ADT und GEKID Stand: 12.02.2014 [Internet]. [cited Dec 1, 2014]. Available from. http://www.tumorzentren.de/tl_files/dokumente/Module%20zum%20Basisdatensatz/ADT_GEKID_Basisdatensatz.pdf.
- 24 Arbeitsgemeinschaft Deutscher Tumorzentren e.V. Organspezifische Module als Ergänzung zum Basisdatensatz [Internet]. [cited Sep 5, 2014 ]. Available from. http://www.tumorzentren.de/module.html.
- 25 European Union Committe of Experts on Rare Diseases (EUCERD). Core Recommendations on Rare Disease Patient Registration and Data Collection [Internet]. 2013. Available from. http://www.eucerd.eu/wp-content/uploads/2013/06/EUCERD_Recommendations_RDRegistryDataCollection_adopted.pdf.
- 26 Muscholl M, Lablans M, Wagner TO, Ückert F. OSSE - Open Source Registry Software Solution. Orphanet J Rare Dis 2014; 9 Suppl (Suppl. 01) O9.
- 27 Bellgard M, Beroud C, Parkinson K, Harris T, Ayme S, Baynam G. et al. Dispelling myths about rare disease registry system development. Source Code Biol Med 2013; 8 (01) 21.
- 28 Harris PA, Taylor R, Thielke R, Payne J, Gonzalez N, Conde JG. Research electronic data capture (REDCap) - a metadata-driven methodology and workflow process for providing translational research informatics support. J Biomed Inform 2009; 42 (02) 377-381.
- 29 Free Software Foundation, Inc. GNU Affero General Public License (Version 3) [Internet]. 2007. Available from https://www.gnu.org/licenses/agpl-3.0.en.html.
- 30 Commission Expert Group on Rare Diseases. Minutes of Meeting (Luxembourg, July 3–4, 2014) [Internet]. Available from. http://ec.europa.eu/health/rare_diseases/docsev_20140703_mi_en.pdf.
- 31 Murphy SN, Mendis M, Hackett K, Kuttan R, Pan W, Phillips LC. et al. Architecture of the Open-source Clinical Research Chart from Informatics for Integrating Biology and the Bedside. AMIA Annu Symp Proc. 2007: 548-552.
- 32 McMurry AJ, Murphy SN, MacFadden D, Weber G, Simons WW, Orechia J. et al. SHRINE: Enabling Nationally Scalable Multi-Site Disease Studies. PLoS ONE 2013; 8 (03) e55811.
- 33 Natter MD. et al. An i2b2-based, generalizable, open source, self-scaling chronic disease registry. J Am Med Inform Assoc 2013; 20 (01) 172-179.
- 34 EHR4CR. EHR4CR Executive Summary [Internet]. 2011. [cited 2014 Aug 4]. Available from http://www.ehr4cr.eu/docs/EHR4CR%20Executive%20Summary%20Nov%2005,%202011.pdf.
- 35 BBMRI. Catalogue of European Biobanks [Internet]. [cited Feb 15, 2014 ]. Available from. https://www.bbmriportal.eu/bbmri2.0/jsp/core/login.jsf.
- 36 Trinczek B, Köpcke F, Leusch T, Majeed RW, Schreiweis B, Wenk J. et al. Design and multicentric Implementation of a generic Software Architecture for Patient Recruitment Systems re-using existing HIS tools and Routine Patient Data. Appl Clin Inform 2014; 5 (01) 264-283.