Improved Record Linkage for Encrypted Identifying Data
Authors: C. Pang and D. Hansen
Date: August 2006
Abstract:
The health data integration project at the E-Health Research Centre is researching ways of improving the integration of health and health related data while maintaining the privacy and security of the data. One such method is to improve the mechanisms of matching patients across databases when the identifying information must not be revealed, even during the linkage step.
Background: With health related data spread between many administrative and clinical databases the ability to bring the data together dynamically is important. This could be to support clinical based decision making, administrative reporting or for clinical research based access to data.
Objectives: There are already mechanisms published for blind folded record linkage. A mechanism for further strengthening the security and privacy of these algorithms is to encrypt the identifying data, such as name, data of birth, before performing the linkage step. However, due to the nature of encryption algorithms, encrypted data can only be matched exactly, limiting the ability to allow for errors in the data. This work presents a mechanism to allow matching of encrypted data when there may be errors in the data.
Methods: A public reference table which is common to both data custodians is used. Each value in the original data is compared to data in the public reference table using an edit distance function. Names from the reference table which are within a given distance of the original data are sent to the linker. The data from the two data custodians are then compared to decide the likelihood of two records being a match. Results: The method described in this paper performs better than other methods which support matching of encrypted data, such as exact matching or matching using soundex.
Discussion and Conclusion: The method described in this paper can be used to improve the level of record matching in tools where access to identifying data is prohibited. This method is currently being added to the HDI software tool as another mechanism of matching records between databases.
© 2006 HISA Ltd.
Download the paper (PDF:38KB)
