Graph-Based Approaches to Record Linkage in Large Datasets
SPRING 2016 RESEARCH INCUBATION AWARDEES
PI: Jacob Bor (Global Health, SPH)
Collaborators: George Kollios (Computer Science, CAS), Katia Oleinik (IS&T), Lorenzo Orecchia (Computer Science, CAS)
This project will develop, implement, validate, and publish graph-based methods for probabilistic record linkage. Researchers will investigate different approaches to integrating information contained in the network structure of the data and assess their performance using a real world dataset: 35 million laboratory records from South Africa’s National Health Laboratory Service. Record linkage presents a challenge in large populations where unique patient identifiers do not exist and identifying information is limited. This project aims to address the challenges presented by this scenario, which are common in many developing countries faced with a growing burden of chronic disease.
This work is funded by a Hariri Research Award made in June, 2016.