Benefits of sharing patient data for research outweigh re-identification risks

Recent study finds low risk of re-identification when sharing patient data for research, and the health equity benefits outweigh problems.

By admin

Nov 21, 2022, 5:25 PM

Patients, healthcare providers and regulators are perennially concerned about the privacy and security of patient data. Data breaches are common in this digital age, exposing the demographic, clinical, and administrative data of millions of patients each year.

There are real risks around hacking, phishing and ransomware at the healthcare provider level, and there are many good reasons to try locking down patient data tightly and keeping it within the four walls of the organization as often as possible.

But not all privacy and security concerns are created equal, and taking a one-size-fits-all approach to data management can cause just as many problems as it hopes to solve.

For example, the potential risks of sharing de-identified patient data with research organizations due to fears over re-identification are likely overblown, according to an international team of researchers writing in PLOS Digital Health.

It is certainly feasible to re-identify patient data with current technology, and the possibility can never be eliminated completely. However, once data has been properly stripped of identifying demographic elements, there is little reason to believe that information is actually being traced back to its original owners and used for nefarious purposes, the authors stated.

PubMed’s library of scientific papers contains no examples of such re-identification occurring in the real world, and a search of more than 10,000 U.S. media publications revealed no reports of patients being matched back to data contributed to research registries or other databases, the authors noted.

With no evidence that re-identification is a clear and present danger to patients, healthcare organizations and their patients should reconsider their reluctance to participate in the clinical research ecosystem, especially in light of the growing global need for new therapeutic approaches to chronic diseases and rare, acute conditions.

Instead, stakeholders should actively invest in sharing patient data more broadly, which is essential for creating unbiased artificial intelligence (AI) algorithms while ensuring that clinical trials and other research initiatives are adequately diverse, inclusive and representative of real-life populations.

“The medical knowledge system that informs clinical practice worldwide has historically been based on studies primarily performed on a handful of high-income countries and typically enrolling white males,” the article pointed out. “To truly move towards a global knowledge medical system that incorporates data from all parts of the world to decrease bias and increase data fairness, data from places that historically have not had a leading role in the development of current medical standards should be included.”

Overcoming infrastructural and security barriers to sharing patient data

Privacy concerns are just one obstacle to making this vision a reality, the authors acknowledged. Developing countries continue to struggle with basic infrastructure issues, such as internet access in remote and rural areas, that make it difficult to collect complete and accurate digital data fit for research purposes. These regions need support and resources to build their digital health capabilities and join the world research community.

In countries that have already moved beyond these foundational barriers, however, members of the healthcare community need to contextualize their concerns and work toward contributing to a more equitable future for patients.

“We would argue that the cost—measured in terms of access to future medical innovations and clinical software while potentiating bias—of slowing ML progress is too great to stop sharing data through large publicly available databases for concerns over imperfect anonymization and potential linkage risks,” the team said.

“Publicly available datasets provide the fuel for widespread application and adoption of AI in healthcare and for advancing our understanding of heterogeneous and diverse patient populations globally. Slowing progress by limiting data sharing risks curtailing medical innovation and significantly impeding our ability to advance our understanding of health and global disease.”

Instead of limiting data sharing as a first line of defense against potential cybercrimes, organizations should work together to develop more comprehensive regulatory frameworks and shared governance standards to safeguard sensitive information.

“Preventing artificial intelligence’s progress towards precision medicine and sliding back to clinical practice dogma may pose a larger threat than concerns of potential patient reidentification within publicly available datasets,” the team concludes. “While the risk to patient privacy…will never be zero, society has to determine an acceptable risk threshold below which data sharing can occur—for the benefit of a global medical knowledge system.”

Jennifer Bresnick is a journalist and freelance content creator with a decade of experience in the health IT industry. Her work has focused on leveraging innovative technology tools to create value, improve health equity, and achieve the promises of the learning health system.