A Scoping Review of the Ethical, Legal, and Technical Dimensions of Privacy in Big Data Health Research

Reem Menwer Owaid Alrashdi (1) , Reem Munawar Awad Al-Rashdi (2) , Salihah Abdullah Saeed Alghamdi (3) , Khuluod Ali Mohammed Rezgallah (3) , Abdulaziz Ali Abdulaziz Alghaythar (4) , Faisal Fahad Mohammed Alshammari (5) , Abdullah Jaber Eissa Faqihi (6) , Dhaifallah Mohammed Dhaifallah Moraya (7) , Ahlam Abdullah Ibrahim Aqeel (8) , Muath Mohammed Dhaifallah Moraya (8) , Khloud Masead Dhaif Allah Al-Mutairi (9) , Nasser Nashi Alshaibani (10) , Khaled Ibrahim Muhammad Mobaraki (11) , Mohammed Saleh Abdulkareem Al Juma, (12) , Sarah Ahmed Arif (8)
(1) Medical Records Technician, Saudi Health Center,Ministry of Health, Saudi Arabia,
(2) The First Health Cluster In Riyadh,Ministry of Health, Saudi Arabia,
(3) Al Imam Abdul Rahman Al Faisal Hospital,Ministry of Health, Saudi Arabia,
(4) Primary Health Care Center In Aldar Albedh2,Ministry of Health, Saudi Arabia,
(5) Eradah Complex And Mental Health-Hail,Ministry of Health, Saudi Arabia,
(6) Javan Eradh And Mental Health Hospital,Ministry of Health, Saudi Arabia,
(7) Jazan Mental Health Hospital , Ministry of Health, Saudi Arabia,
(8) Ministry Of Health, Saudi Arabia,
(9) Specialized Dental Center In Riyadh,Ministry of Health, Saudi Arabia,
(10) Sanitah General Hospital,Ministry of Health, Saudi Arabia,
(11) National Guard Health Affairs,Ministry of Health, Saudi Arabia,
(12) Mohammed Bin Abdulaziz Hospital,Ministry of Health, Saudi Arabia

Abstract

Background: The proliferation of big data in health research—encompassing genomic datasets, electronic health records (EHRs), wearables, and multi-omics—offers unprecedented potential for scientific discovery and personalized medicine. However, this data-driven paradigm poses profound and novel challenges to the privacy of individuals, demanding an integrated analysis of ethical, legal, and technical safeguards. Aim: This scoping review synthesizes contemporary literature (2015-2024) to map the ethical dilemmas, legal frameworks, and technical solutions concerning privacy in big data health research. Methods: A systematic search was conducted across PubMed, IEEE Xplore, Scopus, and Google Scholar. Literature was thematically analyzed to identify key themes, tensions, and emergent strategies across the three dimensions. Results: The review identifies a core tension between data utility for the public good and individual privacy rights. Ethically, key issues include re-identification risk, informed consent for future unspecified research, and algorithmic bias. Legally, a fragmented global landscape exists, with regulations like the GDPR providing strong protections but creating compliance complexity. Technically, privacy-enhancing technologies (PETs) such as federated learning, differential privacy, and homomorphic encryption offer promising, yet imperfect, solutions. Conclusion: Effective privacy preservation in big data health research requires a harmonized, interdisciplinary approach. A robust governance framework must interweave ethical principles, adaptable legal compliance, and state-of-the-art technical controls, foster public trust while enabling responsible innovation. 

Full text article

Generated from XML file

References

Abadi, M., Chu, A., Goodfellow, I., McMahan, H. B., Mironov, I., Talwar, K., & Zhang, L. (2016, October). Deep learning with differential privacy. In Proceedings of the 2016 ACM SIGSAC conference on computer and communications security (pp. 308-318). https://doi.org/10.1145/2976749.2978318

Acar, A., Aksu, H., Uluagac, A. S., & Conti, M. (2018). A survey on homomorphic encryption schemes: Theory and implementation. ACM Computing Surveys (Csur), 51(4), 1-35. https://doi.org/10.1145/3214303

Arksey, H., & O'Malley, L. (2005). Scoping studies: towards a methodological framework. International journal of social research methodology, 8(1), 19-32. https://doi.org/10.1080/1364557032000119616

Cohen, I. G., & Mello, M. M. (2018). HIPAA and protecting health information in the 21st century. Jama, 320(3), 231-232. doi:10.1001/jama.2018.5630

Dinov, I. D. (2016). Volume and value of big healthcare data. Journal of medical statistics and informatics, 4, 3. https://doi.org/10.7243/2053-7662-4-3

Dwork, C., McSherry, F., Nissim, K., & Smith, A. (2016). Calibrating noise to sensitivity in private data analysis. Journal of Privacy and Confidentiality, 7(3), 17-51. https://doi.org/10.29012/jpc.v7i3.405

El Emam, K., Mosquera, L., & Bass, J. (2020). Evaluating identity disclosure risk in fully synthetic health data: model development and validation. Journal of medical Internet research, 22(11), e23139. https://doi.org/10.2196/23139

Erlich, Y., & Narayanan, A. (2014). Routes for breaching and protecting genetic privacy. Nature Reviews Genetics, 15(6), 409-421. https://doi.org/10.1038/nrg3723

Evans, B. J. (2020). The streetlight effect: regulating genomics where the light is. Journal of Law, Medicine & Ethics, 48(1), 105-118. doi:10.1177/1073110520916998

Ferretti, A., Ienca, M., Sheehan, M., Blasimme, A., Dove, E. S., Farsides, B., ... & Vayena, E. (2021). Ethics review of big data research: What should stay and what should be reformed?. BMC medical ethics, 22(1), 51. https://doi.org/10.1186/s12910-021-00616-4

Ghazi, B., Golowich, N., Kumar, R., Manurangsi, P., & Zhang, C. (2021). Deep learning with label differential privacy. Advances in neural information processing systems, 34, 27131-27145.

Hamza, R., Hassan, A., Ali, A., Bashir, M. B., Alqhtani, S. M., Tawfeeg, T. M., & Yousif, A. (2022). Towards secure big data analysis via fully homomorphic encryption algorithms. Entropy, 24(4), 519. https://doi.org/10.3390/e24040519

Ienca, M., Ferretti, A., Hurst, S., Puhan, M., Lovis, C., & Vayena, E. (2018). Considerations for ethics review of big data health research: A scoping review. PloS one, 13(10), e0204937. https://doi.org/10.1371/journal.pone.0204937

Jiang, Y., Mosquera, L., Jiang, B., Kong, L., & El Emam, K. (2022). Measuring re-identification risk using a synthetic estimator to enable data sharing. PLoS One, 17(6), e0269097. https://doi.org/10.1371/journal.pone.0269097

Jordon, J., Szpruch, L., Houssiau, F., Bottarelli, M., Cherubin, G., Maple, C., ... & Weller, A. (2022). Synthetic Data--what, why and how?. arXiv preprint arXiv:2205.03257. https://doi.org/10.48550/arXiv.2205.03257

Kaye, J., Whitley, E. A., Lund, D., Morrison, M., Teare, H., & Melham, K. (2015). Dynamic consent: a patient interface for twenty-first century research networks. European journal of human genetics, 23(2), 141-146. https://doi.org/10.1038/ejhg.2014.71

Li, T., Sahu, A. K., Talwalkar, A., & Smith, V. (2020). Federated learning: Challenges, methods, and future directions. IEEE signal processing magazine, 37(3), 50-60. https://doi.org/10.1109/MSP.2020.2975749

Lu, Y., Shen, M., Wang, H., Wang, X., van Rechem, C., Fu, T., & Wei, W. (2023). Machine learning for synthetic data generation: a review. arXiv preprint arXiv:2302.04062. https://doi.org/10.48550/arXiv.2302.04062

Mascalzoni, D., Melotti, R., Pattaro, C., Pramstaller, P. P., Gögele, M., De Grandi, A., & Biasiotto, R. (2022). Ten years of dynamic consent in the CHRIS study: informed consent as a dynamic process. European Journal of Human Genetics, 30(12), 1391-1397. https://doi.org/10.1038/s41431-022-01160-4

Mittelstadt, B. D., & Floridi, L. (2016). The ethics of big data: current and foreseeable issues in biomedical contexts. The ethics of biomedical big data, 445-480. https://doi.org/10.1007/978-3-319-33525-4_19

Nissim, K., & Wood, A. (2021, December). Foundations for robust data protection: Co-designing law and computer science. In 2021 Third IEEE International Conference on Trust, Privacy and Security in Intelligent Systems and Applications (TPS-ISA) (pp. 235-242). IEEE. https://doi.org/10.1109/TPSISA52974.2021.00026

Obermeyer, Z., Powers, B., Vogeli, C., & Mullainathan, S. (2019). Dissecting racial bias in an algorithm used to manage the health of populations. Science, 366(6464), 447-453. https://doi.org/10.1126/science.aax2342

Ohno-Machado, L., Kim, J., Gabriel, R. A., Kuo, G. M., & Hogarth, M. A. (2018). Genomics and electronic health record systems. Human molecular genetics, 27(R1), R48-R55. https://doi.org/10.1093/hmg/ddy104

Raghupathi, W., & Raghupathi, V. (2014). Big data analytics in healthcare: promise and potential. Health information science and systems, 2(1), 3. https://doi.org/10.1186/2047-2501-2-3

Rezaeikhonakdar, D. (2023). AI chatbots and challenges of HIPAA compliance for AI developers and vendors. Journal of Law, Medicine & Ethics, 51(4), 988-995. doi:10.1017/jme.2024.15

Rocher, L., Hendrickx, J. M., & De Montjoye, Y. A. (2019). Estimating the success of re-identifications in incomplete datasets using generative models. Nature communications, 10(1), 3069. https://doi.org/10.1038/s41467-019-10933-3

Rockwern, B., Johnson, D., Snyder Sulmasy, L., & Medical Informatics Committee and Ethics, Professionalism and Human Rights Committee of the American College of Physicians. (2021). Health information privacy, protection, and use in the expanding digital health ecosystem: a position paper of the American College of Physicians. Annals of internal medicine, 174(7), 994-998. https://doi.org/10.7326/M20-7639

Shabani, M., & Borry, P. (2018). Rules for processing genetic data for research purposes in view of the new EU General Data Protection Regulation. European Journal of Human Genetics, 26(2), 149-156. https://doi.org/10.1038/s41431-017-0045-7

Staunton, C., Slokenberga, S., & Mascalzoni, D. (2019). The GDPR and the research exemption: considerations on the necessary safeguards for research biobanks. European Journal of Human Genetics, 27(8), 1159-1167. https://doi.org/10.1038/s41431-019-0386-5

Steinsbekk, K. S., Kåre Myskja, B., & Solberg, B. (2013). Broad consent versus dynamic consent in biobank research: is passive participation an ethical problem?. European Journal of Human Genetics, 21(9), 897-902. https://doi.org/10.1038/ejhg.2012.282

Tricco, A. C., Lillie, E., Zarin, W., O'Brien, K. K., Colquhoun, H., Levac, D., ... & Straus, S. E. (2018). PRISMA extension for scoping reviews (PRISMA-ScR): checklist and explanation. Annals of internal medicine, 169(7), 467-473. https://doi.org/10.7326/M18-0850

Vayena, E., & Blasimme, A. (2018). Health research with big data: time for systemic oversight. The journal of law, medicine & ethics, 46(1), 119-129. https://doi.org/10.1177/1073110518766026

Vayena, E., Dzenowagis, J., Brownstein, J. S., & Sheikh, A. (2017). Policy implications of big data in the health sector. Bulletin of the World Health Organization, 96(1), 66. https://doi.org/10.2471/BLT.17.197426

Xu, J., Glicksberg, B. S., Su, C., Walker, P., Bian, J., & Wang, F. (2021). Federated learning for healthcare informatics. Journal of healthcare informatics research, 5(1), 1-19. https://doi.org/10.1007/s41666-020-00082-4

Authors

Reem Menwer Owaid Alrashdi
ralrashd@moh.gov.sa (Primary Contact)
Reem Munawar Awad Al-Rashdi
Salihah Abdullah Saeed Alghamdi
Khuluod Ali Mohammed Rezgallah
Abdulaziz Ali Abdulaziz Alghaythar
Faisal Fahad Mohammed Alshammari
Abdullah Jaber Eissa Faqihi
Dhaifallah Mohammed Dhaifallah Moraya
Ahlam Abdullah Ibrahim Aqeel
Muath Mohammed Dhaifallah Moraya
Khloud Masead Dhaif Allah Al-Mutairi
Nasser Nashi Alshaibani
Khaled Ibrahim Muhammad Mobaraki
Mohammed Saleh Abdulkareem Al Juma,
Sarah Ahmed Arif
Alrashdi, R. M. O., Reem Munawar Awad Al-Rashdi, Salihah Abdullah Saeed Alghamdi, Khuluod Ali Mohammed Rezgallah, Abdulaziz Ali Abdulaziz Alghaythar, Faisal Fahad Mohammed Alshammari, … Sarah Ahmed Arif. (2024). A Scoping Review of the Ethical, Legal, and Technical Dimensions of Privacy in Big Data Health Research. Saudi Journal of Medicine and Public Health, 1(2), 1701–1708. https://doi.org/10.64483/202412467

Article Details

Similar Articles

<< < 3 4 5 6 7 > >> 

You may also start an advanced similarity search for this article.

Most read articles by the same author(s)