On the Acceptance, Adoption, and Utility of Synthetic Data for Healthcare Innovation

Master Thesis by Robin Daniël van Hoorn


Addressing the difficulty of accessing patient data is key to advancing healthcare innovation. Consequently, healthcare organizations like Philips and the Eindhoven Medtech
Innovation Center are exploring the potential of synthetic data. This artificial data, which
seeks to mimic the statistical properties of the original data, could be used to share privacysensitive datasets or enlarge datasets. Yet, existing research on synthetic data focuses mainly
on technical shortcomings, overlooking non-technical factors impacting the realization of its
With the help of a diverse group of experts, this thesis qualitatively explores two perspectives, namely (1) the potential utility of synthetic data for machine learning-based healthcare
innovation and (2) the (non-technical) factors influencing the acceptance and adoption of
synthetic data for machine learning-based healthcare innovation. Through semi-structured
interviews, qualitative analysis methods, and a follow-up survey, two frameworks are created
that elucidate insights on both perspectives.
The results show that experts perceive utility in synthetic data for both the research,
development, and integration & deployment stages of machine learning-based healthcare
innovation. For each stage, the proposed utility framework presents several possible use
cases verified by the research participants. In addition, the results show that six factors
primarily influence the acceptance and adoption of synthetic data for machine learningbased healthcare innovation.
The resulting frameworks can facilitate more fruitful discussions about the acceptance,
adoption, and utilization of synthetic data for machine learning-based healthcare innovation by providing a structured overview of the most important facets and considerations
applicable to the process.

