Experimenting with regulations: The role of regulatory sandboxes in governing AI for radiology
Irvine Sihlahla
The landscape of artificial intelligence (AI) and advanced data analytics is evolving rapidly, and that is a wonderful thing: it brings innovations, techniques, and algorithms that improve radiologists’ efficiency and accuracy. The extensive use of big data in AI training algorithms has also narrowed the timeframe from design to algorithm certification. Yet, many new promising applications fail to be rapidly deployed into clinical practice. Even when applications demonstrate immense potential during the design and development phase, there is a critical rate-determining step between the development of Software as a Medical Device (SaMD; the regulatory classification of AI tools for Radiology) and eventual clinical deployment: regulatory approval by agencies such as the FDA and the EU.
One factor contributing to the regulatory lag is the absence of established technical metrics and standards for SaMD certification within existing regulatory policies. In similar cases of scientific uncertainty or a lack of robust empirical evidence to guide policy, policymakers have frequently relied on the precautionary principle. The precautionary principle dictates that protective action be taken against potential serious environmental or human health hazards, even if scientific evidence is not fully established, resonating with core bioethical principles widely applied in healthcare to safeguard patient safety. The precautionary principle has been applied in many major international legal instruments and treaties, and some of its components are reflected in the recent EU AI Act. This act is the first enforceable law to require the establishment of regulatory sandboxes for high-risk sectors, including healthcare AI applications.
Regulatory sandboxes have emerged as a compelling mechanism to bridge the regulatory gap, truly, cometh the hour, cometh the man, as these frameworks allow regulators to ensure high standards without stifling innovation.
What is a regulatory sandbox?
A regulatory sandbox is a supervised testing environment where AI developers can experiment with new technologies under temporary, flexible regulatory guidance. The Organisation for Economic Co-operation and Development describes regulatory sandboxes as mechanisms that allow regulators to “test innovative products, services or business models in a controlled environment under a regulator’s oversight” while managing associated risks. This reflects the practical implementation of the precautionary principle: regulatory sandboxes reduce regulatory delays by striking a delicate balance between enabling innovative algorithms to flourish and maintaining adequate safeguards for safety, reliability, and accuracy.
Regulatory sandboxes can be further envisioned within a radiology practice as data hubs that bring together multidisciplinary personnel, such as radiologists, regulatory agencies, policymakers, data scientists, IRBs, and AI developers. They allow collaborative multidisciplinary teams to identify previously undiscovered flaws and limitations in AI models. They are an enabling testing environment in which root cause analysis of near misses, limitations, and safety issues can be conducted without ascribing liability to any individual AI actor. This setup allows the multidisciplinary team to deploy an AI model alongside daily clinical practice and assess its accuracy, reliability, and safety. These valuable clinical metrics, which serves as real-world evidence of the generalizability, accuracy, robustness, and safety profile of AI models in radiology, can be gathered through periodic algorithmic audits and inform the regulatory agency’s decision on granting certification to the SaMD.
Regulatory sandboxes as a path to regulatory approval
Regulatory sandboxes serve as a vital bridge for assessing the real-world performance of AI algorithms. Prospective and retrospective research on AI algorithms is typically conducted in tightly controlled environments, thereby enhancing a study’s internal validity. However, these controlled environments differ from uncontrolled real-world settings in subtle ways, leaving unanswered questions about how AI models will perform and integrate into existing radiology workflows. Regulatory sandboxes enable the real-world deployment of SaMD, allowing AI developers to assess the generalizability of their models and measure clinical performance. Similarly, regulators are afforded a mechanism to fully understand the safety profiles of SaMD when deployed in real-world settings and to assess the adequacy of human-in-the-loop and explainability of SaMD. Thus, testing in a regulatory sandbox provides empirical, real-world evidence to both AI developers and regulatory agencies, and facilitates certification and clinical deployment.
Beyond evaluating quantitative performance metrics, the use of a regulatory sandbox ensures that AI algorithms deployed in radiology align with the core ethical values of the radiology profession, as outlined in a recent multi-society statement: respect for human dignity, privacy, transparency, explainability, and accountability. Specifically, ethical and social impact assessments can be conducted within regulatory bodies to understand the impact of SaMD use on patients. For instance, algorithmic bias has been a major ethical concern. By using regulatory sandboxes, algorithmic bias can be addressed and mitigated through the inclusion of a multidisciplinary team, including bioethicists and patient advocates, who can assess the ethical and social impacts of SaMD on patients. This strengthens patient-centred healthcare delivery and enhances public trust in the use of algorithms in healthcare.
Importantly, ethical and social impact assessments aim to ensure that the profile of the SaMD remains within acceptable regulatory thresholds. When those thresholds are exceeded, the regulatory agency can use these real-world assessments to deny certification of the SaMD. Alternatively, the regulatory agency may approve the SaMD but impose stringent post-market surveillance measures to protect healthcare users from unnecessary harms associated with the SaMD.
The post-regulatory sandboxes path
Once an algorithm has been thoroughly evaluated in regulatory sandboxes using real-world data and the regulatory team is satisfied that its use in clinical practice guarantees safety, accuracy, and reliability, the SaMD can be deployed widely. If there are lingering concerns about the algorithm, the regulator can request that mitigating measures be put in place. Common post-market requirements include Predetermined Change Control Plans (PCCP), which are recommended by the IMDRF, the umbrella body that includes the FDA and the EU medical device regulatory agencies. The PCCP is a scheduled post-market surveillance plan that notifies the regulatory agency of expected updates to SaMD, including realignment of SaMD with continuous learning capabilities to improve its accuracy, reliability, and generalizability. Data from radiology regulatory sandboxes is vital for this process, as it enables AI developers to plan future software upgrades in line with real-world assessments.
AI algorithms can also experience data/concept drift, leading to a degradation in predictive power over time. Guided by empirical data from regulatory sandboxes, the regulatory agency can request continuous algorithmic audits during the post-market surveillance phase to help identify concept drift. This ensures that algorithms are monitored throughout their deployment and that safety, accuracy, and reliability standards are upheld. Hence, by providing a fertile ground for SaMD model assessment and certification, regulatory sandboxes support policy-making decisions and chart a route for the safe deployment of AI algorithms in radiology.
Practical limitations of regulatory sandboxes
Although regulatory sandboxes offer an unparalleled opportunity for radiology to lead discussions on the regulation of AI-based SaMD in healthcare, they face challenges in their application and development. Chief among them is deciding who is liable and accountable for any patient harm that may result from the use of regulatory sandboxes. While the EU proposes that AI developers be liable when SaMD’s are evaluated in regulatory sandboxes, there is currently no universal liability framework for AI harm. Additionally, regulatory sandboxes require skilled personnel and financial resources to set up and manage. This may limit the participation in these data hubs to well-resourced AI developers and exclude small start-up firms.
Conclusions
By allowing for real-world assessment of SaMD models before widespread deployment, radiology-based regulatory sandboxes map and direct the future policy on SaMD use in radiology. This empowers radiologists to minimise harm to patients and limit liability related to AI use by allowing only certified, thoroughly tested algorithms to be deployed in clinical practise. As radiology is among the leading specialities where AI algorithms for healthcare are deployed, we are at the forefront of deciding how these regulatory sandboxes can be created, managed and improved. We are in an exciting phase of AI use in radiology, and regulatory sandboxes, despite their limitations, offer an ingenious way to facilitate the certification of SaMD without compromising patient safety, reliability, or accuracy.
Irvine Sihlahla is a specialist radiologist and legal scholar with a keen interest in Bioethics and Health Law. He is currently involved in doctoral research at the University of KwaZulu-Natal, South Africa, related to artificial intelligence in health care. He serves as a member of the Trainee Editorial Board for Radiology: Artificial Intelligence.


