The seven components of trustworthy artificial intelligence for healthcare
Artificial intelligence (AI) is finding a foothold in healthcare, with nearly every new technology boasting some sort of machine learning (ML) component. The excitement around AI is palpable, and the visions of a fully automated future are tempting. But there is still a long way to go before AI become sophisticated enough—and trustworthy enough—to truly support clinicians and patients with clinical care.
That’s because the current generation of artificial intelligence is neither wholly artificial nor, technically, intelligent. Humans are intimately involved in creating these tools, from designing the algorithms and selecting training data to using our fallible brain power to validate, display and utilize the outputs.
As a result, unintentional bias frequently creeps into AI algorithms. Untrustworthy results are simply not acceptable in the healthcare setting, where life-and-death decisions depend on having the right information for the right situation.
Fortunately, we are just at the beginning of the AI journey, and there remain ample opportunities to course correct and ensure that only the highest-quality, most trusted algorithms are put into real-world use.
To help ensure that the industry progresses in the right direction, the Coalition for Healthcare AI (CHAI) convened a number of key stakeholders to discuss what makes for good AI and how to ensure that developers and users are following best practices for algorithm development.
As part of the conversation, CHAI has identified seven top features to look for when creating a product or evaluating a solution for use in the real world.
1. Bias, equity and fairness
An algorithm should not produce different results for different types of patients. Training data should be equitable, inclusive, representative and diverse to ensure that outcomes are not biased against any particular population.
For example, a tool designed to identify skin cancer should include training data from people with a variety of skin tones and ethnic backgrounds, since lesions appear differently on each skin type.
Algorithms should bake equity into their design by explicitly identifying and defining equity goals at the beginning of the development process.
CHAI defines “testability” as “the extent to which an ML algorithm’s performance can be verified as satisfactory.” Evaluators must consider the context in which the algorithm is being used, the maturity and design phase of the algorithm, and the criteria against which the algorithm will be judged.
Testing models under real-world conditions is essential for honestly proposing its value in the clinical context. Potential purchasers should ask detailed questions about any claims being made to best understand how the models have been put through their paces.
Usability is a top concern for putting AI into clinical practice, especially as burnout rages among front line clinicians. AI should make a clinician’s life easier, augmenting their unique skills and enabling the delivery of top-quality care.
Usability is in the eye of the beholder, however, and must be evaluated alongside the context of deployment, integration with other systems, and desired degree of active involvement from the user. Both patients and providers should be consulted on usability issues, CHAI recommends.
Safety is paramount in the clinical setting. AI must not contribute in any way to patient harm. When evaluating a model for safety, implementors should establish strong oversight and define relevant criteria that consider immediate safety events as well as downstream impacts.
Users should still have control over accepting or rejecting automated suggestions and should have insight into the logic behind what the algorithm is proposing. Tools should also include clear pathways for reporting safety concerns.
Transparency is the key to avoiding bias and ensuring that a model provides explainable, trustworthy results throughout its lifecycle. Data provenance must be clearly traceable from beginning to end, and any changes should be tracked in a comprehensive manner. There should be a standardized process in place for data curation, CHAI says, with validated inclusion and exclusion criteria.
Users and patients should be fully aware of the limitations of the model and be able to understand the impact of using the tool in a manner suited to their health literacy level. Implementation guides for technical users, and simple written explanations for the non-technical, can help keep artificial intelligence use transparent, especially as machine learning blends into more and more technologies.
AI algorithms aren’t always static, especially if they are designed to re-ingest results and use the data to continuously refine their outputs. Environments change and uses evolve as technology gets more mature. As a result, reliability over time is a crucial area of concern.
Periodic evaluation of the algorithm to ensure it still provides the desired service will be crucial. Technology providers and implementors should work together to define metrics for reliability and monitor these criteria closely through the tool’s lifecycle.
Robust, continual monitoring is perhaps the most important thing healthcare organizations can do to make certain their AI is working as desired. All of the above factors must be monitored on an ongoing basis, and issues must be addressed quickly and comprehensively.
In larger health systems where the same tool is being used in different settings or locations, leaders should implement a centralized reporting structure so that one site doesn’t unintentionally experience poor results or veer away from best practices. The central entity should have standardized protocols defined and ready to deploy when something goes awry, and it should communicate their actions clearly to all other sites to keep users on the same page.
Future of artificial intelligence for healthcare
By investing in continual surveillance, identifying worrying issues quickly, and establishing the ability to address problems promptly, healthcare organizations can make the most of their AI tools and ensure that patients, providers, and the healthcare system as a whole can maximize the benefit of this promising category of digital innovation.
Jennifer Bresnick is a journalist and freelance content creator with a decade of experience in the health IT industry. Her work has focused on leveraging innovative technology tools to create value, improve health equity, and achieve the promises of the learning health system.