Monthly Writings

Evaluations and reviews of the latest in the field.

When Your AI is Confidently Wrong

After reading this article, you will be able to develop practical steps to address AI uncertainty.

SUMMARY:

  • Clinical uncertainty is a major challenge, especially with AI responses

  • AI algorithms are designed to provide a response presented with confidence, even though the evidence may be conflicting.

  • AI tools do not disclose the degree of response certainty (or uncertainty)

  • Clinicians need to be aware of this limitation, especially as the clinical decision-making risk increases.


Certainty vs Uncertainty

Uncertainty:

  • Occurs with a lack of, vague, or ambiguous information.

  • Uncertain situations occur with the lack of necessary, reliable information to assess the situation, weigh various options, and make an informed decision.

 Certainty:

  • Is NOT the absence of not knowing

  • Having enough information to assess the situation and make predictions with high reliability

Absolute Uncertainty & Total Certainty are Never Attainable

The “3S” of Healthcare Innovation

  • The 3S is a continuum of challenges needed to enable innovation within healthcare

  • It is unpredictable and dynamic

  • Spread: Communication approaches of an innovative implementation in a new setting

  • Sustainability: Implementation is a time-limited event. 

  • Once implemented, sustainability becomes continual within a clinical setting, with an undefined timeframe

  • Scale-Up: Innovation expanded to reach all potential beneficiaries of the innovation

 There is a lack of a standardized approach to integrating a 3S innovation

The Nonadaptation, Abandonment, Scale-Up, Spread and Sustainability Model (NASSS)

  • Identifies factors influencing implementation success in digital health

  • 6 domains

  • Each domain may vary in the level of challenges from:

    • Simple: straightforward, predictable, with few components

    • Complicated: multiple interactive components

    • Complex: dynamic, unpredictable, interrelated components

  • NASSS Components

    • Condition: Only a small percentage of clinical conditions are low enough risk or predictable enough as  suitable for technology

    • Technology:

      • The necessary inputs & outputs

      • Many models are less than optimally developed (features, size, usability)

      • Evaluate dependability: data accuracy, transparency for recommendations, interpretability

      • Knowledge needed for and ease of use of the system

      • Customization of the system for specific uses

    • Value: What is the value provided to users and health system?

      • Designed the way the intended user works

    • Organizational:

      • System capacity and readiness for uptake and scale-up

      • Budget availability

      • Leadership support

      • Supports workflow changes

    • Adopter System (Staff, Patient, Caregiver)

      • Address staff concern of scope of practice and patient safety

      • Patient/Caregiver Concern: Training and knowledge to adequately use the system

    • Sociocultural: Health policy, fiscal policy, legal and regulatory policies

Artificial Intelligence (AI) Challenges for Certainty

  • AI Large Language Models (LLMs) generate conclusions on statistical probability.

  • AI-LLMs do not know what they don’t know

  • Despite conflicting evidence, a confident response may be provided without disclosing the model's inability to furnish a valid prediction.

PRACTICAL STEPS

  1. Lack of Clinical Evidence

  • Most health systems struggle to develop proper evaluation & monitoring of AI algorithms.

  • Processes typically focus on safety & process compliance and not effectiveness

  • SOLUTION:

    • Focus on how the innovation will be used

    • Develop clinical workflows

    • Understand the underlying technology and how it uses data – algorithm transparency

2. Criteria for Uncertainty

  • How should AI tools indicate uncertain outputs?

    • When critical information is missing

    • When there is conflicting evidence

    • When the recommendations are of low confidence

  • Higher clinical states of stability, acuity, and severity warrant higher clinical involvement

  • Avoid:

    • Checkbox-driven documentation

    • Automation bias: Trusting and following automated outputs without evaluation

  • SOLUTION:

    • Define Human in the Loop pathways for clinical decision risk levels of:

      • Low to Moderate

      • Moderate

      • High

      • Very High

3. Documentation

  • Clinician(s) remain the final checkpoint and maintain accountability

  • Mandatory review before signature

  • SOLUTION

    • Documentation should include:

      • Final decision based on full patient context

      • How the AI recommendation impacted your thinking and why         

    4. Patient Informed Consent

  • SOLUTION

    • Communicate with patient

    • Document-informed consent

    • Alternative options

CONCLUSIONS:

  • AI algorithms are not designed to provide “I don’t know” responses.

  • Clear guidelines should define when and how a clinician evaluates confidence in AI outputs.

  • Processes should be developed as to:

    • When outputs are reviewed

    • Who reviews them

    • How quickly the review occurs

    • What documentation needs to occur

    • The degree of uncertainty

AI algorithms are designed to provide confident responses, even when the output is uncertain.

AI responses should provide a level of certainty based on the complete patient context, available data, and existing evidence.

AI and clinician roles may vary based on the level of clinical decision-making risk involved.

Greater AI role in low-risk situations

Greater Human role in higher-risk situations

The current risk stratification process has not been developed or standardized.

These processes should be developed based on each health system’s needs and goals.

Let’s have a brief chat to discuss your unique situation

Erkan Hassan