(32) AI-Standardized Patient Reveals Practice Gaps After REMS-Aligned OUD Training

Friday, April 24, 2026

9:45 AM - 1:30 PM PT

Location: Harbor Foyer, Level 2

Presenter(s)

Boris Rozenfeld, MD

Chief Learning Officer
Xuron, Inc, New Jersey

Non-presenting author(s)

Aylin Madore, MD

VP, Curriculum Development
DBC Pri-Med, Massachusetts

Background & Introduction: FDA Opioid Analgesic Risk Evaluation and Mitigation Strategy (REMS) aligned continuing education defines core content for clinicians involved in pain care, including recognizing opioid misuse, identifying opioid use disorder (OUD), counseling patients, and implementing safety planning. [1]

Despite expanded training availability, a persistent implementation gap remains in day-to-day practice. Clinicians often know recommended approaches but struggle to execute stigma-sensitive, patient-centered conversations that support accurate assessment, shared decision-making, and timely initiation of evidence-based care.

Traditional continuing medical education (CME) formats commonly improve knowledge, yet evidence shows mixed and inconsistent effects on clinician performance and downstream outcomes, particularly when behavior change depends on complex communication skills rather than discrete facts. [2]

AI-enabled standardized patients create an opportunity to measure applied performance directly, not just post-test knowledge, while giving learners a low-risk environment to practice and receive immediate feedback.

Aim: Evaluate changes in knowledge and confidence, and characterize observed step-level performance during an AI standardized patient encounter embedded in a REMS aligned online CME/CE activity focused on screening, diagnosis, brief intervention, medication for OUD, and harm reduction planning. [1]

Methods:

Design: Single-arm, pre/post outcomes evaluation with performance assessment.

Intervention: A self-paced, internet-based CME/CE activity (0.5 credits) intended for primary care clinicians launched December 1, 2025. The learning pathway included a baseline pre-test, short micro-learning, an AI standardized patient encounter, then a post-test and evaluation.

Participants: 390 learners completed all required components from December 1, 2025 to January 5, 2026. Learners included physicians, nurse practitioners, physician assistants, and other health professionals practicing across outpatient, hospital, academic, and government settings.

Simulation: Learners interacted with Jeremy Thompson," a 34-year-old warehouse supervisor from a rural community with chronic back pain reporting that opioid medication was not lasting the full month. The encounter followed five sequenced SBIRT (Screening, Brief Intervention, and Referral to Treatment) steps aligned 1:1 to learning objectives: (1) rapport and history, (2) screening using TAPS-1 with required follow-up if positive, (3) DSM-5 diagnostic assessment, (4) brief intervention and medication for OUD discussion using motivational interviewing, and (5) follow-up planning and harm reduction. Learners received immediate step-level feedback.

Analysis: Descriptive paired pre/post comparisons (absolute percentage-point change for knowledge items, mean difference for confidence on a 1 to 5 scale). Performance reported as mean step scores (1 to 5).

Results: All 390 learners completed both pre- and post-activity assessments. Learners represented a broad range of roles and settings. Professions included physicians (MD/DO, 37.5%), nurse practitioners (34.8%), physician assistants (21.5%), and other health professionals (6.2%). Practice settings were primarily private or outpatient (54.6%), followed by community hospitals or medical centers (21.5%), academic medical centers (7.4%), and government settings (4.0%). Experience levels were well distributed, with 36.8% practicing more than 20 years and 34.1% practicing 1 to 10 years.

Baseline exposure to OUD care varied, with 59% reporting never or rarely screening for OUD prior to participation, and nearly 40% reporting that seeing patients with potential substance use concerns was not applicable to their practice.

Knowledge improved across all items. Composite correct responses increased from 58.4% pre-activity to 90.7% post-activity (+32.3%). Screening tool identification (TAPS-1) increased from 49.4% to 95.6% (+46.2%). Recognition of DSM-5 criteria threshold for mild OUD increased from 53.6% to 84.9% (+31.3%). Selecting the appropriate next clinical step increased from 72.1% to 91.6% (+19.5%).

Confidence increased across three skills. Mean confidence increased from 2.66 to 3.60 (+0.94). High confidence (very or extremely confident, 4 to 5) increased from 8.0% to 37.1% (+29.1). Item-level high-confidence changes were: screening for OUD 11.6% to 36.5%, diagnosing OUD 8.9% to 42.7%, offering same-day buprenorphine 7.4% to 32.3%.

Observed simulation performance remained variable after one completion. Mean performance across steps was 2.8/5, with the highest score for DSM-5 diagnostic assessment (4.5/5) and moderate scores for rapport/history (3.3/5) and screening (3.3/5). Lower performance was observed for treatment discussion (2.7/5) and follow-up planning and harm reduction (2.0/5). AI-powered step-level transcript analysis across the entire cohort identified recurring gaps in collaborative motivational interviewing behaviors, addressing concerns about MOUD, and completion of naloxone education and concrete safety planning.

Learner Evaluation: Learners rated the simulation favorably (51.6% very or extremely valuable), and 35.6% reported being likely or very likely to offer same-day buprenorphine after completion.

Conclusion & Discussion: In a REMS aligned CME/CE activity, an AI standardized patient was associated with substantial gains in knowledge and self-reported confidence, while also revealing persistent gaps in applied, conversation-dependent skills and harm reduction planning after a single completion. This pattern is operationally important. Post-tests suggested readiness, yet observed performance showed where learners struggled when tasks required flexible communication, shared decision-making, and time management. By combining measurement of what learners know with observation of what they do, this approach addresses a known limitation of CME evaluation, namely that improvements in knowledge do not reliably translate into consistent performance in practice.

The step-level results point to high-yield targets for retraining and deliberate practice, including collaborative motivational interviewing behaviors, responding to common concerns about buprenorphine, and completing naloxone education and specific safety planning. Immediate feedback supports iterative skill development aligned to REMS educational intent.

Limitations include a single-arm design without a control group, self-selection, short follow-up window, and reliance on self-reported confidence. Future evaluation should examine repeated attempts, durability of gains, and associations with subsequent clinical behavior.

References: 1) US Food and Drug Administration. FDA Education Blueprint for Health Care Providers Involved in the Treatment and Monitoring of Patients with Pain. Published October 2023. Accessed January 2026.

2) Cervero RM, Gaines JK. The impact of CME on physician performance and patient health outcomes: an updated synthesis of systematic reviews. J Contin Educ Health Prof. 2015;35(2):131-138. doi:10.1002/chp.21290

Disclosure(s):

Boris Rozenfeld, MD: No financial relationships to disclose

Aylin Madore, MD: No financial relationships to disclose

Learning Objectives:

Upon completion, participants will be able to explain how an AI standardized patient can assess OUD counseling skills beyond post-test knowledge
Upon completion, participants will be able to interpret the relationship between knowledge gains, confidence gains, and observed performance after a single simulation
Upon completion, participants will be able to apply one step-level performance takeaway to strengthen an OUD training activity they run or support