Federal prosecutors allege that a Florida physician and research staff fabricated clinical trial records that were submitted into database systems used to evaluate investigational drugs.
Across six experiments—including a blinded, real-world ER evaluation—an OpenAI large language model outperformed physician baselines on multiple clinical reasoning tasks, though not on key safety endpoints such as cannot-miss diagnoses
A VHA study across 11 vendors finds AI-generated primary care notes score lower than clinician-written notes, with the largest deficits in thoroughness, organization, and usefulness