Study shows ChatGPT tends to overprescribe in emergencies
Researchers uncover ChatGPT’s limitations in emergency care.
A study conducted by UC San Francisco found that ChatGPT, when applied in emergency care, often recommends unnecessary treatments such as X-rays and antibiotics. It also admits patients who don’t require hospitalisation. Despite its strengths in certain areas, the AI model struggles to match the accuracy of a human doctor in more complex decision-making.
Researchers discovered that while ChatGPT can excel in simpler tasks like determining which patient is sicker, it tends to overprescribe when faced with real emergency cases. ChatGPT-4 performed 8% worse than resident doctors, while version 3.5 was 24% less accurate. These overprescriptions could lead to unnecessary treatments, increased healthcare costs, and strain on resources.
The research highlighted that the AI models are influenced by their internet training, often erring on the side of caution by recommending medical consultations. Although this approach is appropriate for general safety, it can be problematic in emergency settings, where unneeded treatments can harm patients. More refined frameworks are needed before AI can reliably assist in EDs.
Researchers are working on better ways for AI to evaluate clinical information in emergency care. A balance must be found between preventing serious oversight and avoiding excessive medical interventions.