AI chatbots fall short in health advice study
Participants misjudged health conditions using popular AI chatbots.

As healthcare costs rise and waiting lists grow, many people are turning to AI chatbots like ChatGPT for medical advice. However, a new Oxford-led study suggests chatbots may not improve, and could even hinder, health decision-making.
Participants using AI models such as GPT-4o, Cohere’s Command R+ and Meta’s Llama 3 often missed key health conditions or underestimated their severity.
Researchers found users struggled to provide complete information to chatbots and sometimes received confusing, mixed-quality responses.
Participants performed no better than those using traditional methods like online searches or personal judgment. Experts caution that current chatbot evaluations fail to reflect the real-world complexity of human-AI interaction.
While tech giants like Apple, Amazon and Microsoft push AI-driven health tools, professionals remain wary of applying such technology to serious medical decisions. The American Medical Association advises against using chatbots for clinical decision-making.
Would you like to learn more about AI, tech and digital diplomacy? If so, ask our Diplo chatbot!