Brittany Trang , 2025-05-12 18:08:00
OpenAI on Monday released a large dataset for evaluating how well large language models answer questions related to health care. Experts lauded the open-source data and detailed evaluation rubrics, calling them “unprecedented” in scale and breadth.
The project, HealthBench, marks OpenAI’s first foray into health care applications of AI, outside of external partnerships.
“Our mission as OpenAI is to ensure AGI is beneficial to humanity,” said Karan Singhal, who leads OpenAI’s health AI team, referring to OpenAI’s goal of developing artificial general intelligence. “One part of that is building and deploying technology. Another part of it is ensuring that positive applications like health care have a place to flourish and that we do the right work to ensure that the models are safe and reliable in these settings,” he said.
This article is exclusive to STAT+ subscribers
Unlock this article — and get additional analysis of the technologies disrupting health care — by subscribing to STAT+.
Already have an account? Log in