In This Round, Humans 1, AI LLMs 0

Last updated: July 22, 2024 10:39 pm

admin

1 Min Read

A study compared six humans, OpenAI’s GPT-4 and Anthropic’s Claude3-Opus to answer medical questions. The humans performed better than the AI, with GPT-4 performing worse than Claude3-Opus. The questions were based on medical knowledge from a Knowledge Graph by Kahun, based on peer-reviewed sources. 105,000 questions were used to prepare the AI models. Both AI models did better with semantic questions than numerical ones. While LLMs can answer some questions, they are not yet reliable enough to assist physicians. Kahun aims to create more transparent AI that incorporates verified sources, as doctors want to understand the basis of AI recommendations.

Source link

Share This Article

In This Round, Humans 1, AI LLMs 0

Recent Posts

Archives

Categories

Follow Us

Contact Info