AI’s Next Act: New Medicines

admin
6 Min Read

Doug Selinger , 2025-05-09 14:17:00

Bringing a new drug to market is staggeringly inefficient: About 90% of new drugs fail in clinical trials, development times are 10-15 years, and the costs can run in excess of $2 billion dollars. It’s hard to think of an endeavor more in need of a boost from AI, and the tech industry, heady from recent advances, is diving in.

But will what got us here get us there?

History teaches us that the right equation at the right time can change everything. Einstein’s E=mc2 helped usher in the nuclear age. Neural networks, with enough compute capacity and training data, ignited the current AI explosion. And in the late 90s, when it was hard to find anything on the web, Sergey Brin and Larry Page invented the PageRank algorithm that made Google (now Alphabet) one of the most valuable companies in the world.

PageRank and other so-called “centrality algorithms” may not be done transforming the world just yet. In fact, they may be the key to the next AI-driven drug discovery breakthrough.

When applied to websites, centrality algorithms identify which pages are most linked-to, and are therefore most relevant to a query. When applied to biomedical data they can identify the most linked-to answers to scientific questions, highlighting which findings have the strongest experimental support. Crucially, centrality algorithms can be applied to relatively raw data, including the massive data sets generated by modern high-throughput approaches, so they can connect dots that have never been connected before, between data points spread across myriad databases and other data sources. New connections can mean novel discoveries. And multi-agent AI systems are revolutionizing these capabilities even more than in the past. 

Lots of data, too few insights

By design, scientific publications tell stories, and only a handful of stories can fit into each paper. So modern studies, with their accompanying massive data sets, leave thousands or even millions of stories untold. When combined with other studies, the number of untold stories increases, maybe exponentially.

This is at once a tragedy and a massive opportunity. Some of these stories may be new strategies for curing cancer, or rare diseases, or for countering important public health threats. And we’re missing them simply because we’re not able to use the data that’s already in our virtual hands.

A quick back-of-the-envelope calculation gives a sense of how much data we’re talking about: A 2022 survey found approximately 6,000 publicly available biological databases. One of these databases, the Gene Expression Ominibus (GEO), a public repository hosted by the NCBI, currently holds close to 8 million samples. If we assume each sample has about 10,000 measurements (half of the 20,000 or so genes in the human genome) we get about 80 billion measurements. Multiplying through by 6,000 databases brings us to about 500 trillion total data points. That’s without counting chemistry databases, proprietary data sources, or the large-scale data sets that haven’t been deposited in central databases. Whatever the true number is, there’s no doubt that it’s large and it’s growing fast.

The opportunity

Effective utilization of such a treasure trove of data could dramatically boost the ability of AI approaches to deliver meaningful biomedical advances. For example, by combining centrality algorithms with a construct called a “focal graph,” AI agents can indeed leverage this data to deliver experimentally backed findings from traceable sources. Moreover, when combined with large language models (LLMs) such as OpenAI’s ChatGPT or Anthropic’s Claude, focal graph-based approaches can run autonomously, generating insights into the drivers of disease and potentially revealing new ways to treat them.In this time of breathtaking AI progress, there’s plenty of talk about a slide from the “Peak of Inflated Expectations” into the “Trough of Disillusionment” of the Gartner hype cycle. Such pronouncements are understandable, but almost certainly premature. In fact, we may be on the eve of the next breakthrough: a new combination of “old” algorithms that promises to radically accelerate the discovery and development of new medicines. Such an advance is sorely needed, and by utilizing the full breadth of available tools and data, it may finally be within reach.

Photo: MF3d, Getty Images


As an early pioneer of microarray technology, Doug Selinger authored some of the first publications describing experimental and computational approaches for large scale transcriptional analyses. After completing his Ph.D. in George Church’s lab at Harvard, he joined the Novartis Institutes for Biomedical Research where his 14-year career spanned the entire drug discovery pipeline, including significant work in target ID/validation, high throughput screening, and preclinical safety.

In 2017, Doug founded Plex Research to develop a novel form of AI based on search engine algorithms. Plex’s unique platform has helped dozens of biotech and pharma companies accelerate their drug discovery pipelines by providing interpretable and actionable analyses of massive chemical biology and omics data sets.

This post appears through the MedCity Influencers program. Anyone can publish their perspective on business and innovation in healthcare on MedCity News through MedCity Influencers. Click here to find out how.

Source link

Share This Article
error: Content is protected !!