AI in Pharma: Progress Made, But Data Challenges Persist

January 20, 2026

Pistoia Alliance survey reveals industry shift from efficiency gains to innovation, while data readiness remains top obstacle 

By Allison Proffitt 

January 20, 2026 | The pharmaceutical industry’s approach to artificial intelligence is maturing, with companies increasingly focused on innovation rather than mere efficiency gains, according to Christian Baber, Chief Portfolio Officer at the Pistoia Alliance. However, significant hurdles around data quality and workforce training continue to impede progress. 

Speaking about findings from the Alliance’s “Lab of the Future” survey last fall and more recent conference polling at the Pistoia Alliance Boston event, Baber noted a surprising silver lining in what might initially appear as concerning statistics. When survey results showed that approximately 27% of respondents didn’t know the source of data used to train their AI models, Baber’s reaction was counterintuitive: “I would bet you a lot more… people really don’t know how their models are built.” 

The Data Dilemma 

The challenge of AI-ready data emerged as one of the predominant obstacles, cited by roughly half of survey respondents. This issue has persisted for years but has intensified as machine learning models require data in specific, machine-readable formats rather than traditional reports. 

“Some groups will share data historically in a report,” Baber explained. “Now it needs to be in this machine readable form, so it can then be referred back to by these models used going forward.” 

The problem extends beyond technical formatting. Cultural barriers within organizations continue to hinder data sharing, even internally. Scientists often resist relinquishing control of their data, preferring to present conclusions in reports rather than providing raw data for broader reanalysis and AI applications. 

This is hardly a new problem in the life sciences, but Barber says now is the time for a culture shift. “You’re employed to create [data] and share it and use it to drive decisions,” Baber emphasized. “You want it to have an impact.” 

A Strategic Shift 

Perhaps the most encouraging finding from this year’s surveys represents a fundamental change in how pharmaceutical companies approach AI implementation. Previous years showed organizations primarily pursuing efficiency gains and incremental improvements—what Baber characterizes as “doing things better.” This year revealed a pivot toward “doing better things,” with companies focusing on innovation and transformational changes rather than simply optimizing existing processes. 

This shift reflects a maturing investment strategy. Early AI adoption was characterized by a bit of a panic—companies felt compelled to “get into AI” and pursue any available use case, Baber observed. Now, investment has become more intelligent and targeted, with organizations asking where AI can genuinely transform their business rather than merely accelerate it. 

The Training Gap 

Workforce development emerged as another critical need, with education and skills training identified as major obstacles by a significant portion of respondents. But the solution isn’t one-size-fits-all. Some professionals need to build AI models, but many more just need sufficient literacy to use them appropriately. 

Understanding a model’s “applicability domain”—the range of problems it can reliably address—is the first step toward proper AI use for any user. As Baber noted, AI models “interpolate really well, but extrapolate quite poorly.” They excel when working with familiar examples but struggle with entirely novel scenarios, much like the classic “black swan” problem: a model trained only on white swans would never identify a black swan as a swan. 

“It’s the responsibility of the people building the models and also the responsibility of the people using them to look at what should be published about the model,” Baber said, including applicability domain, the types of data on which the model was trained, and governance rules. Users need to know what documentation should exist and recognize red flags when critical information is missing. 

Regulatory Evolution 

The regulatory landscape is becoming clearer, with both EU and US authorities providing more specific guidance on AI use. These guidances are emboldening the industry. In the Lab of the Future survey, only 9% of respondents saw regulation as a barrier to AI, down from 23% in 2024. 

However, in a conservative industry, internal company policies and workflows are still catching up. Regulatory agencies generally want technology to progress but individual companies are wary of implementing anything that might face rejection. 

“Regulatory organizations within pharma tend to be very conservative,” Baber observed. “They don’t want to do anything that might be rejected or cause problems.” 

Pistoia’s Response 

Operating in the pre-competitive space, the Pistoia Alliance is addressing these challenges through several initiatives. The organization maintains a successful natural language processing use case database, recently expanded to include large language models, documenting what works and what doesn’t across various implementations. 

A new benchmarking project aims to address the challenge that roughly 50% of survey respondents identified as their biggest obstacle. The initiative will focus on creating benchmarks specifically for healthcare research and scientific domains, with the flexibility to evolve as models improve. 

The Alliance recently completed a project examining how to minimize hallucinations in natural language database querying. The conclusion: templates provide 100% accuracy but limited flexibility, while agentic AI represents the next best approach. 

Building on these findings, Pistoia has launched an agentic AI project exploring how AI agents can work specifically within life sciences and biopharma, including developing protocols for agents to announce their capabilities and communicate using life sciences-specific vocabulary. 

Practical Recommendations 

For researchers and scientists looking to advance their AI capabilities, Baber offers straightforward advice. First, record data in machine-readable formats with proper metadata from the outset. This creates value for the organization while ensuring proper attribution to data creators. 

Second, invest time in understanding the technology’s limitations. This doesn’t require programming expertise or deep technical knowledge but rather developing practical skills through experimentation. 

“Every scientist can use Excel,” Baber said, drawing a parallel to spreadsheet software, even if, “not every scientist is programming an Excel macro.” 

He recommends basic prompt engineering techniques, noting that simple adjustments—such as asking an AI model to show its work—can drastically reduce hallucinations. For training resources, he suggests courses from respected universities like Harvard and MIT, many available online, along with hands-on experimentation. 

“Try and make it lie to you, and work out why it’s lying and how do I get the truth,” Baber advised, describing AI models as analogous to politicians who provide confident-sounding answers whether they actually know the information or not. 

Looking Ahead 

Despite ongoing challenges, Baber maintains an optimistic outlook based on clear improvements over the past two to three years. The trajectory points toward a future where AI tools become standard equipment for virtually all pharmaceutical researchers. 

“Things are going in the right direction,” he said. The Lab of the Future survey showed that 77% of lab expect to use AI within the next two years, and AI remains the top investment area the third year running.  

The key to realizing this future lies not in developing new algorithms—pharmaceutical companies typically adapt existing AI technologies rather than inventing them—but in addressing the human elements: data practices, organizational culture, workforce skills, and management incentives that either enable or impede AI adoption. 

As Baber succinctly put it: “It isn’t the technology that’s a problem, it’s the people.”