The Role of the Statistical Programmer is Evolving: Here’s How It’s Changing and What That Means for Pharmaceutical Companies

Contributed Commentary by Chris McSpiritt, Domino

February 6, 2026 | For decades, the role of statistical programmers in drug development was clearly defined. They were responsible for producing validated tables, listings, and figures from clinical trial data, most often using statistical analysis systems (SAS’s), within structured workflows for regulatory submissions. This model worked because data sources were controlled, study designs were well-defined upfront, and regulatory expectations primarily focused on standardized outputs produced through established workflows.

That model is now under pressure. As data volumes grow and analytical approaches expand, the scope of the statistical programmer role is broadening. What was once a largely SAS-centric function is becoming a hybrid role that also requires fluency in open-source languages like R (and sometimes Python) alongside emerging artificial intelligence (AI) tools. Statistical programmers are now expected to act as partners in analytics, model validation, and data interpretation, with greater responsibility and strategic impact.

This shift is driven by a series of compounding changes across data, technology, and regulations. One of the most significant drivers is rising data complexity. Pharmaceutical companies are incorporating far more diverse data sources into their analyses, particularly real-world evidence (RWE), in addition to clinical trial data. Today, 82% of FDA drug submissions include RWE, and that number continues to grow year over year. External datasets, longitudinal data, and less-structured sources introduce new challenges around integration, provenance, and consistency. While this data unlocks richer insights, it also makes rigid, linear workflows more difficult to sustain.

As analytical demands have grown, traditional closed ecosystems have struggled to scale across heterogeneous data and methods. This has driven broader adoption of open-source tools, such as R, as a practical response to more complex analyses and faster iteration cycles. Open source enables greater flexibility, advanced analytics, and richer visualizations, but it also introduces more variability into how analyses are performed.

Regulatory expectations have evolved in response to that variability. As methods diversify and toolchains expand, regulators are placing greater emphasis on traceability, reproducibility, and risk-based approaches to quality. Statistical programmers are now expected to demonstrate not only analytical outputs, but also how results were generated, reviewed, and governed throughout the process.

Additionally, the proliferation of AI has unlocked a new world of possibilities that is putting pressure on statistical programmers. Pharmaceutical companies expect them to carry out a larger volume of analyses—faster than ever before—while still meeting the same standards for quality, validation, and compliance.

Taken together, these forces create real tension. Statistical programmers are being asked to take on more responsibility with more data sources, tools, and quality requirements, but often without corresponding changes to process or infrastructure. Without the right support, this adds friction and risk when organizations are under pressure to move faster.

Changing Dynamics Require New Skills for Statistical Programmers

While SAS remains foundational in many organizations, statistical programmers increasingly need proficiency in R and potentially Python as well. In fact, many prospective statistical programmers are graduating college with fluency in R and Python (not SAS). The open-source nature of these languages makes them well suited for conducting advanced analytics, particularly as data sources continue to diversify.

Another crucial area of skill development is AI. Coding assistants allow teams to produce code more quickly, but statistical programmers must manage these tools responsibly. This includes reviewing generated code, determining when it’s ready to be checked in, and using large language models to support documentation. While statistical programmers aren’t expected to be machine learning engineers, they do need a solid understanding of how models work, the assumptions they rely on, and how their outputs should be interpreted.

As AI accelerates code production, strong code management practices become even more important. Embracing robust version control tools like Git, rather than relying on file shares, supports reproducibility, reuse across studies, and greater operational efficiency.

The Responsibility of Pharmaceutical Companies

Pharmaceutical companies play a critical part in enabling statistical programmers to succeed in this new role. That starts with creating structured learning pathways for upskilling and reskilling, and ensuring programmers have the time and bandwidth to learn in addition to their day-to-day responsibilities. Hiring practices must also evolve to prioritize hybrid skillsets rather than focusing exclusively on experience with a single tool.

Just as importantly, this new way of working requires the right technological foundation. Many organizations recognize the need for a modern statistical computing environment (SCE) that reduces friction and risk as open-source and AI tools become more deeply embedded in analytical workflows.

The Modern SCE: A Crucial Foundation

Modern SCEs bring people, processes, and technology together to support the evolving role of the statistical programmer. At its core, a modern SCE provides governed access and reproducible execution. It supports versioned code and data artifacts, along with audit-ready workflows across SAS, R, Python, and AI tools.

By providing a controlled yet flexible environment, modern SCEs enable collaboration while maintaining compliance. As statistical programmers increasingly use open-source and AI-powered tools, SCEs provide the traceability and auditability needed to meet regulatory requirements.

The role of the statistical programmer will continue to evolve as data complexity grows and new technologies mature. Pharmaceutical companies now face a clear inflection point. Those that invest in the right skills, processes, and environments will enable faster, higher-quality submissions with more confidence. Those that don’t risk falling behind on scalability, inspection readiness, and long-term sustainability.

Christopher McSpiritt is VP, Life Sciences Strategy at Domino Data Lab. He drives understanding of customer needs and works with product management and marketing teams to drive go-to-market approaches within the life sciences sector. Christopher began focusing on the Life Sciences industry when he joined a small eClinical startup in 2005. Since then, he has had the opportunity to work at both consulting firms and leading software companies as a project manager, business analyst, consultant, product manager, and strategist. He can be reached at christopher.mcspiritt@dominodatalab.com.