Boltz Upgrade: Recursion Researchers Release Pipeline Combining AI Binding Model with Absolute Binding Free Energies
By Allison Proffitt
October 10, 2025 | Researchers from Recursion are further developing their work on protein-ligand docking with a new digital tool: the Boltz-ABFE pipeline.
This summer, with researchers from MIT, Recursion researchers released Boltz-2, an updated artificial intelligence model that can predict protein structure and binding affinity. The Boltz-2 model challenged the current standard for predicting binding affinity: physics-based simulations called Free Energy Perturbation (FEP) calculations.
FEP simulations are expensive and time consuming. Boltz-2 dropped the cost from approximately $100 per prediction that took 6-12 hours to just a few cents for a 20-second prediction on a single GPU. The accuracy of FEP is also dependent on an accurate model of the protein-ligand complex as an initial condition for the underlying molecular dynamics simulation. Unfortunately, in the early stages of the discovery process appropriate experimental crystal structures are rarely available.
The team, then, combined Boltz-2 with their own absolute FEP protocol to build Boltz-ABFE, a robust pipeline for estimating the absolute binding free energies (ABFE) in the absence of experimental crystal structures. They describe the work in a new arXiv preprint (DOI: 10.48550/arXiv.2508.19385).
“This research … tackles one of the key barriers that has kept rigorous physics-based methods like ABFE from being used earlier in drug discovery,” explained Therence Bois, Vice President of Strategy and Translational Research at Valence Labs, Recursion’s AI research engine, in an email to Bio-IT World.
“Until now, ABFE simulations were largely restricted to the lead-optimization stage, since they required experimental crystal structures of protein-ligand complexes, a major bottleneck that limited their applicability. By removing that dependency, our pipeline makes it possible to bring gold-standard, physics-based free energy calculations much earlier into the discovery process, enabling rational ligand selection and optimization even at the hit-identification and hit-to-lead stages,” he said.
Modeling Challenges
During lead optimization, a single reference crystal structure is often used as a template for docking algorithms, serving as the foundation for all subsequent 3D property predictions, the authors write. In the early stages of drug discovery, they continue, “the situation is even more challenging, as even single reference crystal structures are unavailable or the target may not yet be identified at all. This renders 3D prediction of properties such as potency infeasible.”
AlphaFold3 and Boltz-2 predict confidence scores for the protein-ligand binding pose, but these confidence scores do not correlate to measures of affinity. Thus the researchers proposed a hybrid approach using predicted protein-ligand complexes as starting points for physics-based free energy calculation. In the paper, they described their approach:
“Boltz predicted protein-ligand complexes can be used to initialize simulations that accurately estimate the free energy of binding, provided that some care is taken when choosing which structure prediction is taken forward for use in [molecular dynamic (MD)] simulations. To exemplify the approach and the necessary choices, we present a robust pipeline that prepares Boltz-predicted structures for MD by automating the removal of common defects in the predicted structures such as overlapping atoms and incorrect ligand stereochemistry.”
The team applied the Boltz-ABFE pipeline to four proteins from the FEP+ benchmark: CDK2, TYK2, JNK1, and P38, evaluating the effect different structure prediction models had on the resultant ABFE predictions against experimental values. They found that their pipeline corrects defects of predicted structures and performs 15 free energy simulations without requiring experimentally-determined protein-ligand complex structures.
The Boltz-ABFE pipeline performed similarly to the Boltz-2 affinity model on these four proteins, but Bois explained that the two approaches are more complementary than interchangeable.
“Boltz-2 is a top-down model, trained exclusively on experimental binding affinity data. This gives it strong predictive power, especially on well-studied protein families like kinases but also means its performance can vary significantly for less-represented targets, as we observed in internal benchmarks. ABFE, on the other hand, is a bottom-up, physics-based approach. Because it relies on first-principles simulations rather than data availability, its performance is expected to be more robust and consistent across diverse targets, even in underexplored regions of chemical space,” he said.
“Combining the two therefore offers the best of both worlds: Boltz-2 can rapidly pre-screen large libraries with high throughput, while ABFE provides an orthogonal, physics-grounded layer of validation and re-ranking,” he added.
Next Steps
There’s significant opportunity to further enhance the pipeline’s accuracy and impact, Bois said, outlining several next steps for the team.
“First, we plan to continue advancing the Boltz-2 structure prediction model itself, with the goal of closing the remaining accuracy gap to experimental crystal structures. Achieving that would further strengthen the foundation of the entire workflow. Second, our current ABFE component was used largely ‘as is’ from previous work. By refining and tailoring the protocol specifically for new use cases, such as target deconvolution and early-stage hit-to-lead discovery and by accounting for the noise introduced by predicted rather than experimental structures, we can further boost both accuracy and throughput.”
Bois highlighted new opportunities made possible by the platform including target deconvolution, where multiple interactions can be modeled to predict which combination is most likely responsible for a phenotypic effect.
“Together,” Bois predicts, “these efforts will push the approach closer to a robust, end-to-end solution for structure-guided drug discovery from the earliest stages onward.”


