Ring My BEL: Selventa Releases Biological Expression Language

By Kevin Davies

May 23, 2012 | With a recent rebranding and evolving business model, Selventa—the company formerly known as Genstruct—has decided to release a key knowledge engineering asset to the scientific community.

“Selventa is a discoverer of companion diagnostics and a stratifier of patient populations,” says Ted Slater, a former Genstruct scientist who now consults for the company from Broad Reach Strategic Advising LLC. As the BEL (Biological Expression Language) platform has become less critical to its discovery efforts, Selventa has decided to release the asset to create a widespread knowledge ecosystem. “Selventa anticipates that the adoption of BEL as a standard, computable, scientific language will enable the community-at-large,” says Slater.

As discussed at the OpenBEL Summit hosted by Selventa last month, the company has opted to give BEL and some associated tools (the BEL Framework) away to the community as OpenBEL, to ensure that the standard flourishes and facilitates innovation. To that end, Selventa is forming the OpenBEL Consortium.

Slater calls BEL “a very expressive triple-based language for representing causal, correlative relationships between entities—enzymes, genes, diseases, species, many different things.” Several groups in academia, pharma, text mining, and publishing are already applying BEL in various areas, as was demonstrated at the summit.

Slater started working for Genstruct in 2002, back when the company pursued an in silico biology model. His former colleague Dexter Pratt invented a scripting language designed to represent semantic causal relationships between biological molecular entities in a graph format (for example, ‘enzyme A cleaves protein B’).

“Back then, there was no database for causal relationships and there pretty much still isn’t,” says Slater. “Genstruct knew they had to extract the information from the primary literature and needed a way to best capture it. The solution, invented by Pratt, was to represent it in a scripting language called BEL.” BEL was used for more than ten years by Selventa but exclusively for internal use.

Under the leadership of David de Graaf, CEO of Selventa, the company decided that it was better to release BEL as an open-source platform and “let the community contribute to it in ways we perhaps would never have thought of, and at the same time do a lot of good for the entire community by providing a common knowledge platform for everyone to use for free,” says de Graaf. “We envision this platform to simply enable an innovative scientific community, including Selventa. We want to share in the innovation, share in the ability to use more content in a meaningful way, and defray the development costs by sharing the platform through a consortium.”BEL is freely available at http://openbel.org.

Selventa regarded the open-source visualization tool Cytoscape as a precedent of sorts. Slater calls the that software “very beneficial to the community.”

In addition to releasing BEL, Selventa is also providing a tool to validate BEL, so that users can affirm that what they’ve written is officially BEL (similar to existing tools for RDF and OWL). The BEL Framework helps users with BEL knowledge acquisition and maintenance, and creates computable knowledge bases from BEL script which can support search, visualization, and inference. “It helps you write ‘good BEL’ scripts and create KAMs (Knowledge Assembly Models) from those scripts,” says Slater.

Working Model

The point of the April workshop was to bring together researchers from pharma, academia, publishing, and other segments of the market to discuss the standard and highlight recent progress. Among the organizations represented at the summit were the Broad Institute, Pfizer, the Philip Morris Institute, Accenture, IDBS, Linguamatics, and Thomson Reuters.

Researchers and collaborators demonstrated a number of use cases involving BEL. For example, William Hayes, who recently joined Selventa from Biogen-Idec, demonstrated the BEL Workbench and a Cytoscape plug-in for BEL network visualization, which is also freely available. Pfizer’s Daniel Ziemek described cases of causal reasoning using BEL. Robin Munroe (IDBS) showed how to use BEL in the company’s E-WorkBook suite for next-gen sequencing applications.

Paul Milligan from UK text-mining firm Linguamatics and Sam Ansari (Philip Morris Institute) demonstrated the extraction of entities and their casual relationships from the scientific literature (such as genes, species, drugs, etc.) into the BEL format.

BEL creator Dexter Pratt, who is now chief technology officer of Segterra, demonstrated a free visualization tool called Triptych.js, used to display 3-D graphs of BEL networks. And Eric Neumann (PanGenX) demonstrated the ability for users to translate from BEL to RDF and back (complete with SPARQL querying). “We’d like BEL to be a very scientist-friendly way to write RDF, which we don’t have right now,” says Slater. “It’s a little arcane for the average bench scientist [right now].”

Finally, Slater is also excited about the potential for BEL to impact scientific publishing. Chris Bouton (Entagen), the creator of an RDF visualization tool called Triple Map—is working with Andreas Matern (Thomson Reuters) in a project that Slater says “could change scientific publishing.”

The idea is to enable a researcher studying a particular signaling pathway, for example diabetes, to decorate a node in that graph of relationships with a little symbol—just as the iPhone signals new updates or texts. This symbol would signify that there are new “nanopublications”—scientific papers relevant to that particular node or enzyme that have been turned into small graphs of computable information extracted from those papers.

“In time, like iTunes, you can look at those graphs and decide if you want them, purchase, download, and make them part of your graph immediately. I think that’s just too hot!” says Slater. “Publishers are incredibly important to adoption—Selventa wants the whole information flow to be in a common knowledge representation format. It has to begin with primary literature.”

At the workshop, Selventa distributed details about the consortium charter and prospectus, which can be found at http://openBEL.org. The scientific steering committee will be led by German researcher, Martin Hofmann-Apitius of the Fraunhofer Institute.

There will be a membership fee for consortium members, which goes towards developing and maintaining the BEL language standard and the BEL Framework.