Open source software is gaining popularity at biotech companies, but thorny issues, such as intellectual property, remain. Do the benefits outweigh the risks?
By Curtis Franklin Jr.
June 11, 2002 | For an industry in which 15 years can elapse between startup and first revenue, the allure of "free" software isn't hard to fathom. Biotech companies, like those in many other industries, have embraced the use of software distributed under free or open source licenses.
Indeed, researchers in academia and government — the seed ground for many biotech companies — have long relied on open source software, sometimes for cost reasons, but often because there were no commercial tools available to meet their needs. Developing homegrown software and passing it around freely to peers has become an integral part of the research culture.
Now, the emergence of biotech as an industry and its growing dependence on IT-based tools has stirred debate over the use of open source software. Can software obtained without a proprietary imprimatur truly be trustworthy for the most demanding applications? Commercial vendors maintain that "you get what you pay for," but the user community has other ideas.
Last spring, for example, it was widely reported that Microsoft Corp. tried to persuade the Department of Defense (DOD) to dump its open source software. This turned out to be neither possible nor desirable, according to a study by Mitre Corp., which provides IT support to the government. "Banning open source would have immediate, broad, and strongly negative impacts on the ability of many sensitive and security-focused DOD groups to protect themselves against cyber attacks," states the study.
|Unlocking the door to the open source community can be done via the Internet. A wide variety of open source resources can be found on these organizations' Web sites...
According to The Washington Post, "the Mitre report concluded that open source often results in more secure, less expensive applications and that, if anything, its use should be expanded."
Many bioscience researchers agree. "Almost all projects in my lab use open source software," says Lincoln Stein, a prominent Cold Spring Harbor Laboratory researcher who is developing Web-enabled databases, data analysis tools, and user interfaces to organize, manage, and visualize a vast body of genome information. "This includes the WormBase, the GenomeKnowledgeBase, and the SNP Consortium Web sites. These use a combination of Apache, Linux, Perl, MySQL, ACeDB, and BioPerl [open source] libraries. The main exception is the Gramene Web site, which uses Oracle underneath. Oracle was chosen because at the time it provided transaction support, while MySQL did not. Now that MySQL 4 supports transactions, the decision would probably be different."
The practical truth is that open source and proprietary software will have to coexist. But that doesn't mean there aren't thorny issues to contend with when using open source "products." Ensuring software quality, particularly for applications in drug development or diagnostics, is one. Finding adequate documentation is another. Nevertheless, the arguments favoring open source software are powerful.
Cost Leads the List
Lower upfront cost is open source software's most obvious advantage, says Jeff Chang, a doctoral candidate in medical informatics at Stanford University Medical School. "It makes it much easier to try different tools." Although the overwhelming majority of open source software is virtually free — available for the cost of downloading it from the Internet — a few companies, such as Red Hat Inc., do sell it.
Another attraction is unfettered access to source code — something that's required by most free and open source licenses. "Having source code available allows the software to be run on many more platforms," Chang says. "For various reasons, I run my code on Solaris, IRIX, Windows, Linux, FreeBSD, and Darwin. If the source code is available, it can usually be compiled with little or no effort."
Stein sees another benefit. "Surprisingly, I find the single major advantage of open source is its stability," he explains. "Not in the sense of 'bug-free,' but in the sense that I can rely on its being around for the long haul. I have been burned in the past by relying on proprietary software libraries and tools, only to see the company that provided [them] go under or be purchased by another company. Just ask any developer who relied on the Netscape Live Wire series of development tools."
Spurred by the freedom to tweak and share code, zealous user communities have sprouted around many open source and free products. Many scientists view this collective development process as a form of peer review that leads to better software. There is even a grass roots movement, called The Open Informatics Petition, that seeks to require that all software developed in programs using federal funds be open source. No action seems imminent, but the movement has provoked vigorous debate.
In general terms, most open or free software is distributed under one of two types of licenses: the Gnu General Public License (GPL), originally named Gnu for the first piece of software distributed using the license; or licenses defined by the Open Source Initiative (OSI) guidelines. In many ways, the difference between the two is more philosophical than practical — despite the religious fervor of each camp's adherents — and doesn't materially affect users. Among the key distinctions:
The GPL, developed in 1983 and overseen by the Free Software Foundation, is the older of the two. It states that software and source code must be freely available; that no restrictions may be placed on modifying the source code; and that any software that is a modification of the original program must also abide by the terms of the license (www.fsf.org).
The OSI doesn't have a single license, but instead controls the Open Source Definition (OSD), which defines what can and cannot be in an OSD license. The OSD includes all the major points of the GPL (the Open Source Initiative takes pains to say that the GPL is in compliance with the OSD), but adds that use of the software cannot be restricted to any person, industry, purpose, or country (www.opensource.org).
"We believe that the users of software have certain fundamental, inalienable rights to copy, share, modify, and redistribute computer software," says Bradley M. Kuhn, executive director of the Free Software Foundation. On the other hand, "The open software movement started in 1998 with the purpose of marketing free software to the business community."
Deciphering the differences between the two camps isn't easy, and may not be worth the effort.
Does Danger Lurk?
Critics of open source software focus on three potential problems: software quality, technical support, and the intellectual property constraints related to open source license requirements.
Putting aside the first two for now, the intellectual property issue has sounded alarms inside academic and corporate walls. One concern is that discoveries and inventions made using GPL and OSD software may not be able to be protected as intellectual property. Rather, some observers argue that they must be treated as "open source" themselves and made freely available. Commercial software vendors sometimes promote this position.
Kuhn, however, says there is no danger of the GPL "infecting" the patentability or licensing of a drug or gene sequence that was created using GPL-distributed software. "We clearly assert [that] the output of the program isn't covered by the license." In fact, says Kuhn, if the user is not modifying or redistributing the software, then he or she need not agree to the GPL at all.
Stephanie A. Gore, an intellectual property law specialist and assistant professor at the University of Texas School of Law, concurs, and says there is no language in either the GPL or the OSD that would attach to anything other than software. "Just because you use open source software, [that] doesn't mean that the licensing requirements apply to the new product, especially if the product isn't software."
A related sore spot for users of open source software is compliance with their own employment contracts. Typically, a company or university claims ownership of all of a researcher's work. It may even have a master software agreement that specifically prohibits the use of open source software. This is a problem for researchers who use open source code, tweak it, and then make it available — as is required by open source licenses. In so doing, such researchers are often violating their employment agreements.
One bioinformatics researcher, Steve Brenner of the University of California at Berkeley, renegotiated his employment contract after discovering the problem. The university was willing to resolve the matter, but the process took time. Many researchers simply don't realize there is a problem.
Software quality issues mirror those in the commercial software world. Each camp has good and bad software. On balance, the success of open source packages such as Linux and Web-server Apache (the dominant server software on the Web) have convinced many users that there are no broad qualitative differences between open source and proprietary software.
Obtaining solid technical support, however, is more complicated. To wit:
"There's no 24-hour support hotline for open source, and users have to understand that there's plenty of low-quality open source software out there," says Cold Spring Harbor Labs' Stein. "Users should look foremost at the volume on the software's support mailing list, and [assure] themselves that end users' questions are answered promptly and respectfully."
"I have found that support for open source can far exceed the free (and sometimes also the paid) support for commercial software," says Stanford's Chang. For example, "[One] time I called the tech support of a well-known database company, and the support representative had not heard of SGI computers."
"Cisco uses a fair bit of open source software, but under license with someone who swears to maintain it. For example, our C compiler is the [open source] GCC and GCC++ program, with support from Cygnus," says Fred Baker, a Cisco Systems Inc. fellow specializing in quality of service networking issues.
Users of open source software often speak highly of support provided by user groups and Web sites. They also maintain that talent pools containing hundreds or thousands of open source programmers result in software that's as robust and feature-packed as that available from commercial software firms.
At Home in the Lab
R. Scott Rowland, director of molecular modeling at BioCryst Pharmaceuticals Inc., which designs and develops small-molecule drugs, has made a substantial bet on open source software. The company uses Linux on graphics workstations for desktop modeling and in larger modeling clusters. Rowland also uses Python, an open source programming tool used to develop data-intensive applications.
"When I was looking at the tools available, Python was the best tool to get the job done," he says. "The fact that it's free may enter into it a little bit."
So far, support hasn't been a problem. When an issue arose with the source code for a program that was managing jobs on BioCryst's Linux cluster, Rowland says he "e-mailed the user group and got an answer right back." He advocates choosing open source software with active development groups — something easily determined via simple Internet research.
Patent problems never worried Rowland. "I don't see any legal issues with using the software," he says. "To my knowledge, there haven't been any examples of open source software causing troubles for products discovered using the tools."
Not surprisingly, BioCryst will continue using open source software. "If you look at bioinformatics, we've really embraced open source. I'm a big fan of Python. It's let me do things in my work that would have been very difficult with other tools."
Curtis Franklin Jr. is a consultant and writer working in Gainesville, Fla. He can be contacted at firstname.lastname@example.org.
ILLUSTRATION BY DAVE PLUNKERT