Accelerating Intuition: Reynders on Big Challenges for Informatics


By Catherine Varmazis

May 7, 2008 | BOSTON | Integration and convergence were the main themes of John Reynders’ keynote at the 2008 Bio-IT World Conference & Expo last week. Both have profound implications for informatics and the future of personalized medicine, he said.

Reynders, VP and CIO at Johnson & Johnson (J&J), said the “absolutely insane” amount of data being generated in the life sciences, and the heterogeneity of that data, require new ways of working and connecting with people. Specialists in all fields need to work in the “white space” between fields, and to collaborate more with colleagues within, and partners beyond, their organizations. “The company that thinks it’s sufficient to connect great minds behind its walls – well, it won’t be a great company for very long,” Reynders said.

 John Reynders 
John Reynders
Reynders identified four critical layers of convergence that are occurring or have yet to happen: 1) converge the data; 2) converge information semantically; 3) converge people; and 4) converge knowledge to lead to reasoning capability.

Converge the Data
The first convergence layer involves the capture, processing, filtering and management of data. When he worked at Los Alamos National Laboratory, Reynders ran a research center that had the largest dedicated unclassified supercomputer in the U.S. -- a 1-Teraflop system that had 7-Terabyte spinning disk for storing data. When he moved to Celera, which was assembling the human genome, its system had the same computational capacity, but the spinning disk was 150 Terabytes – 20 times greater. 

That amount of data creates storage challenges. Unable to get federal funding for petabytes of storage, Celera built data visualization corridors, but Reynders said these can leave researchers “data-rich but information-poor.”

Hierarchical approaches to data storage are needed because “we can’t keep all the data all the time.” Such approaches involve not just technology solutions but also business models. For example, if you temporarily need a petabyte of storage, do you have the option to lease it?

Reynders illustrated the concept with the image of bears waiting on a riverbank as salmon leap upstream: “The bears grab just the salmon they need. They don’t want all the water, just the fish.” This concept goes by various terms, but “cloud computing” is the latest buzzword. Reynders argued that we must find ways to temporarily store vast amounts of data, mine it and extract the specific information required – then “drain the pool” and start over.

Find Meaningful Connections
Data exponentials are growing in two dimensions -- size and scale -- as well as in terms of heterogeneity, said Reynders. The challenge facing biomedical scientists is how to converge all this data semantically and find meaningful connections between different kinds of data. Traditional means of finding semantic relationship in text data include latent semantic indexing (LSI) and natural language processing (NLP). While useful, Reynders said, such tools do not serve the desired purpose, as we’re now storing all kinds of data, including text, compounds, and genetic data.

Ontologies are promising platforms for forming semantic relationships across arbitrary data, he said, but only get you so far. “This is the other shoe dropping. Not only do we have an immense amount of data, but the data is very heterogeneous.”

Reynders said the data integration challenge the life science community faces is similar to that of the intelligence community, which is also grappling with the problem of heterogeneous data from disparate sources (satellite data, multi-language text data, signal intelligence, human intelligence), and they have to put it all together and make sense of it. “Some of the algorithms that are needed for this very heterogeneous data integration, which typically go beyond ontology to a very large graph-type problem, are very relevant to our challenges,” he said. “It’s not enough that we can navigate in one of these domains because you cannot find those connections if you’re only looking at one class of information.”

Open Innovation
Reynders’ third layer of convergence is relatively new, he said: that of converging people. Innovation requires not just bringing all the integrated data to one scientist, but to a team of scientists in different domains working together. “Where is the MySpace for scientists? Or that in silico watering hole where clinicians can share their ideas?” he asked.

Some organizations are harnessing internal wisdom within their ranks. J&J, for example, has built the LINK (Leverage INternal Knowledge) expert locator system to enable collaboration among 14,000 enrollees across far-flung J&J subsidiaries. LINK scans the emails of everyone who signs in, searching for keywords. “You have a problem, you throw in these terms, and LINK connects you with the person in [say] Belgium who can help you.”

Other open innovation business models include Eli Lilly’s InnoCentive website. Originally focused on chemical compounds, it now includes a wider range of problems. NineSigma lets seekers post RFPs and solvers bid against them. Your Encore is a space for retired scientists and engineers who want to get back to solving problems. Online IP markets such as yet2.com help innovators who have great ideas navigate all the steps of locking down their IP.

“More and more often, innovation will be coming from outside the organization,” said Reynders, “so paying close attention to how these open innovation models are evolving is going to be very critical to all of us.”

Reasoning Layer: Accelerate Intuition
It was surprising to hear Reynders assert that one bottleneck in convergence is the human brain brain's ability to absorb and connect across multiple complex representations of information. For example, in 1997, IBM’s Deep Blue beat world chess champion Garry Kasparov, even though it only ranked in the middle of Top 500 supercomputers and had less computing power than a modern laptop. How could a machine with such modest computational power defeat the human brain?

Reynders cited futurist Ray Kurzweil, who predicts that within ten years or so, the “Turing test” will be passed. “You’re [instant messaging] on two computer terminals,” said Reynders. “On the other side of the conversation, in another room, there’s a person at one terminal and a computer at the other, and you can’t tell which is which. That’s passing the Turing test.”

Doctors and clinicians of the future will not be replaced by computers, he said, but they will have the very best digital reasoning capacity available to help them sort through immense amounts of heterogeneous information to find the needle in the haystack. That reasoning layer will be the final layer, said Reynders. And it’s much more than “connecting the dots.” It involves connecting information and forming neural circuitry between concepts that can be traversed by any kind of reasoning platform.

“We can’t even start to think about the next layer until this capacity has been formed,” Reynders concluded. “This kind of convergence in the future – Ah, it gets exciting.”

_________________________________________________

This article appeared in Bio-IT World Magazine. 
Subscriptions are free for qualifying individuals.  Apply Today.

 

 

Click here to login and leave a comment.  

0 Comments

Add Comment

Text Only 2000 character limit

Page 1 of 1

White Papers & Special Reports

thomson reuters image
Biomarkers: An Indispensible Addition to the Drug Development Toolkit
Examining the Potential of Biomarkers
Sponsored by Thomson Reuters

Biomarkers are becoming an essential part of clinical development. In this white paper, Thomson Reuters provides insight from experts in industry and academia, and explores the role of biomarkers as evaluative tools in improving clinical research and the challenges this presents.

Discover the potential of biomarkers to:

  • Improve decision making
  • Accelerate drug development
  • Reduce development costs


BlueArc_Scientific Data
Scientific Data Lifecycle Management: Preparing for Storage in an Uncertain Future
Sponsored by BlueArc

Managing vast and overwhelming streams of gene sequencing data today requires ultra-high performance systems and processes. With continued rapid advancement and improvements in gene sequencing, expect tomorrow’s instruments to output quantities of genomic information that will dwarf current levels. Help your organization maintain data control and prepare for the future of sequencing through this informative paper that discusses:

  • The information technology challenges of gene sequencing
  • “Intelligent” methods for data management and customization
  • System survival tips... Deciding what data to keep or delete
  • New tools to keep scientists ahead of impending data torrents


SAS Managed image
Managed Innovation, Assured Compliance
Developing, executing and managing the transformation, analysis and submission of clinical research data with SAS® Drug Development
Sponsored by SAS
Get better products to market faster. Download this white paper to discover the top ten challenges facing life science executives and how to overcome them. See how SAS Drug Development transforms clinical data into true innovation.


Life Science Webcasts & Podcasts

Presented by Trade Commission of Spain

Spain Biotech: An Engine for Economic Change 

TCS podcastDiscover how Spain is focusing on biotechnology to be an engine for economic change through gradual internationalization, development and technology transfer.

Regional governments are actively investing in public and private biology research and promoting the creation of knowledge-based companies. Spain’s human capital combined with aggressive investment in biotech research and infrastructure has led to the creation of bio-clusters.

Today, there are nearly 700 Spanish companies engaged in biotechnology, with almost 50 percent growth in funding devoted to research. In fact, spending on internal R & D in biotechnology has grown 46 percent and is close to 300 million Euros.

Access the podcast 

 



More Podcasts

Job Openings

saic_logo

MANAGER, SCIENTIFIC COMPUTING & PROGRAMMING
(Bioinformatics Manager)
SAIC-Frederick, Inc has an exciting opportunity for a Manager, Scientific Computing & Programming - Core Genoytyping Facility in Gaithersburg, Maryland.  In this role, you will lead the Bioinformatics & Analysis Group.
Master’s or equivalent required.  PhD preferred. Six years experience in development of scientific programs in high-performance computing environment including five years supporting scientific research in computational chemistry, biology, or genetics, & two years supervisory experience.  View complete job posting & apply: www.saic-frederick.com. Position #146945.

For reprints and/or copyright permission, please contact The YGS Group, 1808 Colonial Village Lane, Lancaster, PA;

(717) 399-1900 ext. 125, or via email to Ashley.Zander@theYGSgroup.com.