Taking a World Genography Test


Interview by Kevin Davies

June 14, 2005 | In April, IBM announced that it was partnering with the National Geographic Society to provide the computational and IT infrastructure for the Genographic Project. The 5-year, $40-million program is intended to type DNA samples from indigenous populations around the world, thereby helping to recreate the movement and development of human populations around the globe over the past 100,000 years. The IBM Program Director is Kristopher Lichter. An immunologist by training, Lichter started in biotech before joining IBM six years ago. As Kevin Davies discovered, Lichter hopes his combination of life science and IBM experience will enable him to manage IBM’s multifaceted contributions to the project.

Q: How did IBM get involved in the National Genographic Project?
A: National Geographic approached us. Spencer Wells had finished his pilot project and was looking to open the results to the world and get a larger sample population. They realized that, to be as comprehensive and as scientifically valid as possible, they needed an IT partner — someone with life science expertise who could understand why IT systems could be applicable. Their first choice was to come to us, over a year ago. We had many discussions, but we felt we were compelled to do this. We were instantly excited.

Why so much excitement for such a purely scientific project?
It really represents the type of innovative project that impacts the world in a positive manner — whether it’s through our life science expertise, computational biology center and advanced algorithms, or general IT solutions to make sure the system is as advanced, secure, and flexible as it should be. This project will be evolving over five years. This is exactly what we’re about. It’s in our DNA!

What’s the response been like inside the company?
This is why I work with IBM, to move the dial in a positive way. We’ve received hundreds of e-mails from colleagues. A note went out from [CEO] Sam Palmisano encouraging people, and the response from IBM employees has been overwhelming.

Was there any internal debate about what IBM gets out of this?
Of course, there’s been discussion, but we actually grew the course of our participation over that discussion. The National Genographic Project had a less expansive role in mind for us originally, but during that discussion, we said, we have this Computational Biology Center (CBC), and that inspired the internal and external decision-making process. There was never any doubt we wouldn’t do this. It advances our application of technologies from the CBC. We get to learn, and learning is part of who we are. We can apply these advanced technologies. And you never know what you’ll learn along the way.

What is the nature of IBM’s contribution?
IBM will be participating through IBM Research, Carol Kovac’s Life Sciences organization, and our CIO’s office, which provide the IT design systems. It takes a lot of coordination. We have to work well with the National Geographic Society and principal investigators around the world. It needs to move on with the integrity it requires. Also, through the IBM Foundation, we are directly funding the project, along with the IT systems, people, and time to make the project come together. [CBC Director] Ajay Royyuru, the CBC’s time, the technologies, and so forth.
We’re setting up a project that’s at least five years long. We intend to turn this data out to the public; it’s owned by the world at large. People will come along after us. We’re just trying to plant a very strong sapling.

Which IT and informatics systems are you contributing to the project?
We’re developing a state-of-the-art analytical capability, but as the field evolves, we’ll apply to make those technologies better. As the IT partner, IBM will always be providing a system one step ahead of where the scientists need to be. We’ll look to apply Blue Gene if it’s relevant. The central repository will house all electronic samples. We have the data collection systems of each of the principal investigators. The communication protocols to make sure the data are securely transferred and stored at the National Geographic Society headquarters [in Washington, DC]. Those IBM systems include a Linux-based Blade Center, E-Server X series link with DB2 and Websphere, as well as MQ product for communication. There’s over 1 terabyte of attached storage.

What type of DNA data are you collecting?
It’s specifically Y- and mitochondrial [male- and female-specific] DNA, to make sure we keep to specific nonmedical markers that are strictly built around descent. This helps people understand the scope of the project.

What’s the purpose of the $99.95 public participation kits?

I do think people are fascinated to know about common roots. You can become an associate researcher in this project. It’s a way for the public to get involved while learning about their deep ancestry. The response has been overwhelming. It adds to the sample base and the science. It complements the indigenous populations we’re working with. There’s a growth to the project. There’s also a feeling among the team that the legacy project needed to be done. Proceeds [from sale of the public participation kits] go back to critical infrastructure to help indigenous populations. Everybody wins. It helps the public get involved.

How will the data be shared with the research community?
Once we’ve done the initial scans, then it will be turned out to the broader community. It’s hard to know when that will be — it depends on sample collection, analysis, etcetera, but it will become public domain, a Commons ownership. We’re not patenting or owning the data. We have anthropologists, paleontologists, and linguists on the advisory board to keep the broader picture in mind. The first results will be published by the Genographic Project team — that will be the first glimpse of the data.

With this mountain of data, isn’t there a temptation to apply it for medical use, in a sort of global biobank?

Absolutely not. The integrity of the [Genographic] project is enough. There’s a tremendous amount to be gained by doing valid science for the world, letting that be owned by everyone. Let it be the legacy for the planet. It [medical use] doesn’t compute.

How are you avoiding the controversy and pitfalls that befell the Human Genome Diversity Project (HGDP) a decade ago?
In general, there are high-level goals that are similar [to the HGDP], using genetics to understand humanity better. That’s common, but there are specific differences. The specifics of this project, the clarity of the mission — we’re just studying the human journey; there’s no medical research, no intention of that at all. We won’t be owning the data.

We have indigenous leaders on the advisory board; it’s very important to be working with them off the bat. There’s no Genographic Project without the advice and leadership of those leaders. It’s that simple. There was never a point in which they weren’t going to be applying cultural sensitivity. The internal review board out of the University of Pennsylvania, to whom we applied, stressed that was an important part of the project. How we do the sample collection, how we articulate the project (on a strictly voluntary basis), or how the funding gets applied — we look to the indigenous leaders to tell us.

Spencer Wells has done a lot of very positive groundwork in his previous studies — there is trust there. In general, the response has been extremely positive. Scientists in regions not directly involved, in the field, say we’ve done a very solid job of teaching the scope of the project. It’s been extremely positive.

Click here to login and leave a comment.  

0 Comments

Add Comment

Text Only 2000 character limit

Page 1 of 1

White Papers & Special Reports

HP white paper image
Extreme Storage Knowledge Center
Sponsored by HP

Visit HP’s Extreme Storage Knowledge Center to find informative, complimentary white papers, case studies, videos, product information and more.  Brief overview of topics:

  • The challenges of unstructured storage and how to manage both cost-effectively and efficiently
  • Company case studies of data storage challenges that translate across pharmaceutical and biotech companies today
  • Systems that manage vast amounts of data with simple deployment, unified management, and extreme scalability at an exceptionally low price per terabyte
  • Life sciences data management; viable solutions for small and large companies to manage growing storage demands
  • Take our virtual product tour and see our storage unit from inside out


Coupa white paper 92
10 Secrets to Recession-Proof Your Business
Sponsored by Coupa


Read this white paper to discover 10 strategies smart companies deploy to recession-proof their business.
Leaders generally face hard choices on how to mange a company during an economic downturn and
behave in one of three ways:
1) “The ostrich” - Preserve the status quo/hope for the best
2) “The bull in the china shop” - Blindly cut expenses across the board
3) “The fox” - Use the downturn to make your business more effective and position it for future growth

Learn how to behave “like a fox” and use a recession as a means to pounce on emerging trends.



SGI BriefingON image
High-Performance Computing in Life Science & Education
Sponsored by SGI and Intel
The varied collection of Bio-IT World articles and insights assembled in this BriefingON examine key trends in HPC infrastructure and how researchers are putting their best computational resources to use. Provided here are stories and lessons around the effective use of high performance computing in life science. Download the BriefingON.


Life Science Webcasts & Podcasts

Medidata Solutions

Rising Clinical Trial Delays and Costs - Addressing the Cause, Not the Symptoms 

medidata podcastProtocol complexity is taking a toll on clinical study speed and efficiency: increasingly complicated and ambitious protocols are not only burdening sites and study volunteers but are also prolonging trials and increasing expenses. In response, sponsors have turned to global study placement, restructured site relationships and new site management practices, but the problem remains.

This podcast will discuss:

  • Why these responses address only the symptoms, not the underlying cause, of rising clinical trial delays and costs.
  • Results of a recent joint Tufts University / Medidata Solutions study.
  • New metrics benchmarking protocol design trends.
  • Systematic protocol design improvements and why they are essential to clinical trial performance excellence.

Speakers: Ken Getz, Senior Research Fellow at the Tufts Center for the Study of Drug Development, and Ed Seguine, General Manager, Trial Planning Solutions at Medidata.

Download Now 



More Podcasts

Job Openings

Manager, Scientific Computing & Programming
Lead SAIC-Frederick, Inc.’s Bioinformatics & Analysis Group in developing & maintaining informatics pipelines for generation/analysis of dense genotyping & next-generation sequencing data. Required:  MS or equiv.  5 yrs related experience.  Knowledge of programming/software development, high performance computing, bioinformatics, project management. Visit www.saic-frederick.com - #130019.

For reprints and/or copyright permission, please contact The YGS Group, 1808 Colonial Village Lane, Lancaster, PA;

(717) 399-1900 ext. 125, or via email to Ashley.Zander@theYGSgroup.com.