The platform's most important feature is its massive scalability, according to Isaac Bentwich, Rosetta's founder, chairman, and CEO. "If you want to scale up a mainframe-based system, you simply can't line up 100 mainframes next to each other. Working with databases, this is quite possible," Bentwich explains. "Our platform facilitates easy chaining of additional servers to multiply the processing power at approximately one-fifth the cost." That represents a huge savings for extremely heavy genomic data-mining tasks, which may cost as much as $10 million.
Microsoft Corp.'s SQL Server 2000 was originally built to handle 2 billion records. Rosetta Genomics approached the software giant and presented it with real-life customer applications that would require a 100-billion record capacity. Together, they succeeded in overcoming the technological barriers to increase the size and performance of the database, and it is now capable of performing functions such as simulation of the genome and execution of complex queries on billions of records.
Rosetta Genomics has also developed a genomic query accelerator that improves complex query times by as much as 900 percent, as well as a method of indexable genomic data compression that achieves 40 percent compression while still allowing the data to be fully indexable.
"Currently, we have one database instance measured at 2TB and 20 billion individual records, partitioned into 16 data file groups, each comprising eight files," Bentwich says. "There is also one temporary database file group, comprising four files, and one log file group comprising four files. Primary data is stored in one file group, comprising four files."
Bentwich says that the company will be able to achieve 100 terabytes on a high-volume storage device (such as two EMC Symmetrix machines) working with two 16-processor servers or four eight-processor servers, the Microsoft Windows 2000 Datacenter Server operating system, and possibly Distributed Partition Views technology.
Rosetta Genomics also aspires to be a gene discovery company, not only supplying the technology but also using it to discover novel disease-related genes.
—Hillel Alpert
Back to Bio-IT Shines Bright in Israel