Earlham Institute Presents Sequence Alignment Using Optalysys’ Optical Correlator Computing System

September 18, 2017

By Benjamin Ross

September 18, 2017 | 2017 Best Practices Awards | The Earlham Institute, a Norwich, England-based life science research institute, are working with Optalysys, a Yorkshire, England-based start-up, to apply Optalysys’ optical computing system to life science applications. The first release of the computing system is scheduled for early 2018, which both organizations say will be capable of performing a number of different DNA-based local sequence alignment applications.

Optalysys' optical computing system, a mobile solution the size of a graphics card, exploits a series of electro-optic components, including lasers, Spatial Light Modulators (SLMs), lenses, and cameras to correlate patterns within pairs of images. The correlations are captured as points of high intensity light by the camera.

Optalysys is hoping to provide high computational performance, coupled with low power consumption, especially when compared to high performance computing data centers. The device should enable supercomputing performance on the desktop, without the need for additional cooling.

Both organizations have worked to apply the technology to the task of aligning sequences. Early tests indicate the technology has high sensitivity, meaning that this technology can detect alignments that others would miss. According to both organizations, this is important for a whole range of bioinformatics tasks, such as taxonomic classification, long-read alignment, and inferring evolutionary relationships between distantly related species, to name a few.

In a press release, Optalysys said that sequence alignment is the first of many application areas being targeted. They envision the technology being integrated with standard computer systems across a range of demanding industrial and scientific areas that are becoming limited by the slow-down of progression in electronic computing.

From the Earlham Institute’s perspective, the ability to offer a technology involved in viewing computing a whole new way was an interesting concept, and in 2015 the Earlham Institute partnered with Optalysys to form Project GENESYS, a UK government-funded (Innovate UK) collaboration with the objective of producing a working prototype for sequence alignment.

“Optalysys were invited to the Earlham Institute with the opportunity to present their optical processing technology,” Daniel Mapleson, the Analysis Pipeline Project Leader at the Earlham Institute, said when speaking with Bio-IT World. “We could see immediately that [Optalysys' technology] offers the ability to do pattern matching at scale and speed… The idea we had in mind originally was that we could replace some very slow-running, and energy consuming, alignment processes with this technology.”

Mapleson pointed to the BLAST sequence alignment tool as an example of the limitations in electronic computing. “BLAST has been around a long time, since the 90s,” he said. “It historically offered a good compromise between speed, sensitivity and flexibility. Indeed, the reason it was developed was because existing methods, such as the Smith-Waterman algorithm were too slow. However, the last decade has seen the proliferation of Next Generation Sequencing devices, capable of producing huge amounts of data, and BLAST is simply not able to process the data fast enough.”

As a result of BLAST’s inability to keep up with the advancements in technology, Mapleson says that companies looked for tools that were even faster in processing data in a reasonable amount of time. The speed they gained, though, meant that they had to give up the sensitivity of the data analysis.

“The optical correlator has some advantages over traditional approaches,” Nick New, CEO of Optalysys, told Bio-IT World. “First, it is tolerant towards mismatches and indels, allowing us to find alignments that other approaches would miss. Second, it is inherently parallel allowing us to find seeds for multiple queries simultaneously. Finally, target sequence indexing is not required.” The optical computing system could also be applied to many types sequence alignment, such as protein or transcript alignment, and isn't constrained for use on any particular organism.

The Future Of Wheat

One such organism is bread wheat, which was the subject of the Earlham Institute’s project that won the 2017 Bio-IT World Best Practices Award for IT Infrastructure/HPC this past May in Boston.

BITW BP logo

The Earlham Institute successfully assembled the genomic blueprint of the bread wheat genome for the first time in 2015 by exploiting “leading edge” HPC infrastructure, according to the Earlham Institute’s Best Practices entry form.

The wheat genome, which is five times bigger than the human genome, is complex, and full of repetitive elements, making it difficult for the Earlham Institute to tap into its genomic sequence. The Earlham Institute was exploring new variations of wheat that exhibit the very traits that will help improve its durability in the face of disease and climate change.

The project included what the Earlham Institute called one of the largest single-system deployments of Intel NVMe SSD storage worldwide, as well as the first UK instance of the Edico Genome DRAGEN bio-IT processor to perform mapping on the wheat reference in record time.

Mapleson sees optical processing as being able to handle the same obstacles researchers faced when tackling the wheat genome, although he doesn’t want to get too far ahead of himself. “Optical processing could be one such strategy,” Mapleson said via email. “However, while processing genomes the size of wheat is out of scope for our current Innovate UK project, we still have some work to do to ensure the technology can scale to datasets of that size.”