YouTube Facebook LinkedIn Google+ Twitter Xingrss  

By Michael Athanas  Ph.D.

May 19, 2004 | One of the nastiest obstacles to making effective use of scalable computing infrastructures is enabling applications and workflows to execute in a parallel environment. What follows is a real recipe to quickly transform a cluster into a digital rendering farm. To try this, you will need a configured Unix cluster (Linux, OS X, etc.) installed with load-management software such as Sun Grid-Engine or Platform LSF.

This project is an explicit example of taking an inherently monolithic task and parallelizing it across a cluster. The monolithic task must be decomposed into individual pieces that can be computed independently from the other pieces. All the pieces must be distributed across the cluster for processing. This is referred to as a parameter search parallelization of the digital rendering of a molecular structure. In this example, we use protein structure data from the RCSB Protein Data Bank (PDB) to render an MPEG movie that allows us to "fly" around the molecule.

Before diving into the parallelization steps, we have to prepare the input data. This requires a protein structure record from PDB. You may derive one from, the home of PDB. In the search box, enter a protein name or keyword, such as "hemoglobin," and then download the sequence to your cluster.

Ppovit is a simple but elegant command-line tool from Michigan State University's Department of Chemistry that interprets a PDB file and converts it into something our ray-tracing program can interpret. (You can obtain ppovit at

 3-D MOVIE: With the right rendering tools, this hemoglobin becomes a rotating object. 
To process a PDB structure file, execute:

ppovit mystructure.pdb > mystructure.pov

We will be using the command-line capabilities of Persistence of Vision Raytracer (Povray), "a high-quality, totally free tool for creating stunning three-dimensional graphics" ( Povray is available for most platforms. (For OS X, use the Unix source distribution to compile a command-line version of the binary. For details on building Povray on OS X, see Povray must be either installed on or accessible from each node in your cluster.

A slight modification to the mystructure.pov file we produced earlier is necessary in order to enable a fly-by view of the protein structure. Use your favorite editor and replace the camera declaration stanza with the following:

#declare phi = 2*pi*clock/360.0;

camera { 


-20, -(CameraDistance)*cos(phi)>

look_at <0., 0., 0.>


To tell Povray to ray-trace a 12-second movie of 360 frames using default resolution:

povray +FP +KFI1 +KFF360 mystructure.pov

In this case, each frame is traced sequentially. +KFI and +KFF are the clock initial and final values for which the clock parameter is set in the camera stanza. As the clock parameter increases, we rotate left around the molecule. On my G5 workstation, it takes a little more than three seconds per frame, which adds up to about 20 minutes to ray-trace the whole 12-second movie.

As previously stated, the goal is to parallelize the ray-tracing across the cluster for improved turnaround time. In this case, we will decompose the monolithic ray-tracing into individual-frame ray-tracing that can be executed on each CPU in the cluster.

Most load-management software suites provide mechanisms for parameter searches using job arrays. With a job array, a single command can launch a large number of jobs for simultaneous execution in a cluster. The load-management system issues a sequential job index number that may be used to set the search parameter. With Sun GridEngine, the environment variable SGE_TASK_ID is the search parameter index.

To launch a parallelized ray-trace across the cluster in a single command using Sun GridEngine:

qsub -cwd -o stdout -e stderr -t 1-36 <enter>

povray +FP +KFI${SGE_TASK_ID}0 +KFF${SGE_TASK_ID}9 mystructure.pov


In this case, the first line defines the job submission parameters. The -t flag defines the range of values for the parameter search index. After pressing <enter>, the actual command line that is to be executed is input. In this example, Povray will trace 10 frames for each job fragment. Close and submit by pressing Control-D.

In order to do a comparison of execution speeds, I executed this job on a 16-node G5 cluster. The turnaround time was almost three minutes.

This technique can be used with applications where the execution can be restricted based upon input parameters. The decomposition of a BLAST works quite well for job arrays. 
For the final step, the Povray output must be rendered into an easily viewable format. Many tools are available for this. A simple command-line tool called mpeg2encode is available from the MPEG Software Simulation Group at The result is an MPEG movie that is viewable from practically any Web browser (see

The parameter search decomposition technique can be used with applications where the execution can be restricted based upon input parameters. For example, the decomposition of a BLAST based upon individual input queries works quite well for job arrays.

There are many optimizations to accelerate Povray on a cluster beyond the simple one-line decomposition shown above. A better understanding of your data, your algorithms, and your compute resources will guide effective acceleration techniques. For additional resources, see

Michael Athanas is a founding partner and principal investigator at The BioTeam. E-mail: 

For reprints and/or copyright permission, please contact  Jay Mulhern, (781) 972-1359,