Visualizing the Semantic Web


By Eric K. Neumann
August 8, 2007 | When working with data, it is just as important to visualize it properly as it is to process it, especially in the context of aligned (columnar) collections of records, such as HTS assay, microarray, and clinical data. Today’s researchers have an arsenal of visualization tools at their disposal, ranging from spreadsheets, statistical graphs, dynamic web pages, network views, heat maps, sector maps, and many more. In most cases these vu-tools work best with homogenous, flat data. But in the world of the Semantic Web, the data is often heterogeneous and complex in structure, and so the current ensemble of tools will only work for select slices of RDF data. The real trick is how to take advantage of, and visualize highly interconnected and typed data in a way that allows users to easily choose the perspective that makes most sense.

For many people there is the assumption that if one gains semantics, one has to lose visual understanding, which is why most of us like web pages. It’s as if by making data machine-readable, we worry we may lose human comprehensibility. Yet there are many ways to see data and work with it, so in principle, if semantics can help describe the data, then viewing options should only get more flexible and compelling. What is really needed is a standard way for browsers to easily know how to handle certain kinds and sets of semantic data that come in the form of RDF — the same way style-sheets are used for HTML and XML files.

One tool that offers an intuitive and natural way to view semantic data is EXHIBIT (http://simile.mit.edu/exhibit/), developed as part of MIT’s SIMILE project (http://simile.mit.edu/). EXHIBIT has two noteworthy features: it does not require any back-end database (relational or RDF triple-store) that an author needs to intricately connect to; and it allows anyone to render RDF data via a standard browser by using simple HTML data templates and a set of pre-defined viewing modalities: tile views, tables, timelines, scatter plots, and bar graphs. What’s so cool about EXHIBIT is that once you’ve pointed it to your RDF file or data feed, it dynamically creates interactive ‘facets’ that you can use on the web page to select subsets of the data. Facets are selectable lists of types that a data-record may contain, such as “SMOKER” vs. “NON-SMOKER” for SMKCLASS or “MILD” vs “SEVERE” for  “ADVERSE EVENT”. Facets allow a fast and easy way to filter those records having the desired combinations of values types. Although EXHIBIT is based on JavaScript, it is contained so you don’t have to call or write any additional JavaScript — just include an attribute template tag for the desired data properties (columns) contained in our RDF data records:

<div ex:role=”exhibit-view”

            ex:viewClass=”Exhibit.TabularView”

            ex:label=”Subject Demography”

            ex:columns=”.label, .AGE, .sex, .wt, .ht, .smkclass, .AE “

            ex:columnLabels=”Subject, Age, Sex, Weight, Height, Smoker, Adverse Events”

            ex:sortColumn=”1”

            ex:possibleOrders=”.AGE”

            ex:sortAscending=”true”  >

            </div>

 

From this set of cues, EXHIBIT is able to generate a fully styled HTML representation of all your data as a table, nicely formatted by rows and columns, where the RDF predicates you want as columns are specified under ex:columns. To create facets, all you specify is:

 

<td width=”20%” style=”font-size:10px”>

            <div ex:role=”facet” ex:expression=”.sex”></div>

            <div ex:role=”facet” ex:expression=”.smkclass”></div>

            <div ex:role=”facet” ex:expression=”.AE”></div>

            <div ex:role=”facet” ex:expression=”.treatment”></div>

</td>   

 

A useful companion technology is SIMILE’s Babel translator (http://simile.mit.edu/babel/), an on-line converter that can take your data, even spreadsheet files, and convert them into authentic RDF. I have used it to take several heterogeneous files of clinical trials data (synthetic), and merge them all into one RDF document that now can be easily viewed using EXHIBIT. Using this approach I can combine standard table documents from clinical trials and view them in a variety of pivotable ways (http://eneumann.org/exhibit/clinicaldemo/). Babel can also convert files, like json (JavaScript Object Notation), from other formats as well, but I should mention it seems to work best with a Firefox browser. One can also call Babel via a web-request, circumventing the need to manually go through the web page.

Another system for viewing RDF data is TABULATOR (www.w3.org/2005/ajar/tab), an ongoing project by Tim Berners-Lee and his team to explore linked semantic data. This tool can be added to Firefox browsers as a plug-in, and allows users to explorer complex sets of RDF data in a click-able navigational way that downloads and renders all the linked relations as an expanding tree. Since Semantic Web objects form a graph, the tree view often times descends to a point that links back to something at the top-many find this a bit confusing. TABULATOR is able to create GoogleMaps if location data is part of the data set, and supports SPARQL queries. It is a tool designed for the technically savvy Semantic Web surfer.

Lastly, there is an effort to develop visualization standards for the Semantic Web called Fresnel — pronounced fre-nel’ (www.w3.org/2005/04/fresnel-info/). Fresnel defines a vocabulary set on how to select, render and view RDF data so that it will work in different browser environments. To date, several browsing systems implement Fresnel: Cardovan (IBM), IsaViz (W3C/INRIA), Longwell / Piggy Bank (SIMILE/W3C/MIT), Arago (DERI), and Horus (Freie Universität Berlin). Semantic Web visualization is still a topic of research, but as more semantic data comes on-line, the use and demand for such tools will grow proportionally. For now, there are some practical, easy to use prototypes that will enable researchers to better understand the advantages of Semantic Web information.

Eric K. Neumann is senior director product strategy at Teranode. E-mail: eneumann@teranode.com.

Subscribe to Bio-IT World  magazine.

 

Click here to login and leave a comment.  

0 Comments

Add Comment

Text Only 2000 character limit

Page 1 of 1

White Papers & Special Reports

HP white paper image
Extreme Storage Knowledge Center
Sponsored by HP

Visit HP’s Extreme Storage Knowledge Center to find informative, complimentary white papers, case studies, videos, product information and more.  Brief overview of topics:

  • The challenges of unstructured storage and how to manage both cost-effectively and efficiently
  • Company case studies of data storage challenges that translate across pharmaceutical and biotech companies today
  • Systems that manage vast amounts of data with simple deployment, unified management, and extreme scalability at an exceptionally low price per terabyte
  • Life sciences data management; viable solutions for small and large companies to manage growing storage demands
  • Take our virtual product tour and see our storage unit from inside out


Coupa white paper 92
10 Secrets to Recession-Proof Your Business
Sponsored by Coupa


Read this white paper to discover 10 strategies smart companies deploy to recession-proof their business.
Leaders generally face hard choices on how to mange a company during an economic downturn and
behave in one of three ways:
1) “The ostrich” - Preserve the status quo/hope for the best
2) “The bull in the china shop” - Blindly cut expenses across the board
3) “The fox” - Use the downturn to make your business more effective and position it for future growth

Learn how to behave “like a fox” and use a recession as a means to pounce on emerging trends.



SGI BriefingON image
High-Performance Computing in Life Science & Education
Sponsored by SGI and Intel
The varied collection of Bio-IT World articles and insights assembled in this BriefingON examine key trends in HPC infrastructure and how researchers are putting their best computational resources to use. Provided here are stories and lessons around the effective use of high performance computing in life science. Download the BriefingON.


Life Science Webcasts & Podcasts

Medidata Solutions

Rising Clinical Trial Delays and Costs - Addressing the Cause, Not the Symptoms 

medidata podcastProtocol complexity is taking a toll on clinical study speed and efficiency: increasingly complicated and ambitious protocols are not only burdening sites and study volunteers but are also prolonging trials and increasing expenses. In response, sponsors have turned to global study placement, restructured site relationships and new site management practices, but the problem remains.

This podcast will discuss:

  • Why these responses address only the symptoms, not the underlying cause, of rising clinical trial delays and costs.
  • Results of a recent joint Tufts University / Medidata Solutions study.
  • New metrics benchmarking protocol design trends.
  • Systematic protocol design improvements and why they are essential to clinical trial performance excellence.

Speakers: Ken Getz, Senior Research Fellow at the Tufts Center for the Study of Drug Development, and Ed Seguine, General Manager, Trial Planning Solutions at Medidata.

Download Now 



More Podcasts

Job Openings

Manager, Scientific Computing & Programming
Lead SAIC-Frederick, Inc.’s Bioinformatics & Analysis Group in developing & maintaining informatics pipelines for generation/analysis of dense genotyping & next-generation sequencing data. Required:  MS or equiv.  5 yrs related experience.  Knowledge of programming/software development, high performance computing, bioinformatics, project management. Visit www.saic-frederick.com - #130019.

For reprints and/or copyright permission, please contact The YGS Group, 1808 Colonial Village Lane, Lancaster, PA;

(717) 399-1900 ext. 125, or via email to Ashley.Zander@theYGSgroup.com.