A CAREER Award for CSEs Prof. Heflin
To sports car enthusiasts, football fans and wildlife specialists, the word jaguar connotes highly discrete entities.
Realtors and home buyers argue over titles and plots - as do book lovers and moviegoers.
If the English language, with millions of shades of meaning, can baffle the wisest of scholars, how much more does it confound an artificially intelligent computer search engine that must find links to thousands of Web sites in an eyeblink?
Especially, says Jeff Heflin, when the Web's frontiers expand hourly by leaps and bounds and are governed by no standard rules of searching.
Heflin, an assistant professor of computer science and engineering, is part of an international effort to build a "Semantic Web," a term coined by Tim Berners-Lee, inventor of the World Wide Web.
By developing languages and tools that make it easier for computers to understand web pages, says Heflin, computer scientists hope to upgrade the Web from a "web of links" to a "web of meaning."
Heflin recently received a five-year, $500,000 CAREER Award from the National Science Foundation to study distributed ontologies that could bring the Semantic Web closer to reality.
Ontology is defined in the dictionary as a theory concerning the kinds of entities, specifically abstract entities, that should be admitted into a language system.
Researchers in artificial intelligence have proposed to make ontologies explicit, says Heflin. In computer science, an ontology encodes knowledge about the world, and can thus determine what is implied and find answers without explicit instructions. Ontologies can be used by people, and by databases and other applications that need to share information about domains, or specific subject or knowledge areas, such as cars, medicine or real estate.
But because the Web is such a vast network, with so many people posting and searching for so many different kinds of information, achieving one single ontology that applies to everyone would require too huge a standardization effort. Instead, the Web overflows with overlapping and often contradictory ontologies.
"No formal theory," Heflin wrote in his proposal to NSF, "has considered how ontologies can be integrated and how they may change, or the role of trust in integration."
This, he says, can frustrate those who are conducting a search on the Web.
"Searching can be difficult if you're a novice," says Heflin, "if you're looking for really hard-to-find things, or if you're looking for an answer that does not exist on one single page."
Search engines on the Semantic Web, says Heflin, can better infer from a query the site that would be most helpful to the user. They will also be able to combine different pieces of information from multiple sites in order to find an answer to a question.
Heflin is an invited expert in the Web Ontology Working Group, which was formed by the World Wide Web Consortium (W3C). Directed by Berners-Lee, the W3C develops standards for the Web. The working group designed the Web Ontology Language, or OWL. Although this name is not a proper acronym, its inventors decided that OWL rolls off the tongue more readily, and connotes more wisdom, than does WOL.
The success of a web search, says Heflin, now depends on the rarity of a name or topic or on how popular a web page is. Google, for example, takes into account the number of pages that point to a given page when ranking the results of a search.
"We have a language in which to write an ontology," says Heflin, "but we don't know how to properly combine [distributed] ontologies, or what to do when ontologies are contradictory. We need an environment that people can search in that resolves contradictions meaningfully."
Heflin wants to look at ways of partitioning the Web into useful subsets so users can determine which ontology to use when they have a query and can find an ontology that will point them to the web page that is most suited to the perspective of their search.
"I want to develop an underlying theory so we can understand and build a system that can handle large amounts of data," says Heflin. "That system should be able to look at medicine from the point of view of a patient, a doctor or a pharmaceutical manufacturer, or to search universities from a professor's or a student's point of view."
Posted on Friday, October 01, 2004