Adding openness to a closed worldPublication date: Feb 09, 2010
Traditionally information architects model information using a closed world assumption (when one models using relational or OO principles). Increasingly www-inspired description languages (e.g. OWL) are used, which are based on open world assumptions. These two worlds will have to be bridged to take full advantage of the new possibilities.
This is a guest contribution by Peter Hendler MD, practicing physician, computer scientist and chair of the HL7 RIMBAA working group. Peter just started working on a project to move SNOMED-CT to OWL, which inspired him to write this column.
The Closed World
Software developers fall into different groups according to how they look at the world. By “look at the world”, I mean, how they picture the model of the world in their minds when they program some domain. The approach they take to modeling a domain is heavily influenced by how they perceive the world and the tools they use. Much like the old saying, “if you only know a hammer, everything looks like a nail”.
The historically oldest group, and probably still the most prominent group, are the relational database people. They think of the world as something that can be described by tables of columns and rows. Within HL7 the impact of relational models can be seen in the structure of the HL7 version 2 standard.
The second large group of programmers/modelers are the object oriented (OO) folks. They think of modeling the world in Classes and Objects. For example, a Cat is a class that is a subclass of Mammal which is a subclass of Vertebrate which is a subclass of Animal. A Class can inherit properties from its ancestors. For example, the sex of a Cat, is not specific to Cat, it is more accurately added at the level of Animal. An “object” is an “instance” of a Class. So Fluffy and Boots are Objects of the class Cat. The HL7 version 3 standard uses object oriented principles.
There are many programmers who are comfortable in both of these worlds. And Object to Relational mapping is a big subject in programming today. Much has been written about the "impedance mismatch" when you map relational databases to object oriented structures.
In these worlds, nothing is assumed. Something is true only if you say it is. This is called a “closed world assumption”. When writing a program or a database application, the developer is basically like god. He can declare what is true and what is not true in his model. He is the sole creator of his model. If the model doesn’t declare something as true, then you can safely assume it’s false. Closed world assumption is true when one person or organization controls the creation of a data model.
The Open worldThe World Wide Web (www) and its nature is a prime example of a world where the "closed world assumption" doesn't hold. In the World Wide Web anybody can make a web page. This drastically changes the rules and basically means that anyone can say anything about any subject.
It’s quite possible that two web developers who have never met would both be modeling something similar. It’s also unlikely that they all would come up with the same names of the classes. For example, someone might call a class Client and another programmer might model the same thing but call it Customer instead. The important point is there is no one governing or coordinating the way things are named or modeled. In a closed world (database or Object model), you can assume that things with different names are different things. On the other hand, in the “open world” on the www, you can’t make that assumption at all. This is what is meant by the “open world assumption”.
There is a new kind of information engineer that uses models based on the "open world assumption". These are the Description Logic (DL) people. They are currently rare. These DL people also talk of Classes, but the behavior of these “Classes” is very different from the Object Oriented idea of a Class.
When you talk to programmers, many of them are familiar to some degree with Relational Databases and OO (e.g. the RIM and Java). It’s pretty rare to find people who live in an open world. For the most part super stars of the open world are not in medicine at all, they are the ones who write the reasoners and specify the OWL language 9a generic DL). They are people who are members of the World Wide Web Consortium W3C. They don’t know much about clinical medicine, but they sure know how the open world works.
It takes a few years for a developer to understand and be comfortable in any or each of these views of the world (relational, OO, or DL). If you are a good object oriented programmer, you may get badly burned when you start to play in the open world.
For example, an object oriented programmer might look at a terminology that is a Description Logic such as SNOMED- CT. In a closed world you could model one term as "Infectious Disorder of Lung", and then you could model another term as "Neoplastic Disorder of Lung". In the object oriented world, if you were to say, "give me a Non Infectious Disorder of the Lung", you would logically expect to get the answer "Neoplastic Disorder of Lung". But in SNOMED (or in OWL) that would not happen. Why? Because, you are now in the open world. The names themselves have no meaning to a machine.
The machine doesn’t have any way of knowing whether Neoplastic and Infectious are just two words for the same thing, just like Customer and Client are two words for the same thing.
So the machine says to itself. “For all I know, a Neoplastic Disease may be exactly the same thing as an Infectious Disease”. In order for the classifier to get the expected answer you have to do something to “close” this aspect of the world. You have to tell the model that Neoplasm is “disjoint” or different from Infection. Then you will get the right answer. Until you understand how this works, you will tear your hair out.
Interestingly, the strangeness of the open world is practically invisible until you add negation. Most SNOMED modelers are not aware of this strange behavior. I’ve been modeling SNOMED for years and still got bit.
Bridging both worldsSo where does the bridging of the worlds take place? In HL7 all information objects are based on the closed world object oriented model called the HL7 v3 Reference Information Model (RIM). SNOMED and OWL live in the open world of the www and Description Logic, where anyone can say anything about anything. How will we bridge these worlds?
You can use SNOMED-CT terms as the values in some properties of the RIM model. In other words, you take the closed world OO model of the RIM, but you have little slots in the RIM that you are allowed to fill up with data, and one of the flavors of data you are allowed to use in some of these slots are open world flavored SNOMED terms. This is how you bridge the OO world with DLs.
The RIM contains some features which will need to be taken care of: one can express a semantic concept (example: 'emergency laporoscopic appendicectomy') using either 1 RIM class and a SNOMED-CT term; OR as 3 RIM classes with act relationships and 3 SNOMED-CT terms. One will have to have knowledge of RIM class structures if one wishes to test semantic equivalence of these variant structures. RIM classes allow for negation (the activity did NOT take place), something which isn't supported by SNOMED-CT. Negation can however be supported by OWL, which is one of the reasons to move SNOMED to OWL.
By merging the two older worlds already commonly used in medicine with the newer SNOMED/OWL world exciting things are bound to happen. One benefit that is already well known is the fact that in the open world, the computer can make "inferences". In other words, it can deduce facts logically that you did not tell it, or even know yourself. The two older worlds lack this ability to deduce new facts from the stated facts.
Update: see also his blogpost about Thinking like and OWL reasoner.
PermaLink to this page: http://www.ringholm.com/column/HL7_OWL_adding_openness_to_a_closed_world.htm
Index of columns:
About Ringholm bvRingholm bv is a group of European experts in the field of messaging standards and systems integration in healthcare IT. We provide the industry's most advanced training courses and consulting on healthcare information exchange standards.
See http://www.ringholm.com or call +31 33 7 630 636 for additional information.
Rene is the Tutor-in-chief of Ringholm.