Class hierarchies and Ontologies

I've recently been looking at high-level ontologies, such as CYC and SUMO for a project I'm working on. Clearly, there is a connection between techniques used for organising class hierarchies and those used for organising ontologies (class hierarchies are ontologies, at least superficially). Going further, there seems to be an overlap between languages' standard class libraries and these high-level "common sense" ontologies. Each describes quite general concepts like "objects", "processes", "sets" and "classes", and then provide more specific concepts for particular domains. With RDF and OWL becoming popular (or not?), I wonder if there will be a merging between these two similar areas. Are there any languages that come with a standard ontology? Are there languages whose tools for modelling class hierarchies more closely resemble those of ontology authoring tools? For instance, supporting more than just "subtype" and "instance" relations (i.e., more like first-order logic or a description logic). Haskell's type-classes come to mind. Is there any restriction on the kinds of relations between types that can be expressed by type-classes? Are there techniques that can be applied to type/class-level organisation from ontology modelling?

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.

Ontology

In philosophy, ontology is understood as the theory about the nature of being.

Could you please provide a definition of what exactly you mean by 'ontology' in the context of programming languages ? Surely not a bunch of symbols collected in 'sets', 'classes' or some other data structures ?

The usual Computer Science

Ontology

Iam sorry, but that:

"In computer science an ontology is a data model that represents a domain and is used to reason about the objects in that domain and the relations between them
"

does not make any obvious sense.

Could you expand on the definition and explain just how 'ontology' is a better term than a simple 'data model' ? What does it mean for a 'data model' to 'represent a domain' ontologically ?

Also, how exactly is the computer 'ontology' related to the philosophical one ?

My goal here is to understand whether the usage is motivated by something more substantial than just trying to impress friends and colleagues.

Pretty Weak

Yeah, that definition seems pretty weak to me, too. I try not to use the word "ontology" unless I'm working with a Description Logic system and there will actually be inferences drawn based on the concepts and properties in the ontology, in the context of a relatively rich logic (see here for some brief overviews of the various families representing the current state of the art in Description Logics).

Definition

"I try not to use the word "ontology" unless I'm working with a Description Logic system and there will actually be inferences drawn based on the concepts and properties in the ontology..
"

How do you define such ontology ?

A description logic is a

A description logic is a fragment of ordinary first order logic, which usually works by drastically limiting the way you can use predicates, relations, and quantifiers in order to get decidability and fast (hopefully) logical entailment. An "ontology" would be the set of predicates and relations and the axioms that hold between them.

There's, um, less than meets the eye here.

Big fragment, little fragment

Description logic can be seen either as a notation for the bounded fragment of first-order logic, or a multi-modal logic equipped with fixpoints. It's less than an arbitrary theory of first-order logic, but it's decidable, and it's pretty expressive as decidable logics go. Given that the kind of questions you want to ask of computer ontologies normally ought to be decidable, it's a pretty good framework for formulating ontologies.

I've disagreed with practically nothing neelk has said, but then again neelk disagreed with practically nothing Paul said. Logic does often seem to have less to it than meets the eye, but then again, doing logic well is still hard and worthwhile.

Confused

I thought neelk had clarified the matter, but it seems you think there is more to the 'o' word:


Given that the kind of questions you want to ask of computer ontologies normally ought to be decidable, it's a pretty good framework for formulating ontologies.

It appears that DL is just a tool to work with 'ontology', if so what is 'ontology' then ?

I agree with neelk, more or less

He said: An "ontology" would be the set of predicates and relations and the axioms that hold between them.

I'd elaborate that somewhat to say An "ontology" is given by a collection of entities that you are interested in, and predicates giving properties of them, and relations specifying how they are structured. An ontology will further provide a notion of sentence that allows complex propositions to built out of these elements, and a semantics that determines when these sentences are true.

My response to neelk was motivated by my agreement with Paul that there isn't a very clear distinction between ontology and "formal language" (formulated normally either as
a modal logic or predicate logic, or both). Most of the cases where onotlogy makes sense one wants, (i) a decidable language), and (ii) description logic provides a well studied foundation: I'd like it if people mostly followed Paul's lead. I hoped I'd made it clear that I had no substantive quarrel with what neelk said.

Definition

Let's take a closer look at this:


I'd elaborate that somewhat to say An "ontology" is given by a collection of entities that you are interested in, and predicates giving properties of them, and relations specifying how they are structured. An ontology will further provide a notion of sentence that allows complex propositions to built out of these elements, and a semantics that determines when these sentences are true.

We have collections or sets, predicates presumably defining some interesting subsets of such sets and relations which are in effect subsets of some other interesting set products.

Now, we have a theory consisting of some language with its formation rules and a set of axioms, both defining a theory. The notion of sentence or a well formed formula is defined by the language formation rules. Such theory can have a model which is its semantics, the model consisting of the earlier mentioned sets, relations, etc, along with the interpretation function. What is 'o' doing here ?

The "ontology" part comes

The "ontology" part comes from the analysis of the real-world domain that you must do in order to decide which predicates, relations and connectives you need. Roughly:

1. you have some real-world domain you need to analyze,
2. you figure out the vocabulary you need to talk about it
3. you figure out the relationships that hold between the bits of your vocabulary
4. you encode this knowledge in a deductive system

Formal logic really only comes into the picture in a big way at step 4, but to get there your informal conceptualization (steps 1-3) must be clear and rigorous in order to permit formal description. The priority on getting the informal understanding right is what distinguishes making ontologies from doing research in formal logic.

Still, as Charles Stewart observed, this is a blurry line: for example, modal logic can involve the formalization of informal understandings of concepts like time and obligation.

database modeller


The "ontology" part comes from the analysis of the real-world domain that you must do in order to decide which predicates, relations and connectives you need. Roughly:

1. you have some real-world domain you need to analyze,
2. you figure out the vocabulary you need to talk about it
3. you figure out the relationships that hold between the bits of your vocabulary
4. you encode this knowledge in a deductive system

It's interesting to note that your list roughly describes what a typical [deductive/or relational] database modeller does daily and has been doing for decades, with (1) (2) and (3) being arguably the most important part of his/her job. It used to be called building a conceptual model of some real-world system (bank, factory, etc). I am afraid without understanding how the human brain functions, trying to formalize (1-3) is a pipe dream, just calling (1-3) by some fancy name won't do the trick. (4) of course is quite doable today.

Termin-ology

I am afraid without understanding how the human brain functions, trying to formalize (1-3) is a pipe dream, just calling (1-3) by some fancy name won't do the trick.

No-one has claimed that using the term alone will solve anything. Like any term, it's useful if it identifies something which aids communication between people who want to talk about that thing.

What is it that makes the name "fancy", anyway? It's only 8 letters long, so hardly qualifies for the earlier claim of being a "big word". Is it that it ends in "ology"?

More seriously, I think you're focusing far too strongly on trying to map the term to formal concepts you're already comfortable with, while ignoring the various senses of the term, and what's communicated by those senses. Confusing matters is the fact that the term has multiple senses, which are related but are not the same as the original philosophical sense.

For example, in the semantic web sense, an ontology serves as a shared vocabulary between multiple systems and languages. That sense goes beyond the usual one of a data model or conceptual model or formal model for a single system or theory. The different systems ("agents") which share a common ontology may use different languages and theories to deal with that ontology, and don't share the same knowledge about the ontology. All they share is the specification of the ontology itself, which is deliberately restricted to support this purpose. In this context, an ontology is a particular kind of model, with particular characteristics, which serves a particular purpose, and those particularities justify the use of an identifying term.

Terminology


Like any term, it's useful if it identifies something which aids communication between people who want to talk about that thing.

That's what I find hard to believe. So far, I've seen many various and contradictory definitions of the word. That hardly makes communication easier. A presumably techical term definition should be as precise and unambiguous as possible in order to be usable. I am not trying to be difficult, but am rather curious about what technical or scientific purpose such nebulous terminology might serve (I am aware of other, non-technical reasons for such usage).

I daresay 'semantic' web is another beast of the same kind , as what kind of meaning ('semantics'), except when interpreted by a human being, can say a graph/network have ?

If you find this kind of discussion not quite appropriate for the forum, I apologize and will stop right here ;)

Technical vs non-technical.

A presumably techical term definition should be as precise and unambiguous as possible in order to be usable. I am not trying to be difficult, but am rather curious about what technical or scientific purpose such nebulous terminology might serve (I am aware of other, non-technical reasons for such usage).

I think this is the sticking point... You seem to be very ready to presume that all terminology should have a technical definition. I think the terms under discussion are used by the relevant communities almost exclusively for the "other, non-technical" reasons you mention. After a bit of investigation, maybe it's best just to accept that. Also, I don't think these discussions are inappropriate here, as long as they don't go on too long. This one has probably gone on just about long enough at this point, and I doubt there's much more to be gained.

Interesting Question!

vc: I daresay 'semantic' web is another beast of the same kind , as what kind of meaning ('semantics'), except when interpreted by a human being, can say a graph/network have ?

As you suggest, in and of themselves, none. But human beings have become quite accomplished at not only ascribing meaning to various bits of data, but also in automating the interpretation of bits of data by... um... other bits of data, which are themselves interpreted according to some definition of "computation."

OK, that's admittedly exactly as hand-wavey as you've expressed a concern about, so let me at this point refer you to this resource on semantic networks, the SNePS site, and the Description Logics page for further information.

One man's jargon...

So far, I've seen many various and contradictory definitions of the word. That hardly makes communication easier. A presumably techical term definition should be as precise and unambiguous as possible in order to be usable.

But you're making that judgement from outside the fields in question (afaict). If two people who both work with semantic web systems, or who both work with Cyc, etc., use the term "ontology", they have a pretty good idea of what they mean by it. That alone is a reasonable justification for a term. There's no requirement for the term to be instantly understandable outside those fields, or to have the same meaning in every context, or to have a definition that makes immediate sense when taken out of the context of the field in question.

To an outsider, the term may seem like unwarranted jargon. And some people may abuse it because they think it sounds "fancy", whereas other people might hate it for the same reason. That's just the nature of terminology.

I am not trying to be difficult, but am rather curious about what technical or scientific purpose such nebulous terminology might serve (I am aware of other, non-technical reasons for such usage).

Note that "technical" is not the same as "formal". If the term's definitions are fuzzy, it might be a sign of an immature discipline, which makes sense in this case. I'm reminded of a Perlis quote, "One can't proceed from the informal to the formal by formal means."

I came across a paper which addresses this point to some extent: Ontology Theory by Christopher Menzel.

On the skepticism side, at least relative to the semantic web, this looks interesting: Ontology is Overrated: Categories, Links, and Tags by Clay Shirky.

I also liked the look of Ontology Development 101: A Guide to Creating Your First Ontology, which attempts to answer the "why" and "what" questions.

As for the appropriateness of this discussion, I think it's fine to discuss such questions, but ultimately it's difficult to effectively object to a term without fully understanding what it means to the people who use it (hence the links above). Just reading a bunch of definitions doesn't necessarily provide that understanding, because they depend on a shared context (one might almost say, a shared ontology).

I daresay 'semantic' web is another beast of the same kind , as what kind of meaning ('semantics'), except when interpreted by a human being, can say a graph/network have ?

That one is easier to nail down: the idea is that the current web consists primarily of resources intended to be rendered monolithically to human readers, with the machine blindly translating e.g. HTML to a human-friendly rendered form. The semantic web, OTOH, is supposed to consist of resources which machines will be more easily able to destructure and recompose on their own, due to having better knowledge of the "semantic" structure of the resources. "Semantic" in this case refers to some aspect of human understanding about the resource, expressed in machine-processable form.

The notion of "semantic markup" provides a trivial example of this: it's more useful to wrap quoted text in an HTML "blockquote" element than it is to use a "div" element and specify its rendering style directly with, say, "style='font-style: italic; margin-left:0.5in'; border-left: solid 1px grey". The latter approach doesn't provide a machine-usable clue as to the semantics of the document as understood by a human, whereas the use of the BLOCKQUOTE element is a huge clue, by comparison. The semantic web simply extends this concept (arguably to its breaking point ;)

In this case, a term like "semantic" is being used as a kind of pun: resources have semantics which a human understands, and you can hardly argue with the use of the term in that context. But if we provide hints to the machine about the human-level semantics of a resource, then the machine has some notion of those semantics, even if it's so restricted that it becomes possible to argue about whether the term "semantics" is still appropriate. But it does remain appropriate, because of the presence of a mapping to the semantics in the human realm.

The term "ontology" is used in these contexts in very much the same way: the full philosophical notion of ontology becomes much more restricted when projected into a machine-readable context, but it retains a relationship with the corresponding human-level ontology.

At the risk of stretching a simple example too far, humans and machines each have some understanding of what a BLOCKQUOTE is, and its relationship to other markup elements (e.g. that it can contain other elements). The parts of this understanding that are shared between humans and machines is an ontology: a shared domain-specific vocabulary, with relationships between the terms.

But on the human side, BLOCKQUOTE has many other connotations — one could imagine developing a philosophical-style ontology of documents which attempt to describe the various elements of documents and their relationships, as understood by humans. This is something that can't be formally specified. Would it be OK to call that an ontology? If so, then that's why it's OK to call its machine equivalent an ontology. If not, then you're going to have to take up this discussion with your local philosophy department.

Ontology Theory


Note that "technical" is not the same as "formal". If the term's definitions are fuzzy, it might be a sign of an immature discipline, which makes sense in this case.

That's a good point.


I came across a paper which addresses this point to some extent: Ontology Theory by Christopher Menzel.

Thank you. The article is quite interesting. Although I remain unconvinced as to how fruitful the suggested approach is, that is a very good and solid exposition.


That one is easier to nail down [...]

I am not so sure.


"Semantic" in this case refers to some aspect of human understanding about the resource, expressed in machine-processable form.

That's exactly what I think is quite impossible to accomplish just yet (expressing meaning so that the machine could somehow understand it). Consider your slightly modified markup example:

<name>John</name><age>40</age>

One could say that the XML fragment is 'self-describing' or even 'semantical'. It is not. It is self-describing to the human eye, but not to the computer. It would look to the machine just as good if it were 'marked up' as

<sfasd>John</sfasd><w3e>40</w3e>

and the computer was programmed to recognize the tags.


At the risk of stretching a simple example too far, humans and machines each have some understanding of what a BLOCKQUOTE is, and its relationship to other markup elements (e.g. that it can contain other elements). The parts of this understanding that are shared between humans and machines is an ontology

Machines have zero understanding -- they just blindly react to strings of symbols in the ways the programmer designed them to do so. Saying that the machine can somehow discern semantics in a however complicated data structure is, well, wishful thinking at best.


resources have semantics which a human understands, and you can hardly argue with the use of the term in that context.

It depends on what you mean by 'resources'.


But if we provide hints to the machine about the human-level semantics of a resource, then the machine has some notion of those semantics

It has not.

Anthropomorphing the machine is and old and an utterly failed game (recall the AI debacle in the late 80s early nineties) that did produce a few useful algorithms. Words like 'conceptual domain' and 'semantical web' look like creatures transported from that era. It's quite presumptious of the computer scientists to imagine that they can 'teach' the machine to reason without a deep understanding of how the human brain functions first.

Machine understanding

Machines have zero understanding -- they just blindly react to strings of symbols in the ways the programmer designed them to do so.

Everyone involved knows that. In the semantic web case at least, when terminology is used in a seemingly anthropomorphic sense, you should think more in terms of the "pun" that I tried to explain. I'll try to clarify my explanation without resorting to anthropomorphic terms.

Saying that the machine can somehow discern semantics in a however complicated data structure is, well, wishful thinking at best.

Again, no-one is claiming otherwise. The point is that the semantic entities that are meaningful to humans are marked in a way that the machine can identify and associate a specific entity with specific properties. Whether a tag is named "blockquote" or "sfasd" is not the point. The point is that if the "sfasd" tag is only used to mark entities which represent what a human recognizes as quotations, that's in improvement in the semantic content of a document compared to marking quotations only with, say, presentational instructions (italics, indentation etc.) The tag is said to have "semantics" in the same sort of way that constructs in a programming language are said to have semantics.

resources have semantics which a human understands, and you can hardly argue with the use of the term in that context.

It depends on what you mean by 'resources'.

Since I was talking about the semantic web, I was thinking of the term as used in the web context, i.e. the things that are identified by Uniform Resource Identifiers. Many such resources are documents, which have rich structure and (human) semantics.

But if we provide hints to the machine about the human-level semantics of a resource, then the machine has some notion of those semantics

It has not.

The point is that the machine can recognize a semantic entity that's meaningful to humans, and programs can associate behavior ("semantics") with those entities. In the semantic web case, the machine semantics and human semantics are different, and no-one is seriously trying to pretend that the machine is currently "understanding" anything in the AI sense.

Anthropomorphing the machine is and old and an utterly failed game (recall the AI debacle in the late 80s early nineties) that did produce a few useful algorithms. Words like 'conceptual domain' and 'semantical web' look like creatures transported from that era. It's quite presumptious of the computer scientists to imagine that they can 'teach' the machine to reason without a deep understanding of how the human brain functions first.

As an entirely separate issue, I'm not sure that understanding the function of the human brain is a prerequisite to the development of more intelligent machines; that makes an unwarranted anthropocentric assumption about the importance of the human brain as a model for the mechanics of intelligence. Our brains may represent a particularly inefficient way to achieve intelligence (I know mine does). It's a little like arguing that we need to understand the human walking process before producing cars.

But that's besides the point. In the semantic web case, they're not really talking about teaching the machine to reason in the way that humans do. That's more of a goal in the case of Cyc and similar systems. In that case, you're indeed dealing with the intellectual children of the AI era. In that context, the marketing can get a bit aggressive (how are you going to get funding for a project like Cyc without grandiose claims?), but I don't think most of the researchers involved are under any illusions about what their terminology really implies.

Re: Machine understanding

I'm not sure that understanding the function of the human brain is a prerequisite to the development of more intelligent machines [...]

Indeed. Also, trying to create intelligent machines and failing gives us valuable clues as to why the human brain is the way it is.

Understanding

I largely agree.

Just a couple of remarks:


The point is that if the "sfasd" tag is only used to mark entities which represent what a human recognizes as quotations, that's in improvement in the semantic content of a document compared to marking quotations only with, say, presentational instructions (italics, indentation etc.)

It's an improvement allright but from the human's point of view only. We made the language more expressive but the interpretation ('semantics') is still in the person's head, it was not magically transferred to the computer 'brain'.


The tag is said to have "semantics" in the same sort of way that constructs in a programming language are said to have semantics.

If that's the understanding and we use the word 'semantics' in the narrow sense of, for example, a predicate meaning being the subset of some set, then a trivial "Hello world" program contains as much semantics as any more complicated creature out there ("semantic web"), denotationally speaking.


As an entirely separate issue, I'm not sure that understanding the function of the human brain is a prerequisite to the development of more intelligent machines; that makes an unwarranted anthropocentric assumption about the importance of the human brain as a model for the mechanics of intelligence. Our brains may represent a particularly inefficient way to achieve intelligence (I know mine does). It's a little like arguing that we need to understand the human walking process before producing cars.

I do not think it would be productive to discuss this issue as it may almost ineviatbly lead to some semi-religious exchange so I'll refrain from doing so ;)

Thanks.

Connecting interpretations

It's an improvement allright but from the human's point of view only. We made the language more expressive but the interpretation ('semantics') is still in the person's head, it was not magically transferred to the computer 'brain'.

The point I wanted to make is that there are two distinct sets of interpretations: one set in the heads of all the people dealing with the data (human semantics), and another set of interpretations encoded in all of the programs which operate on the data (machine semantics). Those two sets of interpretations are connected via semantic cues which include markup and resource types, backed up by shared definitions in the form of ontologies and resource metadata.

The semantic web might be described as a network of resources whose machine interpretation matches their human interpretation more closely than in previous systems — it connects at more levels, and in more places, even if the connections are limited to being merely structural or otherwise relatively superficial. The computer's ability to automatically deal with data in ways that seem appropriate to humans is improved, as a consequence of a richer semantics (in both the machine and human senses) being explicitly encoded in the data.

If that's the understanding and we use the word 'semantics' in the narrow sense of, for example, a predicate meaning being the subset of some set, then a trivial "Hello world" program contains as much semantics as any more complicated creature out there ("semantic web"), denotationally speaking.

I don't understand this point - what measure of semantic content ("as much") are you using? Data containing semantic cues can interpreted by an unbounded set of programs. Each of those programs is an interpreter which assigns its own specific semantics, in the PL sense, to the semantic cues. The denotation of a particular semantic cue relative to a particular program is comparable to that of a procedure call in an ordinary language. But the full meaning of a given cue is the set of all denotations given by all programs capable of interpreting that cue. This goes well beyond the semantic content of a "Hello world" program.

Machine semantics


The point I wanted to make is that there are two distinct sets of interpretations: one set in the heads of all the people dealing with the data (human semantics), and another set of interpretations encoded in all of the programs which operate on the data (machine semantics).

That's where I disagree. I think that the notion of meaning(semantics) does not even arise until the human is involved. The machine does not interpret, it simply blindly transforms one data structures into others, so its behavior is rather syntactical than semantical. To clarify, with denotational semantics, for example, you have:

the syntactic world (a collection of formulas with their rules of formation);

the semantic world ( a collection of mathematical structures of some kind like integers, booleans, etc);

semantic valuation functions mapping things from the syntactic world to the semantic.

Now, the formal semantic world (booleans, integers, abstract trees, etc) can be thought of as existing in some Platonic universe or alternatively in the human mind. One can argue that the semantic valuation function as well as the semantic world can be represented so that the machine coud manipulate them as well or even better than the human. In this case, however, the notion of meaning is lost because what the machine does is no diffrent from purely syntactic manuipulations, semantic valuation function becomes indistinguishable from trasformation/inference rules. It's only the human who can draw the line and say "this is a syntactic universe, and these are 'real' things".


Those two sets of interpretations are connected via semantic cues which include markup and resource types, backed up by shared definitions in the form of ontologies and resource metadata.

Sorry, I do not understand this.


The semantic web might be described as a network of resources whose machine interpretation matches their human interpretation

[The following was slightly edited to improve readability]

In my opinion, the 'semantic web' locution can make sense only if it includes people as the 'semantic' component of the 'semantic web', but if so 'semantic' is clearly redundant, one might as well talk about a 'semantic' screwdriver or a 'semantic' bike ;)

Interpretation of interpretation

I think that the notion of meaning(semantics) does not even arise until the human is involved. The machine does not interpret, it simply blindly transforms one data structures into others, so its behavior is rather syntactical than semantical.

There is always a human involved. The machine does some operations. And then we humans makes an interpretation of the machine. You can make the interpretation that "it simply blindly transforms one data structures into others", but could also make the interpretation that the machine also interpret. And the diffrence is not in the machine but in the human interpretation of it. You can not say that one view is right and one wrong. They are just diffrent choices.

But you could note that most systems are designed to be viewed from a specific view.

Machine interpretation


but [one] could also make the interpretation that the machine also interpret

Only if one is confused or speaks metaphorically as in "my kettle decided to retire ";)

Don't interpreters interpret?

What is the purpose of the programs we call "interpreters", if not to interpret? It's just a different, but quite closely related, sense of the term.

Interpreters


What is the purpose of the programs we call "interpreters", if not to interpret? It's just a different, but quite closely related, sense of the term.

Whether you compile-excecute a program or 'interpret' it is an implementation detail quite irrelevant for the program semantics.

So, in this context the word 'interpreter' is used metaphorically.

Metaphor vs. projection

I was thinking of "interpreters" more in the sense that Reynolds used in his Definitional Interpreters paper, in which interpreters are used to specify language semantics, which has a close connection to a denotational approach. The point is that this "metaphorical" use of terminology is not unusual in this area, and in fact if you look closely, I think there's more than mere metaphor going on. This is part of what I was arguing above. I see it as more of a projection: terms which apply to or depend on human understanding have correlates in the more restrictive and mechanical formal space, and the overloading of the terminology is not accidental or wrong, it's meaningful.

Projection


I was thinking of "interpreters" more in the sense that Reynolds used in his Definitional Interpreters paper, in which interpreters are used to specify language semantics, which has a close connection to a denotational approach.

Operational semantics (interpreters) is provably equivalent to denotational semantics.


terms which apply to or depend on human understanding have correlates in the more restrictive and mechanical formal space, and the overloading of the terminology is not accidental or wrong, it's meaningful

If you mean that there is meaning in what the machine does, independent of the human brain interpretation, then I disagree. The machine performs mindless symbol manipulations that may or may not have some meaning only in the human mind.

Terminology overloading (antropomorphing) can be pretty innocuous in some cases especially when backed up by theory as in the case of interpreters<-->operational semantics, or it can be outright confusing and obscure as in the case of 'semantic web' or 'ontology'.

Mechanically encodable semantics

That's where I disagree. I think that the notion of meaning(semantics) does not even arise until the human is involved. The machine does not interpret, it simply blindly transforms one data structures into others, so its behavior is rather syntactical than semantical.

I don't disagree with this (as long as we don't have intelligent machines). I think I expressed myself poorly. What I called "machine semantics" are the kind of semantics which we can encode mechanically, the kind which formal semantics formalizes. What I called "human semantics" was intended to refer to the full human understanding of the data in question, which cannot be completely captured by any (known) formal semantics. The goal of the semantic web is to support a better connection between the blind transformations which machines can perform, and the semantics of the data as a human understands it.

In my opinion, the 'semantic web' locution can make sense only if it includes people as the 'semantic' component of the 'semantic web', but if so 'semantic' is clearly redundant, one might as well talk about a 'semantic' screwdriver or a 'semantic' bike ;)

'Semantic' is not so much redundant as it is being used in a specific sense. The current web is full of data which is interspersed with low-level metadata, dealing with issues such as presentation. That low-level metadata has semantics, but those semantics only really matter to e.g. authors of web pages and web browsers. The 'semantic' in semantic web refers to semantics which have a closer connection to the meaning of the data as understood by an end user consuming the data. The consequence of this is supposed to be to make it possible for the machine to process the data in ways that are more useful to those end users.

In this sense, a semantic bike wouldn't have a saddle, it would have a chair, which you could detach and use in your car or your living room. Most bikes aren't 'semantic' in this sense, because they expose all sort of details which should only be relevant to bike mechanics, such as mechanical gears, a chain which requires oiling, etc. (I have always found this quite inconvenient — if I could buy a solid-state bike, I would.)

Kind regards to Wittgenstein

But you're making that judgement from outside the fields in question (afaict). If two people who both work with semantic web systems, or who both work with Cyc, etc., use the term "ontology", they have a pretty good idea of what they mean by it.

The first funny experience I made in industry after studying maths was a specific usage of terminology. When I asked people for precise definition of words or the meaning of many abbreviations they used in daily work, they couldn't provide an answer. "Kay, did you solve the DUR?" "What is a DUR?" "It is a change request. I show you the input form..." So almost nobody knew what the three letters D-U-R were standing for, besides that they addressed a specific change request procedure. I'm not even sure if any other organization used this term?

Agreement

I don't mean to suggest it's trivial work! Designing a logic to ensure it has desirable properties is definitely both hard and worthwhile.

The "less than meets the eye" was intended to refer to the dodge you typically take when crafting an ontology. You don't come up with a semantic model of the domain in question, and then prove the logic sound against that model. Instead, you specify the theory axiomatically, as a set of predicates and axioms that hold between them, and justify it informally (but hopefully still rigorously). (And for real-world relationships, like family relationships, it's not even clear to me what having a semantic model would mean.)

Thanks


A description logic is a fragment of ordinary first order logic, which usually works by drastically limiting the way you can use predicates, relations, and quantifiers in order to get decidability and fast (hopefully) logical entailment. An "ontology" would be the set of predicates and relations and the axioms that hold between them

Thank you very much.

I honestly do not see a point in using big words like 'conceptual domain' or 'ontology' for something described clearly and succintly just as you did.

Becourse its way shorter?

Becourse its way shorter?

Napkin Definition

I suspect, but have no citation for the belief, that it stems from the "napkin definition" of "ontology" in philosophy: "The study of that which can be known." Representation of "that which can be known" is richer than what we find in even current object-oriented languages, let alone object-oriented languages available for comparison in the early days of AI research, so "ontology" might have suggested itself as a term differentiable from "class hierarchy" or "object soup" if the system being compared to was Object Lisp, which was prototype-, rather than class-based.

I Beg Your Pardon?

neelk: An "ontology" would be the set of predicates and relations and the axioms that hold between them

vc: Thank you very much. I honestly do not see a point in using big words like 'conceptual domain' or 'ontology' for something described clearly and succintly just as you did.

Sentence 3 of the W3C's What is An Ontology (link provided earlier) states

"Ontologies include computer-usable definitions of basic concepts in the domain and the relationships among them."

vc, you found the first explanation above satisfactory but the W3C's explanation (which goes into much more history and detail) to not be acceptable. Why?

Pardon

Because every word in this:


set of predicates and relations and the axioms that hold between them

... has a clear and simple definition unambiguously understood by everyone familiar with at a bit math, while the W3 'definition' is vague and ambiguous to the point of being meaningless. What are 'computer-usable definitions of basic concepts in the domain and the relationships among them' ? I don't know. Does 'o' include something else besides that as the sentence seems to imply ?

Could you expand on the

Could you expand on the definition and explain just how 'ontology' is a better term than a simple 'data model' ? What does it mean for a 'data model' to 'represent a domain' ontologically ?

You can use the term "data model" if you prefer it. Usually, "ontology" refers to a substantial model such as the CYC and SUMO projects I linked to in the original post, typically backed up by some logical semantics (description logics seem popular now, but first order, higher order, and modal logics have all been used).

Also, how exactly is the computer 'ontology' related to the philosophical one ?

It is almost entirely unrelated. The use of the term is unfortunate, but well established.

I don't really care whether you accept or reject the term "ontology". The intent of my questions was to observe that there is a line of technology that has been developed back from semantic nets and frame systems, through a variety of more-or-less logical knowledge representation formats, and today seems to be embodied in various technologies surrounding the "Semantic Web" efforts (description logics, OWL, etc). At various points in this development, some huge standard "upper level" ontologies (CYC, SUMO, etc) have been developed that attempt to describe a large number of general and domain-specific "concepts". On the other hand, we have language environments like Java and .NET shipping with vast class libraries, that seem to describe many similar concepts from a slightly different point of view. In particular, in the latter case, almost all relations between classes/objects are described procedurally via methods, with typically only "instance-of", "subclass-of", and "contains-part" relations being explicit. So, are there languages that provide tools for declaring more sophisticated relationships between classes, and then reasoning about these relationships? I don't have a specific use-case in mind, just curious about any cross-pollination between these two fields.

Still unclear

"
Usually, "ontology" refers to a substantial model such as the CYC and SUMO projects I linked to in the original post, typically backed up by some logical semantics (description logics seem popular now, but first order, higher order, and modal logics have all been used).
"

That begs two questions: 1. what is a "substantial model" vs. just a model; 2. more interestingly, what do you mean by "backed up by some logical semantics", especially "first order" ?

" don't really care whether you accept or reject the term "ontology".
"

I neither reject nor accept the term, I just want to know what people mean by it. I have similar qualms with respect to words like 'semantic Web' or 'domain-specific concept', however, to a lesser degree.

Logical semantics

By "substantial model" I just mean a large model, covering lots of concepts, such as those I've referred to previously (SUMO, CYC). I'm not trying to define "ontology", or even defend the use of the word. Rather, just to characterise how it seems to be used (which is vague, I agree).

By "backed up by some logical semantics" I mean that the languages used to describe these ontologies are very closely related to particular logics. For example, KIF (a variation of which is used in SUMO) is essentially a Lisp-like syntax for first-order predicate logic, as I understand it. By "first-order" I mean that variables in the language can range over literals but not predicates. I'm sure there are more precise definitions.

Also, how exactly is the

Also, how exactly is the computer 'ontology' related to the philosophical one ?

the philosophical viewpoint:

In philosophy, ontology is the most fundamental branch of metaphysics. It studies being or existence and their basic categories and relationships, to determine what entities and what types of entities exist. [http://en.wikipedia.org/wiki/Ontology]

slightly altered:

In computer science, ontology is the most fundamental branch of data modelling. It studies types and instances and their basic categories and relationships, to determine what entities and what types of entities exist.

ontology describes techniques to model data type hierarchies and infer properties of those hierarchies. ontology languages use different kinds of logical calculi to allow specification of the hierarchy and automated inference thereon.

both philosophical and computer science ontology describe the same class of fundamental aspects of their respective reality (our Reality(tm) and virtual systems), therefor the same terminology.

Deinitions

"Why Not Use W3C's Definitions, Explanations"

Maybe because they no different from the wikipedia one ?

How do you define 'ontology' ? Is it just a meaningless buzzword ?

One thing...

I heard something at one point about a project at Sun to create an RDF "ontology" (scare quotes perhaps necessary at this point) based on the structure of the JDK. From what I heard, it was a pretty large data set that included the whole class hierarchy as well as method, field, and exception data. I believe it was based purely on types (basically what you can see in the Javadoc), and from what I heard it was quite cool.

I don't know any more than this, though. I'm not sure whether this was ever publicly discussed, and it would have been second- or third-hand information when I heard it... (In fact, maybe it was IBM rather than Sun? I'm not sure.)

This might be a step in your direction...

Thanks

Thanks, that looks like the sort of thing I was thinking about. I'll see if I can find it.

Also...

Ok, I admit I haven't read all the comments, so apologies if someone already mentioned these oddments:

* Gruber's definition is pretty concise: "An ontology is an explicit specification of a conceptualization."

* Quite a lot of work has been done on mapping between RDF/OWL and UML (and vice versa).

* Much of the information in Cyc and SUMO can be expressed in RDF/OWL (and has been), and these big ontologies certainly have their uses in the context of the Semantic Web. SemWeb technologies are all about globally sharable ontologies, but there are common misconceptions nearby, best cleaned up by Tim Berners-Lee in these slides -

# "The Semantic Web is about making one big ontology"
The semantic web is about a fractal mess of interconnected ontologies....
# "The semantic web ontologies must all be consistent"
Only the parts I am using together

Amusing

Does anyone else find it ironic that most of this discussion has been about what the 'definition' of Ontology is?

Not alone

Clay Shirky does:

It is a rich irony that the word "ontology", which has to do with making clear and explicit statements about entities in a particular domain, has so many conflicting definitions.

Adding to the irony, Clay proceeds to mainly critique simple hierarchical classification mechanisms, which is one of the least interesting kinds of things which might be called an ontology.