Pal gov.tutorial4.session8 1.ontologymodelingchallenges

34
1 PalGov © 2011 1 PalGov © 2011 فلسطينيةلكترونية الديمية الحكومة ا أكاThe Palestinian eGovernment Academy www.egovacademy.ps Tutorial 4: Ontology Engineering & Lexical Semantics Session 8.1 Ontology Modeling Challenges Dr. Mustafa Jarrar University of Birzeit [email protected] www.jarrar.info

Transcript of Pal gov.tutorial4.session8 1.ontologymodelingchallenges

Page 1: Pal gov.tutorial4.session8 1.ontologymodelingchallenges

1PalGov © 2011 1PalGov © 2011

أكاديمية الحكومة اإللكترونية الفلسطينيةThe Palestinian eGovernment Academy

www.egovacademy.ps

Tutorial 4: Ontology Engineering & Lexical Semantics

Session 8.1

Ontology Modeling Challenges

Dr. Mustafa Jarrar

University of Birzeit

[email protected]

www.jarrar.info

Page 2: Pal gov.tutorial4.session8 1.ontologymodelingchallenges

2PalGov © 2011 2PalGov © 2011

About

This tutorial is part of the PalGov project, funded by the TEMPUS IV program of the

Commission of the European Communities, grant agreement 511159-TEMPUS-1-

2010-1-PS-TEMPUS-JPHES. The project website: www.egovacademy.ps

University of Trento, Italy

University of Namur, Belgium

Vrije Universiteit Brussel, Belgium

TrueTrust, UK

Birzeit University, Palestine

(Coordinator )

Palestine Polytechnic University, Palestine

Palestine Technical University, PalestineUniversité de Savoie, France

Ministry of Local Government, Palestine

Ministry of Telecom and IT, Palestine

Ministry of Interior, Palestine

Project Consortium:

Coordinator:

Dr. Mustafa Jarrar

Birzeit University, P.O.Box 14- Birzeit, Palestine

Telfax:+972 2 2982935 [email protected]

Page 3: Pal gov.tutorial4.session8 1.ontologymodelingchallenges

3PalGov © 2011 3PalGov © 2011

© Copyright Notes

Everyone is encouraged to use this material, or part of it, but should

properly cite the project (logo and website), and the author of that part.

No part of this tutorial may be reproduced or modified in any form or by

any means, without prior written permission from the project, who have

the full copyrights on the material.

Attribution-NonCommercial-ShareAlike

CC-BY-NC-SA

This license lets others remix, tweak, and build upon your work non-

commercially, as long as they credit you and license their new creations

under the identical terms.

Page 4: Pal gov.tutorial4.session8 1.ontologymodelingchallenges

4PalGov © 2011

Tutorial Map

Topic Time

Session 1_1: The Need for Sharing Semantics 1.5

Session 1_2: What is an ontology 1.5

Session 2: Lab- Build a Population Ontology 3

Session 3: Lab- Build a BankCustomer Ontology 3

Session 4: Lab- Build a BankCustomer Ontology 3

Session 5: Lab- Ontology Tools 3

Session 6_1: Ontology Engineering Challenges 1.5

Session 6_2: Ontology Double Articulation 1.5

Session 7: Lab - Build a Legal-Person Ontology 3

Session 8_1: Ontology Modeling Challenges 1.5

Session 8_2: Stepwise Methodologies 1.5

Session 9: Lab - Build a Legal-Person Ontology 3

Session 10: Zinnar – The Palestinian eGovernmentInteroperability Framework

3

Session 11: Lab- Using Zinnar in web services 3

Session 12_1: Lexical Semantics and Multilingually 1.5

Session 12_2: WordNets 1.5

Session 13: ArabicOntology 3

Session 14: Lab-Using Linguistic Ontologies 3

Session 15: Lab-Using Linguistic Ontologies 3

Intended Learning ObjectivesA: Knowledge and Understanding

4a1: Demonstrate knowledge of what is an ontology,

how it is built, and what it is used for.

4a2: Demonstrate knowledge of ontology engineering

and evaluation.

4a3: Describe the difference between an ontology and a

schema, and an ontology and a dictionary.

4a4: Explain the concept of language ontologies, lexical

semantics and multilingualism.

B: Intellectual Skills

4b1: Develop quality ontologies.

4b2: Tackle ontology engineering challenges.

4b3: Develop multilingual ontologies.

4b4: Formulate quality glosses.

C: Professional and Practical Skills

4c1: Use ontology tools.

4c2: (Re)use existing Language ontologies.

D: General and Transferable Skills

d1: Working with team.

d2: Presenting and defending ideas.

d3: Use of creativity and innovation in problem solving.

d4: Develop communication skills and logical reasoning

abilities.

Page 5: Pal gov.tutorial4.session8 1.ontologymodelingchallenges

5PalGov © 2011 5PalGov © 2011

Outline and Session ILOs

This session will help student to:

4a2: Demonstrate knowledge of ontology engineering and

evaluation.

4b1: Develop quality ontologies.

4b2: Tackle ontology engineering challenges.

Page 6: Pal gov.tutorial4.session8 1.ontologymodelingchallenges

6PalGov © 2011 6PalGov © 2011

Reading

Guarino, Nicola and Chris Welty. 2002. Evaluating Ontological Decisions with

OntoClean. Communications of the ACM. 45(2):61-65. New York: ACM Press.

http://www.loa.istc.cnr.it/Papers/CACM2002.pdf

Page 7: Pal gov.tutorial4.session8 1.ontologymodelingchallenges

7PalGov © 2011 7PalGov © 2011

Ontology Modeling

• Building ontologies is still arcane art form.

• One maybe a good ontology engineer, but he does not know why!

• An ontology might be better than another, but it is difficult to know

why!

• We need a methodology to guide us not only on what kinds of

ontological decisions we should make, but on how these

decisions can be evaluated.

OntoClean provides a set of modeling ―guidelines‖ in this

direction, but it is still not meant to be a comprehensive

methodology for the all ontology modeling phases.

Page 8: Pal gov.tutorial4.session8 1.ontologymodelingchallenges

8PalGov © 2011 8PalGov © 2011

OnToClean

Nicola Guarino

CNR Institute for Cognitive Sciences and Technologies,

Laboratory for Applied Ontology (LOA) in Trento, Italy

Research Scientist at the IBM T.J. Watson Research Center in

New York.

Chris Welty

A methodology for ontology-driven conceptual analysis, developed

by:

Page 9: Pal gov.tutorial4.session8 1.ontologymodelingchallenges

9PalGov © 2011 9PalGov © 2011

Modeling mistakes

• Many people misuse the subsumption relation, what they do is not

subsumptions.

• Which are correct/wrong here?

Educational Institute

University

Faculty of Law

Birzeit University

Time Duration

Time Interval

Person

Student worker

Person

Female Male

Role

Student worker

Page 10: Pal gov.tutorial4.session8 1.ontologymodelingchallenges

10PalGov © 2011 10PalGov © 2011

Modeling mistakes

• Many people misuse the subsumption relation, what they do is not

subsumptions.

• Which are correct/wrong here?

Educational Institute

University

Faculty of Law

Birzeit University

Time Duration

Time Interval

Person

Student worker

Person

Female Male

Role

Student worker

How to know that your modeling choices are right?

What makes your ontology better than my ontology?

Page 11: Pal gov.tutorial4.session8 1.ontologymodelingchallenges

11PalGov © 2011 11PalGov © 2011

Subsumption (The Subtype Relation)

• The subsumption (also called is-a relationship) forms a hierarchy

– A subsumes B if all instances of B are necessarily instances of A

– Attributes of a supertype are inherited by it subtypes along the hierarchy.

• The subsumption relation is often misused

– People are typically confuses the subtype relation with the part-of and

instance-of relations, and types with roles.

Page 12: Pal gov.tutorial4.session8 1.ontologymodelingchallenges

12PalGov © 2011 12PalGov © 2011

Subsumption misused with Instantiation

Mustafa

Human

Mammal Species

X

X

Mustafa

Human

Mammal Species

SubtypeOf

InstanceOf

InstanceOf

Is A

Is A

Is A

Instances are sometimes confused with types!

To distinguish between them you should ask what are the instances of

―Mustafa‖? Instances of ―Human‖? Instance of Species? Etc.

How to distinguish between same/different instances of a type, two things

are same entity (identity criteria)?

Page 13: Pal gov.tutorial4.session8 1.ontologymodelingchallenges

13PalGov © 2011 13PalGov © 2011

Subsumption misused with Part/Whole

Engine

Car

SubtypeOf X PartOf

Engine

Car

It is often difficult for beginners in ontological analysis to distinguish between the part-of and the

subclass relation. This is due to the fact that subclass is analogous to subset, and a subset of a

set is a part of it. This confusion can be overcomed when we realize the difference between the

parts of a set and the parts of its members

Among the essential properties of a car there are some functional properties, like being ableto accommodate people. An engine has also certain functional properties as essential

properties, like being able to crank and generate a rotational force. Since, however, the

essential properties of cars do not apply to engines, one cannot subsume the other. The

proper relationship here is part and subsumption is not part. [GW02].

Page 14: Pal gov.tutorial4.session8 1.ontologymodelingchallenges

14PalGov © 2011 14PalGov © 2011

Subsumption misused with Disjunction

BetterEngine

Car Part

SubtypeOf X

This type of mistake is particularly common in object-oriented, where value restrictions are an

important part of modeling. These anti-rigid classes are created to satisfy a modeling need to

represent disjunction, for instance, ―any car part is either an engine, or a wheel, or a seat, …‖

It should be clear that this is different from saying ―all engines are car parts,‖ since, in fact,

they are not. Of course, most modeling systems do not provide for disjunction, so modelers

believe they are justified using these tricks, but if the intention is to make meaning as clear as

possible then subsumption is not disjunction. [GW02]

Car Part

Engine or Wheel or Seat

SubtypeOf But this is still not an

intuitive class

To see how this is incorrect, rigidity analysis can be most useful. No instance of a car part is

necessarily a car part (we could take an engine from a car and put it in a boat, making it no

longer a car part but a boat part), so we have to make that class anti-rigid. The class engine

itself is rigid, however, since we can’t imagine an entity that is an engine becoming a non-

engine. Being an engine is essential to it. An anti-rigid class, such as car part, can not

subsume a rigid one, and so we have a conflict.

Page 15: Pal gov.tutorial4.session8 1.ontologymodelingchallenges

15PalGov © 2011 15PalGov © 2011

Subsumption misused with Constitution

An instance of Ocean is the

“the Atlantic Ocean,”

Oceans are made up of

amounts of water

Ocean

Water

SubtypeOf

An instance of Water is

an amount of waterOcean

Water

PartOf

X

Page 16: Pal gov.tutorial4.session8 1.ontologymodelingchallenges

16PalGov © 2011 16PalGov © 2011

Subsumption misused with Polysemy

• A term may have multiple meanings (Polysemy),

• For example ―Table‖ means:

A piece of furniture having a smooth flat top that is usually supported by

one or more vertical legs.

A set of data arranged in rows and columns

Table

Furniture

SubtypeOf?

Array

SubtypeOf

A polysemous class might be placed below both meanings

Country

PoliticalEntity

SubtypeOf?

GeographicArea

SubtypeOf

Page 17: Pal gov.tutorial4.session8 1.ontologymodelingchallenges

17PalGov © 2011 17PalGov © 2011

OntoClean

OntoClean helps you to:

• evaluate misuses of subsumption and inconsistent modeling choices.

• provides a formal basis for why they’re wrong.

• Focuses on taxonomy

A subsumes B iff for all x, x instance of B implies x instance of A

Page 18: Pal gov.tutorial4.session8 1.ontologymodelingchallenges

18PalGov © 2011 18PalGov © 2011

OntoClean

• Considers general properties

– being an apple

– being a table

– being a person

– being red

• A Class is then the set of entities that exhibit a property in a possible

world.

• Members of the Class are instances of the property.

• The terminology can be slightly confusing

– These are not properties in the sense of OWL properties.

based on [Bechhofer]

Page 19: Pal gov.tutorial4.session8 1.ontologymodelingchallenges

19PalGov © 2011 19PalGov © 2011

OntoClean

• Metaproperties describe particular characteristics of the properties.

• Essence

• Rigidity

• Identity

• Unity

• Associating metaproperties with properties (i.e. classes) helps

characterize aspects of the properties.

• Constraints on the metaproperties enforce restrictions on the

taxonomy and help to highlight and evaluate the choices made.

based on [Bechhofer]

Page 20: Pal gov.tutorial4.session8 1.ontologymodelingchallenges

20PalGov © 2011 20PalGov © 2011

Essence (الجوهر)

• A property (i.e. class) is essential to an entity if it must hold for it.

– Not just things that accidently happen to be true all the time.

لكينونة ما إذا كان إجبارية جوهريةتكون الصفة

.لكينونة ما إذا كانت عرضية، لوقت معين وليس طوال الوقتغير جوهرية تكون الصفة

Page 21: Pal gov.tutorial4.session8 1.ontologymodelingchallenges

21PalGov © 2011 21PalGov © 2011

Rigidity ( صفة اللزوم)

• A rigid property is a property that is essential to all of its instances.

تكون الصفة الزمة إذا كانت جوهرية لجميع حامليها، مثل صفة إنسان ألن من يحمل هذه الصفة ال

.يمكنه التخلي عنها في أي وقت أو ظرف– Every entity that can exhibit the property must do so.

– Every entity that is a person must be a person

– There are no entities that can be a person but aren’t.

• An anti-rigid property is one that is never essential

تكون الصفة غير الزمة إذا لم تكن جوهرية لجميع حامليها، مثل صفة طالب ألن من يحمل هذه الصفة

.يمكنه التخلي عنها– For example every instance of student isn’t necessarily a student -

students may cease to be students at some point without ceasing to exist

or changing their identity.

• A semi-rigid property is one that is essential to some instances but not

to others

فهي جوهرية للبعض مثل قاسي مليها، مثل صفة اإذا كانت جوهرية لبعض ح شبه الزمةتكون الصفة

.القاسي‖ اإلسفنج―وغير جوهرية للبعض اآلخر مثل ‖ مطرقة―

Page 22: Pal gov.tutorial4.session8 1.ontologymodelingchallenges

22PalGov © 2011 22PalGov © 2011

Rigidity ( صفة اللزوم)

• A rigid property is a property that is essential to all of its instances.

تكون الصفة الزمة إذا كانت جوهرية لجميع حمليها، مثل صفة إنسان ألن من يحمل هذه الصفة ال

.يمكنه التخلي عنها في أي وقت أو ظرف– Every entity that can exhibit the property must do so.

– Every entity that is a person must be a person

– There are no entities that can be a person but aren’t.

• An anti-rigid property is one that is never essential

تكون الصفة غير الزمة إذا لم تكن جوهرية لجميع حمليها، مثل صفة طالب ألن من يحمل هذه الصفة

.يمكنه التخلي عنها– For example every instance of student isn’t necessarily a student -

students may cease to be students at some point without ceasing to exist

or changing their identity.

• A semi-rigid property is one that is essential to some instances but not

to others

فهي جوهرية للبعض مثل قاسي إذا كانت جوهرية لبعض حمليها، مثل صفة شبه الزمةتكون الصفة

.القاسي‖ اإلسفنج―وغير جوهرية للبعض اآلخر مثل ‖ مطرقة―

Each concept in the Ontology should be

labeled with Rigid (+R) , Anti-rigid (-R), or

Semi-rigid (~R).

As we will see later, these metaproperties impose

constraints on the subsumption relation, which can be

used to check the ontological consistency of taxonomic

links.

One of these constraints is that anti-rigid properties

cannot subsume rigid properties. Thus Student cannot

subsume Person.

Page 23: Pal gov.tutorial4.session8 1.ontologymodelingchallenges

23PalGov © 2011 23PalGov © 2011

Identity

• Identity criteria allows us to recognize individual entities in the world.

– Is that a dog?

Necessary and sufficient criteria in terms of definitions

– Is that my dog?

• Time Duration

– One hour, two hours etc.

• Time Interval

– 11:00-12:00 on Tuesday 26th, 17:00-18:45 on Saturday 17th

Time Interval

Time Duration

SubtypeOf Component OfXTime Interval

Time Duration

When we say “all time intervals are time durations” we really mean “all time intervals have a

time duration”;

Page 24: Pal gov.tutorial4.session8 1.ontologymodelingchallenges

24PalGov © 2011 24PalGov © 2011

Identity

Identity refers to the problem of being able to recognize individual

entities in the world as being the same (or different).

Identity criteria are conditions used to determine equality (sufficient

conditions) and that are entailed by equality (necessary conditions).

For example, how do we recognize a person we know as the same

person even though they may have changed?

Thinking about concepts’ identities, while building an ontology help us

avoid/discover mistakes.

For example:

Time Interval

Time Duration

SubtypeOfInstances like: “One hour”, “two hours”, “one day” etc.

Instances like: “1:00–2:00 next Tuesday”,“2:00–3:00 next Wednesday”, etc.

since all Time Intervals are also Time Durations.

Page 25: Pal gov.tutorial4.session8 1.ontologymodelingchallenges

25PalGov © 2011 25PalGov © 2011

Identity

According to the identity critera for time durations, two durations of the same

length are the same duration. In other words, all one hour time durations are

identical—they are the same duration and therefore there is only one ―one

hour‖ time duration.

According to the identity criteria for time intervals, two intervals occurring at the

same time are the same, but two intervals occurring at different times, even if

they are the same length, are different. Therefore, the two example intervals

given would be different intervals, but the same duration.

This creates a contradiction: if all instances of time interval are also instances

of time duration (as implied by the subclass relationship), how can they be two

instances under one class and a single instance under another?

Time Interval

Time Duration

SubtypeOfInstances like: “One hour”, “two hours”, “one day” etc.

Instances like: “1:00–2:00 next Tuesday”,“2:00–3:00 next Wednesday”, etc.

since all Time Intervals are also Time Durations.

Page 26: Pal gov.tutorial4.session8 1.ontologymodelingchallenges

26PalGov © 2011 26PalGov © 2011

Identity

When we say ―all time intervals are time durations‖ we really mean ―all time

intervals have a time duration‖; the duration is a component of an interval, but

it is not the interval itself. Therefore, we cannot model the relationship as

subclass.

Time Interval

Time Duration

SubtypeOf

since all Time Intervals are also Time Durations.

X Component Of

Time Interval

Time Duration

X

Page 27: Pal gov.tutorial4.session8 1.ontologymodelingchallenges

27PalGov © 2011 27PalGov © 2011

Identity

When we say ―all time intervals are time durations‖ we really mean ―all time

intervals have a time duration‖; the duration is a component of an interval, but

it is not the interval itself. Therefore, we cannot model the relationship as

subclass.

Time Interval

Time Duration

SubtypeOf

since all Time Intervals are also Time Durations.

X Component Of

Time Interval

Time Duration

X

A property will inherit identity criteria from parents

A property may also supply its own additional identity criteria

Page 28: Pal gov.tutorial4.session8 1.ontologymodelingchallenges

28PalGov © 2011 28PalGov © 2011

Unity

Unity refers to the problem of describing the way the parts of an

object are bound together.

Unity criteria identify whether or not a property is intended to

describe whole objects.

For some classes, all their instances are wholes (like Car), for

others none of their instances are wholes (like Oil).

• A property carries unity if all its instances exhibit a common unity

criterion (e.g., Ocean)

• A property carries no unity if its instances are all wholes, but with

different unity criteria (Legal Entity, as it includes companies and people

• A property carries anti-unity if all of its instances are not necessarily

whole

Page 29: Pal gov.tutorial4.session8 1.ontologymodelingchallenges

29PalGov © 2011 29PalGov © 2011

Unity

Thinking about concepts’ identities, while building an ontology helps

us avoid/discover mistakes. For example:

Ocean

Water

SubtypeOf

Instance is an amount of water, but it is not a whole, since it is

not recognizable as an isolated entity.

Instances like the “Atlantic Ocean”, is recognizable as a

single entity

If we claim that instances of the latter are not wholes, and instances of the

former always are, then we have a contradiction. Problems like this

again stem from the ambiguity of natural language, oceans are not ―kinds

of ―water‖, they are composed of water.

Page 30: Pal gov.tutorial4.session8 1.ontologymodelingchallenges

30PalGov © 2011 30PalGov © 2011

Unity

Thinking about concepts’ identities, while building an ontology helps

us avoid/discover mistakes. For example:

Ocean

Water

SubtypeOf

If we claim that instances of the latter are not wholes, and instances of the

former always are, then we have a contradiction. Problems like this

again stem from the ambiguity of natural language, oceans are not ―kinds

of ―water‖, they are composed of water.

XComposed Of

Water

Ocean

Page 31: Pal gov.tutorial4.session8 1.ontologymodelingchallenges

31PalGov © 2011 31PalGov © 2011

OntoClean’s support!

• Recognizing identity and unity criteria is typically very difficult, and

needs philosophical/analytical skill.

• Well, of course good ontologies are not easy and need deep thinking!

• These difficulties are not simplified much by OntoClean, but there is

no better methodology!

• More examples and practice help you become a good ontology

modeler! ….. then your salary will be very high! ;-)

Page 32: Pal gov.tutorial4.session8 1.ontologymodelingchallenges

32PalGov © 2011 32PalGov © 2011

How to apply OntoClean?

B

ASubtypeOf

?

1. If B is anti-rigid, then A must be anti-rigid

2. If B carries an identity criterion, then A must carry the same criterion

3. If B carries a unity criterion, then A must carry the same criterion

4. If B has anti-unity, then A must also have anti-unity

5. If B is externally dependent, then A must be

Given two classes A and B, where B subsumes A,

the following rules must hold:

• Analysis of a hierarchy using these rules can help identify problematic

modeling choices.

• Conceptual backbone of rigid properties

Rules

based on [Bechhofer]

Page 33: Pal gov.tutorial4.session8 1.ontologymodelingchallenges

33PalGov © 2011 33PalGov © 2011

Summary

• OntoClean helps a modeler to justify and analyze the choices made in

defining a subsumption hierarchy.

• Metaproperties :

– Rigidity

– Identity

– Unity

– Dependent

• Application of constraints on metaproperties

– Highlights potential inconsistencies in the modelling

based on [Bechhofer]

Page 34: Pal gov.tutorial4.session8 1.ontologymodelingchallenges

34PalGov © 2011 34PalGov © 2011

References

Guarino, Nicola and Chris Welty. 2002. Evaluating Ontological Decisions with

OntoClean. Communications of the ACM. 45(2):61-65. New York: ACM Press.

Sean Bechhofer: OntoClean. COMP30412. University of Manchester

http://www.cs.man.ac.uk/~seanb/teaching/COMP30412/OntoClean.pdf