Pal gov.tutorial4.session8 2.stepwisemethodologies

24
1 PalGov © 2011 1 PalGov © 2011 فلسطينيةلكترونية الديمية الحكومة ا أكاThe Palestinian eGovernment Academy www.egovacademy.ps Tutorial 4: Ontology Engineering & Lexical Semantics Session 8.2 Stepwise Methodologies Dr. Mustafa Jarrar University of Birzeit [email protected] www.jarrar.info

Transcript of Pal gov.tutorial4.session8 2.stepwisemethodologies

Page 1: Pal gov.tutorial4.session8 2.stepwisemethodologies

1PalGov © 2011 1PalGov © 2011

أكاديمية الحكومة اإللكترونية الفلسطينيةThe Palestinian eGovernment Academy

www.egovacademy.ps

Tutorial 4: Ontology Engineering & Lexical Semantics

Session 8.2

Stepwise Methodologies

Dr. Mustafa Jarrar

University of Birzeit

[email protected]

www.jarrar.info

Page 2: Pal gov.tutorial4.session8 2.stepwisemethodologies

2PalGov © 2011 2PalGov © 2011

About

This tutorial is part of the PalGov project, funded by the TEMPUS IV program of the

Commission of the European Communities, grant agreement 511159-TEMPUS-1-

2010-1-PS-TEMPUS-JPHES. The project website: www.egovacademy.ps

University of Trento, Italy

University of Namur, Belgium

Vrije Universiteit Brussel, Belgium

TrueTrust, UK

Birzeit University, Palestine

(Coordinator )

Palestine Polytechnic University, Palestine

Palestine Technical University, PalestineUniversité de Savoie, France

Ministry of Local Government, Palestine

Ministry of Telecom and IT, Palestine

Ministry of Interior, Palestine

Project Consortium:

Coordinator:

Dr. Mustafa Jarrar

Birzeit University, P.O.Box 14- Birzeit, Palestine

Telfax:+972 2 2982935 [email protected]

Page 3: Pal gov.tutorial4.session8 2.stepwisemethodologies

3PalGov © 2011 3PalGov © 2011

© Copyright Notes

Everyone is encouraged to use this material, or part of it, but should

properly cite the project (logo and website), and the author of that part.

No part of this tutorial may be reproduced or modified in any form or by

any means, without prior written permission from the project, who have

the full copyrights on the material.

Attribution-NonCommercial-ShareAlike

CC-BY-NC-SA

This license lets others remix, tweak, and build upon your work non-

commercially, as long as they credit you and license their new creations

under the identical terms.

Page 4: Pal gov.tutorial4.session8 2.stepwisemethodologies

4PalGov © 2011

Tutorial Map

Topic Time

Session 1_1: The Need for Sharing Semantics 1.5

Session 1_2: What is an ontology 1.5

Session 2: Lab- Build a Population Ontology 3

Session 3: Lab- Build a BankCustomer Ontology 3

Session 4: Lab- Build a BankCustomer Ontology 3

Session 5: Lab- Ontology Tools 3

Session 6_1: Ontology Engineering Challenges 1.5

Session 6_2: Ontology Double Articulation 1.5

Session 7: Lab - Build a Legal-Person Ontology 3

Session 8_1: Ontology Modeling Challenges 1.5

Session 8_2: Stepwise Methodologies 1.5

Session 9: Lab - Build a Legal-Person Ontology 3

Session 10: Zinnar – The Palestinian eGovernmentInteroperability Framework

3

Session 11: Lab- Using Zinnar in web services 3

Session 12_1: Lexical Semantics and Multilingually 1.5

Session 12_2: WordNets 1.5

Session 13: ArabicOntology 3

Session 14: Lab-Using Linguistic Ontologies 3

Session 15: Lab-Using Linguistic Ontologies 3

Intended Learning ObjectivesA: Knowledge and Understanding

4a1: Demonstrate knowledge of what is an ontology,

how it is built, and what it is used for.

4a2: Demonstrate knowledge of ontology engineering

and evaluation.

4a3: Describe the difference between an ontology and a

schema, and an ontology and a dictionary.

4a4: Explain the concept of language ontologies, lexical

semantics and multilingualism.

B: Intellectual Skills

4b1: Develop quality ontologies.

4b2: Tackle ontology engineering challenges.

4b3: Develop multilingual ontologies.

4b4: Formulate quality glosses.

C: Professional and Practical Skills

4c1: Use ontology tools.

4c2: (Re)use existing Language ontologies.

D: General and Transferable Skills

d1: Working with team.

d2: Presenting and defending ideas.

d3: Use of creativity and innovation in problem solving.

d4: Develop communication skills and logical reasoning

abilities.

Page 5: Pal gov.tutorial4.session8 2.stepwisemethodologies

5PalGov © 2011 5PalGov © 2011

Outline and Session ILOs

This session will help student to:

4a1: Demonstrate knowledge of what is an ontology, how it is

built, and what it is used for.

4b1: Develop quality ontologies.

Page 6: Pal gov.tutorial4.session8 2.stepwisemethodologies

6PalGov © 2011 6PalGov © 2011

Methodology

Let’s discuss from where to start, if you want to build an ontology for:

• E-government

• E-Banking

• E-Health

• Bioinformatics

• Multilingual search engine

• …

What are the phases of the ontology development lifecycle? taking into account that Ontologies might be builtcollaboratively by many people.

Page 7: Pal gov.tutorial4.session8 2.stepwisemethodologies

7PalGov © 2011 7PalGov © 2011

Methodological Questions

– Which tools and techniques to use?

– Which languages should be used in which circumstances, and in

which order?

– What quality measures should we care about?

– What things can be reused?

– Which people should be assigned which tasks?

– ....

• Many Methodologies exist ! But non is good! Because each

project/application/domain is different, and the background of the

people involved are also different, etc.

• We will overview some common steps in this lecture, thus try to learn

smartly, and don’t follow these steps literally. You should have your

own methodology for each ontology.

Page 8: Pal gov.tutorial4.session8 2.stepwisemethodologies

8PalGov © 2011 8PalGov © 2011

Most methodologies propose these phases:

1- Identify Purpose and Scope

2- Building the Ontology

2.1- Ontology Capture

2.2- Ontology Coding

3- Integrating existing ontologies

4- Evaluation

5- Documentation

Page 9: Pal gov.tutorial4.session8 2.stepwisemethodologies

9PalGov © 2011 9PalGov © 2011

1- Purpose and Scope

• There is no one/ideal ontology of a certain domain

– There are always alternatives, each abstracting different things, and for

different usages.

• What should be included in the ontology (concepts and relations)

should be smartly determined, taking into account (if possible) many

application scenarios.

– Interoperability between systems.

– improve search quality.

– Communication between people and organizations (important).

– Future extensions should be anticipated.

Page 10: Pal gov.tutorial4.session8 2.stepwisemethodologies

10PalGov © 2011 10PalGov © 2011

• When you specify the purpose and scope, you should specify the

following:

1- What is the domain that the ontology will cover?

The notion of context, in the double articulation theory, is part of the

Purpose and Scope.

That is: the scope where the vocabulary interpretation should be valid.

For example: the scope of the legal-Person ontology is the set of all

laws, regulations, and repositories in the state.

2- What we are going to use the ontology for?

Enough description about what application scenarios are taking into

account.

1- Purpose and Scope

Be carful with the ontology usability/reusability trade-off

Page 11: Pal gov.tutorial4.session8 2.stepwisemethodologies

11PalGov © 2011 11PalGov © 2011

2- Building the Ontology

2.1- Ontology Capture

– Identify key concepts and relationships.

– Produce clear text definitions for these concepts (i.e., glosses).

– Identify terms that refer to these concepts.

– Reach Consensus (Consensus is an indication of correctness).

You may apply the 7 steps for building an ORM schema,

somehow!

2.2- Ontology Coding/Specification/Characterization

– Explicit representation of the “conceptualization” in some formal

language.

Page 12: Pal gov.tutorial4.session8 2.stepwisemethodologies

12PalGov © 2011 12PalGov © 2011

2.1- Ontology Capture: Scoping

• Brainstorming

– Produce all potentially relevant terms and phrases.

• Nouns form the basis for concept names

• Verbs (or verb phrases) form the basis for property and names.

This step can be semi- automated somehow, as candidate concepts and

relations can be extracted automatically from relevant documents, laws,

forms, DB schemes....

• Organize candidate concepts into groups

Group related terms together.

– Exclude some terms if not relevant (w.r.t., purpose and scope)

– Keep notes of these decisions.

– Group similar terms and potential synonyms together.

Page 13: Pal gov.tutorial4.session8 2.stepwisemethodologies

13PalGov © 2011 13PalGov © 2011

2.1- Ontology Capture: Produce Definitions

• Use suitable meta-ontology

– i.e., use modeling primitives in a consistent manner (e.g. Type,

role, entity, instance, relationship...)

• When several people are involved, each might be responsible on a

group of terms

– Semantic overlap with others must be right in the first place,

otherwise lot of redundant re-working.

• Terms: Produce definitions/glosses in a middle-out fashion

– Define a gloss for each term. This helps get deeper understanding

of the domain.

– These glosses will have to be revised later, after defining the

relationships/ subsumptions between concepts.

– This is called middle-out, rather than top-down or bottom up. – will

be discussed later.

Page 14: Pal gov.tutorial4.session8 2.stepwisemethodologies

14PalGov © 2011 14PalGov © 2011

Define Taxonomy

• Relevant terms must be organized in a taxonomic hierarchy (i.e.,

subsumptions)

– Opinions differ on whether it is more efficient to do this in a top-

down or a bottom-up fashion.

• Ensure that hierarchy is indeed a taxonomy:

– If A subsumes B, then every instance of A must also be a

subsume B (compatible with semantics of rdfs:subClassOf)

– Insuring the correctness of subsumptions needs philosophical

thinking (apply the OntoClean Methodology).

• The semantics of subsumption demands that whenever A subsumes

B, every property that holds for instances of B must also apply to

instances of A (called inheritance).

– It makes sense to attach properties to the highest class in the

hierarchy to which they apply.

Page 15: Pal gov.tutorial4.session8 2.stepwisemethodologies

15PalGov © 2011 15PalGov © 2011

Define Properties

• Determine the relevant properties for each concept. Such

properties must be essential –to describe the meaning-, or

relevant to the applications.

• While attaching properties to concepts, it is useful to

determine its range (its datatype/value, or relations with

other concepts).

Page 16: Pal gov.tutorial4.session8 2.stepwisemethodologies

16PalGov © 2011 16PalGov © 2011

Add Rules and Restrictions

• Cardinality Restrictions

• Which properties should be unique, mandatory,

disjunctions, restricted values…etc.

• Relational Characteristics

– symmetry, transitivity, inverse properties, functional values

You must avoid the situation that the added rules are DB integrity

constraints.

Some/all rules should be verbalized –in pseudo natural language

sentences- so to enable other people review it and give feedback.

Page 17: Pal gov.tutorial4.session8 2.stepwisemethodologies

17PalGov © 2011 17PalGov © 2011

Define Some Important Instances

• Some important instances (might) be added to the

ontology, if needed. Such entities can be:

– Country: Palestine

– Person: Arafat

– Capital: Jerusalem

• in case of a large instances, it is more convenient to

have them separately .

- See the Entity and Address servers in Zinnar

Page 18: Pal gov.tutorial4.session8 2.stepwisemethodologies

18PalGov © 2011 18PalGov © 2011

Advantages of the Middle-out Approaches

• A bottom-up approach results in a high degree of detail– increases overall effort

– makes it difficult to spot commonality between related concepts.

– increases risk of inconsistencies and re-work.

• Top-down allow better control of degree of detail– risk of arbitrary high-level categories

– risk of limited stability

• Middle-out strikes is a compromise, but it allow the ontology

evolve gradually, you need to come back to some steps.

• The higher level concepts naturally arise and are thus more

likely to be stable.

Page 19: Pal gov.tutorial4.session8 2.stepwisemethodologies

19PalGov © 2011 19PalGov © 2011

Reaching Agreement: Some suggestions

Ontologies are made to be agreed and shared, thus it is VERY

important to make sure that people agree on them.

How to facilitate reaching agreement?

• Produce a natural language text definitions.

- Ask domain experts to review the context, glosses, verbalized rules,

and the ontology itself in a graphical/diagramatic form.

• Ensure consistency with terms already in use

– use existing thesauri and dictionaries

– avoid introducing new terms in the definitions

• Indicate relationships with other commonly used terms

– synonyms, variants, such referring to different dimensions

• Give examples

Page 20: Pal gov.tutorial4.session8 2.stepwisemethodologies

20PalGov © 2011 20PalGov © 2011

Integrating Existing Ontologies

• Check overlap with existing ontologies

• Establish formal links

– Produce mappings to existing concept definitions

– Import and extend existing ontologies

• Avoid re-inventing the wheel!

Page 21: Pal gov.tutorial4.session8 2.stepwisemethodologies

21PalGov © 2011 21PalGov © 2011

Ontology Evaluation

Several Type of evaluations:

1. Usability Evaluation: Validate whether the ontology produced

satisfies (at least) the intended applications’ requirements.

2. Syntax evaluation: Validate whether the ontology is well-formed

w.r.t the used language.

3. Logical evaluation: Validate whether the ontology has axioms

contradicting or implying each other.

4. Ontological Evaluation: Validate whether the ontology has

concepts that should be instances, sub-concepts that should be

roles, etc. (The OntoClean methodology is very good for this

evaluation)

Page 22: Pal gov.tutorial4.session8 2.stepwisemethodologies

22PalGov © 2011 22PalGov © 2011

Check for Implications and Contradictions

Some tools exist to automatically detect logical correctness (contradictions

and implications), depending on the used ontology language (Such as ORM:

DogmaModeler, OWL: Racer)

Page 23: Pal gov.tutorial4.session8 2.stepwisemethodologies

23PalGov © 2011 23PalGov © 2011

Some Guidelines

Clarity: The ontology engineer should communicate effectively with

the domain experts (= ask the right questions):

– Natural language definitions.

– Give examples, alternatives, and contradictions, elicit knowledge.

– emphasize distinctions.

Coherence: The ontology should be internally consistent

– Syntactically correct.

– Logically consistent.

– Ontologically consistent.

Extensibility: modularize the ontology in a way it is easy to build, understand,

and maintain. What should be in a module?

Reusability and Usability: be innovative to tradeoff this smartly.

Page 24: Pal gov.tutorial4.session8 2.stepwisemethodologies

24PalGov © 2011 24PalGov © 2011

References

Mike Uschol: Building Ontologies: Towards a Unified Methodology.

Proceedings of Expert Systems th Annual Conference

of the British Computer Society Specialist Group on Expert Systems. 1996

http://www.imamu.edu.sa/Scientific_selections/Documents/IT/96-es96-unified-

method.pdf

Mustafa Jarrar: Towards methodological principles for ontology

engineering. PhD Thesis. Vrije Universiteit Brussel. (May 2005)

http://www.jarrar.info/phd-thesis/

Fernández López: Overview Of Methodologies For Building Ontologies.

Proceedings of the IJCAI99 Workshop on Ontologies and

ProblemSolvingMethods Lessons Learned and Future Trends CEUR

Publications.

http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.39.6002