Post on 21-Dec-2015
Contents
Introduction – Ontologies, Ontology learning Technical description Ontology learning in the Semantic Information
description Ontology Learning – Process Ontology Learning - Architecture Ontology Learning data sources Methods used in ontology learning Tools of ontology learning Uses of ontology learning
Ontologies
Provide a formal, explicit specification of a shared conceptualization of a domain that can be communicated between people and heterogeneous and widely spreads application systems.
They have been developed in Artificial Intelligent and Machine Learning to facilitate knowledge sharing and reuse.
Unlike knowledge bases ontologies have “all in one”: formal or machine readable representation full and explicitly described vocabulary full model of some domain consensus knowledge: common understanding of a domain easy to share and reuse
Ontology learning - General
Machine learning of ontologiesMain task: to automatically learn
complicated domain ontologiesExplores techniques for applying
knowledge discovery techniques to different data sources ( html documents, dictionaries, free text, legacy ontologies etc.) in order to support the task of engineering and maintaining ontologies
Introduction – Ontologies, Ontology learning Technical description Ontology learning in the Semantic Information
descritpion Ontology Learning – Process Ontology Learning - Architecture Ontology Learning data sources Methods used in ontology learning Tools of ontology learning Uses of ontology learning
Ontology learning – Technical description
The manual building of ontologies is a tedious task, which can easily result in a knowledge acquisition bottleneck. In addition, human expert modeling by hand is biased, error prone and expensive
Fully automatic machine knowledge acquisition remains in the distant future
Most systems are semi-automatic and require human (expert) intervention and balanced cooperative modeling for constructing ontologies
Introduction – Ontologies, Ontology learning Technical description Ontology learning in the Semantic Information
descritpion Ontology Learning – Process Ontology Learning - Architecture Ontology Learning data sources Methods used in ontology learning Tools of ontology learning Uses of ontology learning
Introduction – Ontologies, Ontology learning Technical description Ontology learning in the Semantic Information
descritpion Ontology Learning – Process Ontology Learning - Architecture Ontology Learning data sources Methods used in ontology learning Tools of ontology learning Uses of ontology learning
Ontology learning – Process (2/2) Stages analysis:
Merging existing structures or defining mapping rules between these structures allows importing and reusing existing ontologies
Ontology extraction models major parts of the target ontology, with learning support fed from various input sources
The target ontology’s rough outline, which results from import, reuse and extraction is pruned to better fit the ontology to its primary purpose
Ontology refinement profits from the pruned ontology but completes the ontology at a fine granularity (in contrast to extraction)
The target application serves as a measure for validating the resulting ontology
The ontology engineer can begin this cycle again- for example, to include new domains in the constructing ontology or to maintain and update its scope
Introduction – Ontologies, Ontology learning Technical description Ontology learning in the Semantic Information
descritpion Ontology Learning – Process Ontology Learning - Architecture Ontology Learning data sources Methods used in ontology learning Tools of ontology learning Uses of ontology learning
Ontology learning – Architecture (2/5)
Ontology Engineering Workbench: A sophisticated means for manual modeling and refining of the final ontology. The ontology engineer can browse the resulting ontology from the ontology learning process and decide to follow, delete or modify the proposals as the task requires.
Ontology learning – Architecture (3/5)
Management component: The ontology engineer uses the management component to select input data – that is relevant resources such as HTML and XML documents, DTDs, databases or existing ontologies that the discovery process can further exploit. Then, using the management component the engineer chooses of a set of resource-processing methods available in the resource-processing component and from a set of algorithms available in the algorithm library.
Ontology learning – Architecture (4/5) Resource processing Component: Depending on the
available data the engineer can choose various strategies for resource processing: Index and reduce HTML documents to free text Transform semi-structured documents such as dictionaries into
predefined relational structure Handle semi-structured and structured schema data by
following different strategies for import Process free natural text
After first preprocessing data according to one of these or similar strategies the resource processing module transforms the data into an algorithm specific relational representation.
Ontology learning – Architecture (5/5)
Algorithm library: A collection of various algorithms that work on the ontology definition and the preprocess input data. Although specific algorithms can vary greatly from one type of input to the next, a considerable overlap exists for underlying learning approaches such as associations rules, formal concept analysis or clustering.
Contents
Introduction – Ontologies, Ontology learning Technical description Ontology learning in the Semantic Information
descritpion Ontology Learning – Process Ontology Learning - Architecture Ontology Learning data sources Methods used in ontology learning Tools of ontology learning Uses of ontology learning
Ontology Learning from Natural Language Natural language texts exhibit morphological, syntactic,
semantic, pragmatic and conceptual constraints that interact in order to convey a particular meaning to the reader. Thus, the text transports information to the reader and the reader embeds this information into his background knowledge
Through the understanding of the text, data is associated with conceptual structures and new conceptual structures are learned from the interacting constraints given through language
Tools that learn ontologies from natural language exploit the interacting constraints on the various language levels (from morphology to pragmatics and background knowledge) in order to discover new concepts and stipulate relationships between concepts
Ontology Learning from Semi-structured Data HTML data, XML data, XML DTDs, XML-
Schemata and their likes add - more or less expressive - semantic information to documents
A number of approaches understand ontologies as a common generalizing level that may communicate between the various data types and data descriptions. Ontologies play a major role for allowing semantic access to these vast resources of semi-structured data
Learning of ontologies from these data and data descriptions may considerably enforce the application of ontologies and, thus, facilitate the access to these data
Ontology Learning from Structured Data
The learning of ontologies from metadata, such as database schemata, in order to derive a common high-level abstraction of underlying data descriptions can be an important precondition for data warehousing or intelligent information agents
Introduction – Ontologies, Ontology learning Technical description Ontology learning in the Semantic Information
descritpion Ontology Learning – Process Ontology Learning - Architecture Ontology Learning data sources Methods used in ontology learning Tools of ontology learning Uses of ontology learning
Methods for learning ontologies (1/8)
Clustering The elaboration of any clustering method
involves the definition of two main elements- a distance metrics and a classification algorithm
A workbench that supports the development of conceptual clustering methods for the (semi-) automatic construction of ontologies of a conceptual hierarchy type from parsed corpora is the Mo’K workbench
Methods for learning ontologies (2/8)
Clustering Ontologies are organized as multiple
hierarchies that form an acyclic graph where nodes are term categories described by intention and links represent inclusion.
Learning though hierarchical classification of a set of objects can be performed in two main ways: top down, by incremental specialization of classes and bottom-up by incremental generalization
Methods for learning ontologies (4/8)Information Extraction Rules
We start with: An initial hand crafted seed ontology of
reasonable quality which contains already the relevant types of relationships between ontology concepts in the given domain
An initial set of documents which exemplarily represent (informally) substantial parts of the knowledge represented in the seed ontology
Methods for learning ontologies (5/8)Information Extraction Rules
Compared to other ontology learning approaches this technique is not restricted to learning taxonomy relationships, but arbitary relationships in an application domain.
A project that uses this technique is the FRODO project.
Methods for learning ontologies (6/8)
Association Rules Association-rule-learning algorithms are used for
prototypical applications of data mining and for finding associations that occur between items in order to construct ontologies (extraction stage)
‘Classes’ are expressed by the expert as a free text conclusion to a rule. Relations between these ‘classes’ may be discovered from existing knowledge bases and a model of the classes is constructed (ontology) based on user-selected patterns in the class relations
This approach is useful for solving classification problems by creating classification taxonomies (ontologies) from rules
Methods for learning ontologies (7/8)
Association Rules – Example A classification knowledge based system with
experimental results based on medical data (Suryanto & Compton – Australia)
Ripple Down Rules (RDR) were used to describe classes and their attributes:
Satisfactory lipid profile previous raised LDL noted (LDL <= 3.4)AND(Triglyceride is NORMAL)AND(Max(LDL)>3.4)OR((LDL is NORMAL)AND(Triglyceride is NORMAL)AND(Max(LDL) is
HIGH)
Experts were allowed to modify or add conclusions in order to correct errors
The conclusions of the rules formed the classes of the classification ontology
Methods for learning ontologies (8/8)
Association Rules – Example Ontology learning methodology used:
Firstly, class relations between rules were discovered. There were three basic relations: subsumption/ intersection, mutual exclusivity and similarity
Secondly, more compound relations which appeared interesting using the three basic relations were specified
Finally, instances of these compound relations or patterns were extracted and the class model was assembled
Problems that occurred: Very similar conclusions were sometimes identified as
mutually exclusive in cases where there different values for the same attribute
The method did not consider any other information about the classes themselves
Introduction – Ontologies, Ontology learning Technical description Ontology learning in the Semantic Information
descritpion Ontology Learning – Process Ontology Learning - Architecture Ontology Learning data sources Methods used in ontology learning Tools of ontology learning Uses of ontology learning
Ontology learning tools – ASIUM (1/8) Acronym for "Acquisition of Semantic knowledge Using
Machine learning method" The main aim of Asium is to help the expert in the
acquisition of semantic knowledge from texts and to generalize the knowledge of the corpus
Asium provides the expert with an interface which will first help him or her to explore the texts and then to learn knowledge which are not in the texts
During the learning step, Asium helps the expert to acquire semantic knowledge from the texts, like subcategorization frames and an ontology. The ontology represents an acyclic graph of the concepts of the studied domain. The subcategorization frames represent the use of the verbs in these texts
Ontology learning tools – ASIUM (2/8)
Methodology:The input for Asium are syntactically parsed texts from a specific domain. It then extracts these triplets: verb, preposition/function (if there is no preposition), lemmatized head noun of the complement. Next, using factorization, Asium will group together all the head nouns occurring with the same couple verb, preposition/function. These lists of nouns are called basic clusters. They are linked with the couples verb,preposition/ function they are coming from.
Ontology learning tools – ASIUM (3/8)
Methodology:Asium then computes the similarity among all the basic clusters together. The nearest ones will be aggregated and this aggregation is suggested to the expert for creating a new concept. The expert defines a minimum threshold for gathering clusters into concepts. Any learned concepts can contain noise (e.g. mistakes in the parsing), any sub-concepts the expert wants to identify or over-generalization due to aggre- gations may occur,so the expert’s contribution is necessary.
Ontology learning tools – ASIUM (4/8)
Methodology:After this, Asium will have learned the first level of the ontology. Asium computes similarity again but among all the clusters; the old and the new ones in order to learn the next level of the ontology. The cooperative process runs until there are no more possible aggregations. The output of the learning process is an ontology and subcategorization frames. The ontology represents an acyclic graph of the concepts of the studied domain. The subcategorization frames represent the use of the verbs in these texts.
Ontology learning tools – ASIUM (5/8)
Methodology The advantages of this method are twofold:
First, the similarity measure identifies all concepts of the domain and the expert can validate or split them. Next the learning process is, for one part, based on these new concepts and suggests more relevant and more general concepts.
Second, the similarity measure will offer the expert aggregations between already validated concepts and new basic clusters in order to get more knowledge from the corpus.
Ontology learning tools – ASIUM (6/8)
The interfaceThis window allows the expert to validate the concepts learned by Asium.
Ontology learning tools – ASIUM (7/8)
The interfaceThis window displays the list of all the examples covered for the learned concept.This display allows the expert to visualize all the sentences which will be allowed if this class is validated.
Ontology learning tools – ASIUM (8/8)
The interfaceThis window displays the ontology like it actually is in memory i.e. learned concepts and concepts to be proposed for a level (each blue circle represents a class).
Ontology learning tools – TEXT-TO-ONTO (1/8)
It develops a semi-automatic ontology learning from text
It tries to overcome the knowledge acquisition bottleneck
It is based on a general architecture for discovering conceptual structures and engineering ontologies from text
Ontology learning tools – TEXT-TO-ONTO (4/8)
Architecture - Main components Text & Processing Management Component
The ontology engineer uses that component to select domain texts exploited in the further discovery process.Can choose among a set of text (pre-) processing methods available on the Text Processing Server and among a set of algorithms available at the Learning & Discovering component.The former module returns text that is annotated by XML and XML-tagged is fed to the Learning & Discovering component
Ontology learning tools – TEXT-TO-ONTO (5/8)
Architecture - Main components Text Processing Server
It contains a shallow text processor based on the core system SMES. SMES is a system that performs syntactic analysis on natural language documents
It organized in modules, such as tokenizer, morphological and lexical processing and chunk parsing that use lexical resources to produce a mixed syntactic/semantic information
The results are stored in annotations using XML-tagged text
Ontology learning tools – TEXT-TO-ONTO (6/8)
Architecture - Main components Lexical DB & Domain Lexicon
SMES accesses a lexical database with more than 120.000 stem entries and more than 12.000 subcategorization frames that are used for lexical analysis and chunk parsing
The domain-specific part of the lexicon associates word stems with concepts available in the concept taxonomy and links syntactic information with semantic knowledge that may be further refined in the ontology
Ontology learning tools – TEXT-TO-ONTO (7/8)Architecture - Main components
Learning & Discovering component Uses various discovering methods on the annotated
texts e.g. term extraction methods for concept acquisition.
Ontology learning tools – TEXT-TO-ONTO (8/8)Architecture - Main components
Ontology Engineering Enviroment-ONTOEDIT Supports the ontology engineer in semi-automatically
adding newly discovered conceptual structures to the ontology
Internally stores modeled ontologies using an XML serialization
Introduction – Ontologies, Ontology learning Technical description Ontology learning in the Semantic Information
descritpion Ontology Learning – Process Ontology Learning - Architecture Ontology Learning data sources Methods used in ontology learning Tools of ontology learning Uses of ontology learning
Uses of ontology learning – Knowledge sharing (1/2)
Identifying candidate relations between expressive, diverse ontologies using concept cluster integration in multi-agent systems
Agents with diverse ontologies should be able to share knowledge by automated learning methods and agent communication strategies
Agents that do not know the relationships of their concepts to each other need to be able to teach each other these relationships (ontology learning)
Uses of ontology learning – Knowledge sharing (2/2)
Concept representation and learning on each agent:
Process: an agent sends a query to another agent and receives a response with new concepts. A new category is created from these concepts. The agent re-learns the ontology rules and if the new concept relation rules are verified, they are stored in the agent.
Uses of ontology learning – Interest matching (1/2) Designing a general algorithm for interest
matching is a major challenge in building online community and agent-based communication networks.
These algorithms can be applied in user categorization for an online community . Users’ behavior can be analyzed and matched against other users to provide collaborative categorization and recommendation services to tailor and enhance the online experience.
The process of finding similar users based on data from logged behavior in called interest matching.
Uses of ontology learning – Interest matching (2/2)User interests can be
described by ontologies as weighed tree- hierarchies of concepts
Each node has a weight attribute to represent the importance of the concept
These weights can be explored to calculate similarities between users
Learning process: a standard ontology is used and the websites the user visits can be classified and entered into the standard ontology to personalize it – if a user frequents websites of a category (instance of a class) it is likely he is interested in other instances of the class
Uses of ontology learning – Web Directory Classification Ontologies and ontology learning can be used to
create information extraction tools for collecting general information from the free text of web pages and classifying them in categories
The goal is to collect indicator terms from the web pages that may assist the classification process. This terms can be derived from directory headings of a web page as well as its content.
The indicator terms along with a collection of interpretation rules can result in a hierarchy (ontology) of web pages.
Uses of ontology learning –E-mail classification (1/2)
KMi Planet A web-based news server for communication
of stories between member in Knowledge Media Institute
Main goal: To classify an incoming story, obtain the relevant objects within the story, deduce the relationships between them and to populate the ontology
Integrate a template-driven information extraction engine with an ontology engine to supply the necessary semantic content
Uses of ontology learning –E-mail classification (2/2)
KMi Planet There are three tools:
PlanetOnto MyPlanet an IE tool
PlanetOnto supports some activities.One of them is Ontology editing.In that point ontology learning is concerned.
A tool called WebOnto provides Web-based visualisation, browsing and editing support for the ontology. The “Operational Conceptual Modelling Language”, OCML, is a language designed for knowledge modeling. WebOnto uses OCML and allows the creation of classes and instances in the ontology, along with easier development and maintenance of the knowledge models
Bibliography
M.Sintek, M. Junker, Ludger van Est, A. Abecker, Using Information Extraction Rules for Extending Domain Ontologies, German Research Center for Artificial Intelligence (DFKI)
M.Vargas-Vera, J.Domingue, Y.Kalfoglou, E.Motta, S.Buckingham Shum, Template-Driven Information Extraction for Populating Ontologies, Knowledge Media Institute (UK)
G.Bisson, C.Nedellec, Designing clustering methods for ontology building, University of Paris
A.Maedche, S.Staab, The TEXT-TO-ONTO Ontology Learning Environment, University of Karlsruhe
A.Maedche, S.Staab, Ontology Learning for the Semantic Web, University of Karlsruhe
H.Suryanto,P.Compton, Learning classification taxonomies from a classification knowledge based system, University of New South Wales (Australia)
Proceedings of the First Workshop on Ontology Learning OL'2000Berlin, Germany, August 25, 2000
Proceedings of the Second Workshop on Ontology Learning OL'2001Seattle, USA, August 4, 2001
ASIUM web page http://www.lri.fr/~faure/Demonstration.UK/Presentation_Demo.html