Seamless semantics - avoiding semantic discontinuity
-
Upload
steffen-staab -
Category
Software
-
view
511 -
download
2
Transcript of Seamless semantics - avoiding semantic discontinuity
Steffen Staab Seamless Semantics 1Institute for Web Science and Technologies · University of Koblenz-Landau, GermanyWeb and Internet Science Group · ECS · University of Southampton, UK &
Seamless Semantics: Avoiding Semantic Discontinuity
Steffen Staab
University of Southampton
&
Universität Koblenz-Landau
Steffen Staab Seamless Semantics 2
„I have bought myself a 30‘‘ screen, because half of my work is re-typing existing
material.“
Why do we have to re-type at all?
A professorial colleague:
Steffen Staab Seamless Semantics 3
Why do we have to retype at all?
Examples:• CVs• Bibliographies• Visa entry forms• Adresses• Purchase orders• ...
Reasons:• No semantics• Other semantics• Other formats• Other importance
ranking• (partial)
incompleteness
Steffen Staab Seamless Semantics 4
„Solution 1“: Tailorism (aka workflow)
https://www.flickr.com/photos/treborscholz/2857596696
Steffen Staab Seamless Semantics 5
„Solution 2“: Conversion tools
Steffen Staab Seamless Semantics 6
Why do we know the „how“ (see talk by Maria-Esther & Axel)
but not the „what“?
Steffen Staab Seamless Semantics 7
Traditional Information System
Business Logics
Structured DataUnstructured
Data
Presentation and Interaction
Charakteristics:• Processes known• Data structures
known• Meaning of data in
schema and implicit in code
Steffen Staab Seamless Semantics 8
Information ecosystems nowadaysExamples• Open Data• 1000s
DBs/company• Ad-hoc data
Characteristics• Some structure• Late structure• Social context• Meaning of data
most important
Steffen Staab Seamless Semantics 9
How does data receive meaning?Explicit:• Formal schema/ontology
– By someone else?
Implicit:• Names are just used for
describing
Social:• Communities converge
– By discussion– By emergence
Meaning?
Steffen Staab Seamless Semantics 10
Talking many languages...
Sub languages• For consumer
– title,...
• Global retailers– barcode
• US food industry– serving size, calories,...
• Producer– batch number
...https://www.youtube.com/watch?v=ga1aSJXCFe0
Depending on who you are – you encounter the (un)expected, the (un)known, the (un)understandable,...
Steffen Staab Seamless Semantics 11
The Unknown
https://www.flickr.com/photos/wrobel/8175902444/
Steffen Staab Seamless Semantics 12
We know a bit.... 1. URIs as identifiers
2. http lookup
3. RDF (triples)
4. relations, also to other locations
Steffen Staab Seamless Semantics 13
What is/should Linked Data good for?• Data integration is (relatively) easy
– Migrating different data sources to linked data is (relatively) easy
• Late schema is easy– Just add some more fields
• Ignoring data is easy– Think of crisps
• Serendipitous use– Discover new information &
new sources by following links
• Data repurposing / pointing– Use what others have done at both schema
and data level
Dealing w
ith the unknown
data and data schema
Steffen Staab Seamless Semantics 14
Issue: From Data Publishing to Understanding
?
De-contextualization Re-contextualization
Publishing data the structure of which you know is easier than understanding what you do not know
Steffen Staab Seamless Semantics 15
1. Reducing language friction
2. Reducing re-use friction
3. Reducing information loss
Agenda
Steffen Staab Seamless Semantics 16
Reducing Language Friction
Steffen Staab Seamless Semantics 17
Italian
Spanish
French
Steffen Staab Seamless Semantics 18
Language Dimensions (in the Semantic Web)
3 Generalization/Specialization
2 Modularization
1 Lexicalization
4 Sophistication
Steffen Staab Seamless Semantics 19
1 Lexicalization
http://img.remastersys.com/nimg/c1/a4/20dc0bfd21b1eac7c08889238b38-300x300-0/recyclable_laminated_plastic_potato_chips_bag_with_back_side_sealing.jpg
[Cimiano et al]
https://www.flickr.com/photos/theimpulsivebuy/11056507874
Steffen Staab Seamless Semantics 21
2 Modularization
Multimedia (@WeST)• FOAF• F event ontology• COMM• ..
Sensors (@Galway)• SSN ontology• COMM• F
Italian
Spanish
French
[Scherp et al][Leggieri et al]
Steffen Staab Seamless Semantics 22
2 Pattern as Micro-Module for Image Tagging
[Scherp&Saathoff, WWW-2010][Troncy et al 2007]
Steffen Staab Seamless Semantics 23
3 Understanding via generalization
Fracture of Femur Fracture of bone
Femur is bone in your upper leg
Steffen Staab Seamless Semantics 24
3 Generalization/Specialization
DOLCE
Ontologyof Plans
CoreSoftware Ontology
Core Ontologyof Web Services
Core Ontology ofSoftware Components
specificity
gene
ricco
re
reusedontology modules
Ontology of Information Objects
Descriptions& Situations
contributedontology modules
http://cos.ontoware.org
Steffen Staab Seamless Semantics 25
4 Sophistication
Steffen Staab Seamless Semantics 26
4 Ontology API Model for Image Tagging
Steffen Staab Seamless Semantics 27
4 Automatically Generated Ontology API
Steffen Staab Seamless Semantics 28
4 Comparing the two structures
Steffen Staab Seamless Semantics 29
4 OntoMDE Workflow
Model of Ontologies (MoOn)Adding declarative layer:Structuring the ontologies intosemantic units
Ontology API Model (OAM)Adding declarative layer:Structuring pragmatic units specifying how entities are to be used together
Steffen Staab Seamless Semantics 30
Reducing Re-use Friction: Semantic Programming
Steffen Staab Seamless Semantics 31
Example scenario: Jamendo
Data about license free music• ~ 1 Million triples• classes and predicates
from 18 different ontologies– FOAF, Tag ontology,
music ontology, …
Simple programming task:• List for every music artist,
all the records they made
Steffen Staab Seamless Semantics 32
Software Development Process Overview
data model design
revised data model design
data model prototype
data queries
final data model
Creation of initial data
model
Exploration of the data
source
Creation of model in
code
Query design / implementation
Mapping of query results
Steffen Staab Seamless Semantics 33
Accessing Artists Using Apache Jena
Steffen Staab Seamless Semantics 34
From artists to songs
Observations• SPARQL queries are strings• Results are strings• Requires good understanding of the data source
RDF Typing is lost
Steffen Staab Seamless Semantics 35
Programming Language Support for RDF Access
Static Typing Errors detected before
execution Misspelling discovered
by compiler! Anectode: 2nd place
because of misspelt code
Static types are form of documentation Less knowledge about
data source required
Better IDE integration / autocompletion
Code generation• Sommer• Winter• OntoMDE
Dynamic Typing E.g. ActiveRDF
(Oren et al 2007)) “convention over
configuration”
dynamic metaprogramming allows for slick code
Steffen Staab Seamless Semantics 36
Programming with Linked Data
Steffen Staab Seamless Semantics 37
c1
Programming with Linked Data
Tasks of the Programmer
1 Schema exploration
2 Programming code types
3 Programming queries
4 Programming procedures for
• creating, • manipulating,• persisting
objects
Steffen Staab Seamless Semantics 38
Node Path Query Language Using Autocompletion
Exploration of classes
Steffen Staab Seamless Semantics 39
Node Path Query Language Using Autocompletion
Exploration of classes
Exploration of relations
Steffen Staab Seamless Semantics 40
Node Path Query Language: Query Formulation
Exploration of classes
Exploration of relations
Querying for instances
Type set of mo:MusicArtist
No definition or declaration needed
Steffen Staab Seamless Semantics 41
Node Path Query Language for Code DevelopmentExploration of classes
Exploration of relations
Querying for instances
Developing code with queries
All translated into SPARQL queries at• Development time• Type inference at compile time
(but also as part of IDE)• Querying again at run time
One language to bind them all
Steffen Staab Seamless Semantics 42
Node Path Query Language for Code Development
Exploration of classes
Exploration of relations
Querying for instances
Developing code with queries
Developing code with new classes
All translated into SPARQL queries at• Development time• Run time update• Persistence!
Steffen Staab Seamless Semantics 43
NPQL
NPQL (Node Path Query Language)• Intensional Queries Describing RDF classes and properties for reuse in IDE and in host language metaprogramming
• Extensional Queries Class instances and property instances
• Compilation to SPARQL for reuse of existing endpoints
Ongoing discussion about details of NPQL
Steffen Staab Seamless Semantics 44
LITEQ
NPQL (Node Path Query Language)• Intensional Queries• Extensional Queries• Compilation to SPARQL
LITEQ (Language Integrated Types, Extensions and Queries) • Implementation of NPQL as F# Type Provider in Visual Studio• Autocompletion using NPQL queries• Automatic typing
of extensional query resultsby intensional queries
Steffen Staab Seamless Semantics 45
Outlook: Programming with Linked Data• More expressive query languages
– Derived data types in tractable description logics!
• More precise combined type inference– (derived) type from data source– type inference in programming language
• Programming across data sources– Federated queries– Linktraversal-based queries (the unknown sources)
• Integration of schema induction – Low quality of schema/ontologies
• Improved autocompletion
Steffen Staab Seamless Semantics 46
Conclusion
Steffen Staab Seamless Semantics 47
Issue: From Data Publishing to Unknown Data Understanding
CognitionStorytellingPragmatics
Ontology PatternsConceptual Modeling
Metamodels...
QuantityPertinenceManner
Steffen Staab Seamless Semantics 48
What is missing?
...a lot...
• Indexing• Search• Data and schema quality• Pragmatics• ...
Steffen Staab Seamless Semantics 49
Semantic Web
Social Web & Web Retrieval
Interactive Web & Human Computing
Web & Economy
Software & Services
Computational Social Science
Thank You!