Semantic Wiki: Social Semantic Web In Action:

Post on 11-May-2015

1.786 views 1 download

description

My talk for Tsinghua University Alumni in Seattle area for centennial celebration tech talk

Transcript of Semantic Wiki: Social Semantic Web In Action:

Semantic WikisSemantic WikisSocial Semantic Web In Action

2011-03-25Specially Prepared for Tsinghua University Alumniin greater Seattle area for centennial celebration

2

About Me: Jesse Wang 王 (嘉 )欣About Me: Jesse Wang 王 (嘉 )欣

1996

1997 2005

1988 1998

3

Who is VulcanWho is Vulcan

4

What does Vulcan doWhat does Vulcan do

Vulcan Inc. was established in 1986 by investor and philanthropist Paul G. Allen, co-founder of Microsoft, to manage his

business and philanthropic efforts. Allen is chairman of Vulcan and his sister, Jody

Allen, is president and CEO.

5

It all began with a vision…It all began with a vision…

6

Now the Vision Continues as Project HaloNow the Vision Continues as Project Halo

Project Halo is a staged, long-range research effort by Vulcan Inc. towards the development of a "Digital Aristotle"—a reasoning system capable of answering novel questions and solving advanced problems in a broad range of scientific

disciplines and related human affairs. The project focuses on creating two primary functions: a tutor capable of instructing and assessing students in those

subjects, and a research assistant with broad, interdisciplinary skills to help scientists and others in their work.

Automatic Question Answering System

7

Project Halo’s Focus AreasProject Halo’s Focus Areas

• Automated User-Centered Reasoning and Acquisition System

• Text book you can talk to

AURA

• Semantic Inference with Large Knowledge-base

• Non-monotonic rule system / RIF

SILK

• Semantic MediaWiki +• Knowledge authoring with SMEs

SMW+

Plus other related semantic technologies and commercial efforts

Question Interpretation

Advanced Reasoning

Knowledge Acquisition

8

Project Halo’s GoalsProject Halo’s Goals

Address the core problems in Knowledge Bases– scale– brittleness

Have high impact

KB E

ffort (co

st, p

eople

,…)

KB size (number of assertions, complexity…)

Vulcan

Now

Future

9

Crowdsourcing for Better Knowledge AcquisitionCrowdsourcing for Better Knowledge Acquisition

11

Wiki as a Crowdsourcing ToolWiki as a Crowdsourcing Tool

Consensus

This distinguishes wikis from other publication tools

12

Consensus in Wikis Comes fromConsensus in Wikis Comes from

Collaboration– ~17 edits/page on average in

Wikipedia (with high variance)– Wikipedia’s Neutral Point of View

Convention– Users follow customs and

conventions to engage with articles effectively

13

Software Support Makes Wikis SuccessfulSoftware Support Makes Wikis Successful

Trivial to edit by anyone Tracking of all changes, one-

step rollback Every article has a “Talk” page

for discussion Notification facility allows

anyone to “watch” an article Sufficient security on pages,

logins can be required A hierarchy of administrators,

gardeners, and editors Software Bots recognize certain

kinds of vandalism and auto-revert, or recognize articles that need work, and flag them for editors

14

Success of WikisSuccess of Wikis

One of human’s greatest inventions

15

Wikis are Great, But…Wikis are Great, But…

Wiki Clock?

How About Hidden Goodies in the Wiki?How About Hidden Goodies in the Wiki?

Wikipedia has articles about…

•… all cities•… their populations•… their mayors•… the skyscrapers

So can I ask for a list of the world’s 5 largest cities with a female mayor?Or Skyscrapers in Shanghai with 50+ floors and built after 2000?

16

17

Enters Semantics…Enters Semantics…To answer questions like:• The female majors of top 10 cities,

sorted by population, starting year, age…

• All skyscrapers in China (Japan, Thailand,…) of 50 (40/60/70) floors or more, and built in year 2000 (2001/2002) and after, sorted by built year, floors…, grouped by cities, regions…

• Median (average) base annual salary of CEOs of Fortune 100 companies in America (Europe, Asian,…)

• All Porsche Vehicles Made in Germany that accelerate from 1-100 km/h less than 4 seconds

• Sci-Fi movies made after year 2000 that cost less than $10M and gross more than $30M

• A map showing where all Mercedes-Benz vehicles are manufactured

• And many more

18

What is a Semantic WikiWhat is a Semantic Wiki

A wiki that has an underlying model of the knowledge described in its pages.

To allow users to make their knowledge explicit and formal Semantic Web Compatible

Semantic Wiki

19

Two PerspectivesTwo Perspectives

Wikis for Metadata

Metadata for Wikis

Characteristics of Semantic WikisCharacteristics of Semantic Wikis

Semantic Wikis

20

List of Semantic WikisList of Semantic Wikis

AceWikiArtificialMemoryWagn - Ruby on Rails-basedKiWi – Knowledge in a WikiKnoodl – Semantic Collaboration tool and application platformMetaweb - the software that powers FreebaseOntoWikiOpenRecordPhpWiki

Semantic MediaWiki - an extension to MediaWiki that turns it into a semantic wikiSwirrl - a spreadsheet-based semantic wiki applicationTaOPis - has a semantic wiki subsystem based on Frame logicTikiWiki CMS/Groupware integrates Semantic links as a core featurezAgile Wikidsmart - semantically enables Confluence

21

22

Basics of Semantic WikisBasics of Semantic Wikis

Still a wiki, with regular wiki features– Category/Tags, Namespaces, Title, Versioning, ...

Typed Content (built-ins + user created, e.g. categories)– Page/Card, Date, Number, URL/Email, String, …

Typed Links (e.g. properties)– “capital_of”, “contains”, “born_in”…

Querying Interface Support– E.g. “[[Category:Member]] [[Age::<30]]” (in SMW)

24

SMW Markup SyntaxSMW Markup Syntax

[[Property::Value | Display]]

Tsinghua is a university located in [[Has location::Beijing]], with

[[Has population::27,000]] students.

In page "Property:Has location":

[[Has type::Page]]

In page "Property:Has population":

[[Has type::number]]

26

Define ClassesDefine Classes

On Page Beijing One possible solution:

– Beijing is a [[Is a::city]]

Beijing is a city in [[Has country::China]], with population [[Has population::2,200,000]].

[[Category::Cities]]

Categories are used to define classes because they are better for class inheritance.

The Jin Mao Tower (金茂大厦 ) is an 88-story landmark supertall skyscraper in …

[[Categories: 1998 architecture | Skyscrapers in Shanghai | Hotels in Shanghai | Skyscrapers over 350 meters | Visitor attractions in Shanghai | Landmarks in Shanghai | Skidmore, Owings and Merrill buildings]]

Category:Skyscrapers in China Category: Skyscrapers by country

27

Database-style Query over Wiki DataDatabase-style Query over Wiki Data

{{#ask:[[Category:Skyscrapers]][[Located in::China]][[Floor count::>50]][[Year built::<2000]] …

}}

Example: Skyscrapers in China higher than 50 stories, built before

2000

ASK/SPARQL query target

Data via DBpedia

29

What is the Promise of Semantic Wikis?What is the Promise of Semantic Wikis?

Semantic Wikis promise Consensus over Data

Combine low-expressivity data authorship with the best features of traditional wikis

User-governed, user-maintained, user-defined

Easy to use as an extension of text authoring

The ultimate data aggregator

31

One Key Helpful Feature of Semantic WikisOne Key Helpful Feature of Semantic Wikis

Semantic Wikis are “Schema-Last”Databases require DBAs and schema design;

Semantic Wikis develop and maintain the schema in the wiki

32

Semantic MediaWiki in 2010Semantic MediaWiki in 2010

Open source (GPL) Well documented Active mailing list Commercial support available World-wide community Regular Conferences

– Next SMWCon 4/28-30, 2011 Arlington, VA

http://semantic-mediawiki.org/Very stable SMW core

Mature while still growing, slowly but steadily

33

SMW ExtensionsSMW Extensions

• Halo Extensions, Semantic Forms, Semantic Notification, …

Data I/O

• Semantic Toolbar, Semantic Drilldown, Enhanced Retrieval, Search…

Query and Browsing

• Semantic Result Printers, Tree View, Exhibit, Flash charts…

Visualization

• HaloACL, Deployment, Triplestore Connector, Simple Rules…• Semantic WikiTags and Subversion Integration extensions • Upcoming Linked Data Extension, with R2R and SILK from F.U.Berlin

Other useful extensions

37

Wikis Can Help Information ManagementWikis Can Help Information Management

Business Intelligence Finding Expertise Internal Encyclopedia Documentation Enterprise Search

Crowd Sourcing is a Great Solution!

Research = Locate and Find Data ?

38

Example I: KnowIT in Johnson & JohnsonExample I: KnowIT in Johnson & Johnson

Most Frequently Asked Questions: (J&J example)– What are the directions between two J&J sites?– What is the meaning of KOL ? HLM ? DRU ?– What data sources can we use to compare biological pathways?– Can you give us a list of R&D applications, related servers and

stakeholders and send us an update every six months?

Capture Facts About Things– Definitions, concepts, questions– Locations – Data sources– Organizations and people– Technologies and systems

39

System ArchitectureSystem Architecture

41

Example II: Knowledge Encapsulation FrameworkExample II: Knowledge Encapsulation Framework

Allow modelers to exploit the ‘information resources’ they have and discover new, potentially relevant material across new media types

KEF aims to provide:– an effective method for storing, retrieving, reviewing and

annotating your documents– an environment where you can share these materials with team

members and discuss– a mechanism to discover new, related information for social and

traditional media– a means to link this material to model representations to aid

analysis and game-play Achieved by a semantic wiki enabled with an NLP pipeline

42

43

45

Example 3: Ultrapedia – An Analytical Semantic WikipediaExample 3: Ultrapedia – An Analytical Semantic Wikipedia

Ultrapedia: An SMW demo built to explore general knowledge acquisition in a wiki

Wikipedia merged with the power of a database– Data extracted from Wikipedia Infobox and Table data; stored in RDF– For Authors: tools to create more compelling articles

• Great visualizations: charts, tables, timelines, photos, analytics• Always up-to-date across the Encyclopedia• Encourage data consistency and find data errors• Link in other web data sources

– For Readers: • Enhanced articles and data interaction• Faceted navigation• Sophisticated queries (both standing and ad-hoc)

Maintenance via the Wikipedia update process– Data is from the article text, with simple ways for article authors to maintain and

extend it.– Authors and readers always in the loop for merging, updating, validating, mapping

Graph Views of the Acceleration DataGraph Views of the Acceleration Data

Dynamic Mapping and ChartingDynamic Mapping and Charting

52

Information Discovery via VisualizationInformation Discovery via Visualization

55

Video: Semantic Wikis for A New ProblemVideo: Semantic Wikis for A New Problem

Social tag-based characterization

Keyword search over tag data

Inconsistent semantics

Easy to engineer

Increasing technical complexity → ← Increasing User Participation

Algorithm-based object characterization

Database-style search

Consistent semantics Extremely difficult to

engineer

Social database-style characterization

Database search + wiki text search

Semantic consistency via wiki mechanisms

Easy to engineer

Semantic Entertainment

Wiki

56

Semantic Seahawks Football WikiSemantic Seahawks Football Wiki

57

Based on Simple Templates and FormsBased on Simple Templates and Forms

Semantic Entertainment: Query Result Highlight ReelSemantic Entertainment: Query Result Highlight Reel

Commercial Look/Feel

Play-by-play video search

Highlight reel generation

Search on crowd-defined patterns (“touchdowns with big hits”)

Tree-based navigation widget

Very favorable economics

Demo

60

The InspirationThe Inspiration

We started with a

We built a

We now have an

wiki

web site

application

61

We CAN Build Applications (Fairly) EasilyWe CAN Build Applications (Fairly) Easily

With all the extensions of Semantic MediaWiki.

• Halo Extensions, Semantic Forms, Semantic Notification, …

Data I/O

• Semantic Toolbar, Semantic Drilldown, Enhanced Retrieval, Search…

Query and Browsing

• Semantic Result Printers, Tree View, Exhibit, Flash charts…

Visualization

• HaloACL, Deployment, Triplestore Connector, Simple Rules…• Semantic WikiTags and SVN Integration extensions • Upcoming Linked Data Extension, with R2R and SILK from FUB

Other useful extensions

Social Semantic Web Applications

65

Collaborative Proposal Management at BT with SMW+Collaborative Proposal Management at BT with SMW+

Active Bid Viewer Service Desk Selector

67

Social Semantic Web ApplicationsSocial Semantic Web Applications

Omitting x examples, y pictures and z lines of text…

68

Case Study 2 and Demo: Project Management with SMW+Case Study 2 and Demo: Project Management with SMW+

Automatically populate tables

Just the data you want, At the level you want Calendars and

timelines Workflows Personal menus Form-oriented inputs Notifications via

email/RSS MS Office integration SVN integration

Vulcan Project Management Wiki (Story)Vulcan Project Management Wiki (Story)

Template and style sheet

customizations

Related content

automatically included

70

Vulcan Project Management Wiki (Task)Vulcan Project Management Wiki (Task)

Color codes to indicate types

and status

SVN Integration automatically “Completed”

task and relate to repository

71

Vulcan Project Management Wiki (Visualizations)Vulcan Project Management Wiki (Visualizations)

Demo

72

Screenshot of a Sprint pageScreenshot of a Sprint page

http://wiking.vulcan.com/dev/index.php/Sprint_101020

Data automatically generated via template queries on page

73

Requirements for Wiki “Developers”Requirements for Wiki “Developers”

One need not– Write code like a hardcore programmer– Design, setup RDBMS or make frequent schema changes– Possess knowledge of a senior system admin

Instead one need– Configure the wiki with desired extensions– Design and evolve the data model (schema)– Design Content

• Customize templates, forms, styles, skin, etc.

The bar is dramatically lowered to build applications – “Source code” is part of the open content of wiki too!

74

Effectiveness of SMW as a Platform ChoiceEffectiveness of SMW as a Platform Choice

Packaged Software

☺Very quick to obtainN Hard to customizeN Expensive

Microsoft Project Version One Microsoft

SharePoint

Custom Development

N Slow to develop☺Extremely flexibleN High cost to develop and maintain

.NET Framework J2EE, … Ruby on rails

SMW + Extensions

☺ Still quick to program☺ Easy to customize☺ Low-moderate cost

Vulcan Project Wiki B.L.S. RPI map

79

ConclusionsConclusions

Semantic MediaWiki+ (http://smwforum.ontoprise.com) – Open-source, growing semantic wiki software system– Wiki-style text + semantic markups– Collaborative, user-governed subject models and data curation– Simple and extensible data models with easy import/export

SMW+ has many government and industry users– People built applications with it

Knowledge Management viacrowds can work– A way to leverage and exploit

web-collected data– A lightweight collaborative

knowledge management tool

A new platform for lightweight web application development

KB E

ffort (co

st, p

eople

,…)

KB size (number of assertions, complexity…)

Vulcan

Now

Future

AcknowledgementAcknowledgement

Paul Allen

Mark Greaves

Andrew J Cowell

Laurent Alquier

Li Ding and Bao Jie

University of Karlsruhe

Tommy Lu

Ontoprise GmbH

William Smith

Ed Swing

TeamMersion LLC

Jesse Wang

80

Thank you!

81

Backups starts here

(End of Slides)

82

Case Study: Battle-space Luminary System Case Study: Battle-space Luminary System

Discover when New Information represents a change in understanding of entities– Discovery of explicit entity links, implicit relationships

Large Volumes of Data in various formats– Unstructured news articles– Tactical Reports, Field Intelligence– Structured Database Information

Use Wiki Pages to represent current knowledge about an entity – “what we know” Domain Ontology to represent domain of information – “what we want to know” Issue Alerts when Significant Events occur

– New information according to category– Changing information on topics of interest– Need to send information to various devices – cell phones, email, etc.

83

System DesignSystem Design

Wiki Configuration– Semantic MediaWiki: Large developer community, active development, open

source. Wikipedia uses MediaWiki, so scalability and performance are important.

– Semantic Results Format: Provides various rich media displays of semantic information, including graphs, timelines, maps

– Semantic Forms: Provides convenient user interface for entering semantic data into wiki, avoiding cumbersome wikitext

– Semantic Notifications: Enables sending of notifications when results of semantic query change.

Domain Ontology– Created OWL Ontology for Terrorism

Semantic Parsing, Extraction, Reasoning– Java Process using various Open-Source Toolkits– Rapid plugin of new technologies– Multiple Data Sources supported

84

Sample Content PageSample Content Page

85

Wiki Content DesignWiki Content Design

Use Templates to Ensure Consistent Look-and-Feel– Templates Correspond to Ontology Classes– Fields within Templates correspond to Properties within Ontology– Rich Content Visualizations derived in consistent way

Hierarchical Categories match Class Hierarchy within Ontology– Ensures Validity for Properties– Category included on each Template page to ensure consistency

Forms Provide ability for users to enter data directly into wiki without knowing Wiki Text– Each form corresponds to a Template– Fields within forms correspond to the fields/properties within the Template– GUI can include auto-completion– Created Page immediately linked semantically to rest of Wiki

86

Sample VisualizationsSample Visualizations

Visualizations automatically created

w/o user edit(tables, timelines,

maps, social networks…)

UI enables notifications based

on results of query – message sent when

visualization changes

Wikipedia for Porsches (Acceleration Data Example)Wikipedia for Porsches (Acceleration Data Example)

Information Need: All Porsche models that accelerate 0-100kph in under 5, 6, and 7 seconds

More Porsche Acceleration Data in WikipediaMore Porsche Acceleration Data in Wikipedia

Main PageUltrapedia Main PageUltrapedia Main Page

Tree View Control Abstract/Summary quick preview

Semantics for Improved Wiki NavigationSemantics for Improved Wiki Navigation

The Porsche 996 Acceleration Table In UltrapediaThe Porsche 996 Acceleration Table In Ultrapedia

Same Table as a QuerySame Table as a Query

Which Porsches accelerate fast?Dynamically-Generated Tables for QueriesDynamically-Generated Tables for Queries

Information Need: All Porsche models that accelerate 0-100kph in under 5, 6, and 7 seconds

Graph Views of the Acceleration DataGraph Views of the Acceleration Data

External Data via a Live Ebay QueryExternal Data via a Live Ebay Query

Linking to External Ebay DataLinking to External Ebay Data

Mercedes-Benz E-class W212 Gallery SectionPhotos in Wiki Articles as DataPhotos in Wiki Articles as Data

Volkswagen Production Timeline ViewTimelines from DataTimelines from Data

Dynamic Mapping and ChartingDynamic Mapping and Charting

Editing Wiki Data In PlaceEditing Wiki Data In Place

Return