MiningTheSocialWeb.Ch2.Microformat

17
Ch.2 Micro-formats (Semantic Markup and Common Sense Collide) 2011. 10. 22 chois79 11년 10월 22일 토요일

description

Mining The Social Web Ch2.Microformat

Transcript of MiningTheSocialWeb.Ch2.Microformat

Page 1: MiningTheSocialWeb.Ch2.Microformat

Ch.2 Micro-formats(Semantic Markup and Common Sense Collide)

2011. 10. 22chois79

11년 10월 22일 토요일

Page 2: MiningTheSocialWeb.Ch2.Microformat

Introduction

• A Web-based approach to semantic markup

• Important step forward

• Provide an effective mechanism for embedding “smarter data” into web page

• Easy for content authors to implement

• ex) XFN, geo, hRecipe, hReview...

• Increased role in social data meshups

11년 10월 22일 토요일

Page 3: MiningTheSocialWeb.Ch2.Microformat

Micro-formats example

11년 10월 22일 토요일

Page 4: MiningTheSocialWeb.Ch2.Microformat

XFN and Friends• XFN(XHTML Friends Networks)

• Most popular micro-format

• Identifying relationships by including a few keywords in the rel attribute of an anchor tag.

• Commonly used in blogs or blogroll

• example

11년 10월 22일 토요일

Page 5: MiningTheSocialWeb.Ch2.Microformat

A Breath-First crawl of XFN Data(1/4)

• Pseudocode for a breath-first search

• Useful for social situation

• To identify mutual friend

11년 10월 22일 토요일

Page 6: MiningTheSocialWeb.Ch2.Microformat

A Breath-First crawl of XFN Data(2/4)

• BeautifulSoup package

• easy_install BeautifulSoup

11년 10월 22일 토요일

Page 7: MiningTheSocialWeb.Ch2.Microformat

A Breath-First crawl of XFN Data(3/4) - IMG

11년 10월 22일 토요일

Page 8: MiningTheSocialWeb.Ch2.Microformat

A Breath-First crawl of XFN Data(4/4) - IMG

11년 10월 22일 토요일

Page 9: MiningTheSocialWeb.Ch2.Microformat

Brief analysis of breadth-first techniques(1/3)• Two consideration for an algorithm

• Efficiency and effectiveness

• Or performance and quality

• Standard performance analysis

• Worst case time and space complexity

• The breadth-first approach is essentially a breadth-first search

11년 10월 22일 토요일

Page 10: MiningTheSocialWeb.Ch2.Microformat

Brief analysis of breadth-first techniques(2/3)• Breadth-first search

• Both the time and space complexity can be bounded in the worst case by bᵈ

• b: the branching factor of the graph

• d: depth

11년 10월 22일 토요일

Page 11: MiningTheSocialWeb.Ch2.Microformat

Brief analysis of breadth-first techniques(3/3)• Mission accomplished?

• Final consideration in analysis is the overall quality of the results

• Slight variations in URLs result in multiple nodes potentially appearing in the appearing for the same person

• ex) http://example.com/~matthew, http://www.example.com/~matthew

11년 10월 22일 토요일

Page 12: MiningTheSocialWeb.Ch2.Microformat

Geo-coordinates:A Common Thread for just About Anything

• Geo data is ubiquitous and plays a powerful part in too many social mashups.

• The device between “real life” and “life on the web” continues to close

11년 10월 22일 토요일

Page 13: MiningTheSocialWeb.Ch2.Microformat

Wikipedia Articles + Google Maps = Road Trip(1/2)

• Geo is one simplest and most widely used micro-formats

• Two techniques for describing Geo

• A slew of popular sites use geo and micro-formats to expose structured data in their pages

• Wikipedia, Yahoo! Local, MapQuest Local ...

11년 10월 22일 토요일

Page 14: MiningTheSocialWeb.Ch2.Microformat

Wikipedia Articles + Google Maps = Road Trip(2/2)

11년 10월 22일 토요일

Page 15: MiningTheSocialWeb.Ch2.Microformat

Slicing and Dicing Recipes - Food Network• hRecipe

• micro-format for food recipe

• http://microformats.org/wiki/hrecipe

11년 10월 22일 토요일

Page 16: MiningTheSocialWeb.Ch2.Microformat

Collecting Restaurant Reviews - Yelp

• hReview

• micro-format for reviews

• http://microformats.org/wiki/hreview

11년 10월 22일 토요일

Page 17: MiningTheSocialWeb.Ch2.Microformat

Summary

• Remember that micro-formats are a way of decorating markup to expose specific types of structured information

• HTML5’s micro-data

• HTML5 specification used to nest semantics within existing content on web pages

• http://www.w3.org/TR/html5/microdata.html

11년 10월 22일 토요일