석사 2 차 지 애 띠

29
INHA UNIVERSITY INCHON, KOREA http://eslab.inha.ac.kr PocketLens : Toward a Perso PocketLens : Toward a Perso nal Recommender System nal Recommender System B.N. Miller, J.A. Konstan, J. Riedl, B.N. Miller, J.A. Konstan, J. Riedl, ACM Transact ACM Transact ions on Information Systems, Vol. 22, No. 3, July ions on Information Systems, Vol. 22, No. 3, July 2004 2004 석석 석석 2 2 석 석 석 석 석 석

description

PocketLens : Toward a Personal Recommender System B.N. Miller, J.A. Konstan, J. Riedl, ACM Transactions on Information Systems, Vol. 22, No. 3, July 2004. 석사 2 차 지 애 띠. OUTLINE. INTRODUCTION RELATED WORKS POCKETLENS POCKETLENS ARCHTECTURES SIMULATION RESULTS DISCUSSIONS. INTRODUCTION. - PowerPoint PPT Presentation

Transcript of 석사 2 차 지 애 띠

Page 1: 석사  2 차  지 애 띠

INHA UNIVERSITY

INCHON, KOREA

http://eslab.inha.ac.kr

PocketLens : Toward a Personal RecoPocketLens : Toward a Personal Recommender Systemmmender System

B.N. Miller, J.A. Konstan, J. Riedl, B.N. Miller, J.A. Konstan, J. Riedl, ACM Transactions on InfACM Transactions on Information Systems, Vol. 22, No. 3, July 2004ormation Systems, Vol. 22, No. 3, July 2004

석사 석사 22 차 차 지 애 띠지 애 띠

Page 2: 석사  2 차  지 애 띠

INHA UNIVERSITY

INCHON, KOREA

http://eslab.inha.ac.kr

OUTLINE OUTLINE

INTRODUCTION

RELATED WORKS

POCKETLENS

POCKETLENS ARCHTECTURES

SIMULATION RESULTS

DISCUSSIONS

Page 3: 석사  2 차  지 애 띠

INHA UNIVERSITY

INCHON, KOREA

http://eslab.inha.ac.kr

INTRODUCTIONINTRODUCTION

Two key problems in recommenders

• Portability – portable recommendations are available on small device such as palmtop computers

• Trust – reliability and security

Page 4: 석사  2 차  지 애 띠

INHA UNIVERSITY

INCHON, KOREA

http://eslab.inha.ac.kr

INTRODUCTION – Research INTRODUCTION – Research GoalsGoals Portability.

• Users should be able to receive recommendations wherever they are, on whatever client they are using, even when client is disconnected from Internet.

Palmtop. • Users should be able to run the recommender on

palmtop size machines.

User Control. • Users should control their profile and ratings

information.

Accuracy. • To provide recommendations that are as good as those

provided by the best known algorithm.

Page 5: 석사  2 차  지 애 띠

INHA UNIVERSITY

INCHON, KOREA

http://eslab.inha.ac.kr

INTRODUCTION - INTRODUCTION - ContributionsContributions

To introduce the PocketLens, a peer-to-peer collaborative filtering algorithm. • To run on client devices as small as palmtop computers and

to enable users • To choose to only share some of their ratings with other user

s.

To provide a comparison of five architectures for distributing ratings among recommender clients.

To evaluate the architectures.• To meet the accuracy and performance goals• The tradeoffs among security, implementation complexity and

performance

Page 6: 석사  2 차  지 애 띠

INHA UNIVERSITY

INCHON, KOREA

http://eslab.inha.ac.kr

RELATED WORKRELATED WORK

User-item collaborative filtering Model based collaborative filtering Peer-to-peer

• Large scale computation – SETI@home• One-to-one communication – AIM, ICQ and other messengers• File sharing – Gnutella, OceanStore and Freenet• Content distribution – Publius, Tangler, Free Heaven, Interme

mory and Mojonation• Collaborative filtering

Intelligent Agents - Yenta

Page 7: 석사  2 차  지 애 띠

INHA UNIVERSITY

INCHON, KOREA

http://eslab.inha.ac.kr

POCKETLENS - benefitsPOCKETLENS - benefits

Portability • creating a model while the user is online, that can be

used to make recommendations while the user is offline.

Palmtop compatibility • the computation of the model is distributed across

many processors, and the resulting model is small. User control

• allowing the user to choose which part of her profile she wants to share with other members of the peer-to-peer community.

Accuracy • experimental evaluation

Page 8: 석사  2 차  지 애 띠

INHA UNIVERSITY

INCHON, KOREA

http://eslab.inha.ac.kr

POCKETLENS – Algorithm(1)POCKETLENS – Algorithm(1)

[Fig.1] Overview of the PocketLens architecture for building a similarity model and making recommendations.

Page 9: 석사  2 차  지 애 띠

INHA UNIVERSITY

INCHON, KOREA

http://eslab.inha.ac.kr

Similarity model creation• Find neighbor module searches the p2p network

to find neighbors.• Update model module incorporates the ratings

from that neighbor into the similarity model.

Recommendation• Recommend module examines the similarity

model to find the n items that are most similar to some or all of those that the model owner has already rated.

POCKETLENS – Algorithm(2)POCKETLENS – Algorithm(2)

Page 10: 석사  2 차  지 애 띠

INHA UNIVERSITY

INCHON, KOREA

http://eslab.inha.ac.kr

POCKETLENS – Algorithm(3)POCKETLENS – Algorithm(3)

Page 11: 석사  2 차  지 애 띠

INHA UNIVERSITY

INCHON, KOREA

http://eslab.inha.ac.kr

POCKETLENS POCKETLENS ARCHITECTURES(1)ARCHITECTURES(1)

Central Server• Each user has a unique persistent identifier.• The ratings are stored at the central server, but each

client builds its own model and computes its own recommendations.

[Fig.2] Central server storage central, computation distributed

Page 12: 석사  2 차  지 애 띠

INHA UNIVERSITY

INCHON, KOREA

http://eslab.inha.ac.kr

Random Discovery• Each user remains anonymous.• Using the underlying protocol of Gnutella, that is the

ping/pong mechanism, to find only the neighbors that happen to be available on the network.

POCKETLENS POCKETLENS ARCHITECTURES(2)ARCHITECTURES(2)

[Fig.3] Random Discovery storage and computation distributed

Page 13: 석사  2 차  지 애 띠

INHA UNIVERSITY

INCHON, KOREA

http://eslab.inha.ac.kr

POCKETLENS POCKETLENS ARCHITECTURES(3)ARCHITECTURES(3)

[Fig.4] The Gnutella PING/PONG protocol

Page 14: 석사  2 차  지 애 띠

INHA UNIVERSITY

INCHON, KOREA

http://eslab.inha.ac.kr

Transitive Traversal• This improves upon the random discovery

architecture by incrementally learning who the most similar neighbors are each time it encounters a new user.

• Maintaining a queue of “neighbors’ neighbors”

POCKETLENS POCKETLENS ARCHITECTURES(4)ARCHITECTURES(4)

[Fig.5] Transitive Traversal storage and computation distributed

Page 15: 석사  2 차  지 애 띠

INHA UNIVERSITY

INCHON, KOREA

http://eslab.inha.ac.kr

POCKETLENS POCKETLENS ARCHITECTURES(5)ARCHITECTURES(5)

[Fig.6] The Gnutella Query protocol

Page 16: 석사  2 차  지 애 띠

INHA UNIVERSITY

INCHON, KOREA

http://eslab.inha.ac.kr

Content Addressable• Based on recent advances in architectures for p2p

file sharing network such as Chord, CAN and Pastry.• Advantages over Gnutella which is to impose a

deterministic overlay routing system on the network.

POCKETLENS POCKETLENS ARCHITECTURES(6)ARCHITECTURES(6)

[Fig.7] Content Addressablestorage and ratings distributed, community model

Page 17: 석사  2 차  지 애 띠

INHA UNIVERSITY

INCHON, KOREA

http://eslab.inha.ac.kr

POCKETLENS POCKETLENS ARCHITECTURES(7)ARCHITECTURES(7)

[Fig.8] Chord lookup [Fig.9] Chord join

Page 18: 석사  2 차  지 애 띠

INHA UNIVERSITY

INCHON, KOREA

http://eslab.inha.ac.kr

POCKETLENS POCKETLENS ARCHITECTURES(8)ARCHITECTURES(8)

[Fig.10] II - Chord stores each row of the item-item matrix at a node

Page 19: 석사  2 차  지 애 띠

INHA UNIVERSITY

INCHON, KOREA

http://eslab.inha.ac.kr

Secure Blackboard• It has to be possible to encrypt users’ profile, add the profile withou

t decryption and the community should able to decrypt the final result and use the model.

• Making use of a write once, read many (WORM) blackboard, and a secure source of random bits.

• Cramer’s secure online voting protocol• Each vote is encrypted using an ElGamal encryption algorithm.

POCKETLENS POCKETLENS ARCHITECTURES(9)ARCHITECTURES(9)

[Fig.11] Secure Blackboard encrypted partial results written to WORM blackboard, community model

Page 20: 석사  2 차  지 애 띠

INHA UNIVERSITY

INCHON, KOREA

http://eslab.inha.ac.kr

SIMULATION RESULTS(1)SIMULATION RESULTS(1)

Data set• Random samples that a user had to have rated at

least 20 movies.• 96,000 ratings for 3,775 movie titles and a

population of 1,000 users.• The sample data-set was used to randomly create 10

trial data-sets.• To simulate different community size they selected

groups of 500, 1,000 and 2,000 users at random in the full database consisting of six million ratings for 5,700 movies and 68,000 users.

Page 21: 석사  2 차  지 애 띠

INHA UNIVERSITY

INCHON, KOREA

http://eslab.inha.ac.kr

Experimental Methods• To test the Central, Random and Transitive architectures, the

y alternated between training and testing while adding neighbors to the neighborhood.

• During each training cycle five new neighbors were added to the model.

• After training cycle, the test cycle were performed to get a set of recommendations for the model owner.

• To test the II-Chord and SBB architectures, they randomly reserved one or more ratings from each user to be a part of the test set.

• After the model was built, test recommendations were generated for each user’s reserved item(s).

SIMULATION RESULTS(2)SIMULATION RESULTS(2)

Page 22: 석사  2 차  지 애 띠

INHA UNIVERSITY

INCHON, KOREA

http://eslab.inha.ac.kr

Metrics• Average Similarity is a how close a group of

neighbors is to the model owner.• Mean Absolute Error is a measure of how

accurately we can predict a user’s rating for an item.• Recall is a measure of how often a list of

recommendations contains an item that the user has actually rated.

• Coverage is a measure of the percentage of items for which a recommendation system can provide predictions.

• Memory Usage is a measure of how large the model grows as more items and neighbors are incorporated.

SIMULATION RESULTS(3)SIMULATION RESULTS(3)

Page 23: 석사  2 차  지 애 띠

INHA UNIVERSITY

INCHON, KOREA

http://eslab.inha.ac.kr

Neighborhood Similarity

SIMULATION RESULTS(4)SIMULATION RESULTS(4)

[Fig.12] Average similarity of users in the model as neighbor size grows

Page 24: 석사  2 차  지 애 띠

INHA UNIVERSITY

INCHON, KOREA

http://eslab.inha.ac.kr

Coverage

SIMULATION RESULTS(5)SIMULATION RESULTS(5)

[Fig.13] Coverage as neighborhood size grows

Page 25: 석사  2 차  지 애 띠

INHA UNIVERSITY

INCHON, KOREA

http://eslab.inha.ac.kr

MAE

SIMULATION RESULTS(6)SIMULATION RESULTS(6)

[Fig.14] MAE as neighborhood size grows

[Fig.15] MAE for community built models

Page 26: 석사  2 차  지 애 띠

INHA UNIVERSITY

INCHON, KOREA

http://eslab.inha.ac.kr

Recall

SIMULATION RESULTS(7)SIMULATION RESULTS(7)

[Fig.16] Recall as neighborhood size grows

Page 27: 석사  2 차  지 애 띠

INHA UNIVERSITY

INCHON, KOREA

http://eslab.inha.ac.kr

DISCUSSION(1)DISCUSSION(1)

To increase portability, recommendations can be available to users whenever and wherever they want.

[Table1] Effect of model truncation in the PocketLens algorithm on size, speed, coverage and MAE

Page 28: 석사  2 차  지 애 띠

INHA UNIVERSITY

INCHON, KOREA

http://eslab.inha.ac.kr

To increase trust, system enables users to decide how much of their information to share while preserving their anonymity.

DISCUSSION(2)DISCUSSION(2)

Fig.17 Comparison of different architectures wrt. data security and complexity of model building

Page 29: 석사  2 차  지 애 띠

INHA UNIVERSITY

INCHON, KOREA

http://eslab.inha.ac.kr

CONCLUSION & FUTURE CONCLUSION & FUTURE WORKWORK Conclusions

• The PocketLens algorithm is portable enough to run on disconnected palmtop computers, and can protect the user’s privacy and provide trust recommendations.

• The quality of recommendations is as good as the best previously reported results [Sarwar et al. 2001].

• Among five architectures, no one is perfect but the architectures provides fast, portable recommendations with privacy protection, and good quality.

Future Works• To turn these architectures into working systems.• To investigate interface issues for helping users message and contr

ol their profile information.• To prevent shilling attack problems.