INHA UNIVERSITY
INCHON, KOREA
http://eslab.inha.ac.kr
PocketLens : Toward a Personal RecoPocketLens : Toward a Personal Recommender Systemmmender System
B.N. Miller, J.A. Konstan, J. Riedl, B.N. Miller, J.A. Konstan, J. Riedl, ACM Transactions on InfACM Transactions on Information Systems, Vol. 22, No. 3, July 2004ormation Systems, Vol. 22, No. 3, July 2004
석사 석사 22 차 차 지 애 띠지 애 띠
INHA UNIVERSITY
INCHON, KOREA
http://eslab.inha.ac.kr
OUTLINE OUTLINE
INTRODUCTION
RELATED WORKS
POCKETLENS
POCKETLENS ARCHTECTURES
SIMULATION RESULTS
DISCUSSIONS
INHA UNIVERSITY
INCHON, KOREA
http://eslab.inha.ac.kr
INTRODUCTIONINTRODUCTION
Two key problems in recommenders
• Portability – portable recommendations are available on small device such as palmtop computers
• Trust – reliability and security
INHA UNIVERSITY
INCHON, KOREA
http://eslab.inha.ac.kr
INTRODUCTION – Research INTRODUCTION – Research GoalsGoals Portability.
• Users should be able to receive recommendations wherever they are, on whatever client they are using, even when client is disconnected from Internet.
Palmtop. • Users should be able to run the recommender on
palmtop size machines.
User Control. • Users should control their profile and ratings
information.
Accuracy. • To provide recommendations that are as good as those
provided by the best known algorithm.
INHA UNIVERSITY
INCHON, KOREA
http://eslab.inha.ac.kr
INTRODUCTION - INTRODUCTION - ContributionsContributions
To introduce the PocketLens, a peer-to-peer collaborative filtering algorithm. • To run on client devices as small as palmtop computers and
to enable users • To choose to only share some of their ratings with other user
s.
To provide a comparison of five architectures for distributing ratings among recommender clients.
To evaluate the architectures.• To meet the accuracy and performance goals• The tradeoffs among security, implementation complexity and
performance
INHA UNIVERSITY
INCHON, KOREA
http://eslab.inha.ac.kr
RELATED WORKRELATED WORK
User-item collaborative filtering Model based collaborative filtering Peer-to-peer
• Large scale computation – SETI@home• One-to-one communication – AIM, ICQ and other messengers• File sharing – Gnutella, OceanStore and Freenet• Content distribution – Publius, Tangler, Free Heaven, Interme
mory and Mojonation• Collaborative filtering
Intelligent Agents - Yenta
INHA UNIVERSITY
INCHON, KOREA
http://eslab.inha.ac.kr
POCKETLENS - benefitsPOCKETLENS - benefits
Portability • creating a model while the user is online, that can be
used to make recommendations while the user is offline.
Palmtop compatibility • the computation of the model is distributed across
many processors, and the resulting model is small. User control
• allowing the user to choose which part of her profile she wants to share with other members of the peer-to-peer community.
Accuracy • experimental evaluation
INHA UNIVERSITY
INCHON, KOREA
http://eslab.inha.ac.kr
POCKETLENS – Algorithm(1)POCKETLENS – Algorithm(1)
[Fig.1] Overview of the PocketLens architecture for building a similarity model and making recommendations.
INHA UNIVERSITY
INCHON, KOREA
http://eslab.inha.ac.kr
Similarity model creation• Find neighbor module searches the p2p network
to find neighbors.• Update model module incorporates the ratings
from that neighbor into the similarity model.
Recommendation• Recommend module examines the similarity
model to find the n items that are most similar to some or all of those that the model owner has already rated.
POCKETLENS – Algorithm(2)POCKETLENS – Algorithm(2)
INHA UNIVERSITY
INCHON, KOREA
http://eslab.inha.ac.kr
POCKETLENS – Algorithm(3)POCKETLENS – Algorithm(3)
INHA UNIVERSITY
INCHON, KOREA
http://eslab.inha.ac.kr
POCKETLENS POCKETLENS ARCHITECTURES(1)ARCHITECTURES(1)
Central Server• Each user has a unique persistent identifier.• The ratings are stored at the central server, but each
client builds its own model and computes its own recommendations.
[Fig.2] Central server storage central, computation distributed
INHA UNIVERSITY
INCHON, KOREA
http://eslab.inha.ac.kr
Random Discovery• Each user remains anonymous.• Using the underlying protocol of Gnutella, that is the
ping/pong mechanism, to find only the neighbors that happen to be available on the network.
POCKETLENS POCKETLENS ARCHITECTURES(2)ARCHITECTURES(2)
[Fig.3] Random Discovery storage and computation distributed
INHA UNIVERSITY
INCHON, KOREA
http://eslab.inha.ac.kr
POCKETLENS POCKETLENS ARCHITECTURES(3)ARCHITECTURES(3)
[Fig.4] The Gnutella PING/PONG protocol
INHA UNIVERSITY
INCHON, KOREA
http://eslab.inha.ac.kr
Transitive Traversal• This improves upon the random discovery
architecture by incrementally learning who the most similar neighbors are each time it encounters a new user.
• Maintaining a queue of “neighbors’ neighbors”
POCKETLENS POCKETLENS ARCHITECTURES(4)ARCHITECTURES(4)
[Fig.5] Transitive Traversal storage and computation distributed
INHA UNIVERSITY
INCHON, KOREA
http://eslab.inha.ac.kr
POCKETLENS POCKETLENS ARCHITECTURES(5)ARCHITECTURES(5)
[Fig.6] The Gnutella Query protocol
INHA UNIVERSITY
INCHON, KOREA
http://eslab.inha.ac.kr
Content Addressable• Based on recent advances in architectures for p2p
file sharing network such as Chord, CAN and Pastry.• Advantages over Gnutella which is to impose a
deterministic overlay routing system on the network.
POCKETLENS POCKETLENS ARCHITECTURES(6)ARCHITECTURES(6)
[Fig.7] Content Addressablestorage and ratings distributed, community model
INHA UNIVERSITY
INCHON, KOREA
http://eslab.inha.ac.kr
POCKETLENS POCKETLENS ARCHITECTURES(7)ARCHITECTURES(7)
[Fig.8] Chord lookup [Fig.9] Chord join
INHA UNIVERSITY
INCHON, KOREA
http://eslab.inha.ac.kr
POCKETLENS POCKETLENS ARCHITECTURES(8)ARCHITECTURES(8)
[Fig.10] II - Chord stores each row of the item-item matrix at a node
INHA UNIVERSITY
INCHON, KOREA
http://eslab.inha.ac.kr
Secure Blackboard• It has to be possible to encrypt users’ profile, add the profile withou
t decryption and the community should able to decrypt the final result and use the model.
• Making use of a write once, read many (WORM) blackboard, and a secure source of random bits.
• Cramer’s secure online voting protocol• Each vote is encrypted using an ElGamal encryption algorithm.
POCKETLENS POCKETLENS ARCHITECTURES(9)ARCHITECTURES(9)
[Fig.11] Secure Blackboard encrypted partial results written to WORM blackboard, community model
INHA UNIVERSITY
INCHON, KOREA
http://eslab.inha.ac.kr
SIMULATION RESULTS(1)SIMULATION RESULTS(1)
Data set• Random samples that a user had to have rated at
least 20 movies.• 96,000 ratings for 3,775 movie titles and a
population of 1,000 users.• The sample data-set was used to randomly create 10
trial data-sets.• To simulate different community size they selected
groups of 500, 1,000 and 2,000 users at random in the full database consisting of six million ratings for 5,700 movies and 68,000 users.
INHA UNIVERSITY
INCHON, KOREA
http://eslab.inha.ac.kr
Experimental Methods• To test the Central, Random and Transitive architectures, the
y alternated between training and testing while adding neighbors to the neighborhood.
• During each training cycle five new neighbors were added to the model.
• After training cycle, the test cycle were performed to get a set of recommendations for the model owner.
• To test the II-Chord and SBB architectures, they randomly reserved one or more ratings from each user to be a part of the test set.
• After the model was built, test recommendations were generated for each user’s reserved item(s).
SIMULATION RESULTS(2)SIMULATION RESULTS(2)
INHA UNIVERSITY
INCHON, KOREA
http://eslab.inha.ac.kr
Metrics• Average Similarity is a how close a group of
neighbors is to the model owner.• Mean Absolute Error is a measure of how
accurately we can predict a user’s rating for an item.• Recall is a measure of how often a list of
recommendations contains an item that the user has actually rated.
• Coverage is a measure of the percentage of items for which a recommendation system can provide predictions.
• Memory Usage is a measure of how large the model grows as more items and neighbors are incorporated.
SIMULATION RESULTS(3)SIMULATION RESULTS(3)
INHA UNIVERSITY
INCHON, KOREA
http://eslab.inha.ac.kr
Neighborhood Similarity
SIMULATION RESULTS(4)SIMULATION RESULTS(4)
[Fig.12] Average similarity of users in the model as neighbor size grows
INHA UNIVERSITY
INCHON, KOREA
http://eslab.inha.ac.kr
Coverage
SIMULATION RESULTS(5)SIMULATION RESULTS(5)
[Fig.13] Coverage as neighborhood size grows
INHA UNIVERSITY
INCHON, KOREA
http://eslab.inha.ac.kr
MAE
SIMULATION RESULTS(6)SIMULATION RESULTS(6)
[Fig.14] MAE as neighborhood size grows
[Fig.15] MAE for community built models
INHA UNIVERSITY
INCHON, KOREA
http://eslab.inha.ac.kr
Recall
SIMULATION RESULTS(7)SIMULATION RESULTS(7)
[Fig.16] Recall as neighborhood size grows
INHA UNIVERSITY
INCHON, KOREA
http://eslab.inha.ac.kr
DISCUSSION(1)DISCUSSION(1)
To increase portability, recommendations can be available to users whenever and wherever they want.
[Table1] Effect of model truncation in the PocketLens algorithm on size, speed, coverage and MAE
INHA UNIVERSITY
INCHON, KOREA
http://eslab.inha.ac.kr
To increase trust, system enables users to decide how much of their information to share while preserving their anonymity.
DISCUSSION(2)DISCUSSION(2)
Fig.17 Comparison of different architectures wrt. data security and complexity of model building
INHA UNIVERSITY
INCHON, KOREA
http://eslab.inha.ac.kr
CONCLUSION & FUTURE CONCLUSION & FUTURE WORKWORK Conclusions
• The PocketLens algorithm is portable enough to run on disconnected palmtop computers, and can protect the user’s privacy and provide trust recommendations.
• The quality of recommendations is as good as the best previously reported results [Sarwar et al. 2001].
• Among five architectures, no one is perfect but the architectures provides fast, portable recommendations with privacy protection, and good quality.
Future Works• To turn these architectures into working systems.• To investigate interface issues for helping users message and contr
ol their profile information.• To prevent shilling attack problems.
Top Related