ETH Zurich – Distributed Computing Group Michael Kuhn 1ETH Zurich – Distributed Computing Group...

download ETH Zurich – Distributed Computing Group Michael Kuhn 1ETH Zurich – Distributed Computing Group Social Audio Features An Intuitive Guide to the Music Galaxy.

If you can't read please download the document

Transcript of ETH Zurich – Distributed Computing Group Michael Kuhn 1ETH Zurich – Distributed Computing Group...

  • Slide 1
  • ETH Zurich Distributed Computing Group Michael Kuhn 1ETH Zurich Distributed Computing Group Social Audio Features An Intuitive Guide to the Music Galaxy Michael Kuhn Distributed Computing Group (DISCO) ETH Zurich [email protected]
  • Slide 2
  • Today, I would like to listen to something cheerful. Something like Lenny Kravitz would be great. Who can help me to discover my collection?
  • Slide 3
  • half of the time I spend skipping songs...
  • Slide 4
  • In my shelf AC/DC is next to the ZZ Top...
  • Slide 5
  • Similar or different???
  • Slide 6
  • cover flow looks better cover flow looks better
  • Slide 7
  • does not well represent perceived similarity miles davis beatles fatboy slim beatles fatboy slim avril lavigne miles davis
  • Slide 8
  • Slide 9
  • well reflects perceived music similarity. is as convenient to use as an audio feature space. We want to have something that Social Audio Features
  • Slide 10
  • socially derived music similarity + mapping into Euclidean space = Social Audio Features
  • Slide 11
  • ETH Zurich Distributed Computing Group Michael Kuhn 11 Advantages of a Feature Space Similar songs are close to each other Quickly find nearest neighbors Span (and play) volumes Create smooth playlists by interpolation Visualize a collection Low memory footprint Well suited for mobile domain convenient basis to build music software
  • Slide 12
  • Creating Social Audio Features, Method 1: Collaborative Filtering and MDS
  • Slide 13
  • Slide 14
  • #common users (co-occurrences) (co-occurrences) Occurrences of song A Occurrences of song B Users who listen to Muse also listen to Oasis... Problem: Only pairwise similarity, but no global view!
  • Slide 15
  • Getting a global view... d = ? pairwise similarities 1 1
  • Slide 16
  • Principal Component Analysis (PCA): Project on hyperplane that maximizes variance. Computed by solving an eigenvalue problem. Basic idea of MDS: Assume that the exact positions y 1,...,y N in a high-dimensional space are given. It can be shown that knowing only the distances d(y i, y j ) between points we can calculate the same result as applying PCA to y 1,...,y N. Problem: Complexity O(n 2 log n) use approximation: LMDS [da Silva and Tenenbaum, 2002] Classical Multidimensional Scaling (MDS)
  • Slide 17
  • Problem: Some links erroneously shortcut certain paths Problem: Use embedding as estimator for distance: Remove edges that get stretched most and re-embed
  • Slide 18
  • After 30 rounds of iterative embedding Original embedding
  • Slide 19
  • Pink Floyd - Time Pink Floyd - On the Run Pink Floyd - Any Colour you Like Pink Floyd - The Great Gig in the Sky Pink Floyd - Eclipse Pink Floyd - Us and Them Pink Floyd - Brain Damage Pink Floyd - Speak to Me Pink Floyd - Money Pink Floyd - Breathe Pink Floyd - One of These Days Miles Davis - So What Horace Silver - Song For My Father Bill Evans - All of You Miles Davis - Freddie Freeloader Nat King Cole - The More I See You Miles Davis - So Near Miles Davis - Flamenco Sketches Charles Mingus - Eat That Chicken Jimmy Smith - On the Sunny Side Julie London - Daddy Bill Evans My Mans Gone Now 10 Dimensions give a reasonable quality Example Neighborhoods in 10D Space (0.5M songs)
  • Slide 20
  • Creating Social Audio Features, Method 2: Social Tags and PLSA
  • Slide 21
  • Slide 22
  • Meaningful labels, but sparse data Meaningful labels, but sparse data Good similarity information, but no labels Good similarity information, but no labels Lets combine this information
  • Slide 23
  • ETH Zurich Distributed Computing Group Michael Kuhn 23 Combining Usage Data and Social Tags
  • Slide 24
  • ETH Zurich Distributed Computing Group Michael Kuhn 24 art painting artist music collection approach psychology feeling female subjective audio signal music beat timbre 1)Select latent class z with probability P(z|d) 2)Select word w with probability P(w|z) PLSA: find probabilities that best approximate observed word distribution PLSA: Probabilistic Latent Semantic Analysis (PLSA)
  • Slide 25
  • ETH Zurich Distributed Computing Group Michael Kuhn 25 Probabilistic Latent Semantic Analysis (PLSA) Everyonehasaphotographicmemory some just dont have film. 1)Select latent class z with probability P(z|d) 2)Select word w with probability P(w|z) PLSA: find probabilities that best approximate observed word distribution PLSA:
  • Slide 26
  • ETH Zurich Distributed Computing Group Michael Kuhn 26 PLSA: Interpretation as Space can be seen as a vector that defines a point in space [Hofmann, 1999] K small: Dimensionality reduction songs latent music style classes tags
  • Slide 27
  • ETH Zurich Distributed Computing Group Michael Kuhn 27 Greenday basket case rock punk pop-punk Madonna like a prayer pop dance female vocalists Beatles hey jude 60s Classic rock british Applying PLSA to Music and Tags Greenday Beatles Madonna 32 latent classes (=dimensions), 1.1M songs
  • Slide 28
  • ETH Zurich Distributed Computing Group Michael Kuhn 28 Evaluation Artist clustering Comparison to coll. filtering Comparison to coll. filtering Tag consistency
  • Slide 29
  • ETH Zurich Distributed Computing Group Michael Kuhn 29 LMDS vs. PLSA Space Advantages of LMDS: Same accurracy at lower dimensionality (10 vs. 32) Advantages of PLSA: Natural meaning of tags Assignment of tags to songs (probabilistic) Current sizes (approx.): LMDS: 600K tracks PLSA: 1.1M tracks Current sizes (approx.): LMDS: 600K tracks PLSA: 1.1M tracks
  • Slide 30
  • Using the Social Audio Features
  • Slide 31
  • high-dimensional!high-dimensional!
  • Slide 32
  • ETH Zurich Distributed Computing Group Michael Kuhn 32 Visualization in 2D Identify relevant tags Find centroids of these tags in high-dimensional space Apply Principal Component Analysis (PCA) to these centroids
  • Slide 33
  • ETH Zurich Distributed Computing Group Michael Kuhn 33
  • Slide 34
  • What people have chosen during the researchers night in Zurich
  • Slide 35
  • ETH Zurich Distributed Computing Group Michael Kuhn 35 YouJuke The YouTube Jukebox
  • Slide 36
  • YouTube as media source YouTube as media source Social Audio Features to create smart playlist
  • Slide 37
  • Slide 38
  • Slide 39
  • www.youjuke.orgwww.youjuke.org apps.facebook.com/youjukeapps.facebook.com/youjuke
  • Slide 40
  • Half of the time I spend skipping songs
  • Slide 41
  • I only want to listen to songs that match my mood...
  • Slide 42
  • After only few skips, we know pretty well which songs match the users mood After only few skips, we know pretty well which songs match the users mood
  • Slide 43
  • ETH Zurich Distributed Computing Group Michael Kuhn 43 Work in Progress: Who is Dancing? AC/DCAC/DC BeatlesBeatles ProdigyProdigy
  • Slide 44
  • ETH Zurich Distributed Computing Group Michael Kuhn 44 In my shelf AC/DC is next to ZZ Top... Browsing Covers
  • Slide 45
  • www.museek.ethz.ch
  • Slide 46
  • Video
  • Slide 47
  • Selected Comments from museek Users Your software is a pathetic piece of crap! [] Does a good job learning my tastes[] [] easy browse and make playlists. Auto play related music is very good. ui ! [...] Love the ability to automatically play similar music. [...] [...] Love the ability to automatically play similar music. [...] Good potential, but album art is tiny & blurry [] Just got it and want to put more music on my sd card now. Pretty cool once you get the hang of it. L'algorithme de slection des playlists en fonction de l'volution de votre humeur est un vritable bijou. Flicitations [] Awesome app beating the ipod genius feature and coverflow. []
  • Slide 48
  • ETH Zurich Distributed Computing Group Michael Kuhn 48 Questions? Thanks to: Lukas Bossard Mihai Calin Matthias Flckiger Olga Goussevskaia Michael Lorenzi Roger Wattenhofer Samuel Welten Martin Wirz URLs: www.museek.ethz.ch www.youjuke.org apps.facebook.com/youjuke E-Mail: [email protected] (Michael Kuhn)
  • Slide 49
  • ETH Zurich Distributed Computing Group Michael Kuhn 49 Publications Sensing Dance Engagement for Collaborative Music Control. Michael Kuhn, Martin Wirz, Matthias Flckiger, Roger Wattenhofer, Gerhard Trster. (accepted at ISWC 2011) Social Audio Features for Advanced Music Retrieval Interfaces. Michael Kuhn, Roger Wattenhofer, and Samuel Welten. ACM Multimedia, Florence, October 2010. Visually and Acoustically Exploring the High-Dimensional Space of Music. Lukas Bossard, Michael Kuhn, and Roger Wattenhofer. IEEE International Conference on Social Computing (SocialCom), Vancouver, Canada, August 2009. From Web to Map: Exploring the World of Music. Olga Goussevskaia, Michael Kuhn, Michael Lorenzi, and Roger Wattenhofer. IEEE/WIC/ACM International Conference on Web Intelligence (WI), Sydney, Australia, December 2008. Exploring Music Collections on Mobile Devices. Olga Goussevskaia, Michael Kuhn, and Roger Wattenhofer. International Conference on Human-Computer Interaction with Mobile Devices and Services (MobileHCI), Amsterdam, Netherlands, September 2008.