Watson DevCon 2016 - Watson Beat: Making Music Cognitive

Click here to load reader

  • date post

    16-Apr-2017
  • Category

    Technology

  • view

    248
  • download

    4

Embed Size (px)

Transcript of Watson DevCon 2016 - Watson Beat: Making Music Cognitive

  • IBM confidential IBM Corporation

    Watson Beat: New Cognitive Era

  • 2IBM Corporation

    FutureCurrent

  • IBM Confidential | Do Not Distribute 2015 IBM Corporation

    Why Music?

    Why unsupervised learning?

    Why Watson Beat?

    Representing music based on parameters is not very intuitive for most people

    For example: I like songs with 120 bpm, following B# major etc.

    Watson Beat provides a new way to query and compose new music based on reference tracks

    Ease of use: no requirement of prior knowledge, music theory etc.

    Provide input track(s) to our model, it gives you a new output track(s)

    Motivation:

    3

  • IBM Confidential | Do Not Distribute 2015 IBM Corporation

    Demo of potential application I am an amateur video game developer, and I need music for my game, and I like how Game of Thrones sounds Feed it through Watson Beat and get 100s of variations on same music Ability to steer compositions based on intent slow, sad, fast, happy, vibrant

    Learning Music: Basic Idea

    Learned Game Of Thrones

    Input Track(s)

    Perturb model using creativity

    genes (known influences)

    Extract musical

    characteristics (pitch, rhythm, dynamics etc.)

    Reconstruct track by iteratively learning

    input musical characteristics with added perturbation

    Output Track(s)

    4

    Original Game Of Thrones

  • IBM Confidential | Do Not Distribute 2015 IBM Corporation

    RBM: Stochastic Neural Net with one layer of visible units one layer of hidden units

    Learning RBMs: Contrastive Divergence DBN: Stack multiple Restricted Boltzmann

    Machines (RBM)

    Deep Belief Networks (DBN):

    5

    x: Visible layer

    h: Hidden Layer

    W: weights

    Input Vector

    Perturbed Input Vector

    unsupervised learning of NNs (RBMs, Auto encoder etc.)

    Output Vector

    RBM

  • IBM Confidential | Do Not Distribute 2015 IBM Corporation

    Watson Beat Pandora station Suggest recreated version of songs that you like, youve been listening to etc.

    Producers, composers, music engineers create music based on intent (slow, fast, happy vibrant) Ability for retail stores, small businesses to create their own music based on original tracks Loop Pedal Mixing: pedal -> DJ Watson mixer -> amp

    https://www.youtube.com/watch?v=qX2eJsj9MiQ

    Applications: Cloud based cognitive music service

    6

    https://www.youtube.com/watch?v=qX2eJsj9MiQ

  • IBM Confidential | Do Not Distribute 2015 IBM Corporation

    Backup Slides

    7

  • IBM Confidential | Do Not Distribute 2015 IBM Corporation

    Training RBMs

    8

  • IBM Confidential | Do Not Distribute 2015 IBM Corporation

    Recreate original Music using RBMs

    C# E B

    Time1/16 1/16

    h: Hidden Layer (Holds extracted features of visible layer)

    x: Visible layer (Holds pitch information)

    9

    C E# B

    Time1/16 1/16

    x~: Learned visible layer (Holds learned pitch information)

    p(h|x) p(x|h)

    Demo: Recreated Mary had a little lamb

  • Example 1: Create new music by adding perturbation

    10

    h: Hidden Layer (Holds extracted features of visible

    Demo: Learned Mary (less perturbation)

    Demo: Learned Mary (more perturbation)

    x: Visible layer (Holds initial pitch information)

    h1: Hidden Layer (Holds extracted features of visible

    C# rand B

    Time

    x: Visible layer (Holds perturbed pitch information)

    x~: Learned visible layer (Holds learned pitch information)

    p(h|x) p(x|h)

    E rand E# rand` B

    Time

    A rand`

    p1(h|x)

    C# E B

    Time

    Demo: Original Mary had a little lamb

  • 11

    h: Hidden Layer (Holds extracted features of visible

    Demo: Spooky String Quartet

    x: Visible layer (Holds initial pitch information)

    h1: Hidden Layer (Holds extracted features of visible

    C# Oct B

    Time

    x: Visible layer (Holds perturbed pitch information)

    x~: Learned visible layer (Holds learned pitch information)

    p(h|x) p(x|h)

    E Oct E# Oct` B

    Time

    A Oct`

    p1(h|x)

    C# E B

    Time

    Demo: Original String Quartet

    Example 2: Create new music by steering learning based on emotional intent

  • Example 3: Create new music by learning from two songs and adding perturbation

    C# minor B

    Time

    Visible layer (Holds pitch + bias information)

    Hidden Layer (Holds extracted features of visible layer)

    minor E

    Song A

    D#

    Song B

    Weights

    12

    Demo: Learned Willie Nelson

    and Miley Cyrus

  • 13

    Original Adele

    Learned Adele Vibrant version

    Learned Adele Mellow version

    Example 4: Create new music by steering learning based on emotional intent

  • 14

    x1: Visible layer for RBM1

    x2: Visible layer for RBM2 h1: Hidden Layer for RBM 1

    W1: weights for RBM1

    x3: Visible Layer for RBM3 h2: hidden layer for RBM2

    h3: hidden layer for RBM3

    W2: weights for RBM2

    W3: weights for RBM3