Matrix Factorization

Recovering latent factors in a matrixm columns

v11 …

… …vij

… vnmn r

Recovering latent factors in a matrixK * m

x1 y1x2 y2.. ..

… …xn yn

a1 a2 .. … amb1 b2 … … bm v11 …

… …vij

… vnm

What is this for?K * m

x1 y1x2 y2.. ..

… …xn yn

a1 a2 .. … amb1 b2 … … bm v11 …

… …vij

… vnm

MF for collaborative filtering

What is collaborative filtering?

Recovering latent factors in a matrixm movies

v11 …

… …vij

… vnmV[i,j] = user i’s rating of movie j

m movies

x1 y1x2 y2.. ..

… …xn yn

a1 a2 .. … amb1 b2 … … bm v11 …

… …vij

… vnm

V[i,j] = user i’s rating of movie j

MF for image modeling

MF for images10,000 pixels

1000 * 10,000,00

x1 y1x2 y2.. ..

… …xn yn

a1 a2 .. … amb1 b2 … … bm v11 … … …… …

… vnm

V[i,j] = pixel j in image i

2 prototypes

MF for modeling text

• The Neatest Little Guide to Stock Market Investing• Investing For Dummies, 4th Edition• The Little Book of Common Sense Investing: The Only

Way to Guarantee Your Fair Share of Stock Market Returns

• The Little Book of Value Investing• Value Investing: From Graham to Buffett and Beyond• Rich Dad’s Guide to Investing: What the Rich Invest in,

That the Poor and the Middle Class Do Not!• Investing in Real Estate, 5th Edition• Stock Investing For Dummies• Rich Dad’s Advisors: The ABC’s of Real Estate

Investing: The Secrets of Finding Hidden Profits Most Investors Miss

https://technowiki.wordpress.com/2011/08/27/latent-semantic-analysis-lsa-tutorial/

TFIDF counts would be better

Recovering latent factors in a matrixm terms

doc term matrix

x1 y1x2 y2.. ..

… …xn yn

a1 a2 .. … amb1 b2 … … bm v11 …

… …vij

… vnm

V[i,j] = TFIDF score of term j in doc i

Investing for real estate

Rich Dad’s Advisor’s:

The ABCs of Real Estate Investment

The little book of common

sense investing: …

Neatest Little Guide

to Stock Market

Investing

MF is like clustering

k-means as MFcluster means

0 11 0.. ..

… …xn yn

a1 a2 .. … amb1 b2 … … bm v11 …

… …vij

… vnm

original data setindicators

for r clusters

How do you do it?K * m

x1 y1x2 y2.. ..

… …xn yn

a1 a2 .. … amb1 b2 … … bm v11 …

… …vij

… vnm

talk pilfered from …..

KDD 2011

m movies

x1 y1x2 y2.. ..

… …xn yn

a1 a2 .. … amb1 b2 … … bm v11 …

… …vij

… vnm

V[i,j] = user i’s rating of movie j

Matrix factorization as SGD

step size why does this work?

Matrix factorization as SGD - why does this work? Here’s the key claim:

Checking the claim

Think for SGD for logistic regression• LR loss = compare y and ŷ = dot(w,x)• similar but now update w (user weights) and x

(movie weight)

What loss functions are possible?N1, N2 - diagonal matrixes, sort of like IDF factors for the users/movies

“generalized” KL-divergence

What loss functions are possible?

ALS = alternating least squares

talk pilfered from …..

KDD 2011

Similar to McDonnell et al with perceptron learning

Slow convergence…..

More detail….• Randomly permute rows/cols of matrix• Chop V,W,H into blocks of size d x d

– m/d blocks in W, n/d blocks in H• Group the data:

– Pick a set of blocks with no overlapping rows or columns (a stratum)– Repeat until all blocks in V are covered

• Train the SGD– Process strata in series– Process blocks within a stratum in parallel

More detail….Z was V

More detail….• Initialize W,H randomly

– not at zero • Choose a random ordering (random sort) of the points in a stratum in each “sub-epoch”• Pick strata sequence by permuting rows and columns of M, and using M’[k,i] as column index of row i in subepoch k • Use “bold driver” to set step size:

– increase step size when loss decreases (in an epoch)– decrease step size when loss increases

• Implemented in Hadoop and R/Snowfall

Wall Clock Time8 nodes, 64 cores, R/snow

Number of Epochs

Varying rank100 epochs for all

Hadoop scalabilityHadoop

process setup time starts to

dominate

Hadoop scalability

Matrix Factorization

Documents

Transcript of Matrix Factorization

Class vii algebra factorization

Code: 101MAT4–101MT4B Today’s topics Normed linear (vector) space Symmetric matrix … · 2020. 3. 13. · Linear (vector) space ... matrix L, then A =LU (LU factorization), that

Machine Learning Techniques hxÒ Õhtlin/course/ml20fall/doc/...Machine Learning Techniques (_hxÒ•Õ) Lecture 15: Matrix Factorization Hsuan-Tien Lin (ŠÒ0) htlin@csie.ntu.edu.tw

Applied Matrix Theory, Math 464/514, Fall 2019lorenz/464/h.pdf1 Gaussian Elimination and LU Factorization Consider a linear system of equations Ax= bwhere A2Cn n is a square matrix,

Computing Reversed Lempel-Ziv Factorization Online

Optimal divergence diversity for superresolution-based nonnegative matrix factorization (in Japanese)

PERFORMANCE OPTIMIZATION OF SYMMETRIC FACTORIZATION · PDF filePERFORMANCE OPTIMIZATION OF SYMMETRIC FACTORIZATION ALGORITHMS ... Abis su ciently positive de nite, so that the

Single (transverse) Spin Asymmetry & QCD Factorization

Efficient multichannel nonnegative matrix factorization with rank-1 spatial model (in Japanese)

Kai Yu - 大眼睛实验室bigeye.au.tsinghua.edu.cn/DragonStar2012/docs/dragonstar... · Outline ! Problem definition ! Pre-processing ! CF as classification ! CF as matrix factorization

Matrix-Matrix Multiplication › users › flame › LAFF › Notes › Week5.pdfWeek 5. Matrix-Matrix Multiplication 164 Is matrix-matrix multiplication associative? Homework 5.2.2.1

Re-examining Decays in QCD factorization approach

Factorization of a 768-bit RSA modulus

Analysis of charmless B decays in Factorization Assisted ...moriond.in2p3.fr/QCD/2017/MondayMorning/Lu.pdf• charmless hadronic B decays are studied in the factorization-assisted

Factorization Breaking in Diffractive Photoproduction of Dijets

第３回関西NIPS読み会：Temporal Regularized Matrix Factorization for High dimensional Time Series Prediction

Scalable Coordinate Descent Approaches to Parallel Matrix Factorization for Recommender Systems

Divergence optimization based on trade-off between separation and extrapolation abilities in superresolution-based nonnegative matrix factorization (in Japanese)

Symplectic factorization, Darboux theorem and ellipticitywgangbo/publications/...Symplectic factorization, Darboux theorem and ellipticity B. DACOROGNA, W. GANGBO and O. KNEUSS B.D.

Computing palindromic factorization and palindromic covers on-line