量子アニーリングを用いたクラスタ分析

35
アニーリング をいた クラスタ分析 Kenichi Kurihara, Shu Tanaka, and Seiji Miyashita “Quantum Annealing for Clustering”, UAI2009 Quantum Annealing Simulated annealing Hybrid annealing

description

Googleの栗原賢一さん、東京大学の宮下精二先生との共同研究論文 "Quantum Annealing for Clustering"の解説スライドです。 論文は以下からダウンロードできます。 Quantum Annealing for Clustering http://www.cs.mcgill.ca/~uai2009/papers/UAI2009_0019_71a78b4a22a4d622ab48f2e556359e6c.pdf 以下は日本語の解説です。 量子アニーリング法を用いたクラスタ分析 http://www.shutanaka.com/papers_files/ShuTanaka_DEXSMI_10.pdf

Transcript of 量子アニーリングを用いたクラスタ分析

  • 1. Quantum AnnealingHybrid annealingSimulated annealingKenichi Kurihara, Shu Tanaka, and Seiji MiyashitaQuantum Annealing for Clustering, UAI2009

2. () 3. 4. N!(10:106 20:1018) 5. ()x = argminxf(x)x = (x1, , xN)()5 12 10-1510 2x105 2x10-1115 4x108 4x10-820 6x1016 625 3x1023 35930 4x1030 1401 6. ()x = (x1, , xN)x = argminxf(x) 7. 8. Ising H = !i,j"Jijzi zj zi = 1 9. zi = 1()()()Hopt. = !i,j"Jijzi zj ihizi S. Kirkpatrick, C. D. Gelatt, and M. P. Vecchi, Science, Vol. 220, 671 (1983)() 10. ()Hopt. = !i,j"Jijzi zj ()()T. Kadowaki and H. Nishimori,Phys. Rev. E Vol. 58, 5355 (1998).zi = 1()ihizi S. Kirkpatrick, C. D. Gelatt, and M. P. Vecchi, Science, Vol. 220, 671 (1983) 11. () zHopt. = !i,j"Jijzi zj ihizii = 1Hopt. = !i,j"Jij zi zj ihizi zi =!1 00 1"zi |!" = +|!" , zi |#" = |#"zi |!" =!1 00 1"!10"=!10"= +|!"2Nx2N 12. ()Hq = ixi xi =!0 11 0"()xi |!" = |#" , xi |#" = |!"xi |!" =1#2(|$" + |%") , |&" =1#2(|$" |%") 13. ()Hq = ixi xi =!0 11 0"()xi |!" = |#" , xi |#" = |!"Hq : |!! !"|!!" =12(|##" + |#$" + |$#" + |$$") 14. ()Hopt. = ()Hq = ixi xi =!0 11 0"()xi |!" = |#" , xi |#" = |!"!i,j"Jij zi zj ihizi zi =!1 00 1" 15. H(t) = A(t)Hopt. + B(t)HqA(t)B(t) t10 0 16. 17. adds another dimension to simulatedannealing (SA) to control a model.QA iteratively decreases T and !whereas SA decreases just T.Figure 2: Illustrative explanation of QA. shows m independent SAs, and the right one derived with the Suzuki-Trotter (ST) denotes a clustering assignment.given: NK i vi i j (wij) k K i k k (wij )globally optimal solutions like !1 and !2, where themajority of data points are well-clustered, but someof them are not. Thus, a better clustering assignmentcan be constructed by picking up well-clustered datapoints from many sub-optimal clustering assignments.Note an assignment constructed in such a way is lo-catedFigure 2: Illustrative explanation of QA. The left figureshows m independent SAs, and the right one is QA algo-rithmFigure 2: Illustrative explanation of QA. The left figureshows m independent SAs, and the right one is QA algo-rithmbetween the sub-optimal ones by the proposedderived with the Suzuki-Trotter (ST) expansion. !derived with the Suzuki-Trotter (ST) expansion. !quantum effect Hq so that QA-ST can find a betterassignment between sub-optimal ones.denotes a clustering assignment.denotes a clustering assignment.thesome2 Preliminariesassignmentdataassignments.lo-catedproposedbettercluster 1; cluster 2; cluster 3; cluster 4;!1 (local optimum) !2 (local optimum)thesomeassignmentdataassignments.lo-catedproposedbettercluster 1; cluster 2; cluster 3; cluster 4;!1 (local optimum) !2 (local optimum)First of all, we introduce the notation used in thispaper. We assume we have n data points, and theyare assigned to k clusters. The assignment of the i-thdata point is denoted by binary indicator vector!i. For example, when k is equal to two, we denotethe i-th data point assigned to the first and the sec-ondcluster by !i = (1, 0)T and !i = (0, 1)T , re-spectively.The assignment of all data points is alsocluster 1; cluster 2; cluster !1 (local optimum) !2 !! (global optimum) 18. 19. (KN x KN )Hc = diag!E((1)), ,E((KN))"pSA(!; ") = !!|e!Hc |!"" !!|e!Hc |!"!E(")pSA(i|i) = e"i e!E(")i = {j |j= i}!|e!Hc |" = e!E(")Hc 20. (KN x KN )H = Hc + HqHc = diagHq =Ni=1!E((1)), ,E((KN))"xi x = (EK 1K) EK1KKxK 1KxK x =!0 0"x =!"#$0 0 0x =!""#0 0 0 0$%%&K=2 K=3 K=4Ising 21. (KN x KN )H = Hc + HqHq =Ni=1xi x = (EK 1K) EK1KKxK 1KxK Hc = diag!E((1)), ,E((KN))"pQA(!; ", ) = !!|e!H|!"" !!|e!H|!"H Suzuki-Trotter 22. Suzuki-Trotter d d+1 mTrotterpQA(!1; ", ) =!2 !mpQAST(!1, !2, , !m; ", ) + O"1m#pQAST(!1, !2, , !m; ", ) =1Zmj=1pSA(!j ; "/m)es(!j ,!j+1)f(",)s(!j , !j+1) =1NNi=1"(!j,i, !j+1,i) f(, ) = log!a + bb"a = emb =1Ka(aK 1)j,ij i 23. Trotter(6)(8)(9) !1 = !!2 !2()Figure 4: !1 and !2 give the same clustering but havedifferent cluster labels. Thus, s(!1, !2) = 0. Aftercluster label permutation from !2 to !!2, s(!1, !!2) = 1.The purity, s, gives s(!1, !2) = 1 as well. 24. 12...m12...m12...m12...m12...m12...m12...mS. Kirkpatrick, C. D. Gelatt, and M. P. Vecchi, Science, Vol. 220, 671 (1983) 25. 12...m12...m12...m12...m12...m12...m12...mMCSK. Hukushima and K. Nemoto, J. Phys. Soc. Jpn. Vol. 65, 1604 (1996). 26. MCS12...m12...m12...m12...m12...m12...m12...mK. Kurihara, S. Tanaka, and S. Miyashita, UAI2009 (2009K. K. Pudenz et al. arXiv:1408.4382Quantum Annealing Correction (QAC) 27. t = 0rTrotter m 50 pQAST(!1, !2, , !m; ", ) =1Zmj=1pSA(!j ; "/m)es(!j ,!j+1)f(",)s(!j , !j+1) =1NNi=1"(!j,i, !j+1,i) f(, ) = log!a + bb"a = em b =1Ka(aK 1) = 0ert 28. Simulated Annealing ()Quantum Annealingf*f2 29. 540 60 80 1006.7626.7586.754x 10iterationminj E(!j)540 60 80 1006.7686.7666.7646.7626.766.7586.756x 10iterationEj[E(!j )]40 60 80 10010.80.6iterationEj[s(!j , !j+1)]MNIST with MoGr! = 1.05QA r !=1.02; f2QA r!=1.05; f2QA r!=1.10; f!QA r!=1.20; f!SA540 60 80 1001.861.841.821.81.78x 10CPU iterationtime(SA): 22.0hoursminj E(!j)540 60 80 1001.881.861.841.821.8x 10iterationEj[E(!j )]40 60 80 10010.50iterationEj[s(!j , !j+1)]Reuters with LDAr! = 1.05QA r !=1.02; f2QA r!=1.05; f2QA r!=1.10; f!QA r!=1.20; f!SA9.69.45x 10E(!j)minj 9.69.45x 10Ej[E(!j )]10.50Ej[s(!j , !j+1)]NIPS with LDAr! = 1.05QA r !=1.02; f2QA r!=1.05; f2QA r!=1.10; f!QA r!=1.20; f!MNIST(500030)() f* SACPU time(QA): 21.7hours 30. 540 60 80 1006.7626.7586.754x 10iterationminj E(!j)540 60 80 1006.7686.7666.7646.7626.766.7586.756x 10iterationEj[E(!j )]40 60 80 10010.80.6iterationEj[s(!j , !j+1)]MNIST with MoGr! = 1.05QA r !=1.02; f2QA r!=1.05; f2QA r!=1.10; f!QA r!=1.20; f!SA540 60 80 1001.861.841.821.81.78x 10iterationminj E(!j)540 60 80 1001.881.861.841.821.8x 10iterationEj[E(!j )]40 60 80 10010.50iterationEj[s(!j , !j+1)]Reuters with LDAr! = 1.05QA r !=1.02; f2QA r!=1.05; f2QA r!=1.10; f!QA r!=1.20; f!SA540 60 80 1009.69.4x 10CPU iterationtime(SA): 10.0hoursminj E(!j)540 60 80 1009.69.4x 10iterationEj[E(!j )]40 60 80 10010.50iterationEj[s(!j , !j+1)]NIPS with LDAr! = 1.05QA r !=1.02; f2QA r!=1.05; f2QA r!=1.10; f!QA r!=1.20; f!SA9.59.459.49.355x 10E(!j)minj 9.69.59.45x 10Ej[E(!j )]0.80.60.40.2Ej[s(!j , !j+1)]NIPS with LDAr! = 1.02QA r !=1.02; f2QA r!=1.05; f!QA r!=1.10; f!QA r!=1.20; f!REUTERS100020() f* SACPU time(QA): 9.9hours 31. 40 60 80 1006.754iteration40 60 80 1006.756iteration40 60 80 1000.6iterationSA5NIPS16842040 60 80 100Tadashi Kadowaki and Hidetoshi Nishimori. Quantum 1.861.841.821.81.78x 10iterationminj E(!j)540 60 80 1001.881.861.841.821.8x 10iterationEj[E(!j )]()40 60 80 10010.50iterationEj[s(!j , !j+1)]Reuters with LDAr! = 1.05QA r !=1.02; f2QA r!=1.05; f2QA r!=1.10; f!QA r!=1.20; f!SA540 60 80 1009.69.4x 10iterationminj E(!j)540 60 80 1009.69.4x 10iterationEj[E(!j )]40 60 80 10010.50iterationEj[s(!j , !j+1)]NIPS with LDAr! = 1.05QA r !=1.02; f2QA r!=1.05; f2QA r!=1.10; f!QA r!=1.20; f!SA5 f* SA90 120 150 1809.59.459.49.35x 10CPU iterationtime(SA): 62.8hoursminj E(!j)590 120 150 1809.69.59.4x 10iterationEj[E(!j )]90 120 150 1800.80.60.40.2iterationEj[s(!j , !j+1)]NIPS with LDAr! = 1.02QA r !=1.02; f2QA r!=1.05; f!QA r!=1.10; f!QA r!=1.20; f!SACPU time(QA): 62.5hoursFigure 6: Comparison between SA and QA varying annealing schedule. r!, f2 and f! in legends correspond The left-most column shows what SA and QA found. QA with f! always found better results than show QAs performance for each problem like this 32. 40 60 80 100iteration40 60 80 1001.8iteration40 60 80 1000iterationSA5NIPS16842040 60 80 1009.69.4x 10iterationminj E(!j)540 60 80 1009.69.4x 10iterationEj[E(!j )]40 60 80 10010.50iterationEj[s(!j , !j+1)]NIPS with LDAr! = 1.05QA r !=1.02; f2QA r!=1.05; f2QA r!=1.10; f!QA r!=1.20; f!SA590 120 150 1809.59.459.49.35x 10iterationminj E(!j)590 120 150 1809.69.59.4x 10iterationEj[E(!j )]90 120 150 1800.80.60.40.2iterationEj[s(!j , !j+1)]NIPS with LDAr! = 1.02QA r !=1.02; f2QA r!=1.05; f!QA r!=1.10; f!QA r!=1.20; f!SAFigure 6: Comparison between SA and QA varying annealing schedule. r!, f2 and f! in legends correspond The left-most column shows what SA and QA found. QA with f! always found better results than show QAs performance for each problem like thispaper. However, it is worth trying to develop QA-basedCPU time(SA): 62.8hoursalgorithms for different models, e.g. Bayesiannetworks, by different quantum effect Hq. The pro-posedalgorithm looks like genetic algorithms in termsrunning multiple instances. Studying their relation-shipis also interesting future work.Tadashi Kadowaki and Hidetoshi Nishimori. Quantum in the transverse Ising model. Physical E, 58:5355 5363, 1998.S. Kirkpatrick, C. D. Gelatt, and M. P. Vecchi. Optimiza-tionby simulated annealing. Science, 220(4598):680, 1983.Percy Liang, Michael I. Jordan, and Ben Taskar. permutation-augmented sampler for DP mixture In ICML, pages 545552. Omnipress, 2007.() f* SACPU time(QA): 62.5hours 33. Simulated Annealing ()Quantum Annealingf*f2 34. () 35. Thank you !Kenichi Kurihara, Shu Tanaka, and Seiji MiyashitaQuantum Annealing for Clustering, UAI2009