解決正確的問題 - 如何讓數據發揮影響力?

69
解解解解解解解解解解解解解解解解解Pei-shen Wu, MD [email protected]

Transcript of 解決正確的問題 - 如何讓數據發揮影響力?

Page 1: 解決正確的問題 - 如何讓數據發揮影響力?

解決正確的問題:如何讓數據發揮影響力?Pei-shen Wu, MD

[email protected]

Page 2: 解決正確的問題 - 如何讓數據發揮影響力?

Outline• 是什麼阻礙了組織運用數據資料的力量?• 怎麼解決?從 practitioner的角度出發• 重新思考數位時代下的學習

Page 3: 解決正確的問題 - 如何讓數據發揮影響力?

資料分析需求的分佈

因果關係驗證

資料探索、假設建立、相關性探究大多數的分析需求在這

Page 4: 解決正確的問題 - 如何讓數據發揮影響力?

與資料人員的對話「近期增加了產品品項,但客戶人數不見起色,想問最近用戶的活躍狀況?」「這些用戶的狀況跟去年同期比較如何?」「這些用戶是否有地區差異?」「與這些用戶同班的用戶的狀況?」…

Page 5: 解決正確的問題 - 如何讓數據發揮影響力?

回應單一個問題需要的功夫SELECT keyID,userID,dot_email,gender,points,user_nickname,username,userEmail,userSendableEmail,userRole,userCity,userSchool,userGrade,joinedTime,userBirthdate,match_score FROM (SELECT pre.keyID AS keyID,pre.userID AS userID,pre.dot_email AS dot_email,pre.gender AS gender,pre.points AS points,pre.user_nickname AS user_nickname,pre.username AS username,pre.userEmail AS userEmail,pre.userSendableEmail AS userSendableEmail,pre.userRole AS userRole,pre.userCity AS userCity,pre.userSchool AS userSchool,pre.userGrade AS userGrade,pre.joinedTime AS joinedTime,pre.userBirthdate AS userBirthdate,post.match_score AS match_score FROM (SELECT * FROM (SELECT pre.keyID AS keyID,pre.userID AS userID,pre.dot_email AS dot_email,pre.gender AS gender,pre.points AS points,pre.user_nickname AS user_nickname,pre.username AS username,post.userEmail AS userEmail,post.userSendableEmail AS userSendableEmail,post.userRole AS userRole,post.userCity AS userCity,post.userSchool AS userSchool,post.userGrade AS userGrade,post.joinedTime AS joinedTime,post.userBirthdate AS userBirthdate FROM (SELECT * FROM (SELECT keyID, userID, dot_email, gender, points, user_nickname, username FROM (SELECT __key__.name AS keyID, user_id AS userID, user_email AS underline_email, current_user.email AS cu_email, user.email AS dot_email, gender, points, user_nickname, username FROM junyi_20161212.UserData_20161212)) AS pre INNER JOIN EACH (SELECT userEmail, userSendableEmail, userId AS userID, userRole, userCity, userSchool, userGrade, joinedTime, userBirthdate FROM FinalTable.UserFinalTmpInfo) AS post ON pre.userID = post.userID)) AS pre INNER JOIN EACH (SELECT keyID,match_score FROM (SELECT keyID, match_score FROM (SELECT keyID,prob1*prob2 AS match_score FROM (SELECT pre.keyID AS keyID,pre.prob1 AS prob1,post.prob2 AS prob2 FROM (SELECT * FROM (SELECT keyID, prob1 FROM (SELECT keyID, match_score AS prob1 FROM (SELECT keyID, match_score FROM (SELECT keyID, match_score_A*cdf AS match_score FROM (SELECT keyID, MAX(output) AS cdf, match_score_A FROM (SELECT keyID, IF(metric >= input,1,0) AS compare, output, match_score_A FROM (SELECT pre.keyID AS keyID,pre.comparekey AS comparekey,pre.metric AS metric,pre.match_score_A AS match_score_A,post.input AS input,post.output AS output FROM (SELECT * FROM (SELECT keyID, 1 AS comparekey, metric, match_score AS match_score_A FROM (SELECT pre.keyID AS keyID,pre.match_score AS match_score,post.dot_email AS dot_email,post.metric AS metric FROM (SELECT * FROM (SELECT teacher_keyID AS keyID, MAX(match_score) AS match_score FROM(SELECT pre.student_keyID AS student_keyID,pre.classID AS classID,pre.student_underline_email AS student_underline_email,pre.teacher_keyID AS teacher_keyID,post.match_score AS match_score FROM (SELECT * FROM (SELECT pre.student_keyID AS student_keyID,pre.classID AS classID,pre.student_underline_email AS student_underline_email,post.teacher_keyID AS teacher_keyID FROM (SELECT * FROM (SELECT student_keyID, classID, student_underline_email FROM FLATTEN((SELECT __key__.name AS student_keyID, user_email AS student_underline_email, student_lists.path AS classID FROM junyi_20161212.UserData_20161212), classID)) AS pre INNER JOIN EACH (SELECT teacher_keyID, classID FROM(SELECT coaches.name AS teacher_keyID, __key__.path AS classID, code AS classcode, name AS classname FROM junyi_20161212.StudentList_20161212)) AS post ON pre.classID = post.classID)) AS pre INNER JOIN EACH (SELECT classID, MAX(match_score) AS match_score FROM (SELECT classID,match_score FROM (SELECT classID, MAX(match_score) AS match_score FROM (SELECT pre.student_keyID AS student_keyID,pre.classID AS classID,pre.student_underline_email AS student_underline_email,pre.teacher_keyID AS teacher_keyID,post.match_score AS match_score FROM (SELECT * FROM (SELECT pre.student_keyID AS student_keyID,pre.classID AS classID,pre.student_underline_email AS student_underline_email,post.teacher_keyID AS teacher_keyID FROM (SELECT * FROM (SELECT student_keyID, classID, student_underline_email FROM FLATTEN((SELECT __key__.name AS student_keyID, user_email AS student_underline_email, student_lists.path AS classID FROM junyi_20161212.UserData_20161212), classID)) AS pre INNER JOIN EACH (SELECT teacher_keyID, classID FROM(SELECT coaches.name AS teacher_keyID, __key__.path AS classID, code AS classcode, name AS classname FROM junyi_20161212.StudentList_20161212)) AS post ON pre.classID = post.classID)) AS pre INNER JOIN EACH (SELECT keyID AS student_keyID, match_score FROM (SELECT student_keyID AS keyID, MAX(match_score) AS match_score FROM(SELECT student_keyID, teacher_keyID, 1 AS match_score FROM (SELECT pre.student_keyID AS student_keyID,pre.classID AS classID,pre.student_underline_email AS student_underline_email,post.teacher_keyID AS teacher_keyID FROM (SELECT * FROM (SELECT student_keyID, classID, student_underline_email FROM FLATTEN((SELECT __key__.name AS student_keyID, user_email AS student_underline_email, student_lists.path AS classID FROM junyi_20161212.UserData_20161212), classID)) AS pre INNER JOIN EACH (SELECT teacher_keyID, classID FROM(SELECT coaches.name AS teacher_keyID, __key__.path AS classID, code AS classcode, name AS classname FROM junyi_20161212.StudentList_20161212)) AS post ON pre.classID = post.classID))) GROUP BY keyID)) AS post ON pre.student_keyID = post.student_keyID)) GROUP BY classID)) GROUP BY classID) AS post ON pre.classID = post.classID)) GROUP BY keyID) AS pre INNER JOIN EACH (SELECT pre.dot_email AS dot_email,pre.metric AS metric,post.keyID AS keyID FROM (SELECT * FROM (SELECT dot_email, SUM(points_earned) AS metric FROM(SELECT dot_email, points_earned, timestamp, time_consumed FROM (SELECT dot_email, points_earned, timestamp, time_consumed FROM (SELECT dot_email,points_earned,timestamp,time_consumed FROM (SELECT dot_email, points_earned, timestamp, time_consumed FROM (SELECT user.email AS dot_email, points_earned, seconds_watched AS time_consumed, DATE_ADD(time_watched, 8, 'HOUR') AS timestamp FROM junyi_20161212.VideoLog_20161212)),(SELECT dot_email, points_earned, timestamp, time_consumed FROM (SELECT user.email AS dot_email, points_earned, time_taken AS time_consumed, DATE_ADD(time_done, 8, 'HOUR') AS timestamp FROM junyi_20161212.ProblemLog_20161212)))) WHERE timestamp >= TIMESTAMP('2016-09-01') AND timestamp <= TIMESTAMP('2016-12-12')) GROUP BY dot_email) AS pre INNER JOIN EACH (SELECT keyID, dot_email FROM (SELECT __key__.name AS keyID, user.email AS dot_email FROM junyi_20161212.UserData_20161212)) AS post ON pre.dot_email = post.dot_email)) AS post ON pre.keyID = post.keyID))) AS pre INNER JOIN EACH (SELECT input, AVG(output) AS output, 1 AS comparekey FROM ( SELECT input, output/101 AS output FROM ( SELECT input, ROW_NUMBER() OVER() AS output FROM ( FLATTEN(( SELECT QUANTILES(metric,101) AS input FROM (SELECT pre.keyID AS keyID,pre.match_score AS match_score,post.dot_email AS dot_email,post.metric AS metric FROM (SELECT * FROM (SELECT student_keyID AS keyID, MAX(match_score) AS match_score FROM(SELECT student_keyID, teacher_keyID, 1 AS match_score FROM (SELECT pre.student_keyID AS student_keyID,pre.classID AS classID,pre.student_underline_email AS student_underline_email,post.teacher_keyID AS teacher_keyID FROM (SELECT * FROM (SELECT student_keyID, classID, student_underline_email FROM FLATTEN((SELECT __key__.name AS student_keyID, user_email AS student_underline_email, student_lists.path AS classID FROM junyi_20161212.UserData_20161212), classID)) AS pre INNER JOIN EACH (SELECT teacher_keyID, classID FROM(SELECT coaches.name AS teacher_keyID, __key__.path AS classID, code AS classcode, name AS classname FROM junyi_20161212.StudentList_20161212)) AS post ON pre.classID = post.classID))) GROUP BY keyID) AS pre INNER JOIN EACH (SELECT pre.dot_email AS dot_email,pre.metric AS metric,post.keyID AS keyID FROM (SELECT * FROM (SELECT dot_email, SUM(points_earned) AS metric FROM(SELECT dot_email, points_earned, timestamp, time_consumed FROM (SELECT dot_email, points_earned, timestamp, time_consumed FROM (SELECT dot_email,points_earned,timestamp,time_consumed FROM (SELECT dot_email, points_earned, timestamp, time_consumed FROM (SELECT user.email AS dot_email, points_earned, seconds_watched AS time_consumed, DATE_ADD(time_watched, 8, 'HOUR') AS timestamp FROM junyi_20161212.VideoLog_20161212)),(SELECT dot_email, points_earned, timestamp, time_consumed FROM (SELECT user.email AS dot_email, points_earned, time_taken AS time_consumed, DATE_ADD(time_done, 8, 'HOUR') AS timestamp FROM junyi_20161212.ProblemLog_20161212)))) WHERE timestamp >= TIMESTAMP('2016-09-01') AND timestamp <= TIMESTAMP('2016-12-12')) GROUP BY dot_email) AS pre INNER JOIN EACH (SELECT keyID, dot_email FROM (SELECT __key__.name AS keyID, user.email AS dot_email FROM junyi_20161212.UserData_20161212)) AS post ON pre.dot_email = post.dot_email)) AS post ON pre.keyID = post.keyID))), input)))) GROUP BY input) AS post ON pre.comparekey = post.comparekey))) WHERE compare == 1 GROUP BY keyID, match_score_A))))) AS pre INNER JOIN EACH (SELECT keyID, prob2 FROM (SELECT keyID, match_score AS prob2 FROM (SELECT keyID, match_score FROM (SELECT keyID, timing_match_denumerator/timing_match_numerator AS match_score FROM (SELECT keyID, SUM(match_condition) OVER (PARTITION BY keyID) AS timing_match_denumerator, COUNT(match_condition) OVER (PARTITION BY keyID) AS timing_match_numerator, FROM (SELECT keyID, date, month, weekday, hour, minute, IF((hour == 9 OR hour == 10 OR hour == 11 OR hour == 12 OR hour == 13 OR hour == 14 OR hour == 15),1,0) AS match_condition FROM (SELECT keyID, date, month,

Page 6: 解決正確的問題 - 如何讓數據發揮影響力?

試圖以三週努力滿足三秒的好奇

Page 7: 解決正確的問題 - 如何讓數據發揮影響力?

回答速度趕不上問題產生的速度

Page 8: 解決正確的問題 - 如何讓數據發揮影響力?

資料人員第一線人員

Page 9: 解決正確的問題 - 如何讓數據發揮影響力?

是否可以更快?

需求提出 規格確認 初步實作 驗證除錯

Page 10: 解決正確的問題 - 如何讓數據發揮影響力?
Page 11: 解決正確的問題 - 如何讓數據發揮影響力?

Scale in complexity,not size

Page 12: 解決正確的問題 - 如何讓數據發揮影響力?

IT-centric Business Intelligence

因果關係驗證

資料探索、假設建立、相關性探究

資料人員

第一線人員

Page 13: 解決正確的問題 - 如何讓數據發揮影響力?

User-centric Business Intelligence

因果關係驗證

資料探索、假設建立、相關性探究

資料人員

第一線人員

Page 14: 解決正確的問題 - 如何讓數據發揮影響力?

市場上既已存在工具方案 (eg. Tableau)又何必自行研究?

Page 15: 解決正確的問題 - 如何讓數據發揮影響力?

Complexity is both domain + infrastructure

specific

Page 16: 解決正確的問題 - 如何讓數據發揮影響力?

How do you encode & implement this part?

Page 17: 解決正確的問題 - 如何讓數據發揮影響力?

即便是用 Google Bigquery仍是如此SQL 是

Business Logic 的組合語言

Page 18: 解決正確的問題 - 如何讓數據發揮影響力?

如果可以這樣該有多麼美好?SELECT keyID,userID,dot_email,gender,points,user_nickname,username,userEmail,userSendableEmail,userRole,userCity,userSchool,userGrade,joinedTime,userBirthdate,match_score FROM (SELECT pre.keyID AS keyID,pre.userID AS userID,pre.dot_email AS dot_email,pre.gender AS gender,pre.points AS points,pre.user_nickname AS user_nickname,pre.username AS username,pre.userEmail AS userEmail,pre.userSendableEmail AS userSendableEmail,pre.userRole AS userRole,pre.userCity AS userCity,pre.userSchool AS userSchool,pre.userGrade AS userGrade,pre.joinedTime AS joinedTime,pre.userBirthdate AS userBirthdate,post.match_score AS match_score FROM (SELECT * FROM (SELECT pre.keyID AS keyID,pre.userID AS userID,pre.dot_email AS dot_email,pre.gender AS gender,pre.points AS points,pre.user_nickname AS user_nickname,pre.username AS username,post.userEmail AS userEmail,post.userSendableEmail AS userSendableEmail,post.userRole AS userRole,post.userCity AS userCity,post.userSchool AS userSchool,post.userGrade AS userGrade,post.joinedTime AS joinedTime,post.userBirthdate AS userBirthdate FROM (SELECT * FROM (SELECT keyID, userID, dot_email, gender, points, user_nickname, username FROM (SELECT __key__.name AS keyID, user_id AS userID, user_email AS underline_email, current_user.email AS cu_email, user.email AS dot_email, gender, points, user_nickname, username FROM junyi_20161212.UserData_20161212)) AS pre INNER JOIN EACH (SELECT userEmail, userSendableEmail, userId AS userID, userRole, userCity, userSchool, userGrade, joinedTime, userBirthdate FROM FinalTable.UserFinalTmpInfo) AS post ON pre.userID = post.userID)) AS pre INNER JOIN EACH (SELECT keyID,match_score FROM (SELECT keyID, match_score FROM (SELECT keyID,prob1*prob2 AS match_score FROM (SELECT pre.keyID AS keyID,pre.prob1 AS prob1,post.prob2 AS prob2 FROM (SELECT * FROM (SELECT keyID, prob1 FROM (SELECT keyID, match_score AS prob1 FROM (SELECT keyID, match_score FROM (SELECT keyID, match_score_A*cdf AS match_score FROM (SELECT keyID, MAX(output) AS cdf, match_score_A FROM (SELECT keyID, IF(metric >= input,1,0) AS compare, output, match_score_A FROM (SELECT pre.keyID AS keyID,pre.comparekey AS comparekey,pre.metric AS metric,pre.match_score_A AS match_score_A,post.input AS input,post.output AS output FROM (SELECT * FROM (SELECT keyID, 1 AS comparekey, metric, match_score AS match_score_A FROM (SELECT pre.keyID AS keyID,pre.match_score AS match_score,post.dot_email AS dot_email,post.metric AS metric FROM (SELECT * FROM (SELECT teacher_keyID AS keyID, MAX(match_score) AS match_score FROM(SELECT pre.student_keyID AS student_keyID,pre.classID AS classID,pre.student_underline_email AS student_underline_email,pre.teacher_keyID AS teacher_keyID,post.match_score AS match_score FROM (SELECT * FROM (SELECT pre.student_keyID AS student_keyID,pre.classID AS classID,pre.student_underline_email AS student_underline_email,post.teacher_keyID AS teacher_keyID FROM (SELECT * FROM (SELECT student_keyID, classID, student_underline_email FROM FLATTEN((SELECT __key__.name AS student_keyID, user_email AS student_underline_email, student_lists.path AS classID FROM junyi_20161212.UserData_20161212), classID)) AS pre INNER JOIN EACH (SELECT teacher_keyID, classID FROM(SELECT coaches.name AS teacher_keyID, __key__.path AS classID, code AS classcode, name AS classname FROM junyi_20161212.StudentList_20161212)) AS post ON pre.classID = post.classID)) AS pre INNER JOIN EACH (SELECT classID, MAX(match_score) AS match_score FROM (SELECT classID,match_score FROM (SELECT classID, MAX(match_score) AS match_score FROM (SELECT pre.student_keyID AS student_keyID,pre.classID AS classID,pre.student_underline_email AS student_underline_email,pre.teacher_keyID AS teacher_keyID,post.match_score AS match_score FROM (SELECT * FROM (SELECT pre.student_keyID AS student_keyID,pre.classID AS classID,pre.student_underline_email AS student_underline_email,post.teacher_keyID AS teacher_keyID FROM (SELECT * FROM (SELECT student_keyID, classID, student_underline_email FROM FLATTEN((SELECT __key__.name AS student_keyID, user_email AS student_underline_email, student_lists.path AS classID FROM junyi_20161212.UserData_20161212), classID)) AS pre INNER JOIN EACH (SELECT teacher_keyID, classID FROM(SELECT coaches.name AS teacher_keyID, __key__.path AS classID, code AS classcode, name AS classname FROM junyi_20161212.StudentList_20161212)) AS post ON pre.classID = post.classID)) AS pre INNER JOIN EACH (SELECT keyID AS student_keyID, match_score FROM (SELECT student_keyID AS keyID, MAX(match_score) AS match_score FROM(SELECT student_keyID, teacher_keyID, 1 AS match_score FROM (SELECT pre.student_keyID AS student_keyID,pre.classID AS classID,pre.student_underline_email AS student_underline_email,post.teacher_keyID AS teacher_keyID FROM (SELECT * FROM (SELECT student_keyID, classID, student_underline_email FROM FLATTEN((SELECT __key__.name AS student_keyID, user_email AS student_underline_email, student_lists.path AS classID FROM junyi_20161212.UserData_20161212), classID)) AS pre INNER JOIN EACH (SELECT teacher_keyID, classID FROM(SELECT coaches.name AS teacher_keyID, __key__.path AS classID, code AS classcode, name AS classname FROM junyi_20161212.StudentList_20161212)) AS post ON pre.classID = post.classID))) GROUP BY keyID)) AS post ON pre.student_keyID = post.student_keyID)) GROUP BY classID)) GROUP BY classID) AS post ON pre.classID = post.classID)) GROUP BY keyID) AS pre INNER JOIN EACH (SELECT pre.dot_email AS dot_email,pre.metric AS metric,post.keyID AS keyID FROM (SELECT * FROM (SELECT dot_email, SUM(points_earned) AS metric FROM(SELECT dot_email, points_earned, timestamp, time_consumed FROM (SELECT dot_email, points_earned, timestamp, time_consumed FROM (SELECT dot_email,points_earned,timestamp,time_consumed FROM (SELECT dot_email, points_earned, timestamp, time_consumed FROM (SELECT user.email AS dot_email, points_earned, seconds_watched AS time_consumed, DATE_ADD(time_watched, 8, 'HOUR') AS timestamp FROM junyi_20161212.VideoLog_20161212)),(SELECT dot_email, points_earned, timestamp, time_consumed FROM (SELECT user.email AS dot_email, points_earned, time_taken AS time_consumed, DATE_ADD(time_done, 8, 'HOUR') AS timestamp FROM junyi_20161212.ProblemLog_20161212)))) WHERE timestamp >= TIMESTAMP('2016-09-01') AND timestamp <= TIMESTAMP('2016-12-12')) GROUP BY dot_email) AS pre INNER JOIN EACH (SELECT keyID, dot_email FROM (SELECT __key__.name AS keyID, user.email AS dot_email FROM junyi_20161212.UserData_20161212)) AS post ON pre.dot_email = post.dot_email)) AS post ON pre.keyID = post.keyID))) AS pre INNER JOIN EACH (SELECT input, AVG(output) AS output, 1 AS comparekey FROM ( SELECT input, output/101 AS output FROM ( SELECT input, ROW_NUMBER() OVER() AS output FROM ( FLATTEN(( SELECT QUANTILES(metric,101) AS input FROM (SELECT pre.keyID AS keyID,pre.match_score AS match_score,post.dot_email AS dot_email,post.metric AS metric FROM (SELECT * FROM (SELECT student_keyID AS keyID, MAX(match_score) AS match_score FROM(SELECT student_keyID, teacher_keyID, 1 AS match_score FROM (SELECT pre.student_keyID AS student_keyID,pre.classID AS classID,pre.student_underline_email AS student_underline_email,post.teacher_keyID AS teacher_keyID FROM (SELECT * FROM (SELECT student_keyID, classID, student_underline_email FROM FLATTEN((SELECT __key__.name AS student_keyID, user_email AS student_underline_email, student_lists.path AS classID FROM junyi_20161212.UserData_20161212), classID)) AS pre INNER JOIN EACH (SELECT teacher_keyID, classID FROM(SELECT coaches.name AS teacher_keyID, __key__.path AS classID, code AS classcode, name AS classname FROM junyi_20161212.StudentList_20161212)) AS post ON pre.classID = post.classID))) GROUP BY keyID) AS pre INNER JOIN EACH (SELECT pre.dot_email AS dot_email,pre.metric AS metric,post.keyID AS keyID FROM (SELECT * FROM (SELECT dot_email, SUM(points_earned) AS metric FROM(SELECT dot_email, points_earned, timestamp, time_consumed FROM (SELECT dot_email, points_earned, timestamp, time_consumed FROM (SELECT dot_email,points_earned,timestamp,time_consumed FROM (SELECT dot_email, points_earned, timestamp, time_consumed FROM (SELECT user.email AS dot_email, points_earned, seconds_watched AS time_consumed, DATE_ADD(time_watched, 8, 'HOUR') AS timestamp FROM junyi_20161212.VideoLog_20161212)),(SELECT dot_email, points_earned, timestamp, time_consumed FROM (SELECT user.email AS dot_email, points_earned, time_taken AS time_consumed, DATE_ADD(time_done, 8, 'HOUR') AS timestamp FROM junyi_20161212.ProblemLog_20161212)))) WHERE timestamp >= TIMESTAMP('2016-09-01') AND timestamp <= TIMESTAMP('2016-12-12')) GROUP BY dot_email) AS pre INNER JOIN EACH (SELECT keyID, dot_email FROM (SELECT __key__.name AS keyID, user.email AS dot_email FROM junyi_20161212.UserData_20161212)) AS post ON pre.dot_email = post.dot_email)) AS post ON pre.keyID = post.keyID))), input)))) GROUP BY input) AS post ON pre.comparekey = post.comparekey))) WHERE compare == 1 GROUP BY keyID, match_score_A))))) AS pre INNER JOIN EACH (SELECT keyID, prob2 FROM (SELECT keyID, match_score AS prob2 FROM (SELECT keyID, match_score FROM (SELECT keyID, timing_match_denumerator/timing_match_numerator AS match_score FROM (SELECT keyID, SUM(match_condition) OVER (PARTITION BY keyID) AS timing_match_denumerator, COUNT(match_condition) OVER (PARTITION BY keyID) AS timing_match_numerator, FROM (SELECT keyID, date, month, weekday, hour, minute, IF((hour == 9 OR hour == 10 OR hour == 11 OR hour == 12 OR hour == 13 OR hour == 14 OR hour == 15),1,0) AS match_condition FROM (SELECT keyID, date, month,

class.students <- con_relationshipByClass(dataset, outputUsers.role = "student")

class.teachers <- con_relationshipByClass(dataset, outputUsers.role = "teacher”, student_UserObj = class.students)

Regular_Preparation_Public.teachers <- UserObj_ANDmerge(con_probability_AgreaterthanB(dataset,UserObj.A = class.teachers,UserObj.B = class.students,metric = "seconds”, video = T, prob = T,window.begin = window.begin, window.end = window.end),

con_activity_timing(dataset, prob = T, video = T,complete.mission = T, assign.mission = T,hour = 9:15, window.begin = window.begin, window.end = window.end))

Regular_Preparation_Public.teachers <- view_User_FinalTable(dataset, Regular_Preparation_Public.teachers, output = "data")

Page 19: 解決正確的問題 - 如何讓數據發揮影響力?

如果可以這樣該有多麼美好?class.students <- con_relationshipByClass(dataset, outputUsers.role = "student")

class.teachers <- con_relationshipByClass(dataset, outputUsers.role = "teacher”, student_UserObj = class.students)

Regular_Preparation_Public.teachers <- UserObj_ANDmerge(con_probability_AgreaterthanB(dataset,UserObj.A = class.teachers,UserObj.B = class.students,metric = "seconds”, video = T, prob = T,window.begin = window.begin, window.end = window.end),

con_activity_timing(dataset, prob = T, video = T,complete.mission = T, assign.mission = T,hour = 9:15, window.begin = window.begin, window.end = window.end))

Regular_Preparation_Public.teachers <- view_User_FinalTable(dataset, Regular_Preparation_Public.teachers, output = "data")

有班級的使用者( 學生 ) 所屬的老師

ViewFunction

比較 A 與 B課堂時間影片跟習題的秒數回傳 A

Page 20: 解決正確的問題 - 如何讓數據發揮影響力?

以邏輯模組來操作已建置的 SQL

有班級的使用者( 學生 ) 所屬的老師

ViewFunction

比較 A 與 B課堂時間影片跟習題的秒數回傳 A

AB

Page 21: 解決正確的問題 - 如何讓數據發揮影響力?

邏輯模組的其他好處• Reusable code• Focus on business logic itself• Enabling very complex queries = Allowing rapid hypothesis generation and alignment with data

• Makes validation & debugging easier

Page 22: 解決正確的問題 - 如何讓數據發揮影響力?

Having data is not enough

Page 23: 解決正確的問題 - 如何讓數據發揮影響力?

Too much data is

Counter-productive

Lost in strange trends (not big pictures)and on numbers that lead to No action

Page 24: 解決正確的問題 - 如何讓數據發揮影響力?
Page 25: 解決正確的問題 - 如何讓數據發揮影響力?

It makes no senseif you can’t do or

no need to do anything about it.

Page 26: 解決正確的問題 - 如何讓數據發揮影響力?

7 天前來過,每次平均消費 3000NTD總共來過 4 次Recency = 7

Frequency = 4Monetary Value = 3000

Page 27: 解決正確的問題 - 如何讓數據發揮影響力?

Monetary Value

Frequency

常來 + 每次來花很多

少來 + 做不到業績

偶爾來一旦來就花很多

常來但每次來都很省

Page 28: 解決正確的問題 - 如何讓數據發揮影響力?

Monetary Value

Frequency

常來 + 每次來花很多

少來 + 做不到業績

偶爾來一旦來就花很多

常來但每次來都很省

來店有禮 吸引你常來蒞臨

Page 29: 解決正確的問題 - 如何讓數據發揮影響力?

Monetary Value

Frequency

常來 + 每次來花很多

少來 + 做不到業績

偶爾來一旦來就花很多

常來但每次來都很省

買滿千送百 給你湊數的誘因

Page 30: 解決正確的問題 - 如何讓數據發揮影響力?

同一件商品,若可以辨識你是價格不敏感者 賣高價品價格敏感者 打折賣

= “Actions we can take”

Page 31: 解決正確的問題 - 如何讓數據發揮影響力?

其他可以用來辨識「價格敏感度」的資料 ?

Page 32: 解決正確的問題 - 如何讓數據發揮影響力?

收入?學生?資產狀況小孩人數?Coupon的使用?是否是當地人?品牌忠誠度?市場熟悉程度?同類型產品,買哪種價位的?

Page 33: 解決正確的問題 - 如何讓數據發揮影響力?

Ignore coupons Offer high price

MaximizeRevenue

Theory: 消費行為受 Price-sensitivity 影響

“Data” “Action” “Outcome”

Page 34: 解決正確的問題 - 如何讓數據發揮影響力?

要倒回來想:主詞是能做什麼(Action)才能產生想要的

Outcome?

Data Action Outcome

Page 35: 解決正確的問題 - 如何讓數據發揮影響力?

Asking a relevant questionis domain specific skill

Page 36: 解決正確的問題 - 如何讓數據發揮影響力?
Page 37: 解決正確的問題 - 如何讓數據發揮影響力?

Take Action = P( 生理數據 ) 非專業人士:

Page 38: 解決正確的問題 - 如何讓數據發揮影響力?

Take Action = P( 生理數據 | 病史 ) 專業人士:

Page 39: 解決正確的問題 - 如何讓數據發揮影響力?

專業人士:P( HR<60 | 不規律呼吸 , 高血壓) = 腦壓上昇

醫療文獻稱這三者合併出現為 Cushing’s reflex

Page 40: 解決正確的問題 - 如何讓數據發揮影響力?

善用資料改善線上教育 slide (https://goo.gl/Yd6ajF)

Page 41: 解決正確的問題 - 如何讓數據發揮影響力?

善用資料改善線上教育 slide (https://goo.gl/Yd6ajF)

Page 42: 解決正確的問題 - 如何讓數據發揮影響力?

https://www.junyiacademy.org/

Page 43: 解決正確的問題 - 如何讓數據發揮影響力?

https://www.junyiacademy.org/

Page 44: 解決正確的問題 - 如何讓數據發揮影響力?

https://www.junyiacademy.org/

Page 45: 解決正確的問題 - 如何讓數據發揮影響力?

善用資料改善線上教育 slide (https://goo.gl/Yd6ajF)

Page 46: 解決正確的問題 - 如何讓數據發揮影響力?

老師及所屬學生的使用行為分析 翻轉教學成就孩子Theory: 好的 practice是可以被複製跟運用的

“Data” “Action” “Outcome”

找出領頭羊以便觀摩

Page 47: 解決正確的問題 - 如何讓數據發揮影響力?
Page 48: 解決正確的問題 - 如何讓數據發揮影響力?

醫師 老師病史詢問、各式檢查執行治療規劃治療方針衛教

評估先備知識安排學習情境、教材評估學習成效給予指導、輔導跟支持選擇教學策略

Page 49: 解決正確的問題 - 如何讓數據發揮影響力?

Data Action Outcome

What is the theory?What cause-effect is already known?

Page 50: 解決正確的問題 - 如何讓數據發揮影響力?

必須深入有關學習的理論才能有更多可能性

Page 51: 解決正確的問題 - 如何讓數據發揮影響力?

學生學習的行為 成就孩子

Learning theories

“Data” “Action” “Outcome”

協助挑選合適的教學策略

Page 52: 解決正確的問題 - 如何讓數據發揮影響力?

三種主流的觀點BehaviorismCognitivismConstructivism

數據最容易應用的思維

Page 53: 解決正確的問題 - 如何讓數據發揮影響力?

Behaviorism 行為主義刺激 行為 後果

Page 54: 解決正確的問題 - 如何讓數據發揮影響力?

刺激 行為 後果古典制約 Classical Conditioning

Page 55: 解決正確的問題 - 如何讓數據發揮影響力?

刺激 行為 後果操作制約 Operant Conditioning

Page 56: 解決正確的問題 - 如何讓數據發揮影響力?
Page 57: 解決正確的問題 - 如何讓數據發揮影響力?

行為主義的觀點Feedback loops + 合適的 Reward搭配可以系統化的影響學習行為

Page 58: 解決正確的問題 - 如何讓數據發揮影響力?

Behaviorism 行為主義• 被動的接受知識• 遺忘是因為久了沒用• 學習就是–行為機率變化 : P(刺激 ) = 行為

• Generalization是因為刺激源類似• 強調practice、 cues、 hints、 rewards …

Page 59: 解決正確的問題 - 如何讓數據發揮影響力?

行為主義的侷限認知資源低的學習 = 很強大、很有效–寫選擇題、填空、背誦、配對比對

認知資源高的學習 = 解釋力很差–學會能賴以工作的外語–做別人沒做過的事、 ill-defined problems–創作、申論題

Page 60: 解決正確的問題 - 如何讓數據發揮影響力?

What can we learn from other theories?

Cognitivism 認知主義–內在 knowledge state變化 State & Path analysis

學習者必須主動的把知識 encode到記憶中–強調新知識跟舊知識的關係 倒帶行為

Page 61: 解決正確的問題 - 如何讓數據發揮影響力?

0:10 – 0:60Peak at 0:30

2:00-2:20

3:50 -

Page 62: 解決正確的問題 - 如何讓數據發揮影響力?

「質數判別方法」影片劇本• 找出 1 to 10哪些是質數?– [0:10] 1 不是質數,也不是合數– [0:25] 合數就是除了質數以外的數– [0:40] 0也不是質數,也不是合數– [0:50] 2 是質數唯一的偶數,其他質數都是奇數

• [2:00] 再次強調,只有 2 是偶數的質數• [2:10] 13 是否為質數?• [3:50] 33 是否為質數?• [4:10] 一個數一定可以被自己跟 1整除• [4:30] 33 可以被 3整除,所以不是質數

Page 63: 解決正確的問題 - 如何讓數據發揮影響力?

把影片內容依主題切成九段

Page 64: 解決正確的問題 - 如何讓數據發揮影響力?

從頭看到尾,只看一次 只看前面的觀念,不做 13,33的練習

前面的觀念看兩次後,才做 13,33練習 重複從頭到尾看很多次

Page 65: 解決正確的問題 - 如何讓數據發揮影響力?

能對教材「倒帶」「暫停」「跳過」是少數能呈現學生對自己的學習做選擇的結果

Page 66: 解決正確的問題 - 如何讓數據發揮影響力?

學生所做的選擇反映他如何建構他自己的知識

Page 67: 解決正確的問題 - 如何讓數據發揮影響力?

Issues not addressed by Behaviorism

Context 情境Intrinsic motivation(內在動機 )–Competence = 成就感–Autonomy = 自我控制感,可以自己做選擇–Relatedness = 社群、夥伴、團隊

Page 68: 解決正確的問題 - 如何讓數據發揮影響力?

使用者 long-term engagement 關鍵1.社群的存在2.內在動機驅動3.Experience本身會 evolve over time ,而非短期的 funnel 或 habit loops

Page 69: 解決正確的問題 - 如何讓數據發揮影響力?

創造一個會隨時間逐漸 unfold 的體驗一個會隨使用者的投入與技能成長改變的體驗