O2O
2014- 2014- 4 2010- 1
21097075
POI/DEAL
O2O
smartbox
POI/DEAL
KTV
l 1-2 l 3-5
l 30%+3000w
l Top495%
30.00%
35.00%
40.00%
45.00%
50.00%
55.00%
2015
0101
20
1501
16
2015
0131
20
1502
15
2015
0302
20
1503
17
2015
0401
20
1504
16
2015
0501
20
1505
16
2015
0531
20
1506
15
2015
0701
20
1507
16
2015
0801
20
1508
16
2015
0831
20
1509
15
2015
0930
20
1510
15
smartbox
-
JOIN
POI/DEAL
LABEL
-
KV
hadoop
hive
spark
-
QUERY-DEAL CTR
POI-USER
DEAL-USER
QUERY CTR CVR
DEAL CTR CVR Isnew
USER
POI WIFI DEAL
QUERY-POI
-
querybrandname
POI
POI
-
1
1
2 2
- Application
Flume Impression Log
Labeled Data
Order/Click Log
Deal DB
Model 1
Feature
Offline Training
Model 2
Online Training
Rerank Service
DEBU
G
ABTEST
--
protoBufferLib
serialize
KV
deserialize
OnlineFeatureMap
-
rank
Online learning GBDT
GBDT
LR
-
CTRP(click|show)
acaonP(acaon|click)
P(instant)
E&E
POI/DEAL ?
l l
-
hadoop
hive
spark
storm
ATP
hdfs
hive
hbase
tair
updater
T
-
Geohash
6
3
1
2
X
wifi
Geohash
POI
DEAL
QUERY X
-
POI
-
Label
KV
1
2
3
4
N+1
N+2
N
GBDT
LR
-
2
storm
LR SVM
MinMax Standard
API Command
Chi squared
Topic Model GBDT
Mutual Informaaon
DEBUG
l
-GBDT
Hypothesis
Loss Funcaon
Update Funcaon
Tree 1 Tree 2 Tree M
......
1( ) ( ; )
M
M m mm
h x T x=
= 2
1
1min ( ) min ( ( ))2
N
i ih H h H iL h y h x
=
1, 1
{ , } arg min ( , ( ) ( ; ))m m
N
m m i m m miL y h x T x
=
= +
-GBDT Command sample
. /gbdt_train -train trainset -test testset -model model_path -conf config_file
arguments (conf/gbdt.conf)
dim=100 # depth=4 # iteraaons=100 # shrinkage=0.2 # fraao=1.0 # draao=1.0 # maxBins=40 # bin loss=LogLikelihood # loglikelihood square error init=false # hasInitValue=false # debug=true # debug