Complex Event Processing on Ruby, Fluentd and Norikra #rubykaigi
-
Upload
satoshi-tagomori -
Category
Technology
-
view
8.811 -
download
1
Transcript of Complex Event Processing on Ruby, Fluentd and Norikra #rubykaigi
CRUBY+JRUBY
FLUENTDCEPNORIKRA
MSGPACK-RPC-OVER-HTTP
LOGGING
STREAM PROCESSINGxQL
ESPER
13年6月1日土曜日
Complex Event Processingon Ruby, Fluentd and Norikra
RubyKaigi 2013 (2013/06/01)TAGOMORI Satoshi (@tagomoris)
13年6月1日土曜日
TAGOMORI Satoshi (@tagomoris)LINE corp.
Ruby, Perl, Node.js, Hadoop, ...
13年6月1日土曜日
TAGOMORI Satoshi (@tagomoris)LINE corp.
Ruby, Perl, Node.js, Hadoop, ...
Please, Call me 'MORIS' !
13年6月1日土曜日
13年6月1日土曜日
2013/04- LINE Corporation (+NHN Japan)
2012/01- NHN Japan
-2011/12 livedoor (+NHN Japan +Naver Japan)
13年6月1日土曜日
13年6月1日土曜日
13年6月1日土曜日
My mission: loggingStore access logs / application logsCalculate & visualize service activitiesBuild data warehouse for application engineers' operationsNotify anomaly service statuses
for system status (HTTP status, response time, ...)
for application metrics13年6月1日土曜日
Our log trafficDaily
1.5+ TB (non compressed)5.6+ Billion lines / day (56億行/day)
Peak time140,000+ lines / sec300Mbps
13年6月1日土曜日
What we want to doCOUNT PV,UU and others (daily/realtime)
COUNT Service metrics (daily/hourly)
FIND Surprising Errors [4xx,5xx] (immediately)
CHECK Response Times (immediately)
SERCH Logs in troubles (hourly/immediately)
VISUALIZE/NOTIFY App Status (realtime)
13年6月1日土曜日
BATCHESAND
STREAMS
13年6月1日土曜日
Batches and StreamsHadoop is for batchesHigh performance batch is important
HDFS has good performance
Stream log writing and calculationsare also VERY VERY IMPORTANT
Hybrid System:Stream processing + Batch
13年6月1日土曜日
System OverviewWeb Servers Fluentd
Cluster
ArchiveStorage(scribed)
FluentdWatchers
GraphTools
Notifications(IRC)
Hadoop Cluster(HDFS, YARN)
webhdfs
HuahinManager
hiveserver
STREAM
Shib ShibUI
BATCH SCHEDULEDBATCH
Norikra
13年6月1日土曜日
Stream processingParsing logs
Appending flags for analysis
Counting rate/bytes
Calculating system metrics
Calculating application metrics
13年6月1日土曜日
Fluentd
"Fluentd" is a lightweight and flexible log collector. Fluentd receives logs as JSON streams, buffers them, and sends them to other systems like Amazon S3, MongoDB, Hadoop, or other Fluentds.
http://fluentd.org
13年6月1日土曜日
Fluentd on CRubyeasy to install/setup (from rubygems.org)
plugins
easy to install (from rubygems.org)
easy to write (with ruby!)
stability (no one crashes in this 1 year)
throughput (17500 msgs/sec)
td-agent (rpm/deb: ruby and fluentd and some plugins)
13年6月1日土曜日
Fluentd users
13年6月1日土曜日
Fluentd: stream aggregationSystem metrics: status / response time
13年6月1日土曜日
Fluentd: stream aggregation### response time aggregation<match responsetime.monitor.*> type numeric_monitor tag monitor.responsetime aggregate tag unit minute monitor_key duration percentiles 50,90,95,98,99</match>
### response time counting<match responsetime.counter.*> type numeric_counter tag numcount.responsetime aggregate tag unit minute count_key duration pattern1 u100ms 0 100000 pattern2 u500ms 100000 500000 pattern3 u1s 500000 1000000 pattern4 u3s 1000000 3000000 pattern5 long 3000000</match>
### HTTP status counting<match httpstatus.counter.*> type datacounter tag_prefix datacount.httpstatus output_per_tag yes aggregate tag output_messages yes
unit minute count_key status
pattern1 2xx ^2\d\d pattern2 3xx ^3\d\d pattern3 429 ^429 pattern4 4xx ^4\d\d pattern5 5xx ^5\d\d</match>
13年6月1日土曜日
break
13年6月1日土曜日
And more: stream queryCustom plugin: not so casual enough
xQL: declarative language
streams processing
for optional data fields
no more schema management
connectivity with Fluentd
13年6月1日土曜日
Stream query: vs stored data query
No more query wait time
Immediate result for time batch
No more storages
No more query execution management
Once register query, runs forever
13年6月1日土曜日
Norikra
13年6月1日土曜日
Norikra
Full feature of Esper over JRuby
Simple RPC: msgpack-rpc-over-http
Simple RPC Server: mizuno (jetty + rack)
Simple Client Library: norikra-client
Just same code for cruby/jruby
13年6月1日土曜日
Norikra
Norikra Server (on JVM)
Esper Instance (Query Engine)
Type DefinitionManager
Output Event Pool
Norikra Engine
RPC Servermizuno (Jetty + Rack)
Rack RPC HandlerNorikraClient
NorikraClient
JRUBY
CRUBY
msgpack-rpc-over-http
13年6月1日土曜日
Esper
"Esper and Event Processing Language (EPL) provide a highly scalable, memory-efficient, in-memory computing, SQL-standard, minimal latency, real-time streaming Big Data processing engine for medium to high-velocity and high-variety data."
http://esper.codehaus.org/
13年6月1日土曜日
Norikra Query: target "sales"
goods_id:5 price:49.8 num:1 shop:"LINE"goods_id:2 price:12.5 num:3 shop:"Cookpad"goods_id:4 price:36.6 num:10 shop:"Cookpad"
SELECT shop, sum(price*num) AS amountFROM sales.win:time_batch(10 minutes)GROUP BY shop
goods_id:5 price:49.8 num:1 shop:"LINE"
goods_id:2 price:12.5 num:3 shop:"Cookpad" affiliate:"BiS"
SELECT affiliate, count(*) AS cntFROM sales.win:time_batch(1 hour)GROUP BY affiliate
13年6月1日土曜日
Norikra query: vs Fluentd custom plugin
SQL!!!
No more restart for new queries
register queries whenever we want
No more private plugins
No more fat Fluentd configurations
13年6月1日土曜日
fluent-plugin-norikra
Fluentd plugin to use Norikra
Norikra server autostart
Automatically defined target(ex: table)
Pre-defined queries for each targets
13年6月1日土曜日
fluent-plugin-norikra
installation
`gem install fluent-plugin-norikra`
configuration
see DEMO
13年6月1日土曜日
Demo: bootstrap
rbenv shell jruby-1.7.4gem install norikrawhich norikrarbenv shell 2.0.0-pxxxgem install fluent-plugin-norikravi demo.conffluentd -c demo.conf
13年6月1日土曜日
Demo: query streams
some messages over fluent-cat
register queries with norikra-client
more messages over fluent-cat & norikra-client
13年6月1日土曜日
Roadmapof Norikra
13年6月1日土曜日
roadmap of norikraNorikra is still UNDER DEVELOPMENT
Norikra feature updates (JOINs, etc)Web GUI
query & target list managementsave & restore
Distributed & orchestrated nodes
13年6月1日土曜日
Ruby without Rails
13年6月1日土曜日
Unbelievableto stop GC!!!!!!!!!!
13年6月1日土曜日
CRuby
great partner for java & rubyistand for jvm middleware, like Hadoop Norikra uses Esper's internal API to parse queries
gems across platforms?
JRuby
long-running daemons on crubymemory usage is big problem
13年6月1日土曜日
SHUT THE FUCK UPAND WRITE SOME QUERY
13年6月1日土曜日
See also:http://fluentd.org/http://fluentd.org/plugin/https://github.com/tagomoris/norikrahttps://github.com/tagomoris/norikra-clienthttps://github.com/tagomoris/fluent-plugin-norikrahttp://esper.codehaus.org/
"Fluentd: The ruby based middleware across the world"http://www.slideshare.net/tagomoris/fluentd-in-tkrk10
"Log analysis system with Hadoop in livedoor 2013 Winter"http://www.slideshare.net/tagomoris/log-analysis-with-hadoop-in-livedoor-2013
13年6月1日土曜日