Hadoop meets Cloud with Multi-Tenancy
-
Upload
treasure-data-inc -
Category
Technology
-
view
7.020 -
download
0
description
Transcript of Hadoop meets Cloud with Multi-Tenancy
![Page 1: Hadoop meets Cloud with Multi-Tenancy](https://reader034.fdocument.pub/reader034/viewer/2022051400/554f44b4b4c905cd048b56b5/html5/thumbnails/1.jpg)
Treasure DataHadoop meets Cloud with Multi-Tenancy
Kazuki OhtaFounder and CTO at Treasure Data, Inc.
Hadoopユーザー会 [email protected]
@kzk_mover
Friday, April 5, 13
![Page 2: Hadoop meets Cloud with Multi-Tenancy](https://reader034.fdocument.pub/reader034/viewer/2022051400/554f44b4b4c905cd048b56b5/html5/thumbnails/2.jpg)
Who are you? Kazuki Ohta (太田一樹)
• @kzk_mover, [email protected]
Treasure Data, Inc.• Chief Technology Officer, Founded July 2011
Hadoop User Group Japan• One of Founders• “Hadoop徹底入門”
Open-Source Enthusiast• Hadoop, memcached, jemalloc, MongoDB, memcached, uim, etc...
2
Friday, April 5, 13
![Page 3: Hadoop meets Cloud with Multi-Tenancy](https://reader034.fdocument.pub/reader034/viewer/2022051400/554f44b4b4c905cd048b56b5/html5/thumbnails/3.jpg)
3
Data Volume
Cloud
EnterpriseRDBMSLightweight
RDBMS
DB2
1Bil entryOr 10TB
TraditionalData Warehouse
$10Bmarket
$34Bmarket
Database-as-a-service
Big Data-as-a-Service
On-Premise
© 2012 Forrester Research, Inc. Reproduction Prohibited
Treasure Data = Cloud + Big Data
Friday, April 5, 13
![Page 4: Hadoop meets Cloud with Multi-Tenancy](https://reader034.fdocument.pub/reader034/viewer/2022051400/554f44b4b4c905cd048b56b5/html5/thumbnails/4.jpg)
4
What is the Problem?
Friday, April 5, 13
![Page 5: Hadoop meets Cloud with Multi-Tenancy](https://reader034.fdocument.pub/reader034/viewer/2022051400/554f44b4b4c905cd048b56b5/html5/thumbnails/5.jpg)
Big Data? NoSQL?
5
Friday, April 5, 13
![Page 6: Hadoop meets Cloud with Multi-Tenancy](https://reader034.fdocument.pub/reader034/viewer/2022051400/554f44b4b4c905cd048b56b5/html5/thumbnails/6.jpg)
6
Too Many Solutions
Friday, April 5, 13
![Page 7: Hadoop meets Cloud with Multi-Tenancy](https://reader034.fdocument.pub/reader034/viewer/2022051400/554f44b4b4c905cd048b56b5/html5/thumbnails/7.jpg)
7from http://marblejenka.blogspot.jp/2013/01/hadoop.html
Hadoop Versions
Too Many Variations (+Eco System)
Friday, April 5, 13
![Page 8: Hadoop meets Cloud with Multi-Tenancy](https://reader034.fdocument.pub/reader034/viewer/2022051400/554f44b4b4c905cd048b56b5/html5/thumbnails/8.jpg)
Current Big Data Solutions: ‘Feature Creep’
8http://en.wikipedia.org/wiki/Feature_creepFriday, April 5, 13
![Page 9: Hadoop meets Cloud with Multi-Tenancy](https://reader034.fdocument.pub/reader034/viewer/2022051400/554f44b4b4c905cd048b56b5/html5/thumbnails/9.jpg)
9
We need Machete :)
Machete Design by James LindenbaumHeroku Co-Founderhttp://www.youtube.com/watch?v=3BhDLm9jo5Y
EVERYTHINGwith
ONE interface
Simple & Discoverable
Friday, April 5, 13
![Page 10: Hadoop meets Cloud with Multi-Tenancy](https://reader034.fdocument.pub/reader034/viewer/2022051400/554f44b4b4c905cd048b56b5/html5/thumbnails/10.jpg)
‘Simplicity’ itself is a feature :)
10
by Anand Babu PeriasamyGlusterFS Co-Founder
Friday, April 5, 13
![Page 11: Hadoop meets Cloud with Multi-Tenancy](https://reader034.fdocument.pub/reader034/viewer/2022051400/554f44b4b4c905cd048b56b5/html5/thumbnails/11.jpg)
Next Topic: Cloud?
11
Friday, April 5, 13
![Page 12: Hadoop meets Cloud with Multi-Tenancy](https://reader034.fdocument.pub/reader034/viewer/2022051400/554f44b4b4c905cd048b56b5/html5/thumbnails/12.jpg)
12
http://www.saasblogs.com/saas/demystifying-the-cloud-where-do-saas-paas-and-other-acronyms-fit-in/
Friday, April 5, 13
![Page 13: Hadoop meets Cloud with Multi-Tenancy](https://reader034.fdocument.pub/reader034/viewer/2022051400/554f44b4b4c905cd048b56b5/html5/thumbnails/13.jpg)
Battle Field of IaaS Vendors: SCM
13
HW Performance / Price
Time
On-Premise
Decrease withMoore’s Law
IaaS Vendors
Battle Field:Supply Chain Management
In the near future, most of HW buyers aren’t individual companies, but cloud.
Friday, April 5, 13
![Page 14: Hadoop meets Cloud with Multi-Tenancy](https://reader034.fdocument.pub/reader034/viewer/2022051400/554f44b4b4c905cd048b56b5/html5/thumbnails/14.jpg)
PaaS, SaaS:IT is all about Operation
14
With PaaS, you offload your development operations function and have the PaaS provider handle the tools and components required to deploy and manage applications reliably. - EngineYard
More Sleep, More Value
Friday, April 5, 13
![Page 15: Hadoop meets Cloud with Multi-Tenancy](https://reader034.fdocument.pub/reader034/viewer/2022051400/554f44b4b4c905cd048b56b5/html5/thumbnails/15.jpg)
15
PaaS/SaaS Battle Field: ‘Time’ is Money
CustomerValue
Time
IdealExpectation
Sign-up or PO
Obsoleteover time
Reality(On-Premise)
HW/SW Selection, PoC, Deploy...Upgrade
Friday, April 5, 13
![Page 16: Hadoop meets Cloud with Multi-Tenancy](https://reader034.fdocument.pub/reader034/viewer/2022051400/554f44b4b4c905cd048b56b5/html5/thumbnails/16.jpg)
16
Introductionto
Treasure Data
Friday, April 5, 13
![Page 17: Hadoop meets Cloud with Multi-Tenancy](https://reader034.fdocument.pub/reader034/viewer/2022051400/554f44b4b4c905cd048b56b5/html5/thumbnails/17.jpg)
17
Company Overview
US team as of 2012 JulyFriday, April 5, 13
![Page 18: Hadoop meets Cloud with Multi-Tenancy](https://reader034.fdocument.pub/reader034/viewer/2022051400/554f44b4b4c905cd048b56b5/html5/thumbnails/18.jpg)
Company Overview Silicon Valley-based Company
• All Founders are Japanese• Hironobu Yoshikawa• Kazuki Ohta• Sadayuki Furuhashi
OSS Enthusiasts• MessagePack, Fluentd, etc.• Cloud native
18
Friday, April 5, 13
![Page 19: Hadoop meets Cloud with Multi-Tenancy](https://reader034.fdocument.pub/reader034/viewer/2022051400/554f44b4b4c905cd048b56b5/html5/thumbnails/19.jpg)
19
Our 50+ Customers – Fortune Global 500 leaders and start-ups including:
250 billion records / month in Feb 2013
2 million jobs executed
Friday, April 5, 13
![Page 20: Hadoop meets Cloud with Multi-Tenancy](https://reader034.fdocument.pub/reader034/viewer/2022051400/554f44b4b4c905cd048b56b5/html5/thumbnails/20.jpg)
20
Vision: Single Analytics Platform for the World
Friday, April 5, 13
![Page 21: Hadoop meets Cloud with Multi-Tenancy](https://reader034.fdocument.pub/reader034/viewer/2022051400/554f44b4b4c905cd048b56b5/html5/thumbnails/21.jpg)
Investors Bill Tai Naren Gupta - Nexus Ventures, Director of Redhat, TIBCO Othman Laraki - Former VP Growth at Twitter James Lindenbaum, Adam Wiggins, Orion Henry - Heroku
Founders Anand Babu Periasamy, Hitesh Chellani - Gluster
Founders Yukihiro “Matz” Matsumoto - Creator of Ruby Dan Scheinman - Director of Arista Networks + 10 more people
• and....21
Jerry Yang, Founder of Yahoo!where Hadoop was invented :)
Check out Today (2013/01/21)’s Morning 日経新聞!
Friday, April 5, 13
![Page 22: Hadoop meets Cloud with Multi-Tenancy](https://reader034.fdocument.pub/reader034/viewer/2022051400/554f44b4b4c905cd048b56b5/html5/thumbnails/22.jpg)
22
Treasure Data’sPhilosophy and Architecture
Friday, April 5, 13
![Page 23: Hadoop meets Cloud with Multi-Tenancy](https://reader034.fdocument.pub/reader034/viewer/2022051400/554f44b4b4c905cd048b56b5/html5/thumbnails/23.jpg)
23
Big Data Adoption Stages
Intelligence Sophistication
Standard Reports
Ad-hoc Reports
Drill Down Query
Alerts
Statistical Analysis
Predictive Analysis
Optimization
What happened?
Where?
Where exactly?
Error?
Why?
What’s a trend?
What’s the best?
Analytics
Reporting
Treasure Data’s FOCUS
(80% of needs)
Friday, April 5, 13
![Page 24: Hadoop meets Cloud with Multi-Tenancy](https://reader034.fdocument.pub/reader034/viewer/2022051400/554f44b4b4c905cd048b56b5/html5/thumbnails/24.jpg)
24
Full Stack Support for Big Data Reporting
Our best-in-class architecture and operations team ensure the integrity and availability of your data.
Data from almost any source can be securely and reliably uploaded using td-agent in streaming or batch mode.
Our SQL, REST, JDBC, ODBC and command-line interfaces support all major query tools and approaches.
You can store gigabytes to petabytes of data efficiently and securely in our cloud-based columnar datastore.
Friday, April 5, 13
![Page 25: Hadoop meets Cloud with Multi-Tenancy](https://reader034.fdocument.pub/reader034/viewer/2022051400/554f44b4b4c905cd048b56b5/html5/thumbnails/25.jpg)
25
Treasure Data = Collect + Store + Query
Friday, April 5, 13
![Page 26: Hadoop meets Cloud with Multi-Tenancy](https://reader034.fdocument.pub/reader034/viewer/2022051400/554f44b4b4c905cd048b56b5/html5/thumbnails/26.jpg)
26
Example in AdTech: MobFox
1. Europe’s largest independent mobile ad exchange.
2. 20 billion imps/month (circa Jan. 2013)
3. Serving ads for 15,000+ mobile apps (circa Jan. 2013)
4. Needed Big Data Analytics infrastructure ASAP.
Friday, April 5, 13
![Page 27: Hadoop meets Cloud with Multi-Tenancy](https://reader034.fdocument.pub/reader034/viewer/2022051400/554f44b4b4c905cd048b56b5/html5/thumbnails/27.jpg)
27
Two Weeks From Start to Finish!
Friday, April 5, 13
![Page 28: Hadoop meets Cloud with Multi-Tenancy](https://reader034.fdocument.pub/reader034/viewer/2022051400/554f44b4b4c905cd048b56b5/html5/thumbnails/28.jpg)
28
Our Value was Proven :)
CustomerValue
Time
Our Value: Save Time!
Sign-up or PO
Obsoleteover time
Reality(On-Premise)
HW/SW Selection, PoC, Deploy...Upgrade
SimpleInterface
Friday, April 5, 13
![Page 29: Hadoop meets Cloud with Multi-Tenancy](https://reader034.fdocument.pub/reader034/viewer/2022051400/554f44b4b4c905cd048b56b5/html5/thumbnails/29.jpg)
29
Architecture Breakdown
Data Collection• Increasing variety of
data sources• No single data schema• Lack of streaming data
collection method• 60% of Big Data project
resource consumed
Data Store/Analytics• Remaining complexity in
both traditional DWH and Hadoop (very slow time to market)
• Challenges in scaling data volume and expanding cost.
Connectivity• Required to ensure
connectivity with existing BI/visualization/apps by JDBC, REST and ODBC.
Friday, April 5, 13
![Page 30: Hadoop meets Cloud with Multi-Tenancy](https://reader034.fdocument.pub/reader034/viewer/2022051400/554f44b4b4c905cd048b56b5/html5/thumbnails/30.jpg)
1) Data Collection 60% of BI project resource is consumed here Most ‘underestimated’ and ‘unsexy’ but MOST important Fluentd: OSS lightweight but robust Log Collector
• http://fluentd.org/
30
15:40~ Log analysis system with Hadoop in livedoor 2013
by Satoshi Tagomori @ NHN Japan
16:30~ いかにしてHadoopにデータを集めるか by Sadayuki Furuhahsi @ Treasure Data, Inc.
These talks will cover Fluentd :)
Friday, April 5, 13
![Page 31: Hadoop meets Cloud with Multi-Tenancy](https://reader034.fdocument.pub/reader034/viewer/2022051400/554f44b4b4c905cd048b56b5/html5/thumbnails/31.jpg)
31
2) Data Store / Analytics - Columnar Storage
Friday, April 5, 13
![Page 32: Hadoop meets Cloud with Multi-Tenancy](https://reader034.fdocument.pub/reader034/viewer/2022051400/554f44b4b4c905cd048b56b5/html5/thumbnails/32.jpg)
32
3) Connectivity
Query
Web App
MySQLTreasure Data
Columnar Storage
QueryProcessingCluster
Query API
REST API
JDBC, ODBC Driver
td-command
BI apps
Postgres
Result
Friday, April 5, 13
![Page 33: Hadoop meets Cloud with Multi-Tenancy](https://reader034.fdocument.pub/reader034/viewer/2022051400/554f44b4b4c905cd048b56b5/html5/thumbnails/33.jpg)
Most Difficult Challenge: Multi-Tenancy All customers share the Hadoop clusters (4 Data Centers) Resource Sharing (Burst Cores), Rapid Improvement, Ease of Upgrade
33
datacenter A
datacenter B
datacenter C
datacenter D
Local FairScheduler
Local FairScheduler
Local FairScheduler
Local FairScheduler
GlobalScheduler
On-DemandResouce Allocation
Job Submission+ Plan Change
Friday, April 5, 13
![Page 34: Hadoop meets Cloud with Multi-Tenancy](https://reader034.fdocument.pub/reader034/viewer/2022051400/554f44b4b4c905cd048b56b5/html5/thumbnails/34.jpg)
Conclusion Big Data is too complex
• Needs Simplicity• Machete v.s. Swiss Army Knife (Feature Creep)
IT is changing• The value of Software itself is decreasing• Operation is the key
Treasure Data = Cloud + Big Data• Currently Focusing on Big Data Reporting• Instant Value with Simple Interface
34
Friday, April 5, 13
![Page 35: Hadoop meets Cloud with Multi-Tenancy](https://reader034.fdocument.pub/reader034/viewer/2022051400/554f44b4b4c905cd048b56b5/html5/thumbnails/35.jpg)
35
We’re Hiring Top Talents, please contact me :)
Friday, April 5, 13
![Page 36: Hadoop meets Cloud with Multi-Tenancy](https://reader034.fdocument.pub/reader034/viewer/2022051400/554f44b4b4c905cd048b56b5/html5/thumbnails/36.jpg)
3618
Appendix
Friday, April 5, 13
![Page 37: Hadoop meets Cloud with Multi-Tenancy](https://reader034.fdocument.pub/reader034/viewer/2022051400/554f44b4b4c905cd048b56b5/html5/thumbnails/37.jpg)
37
Big Data Market GrowthBig Data Revenue Breakdown(average of IDC, Gartner and Wikibon stats)
CAGR 38%
“More than half a billion dollars in venture capital has been invested in new big data technology.”
— Dan Vessett, IDC
“In 2012…BI and Analytics are rated #1 priorities.” — Ravi Kalakota, Gartner
“Big Data is the new definitive source of competitive advantage across all industries.”
— Jeff Kelly, Wikibon
Friday, April 5, 13
![Page 38: Hadoop meets Cloud with Multi-Tenancy](https://reader034.fdocument.pub/reader034/viewer/2022051400/554f44b4b4c905cd048b56b5/html5/thumbnails/38.jpg)
38
Big Data Situation
CustomerValue
Time
Treasure Data
AWS
On-premise solutions
Sign-up or PO
Software B
EMR
RedShift
Software A
Obsolescenceover time
Friday, April 5, 13
![Page 39: Hadoop meets Cloud with Multi-Tenancy](https://reader034.fdocument.pub/reader034/viewer/2022051400/554f44b4b4c905cd048b56b5/html5/thumbnails/39.jpg)
39
Treasure Data Service ArchitectureUser
Apache
App
App
Other data sources
RDBMS
Treasure Data columnar data
warehouse
QueryProcessingCluster
Query API
HIVE, PIG (to be supported)
JDBC, REST
MAPREDUCE JOBS
td-command
BI apps
Friday, April 5, 13
![Page 40: Hadoop meets Cloud with Multi-Tenancy](https://reader034.fdocument.pub/reader034/viewer/2022051400/554f44b4b4c905cd048b56b5/html5/thumbnails/40.jpg)
40
Our Own Open Source technologiesWe are open source natives and proud of our heritage.We’ve contributed to Hibernate, Hadoop, Cassandra, Memcached, KDE, MongoDB among others.Our product reflects our deep commitment to the open-source community and is built on top of open source software we’ve authored and open sourced.• Fluentd - a popular data collector daemon written in Ruby www.fluentd.org (a leading user: SlideShare/Linkedin, One Kings Lane)• MessagePack - a fast, compact serializer. www.msgpack.org (a leading user: Pinterest, Redis)
Substantial commitment(Code, Packaging, Documentation,
Sponsorship)
Tech marketing, Possible lead gen
Friday, April 5, 13
![Page 41: Hadoop meets Cloud with Multi-Tenancy](https://reader034.fdocument.pub/reader034/viewer/2022051400/554f44b4b4c905cd048b56b5/html5/thumbnails/41.jpg)
41
Example in Web Industry
Friday, April 5, 13
![Page 42: Hadoop meets Cloud with Multi-Tenancy](https://reader034.fdocument.pub/reader034/viewer/2022051400/554f44b4b4c905cd048b56b5/html5/thumbnails/42.jpg)
42
Example Use Case – MySQL to TD
Friday, April 5, 13
![Page 43: Hadoop meets Cloud with Multi-Tenancy](https://reader034.fdocument.pub/reader034/viewer/2022051400/554f44b4b4c905cd048b56b5/html5/thumbnails/43.jpg)
43
Example Use Case – MySQL to TD
Friday, April 5, 13
![Page 44: Hadoop meets Cloud with Multi-Tenancy](https://reader034.fdocument.pub/reader034/viewer/2022051400/554f44b4b4c905cd048b56b5/html5/thumbnails/44.jpg)
Big Data for the Rest of Us
www.treasure-data.com | @TreasureData
Friday, April 5, 13