Easier and Faster for hbase in HadoopCon 2014

43
Faster and Easier for HBase 亦亦亦亦 Hubert 亦亦亦亦

description

A tool to make faster and easier for HBase

Transcript of Easier and Faster for hbase in HadoopCon 2014

Page 1: Easier and Faster for hbase in HadoopCon 2014

Faster and Easier for HBase

亦思科技 Hubert 范姜冠宇

Page 2: Easier and Faster for hbase in HadoopCon 2014

Who are we?

• 位於新竹科學園區• 過去主要客戶為園區各大製造廠• 未來願景 : 全球 HBase 最專業的軟體廠商• 努力方向 :

– 提供 HBase 較好的使用介面– 改善 HBase 效能 , 提高 Availability– 提供各產業完善的 HBase 解決方案

Page 3: Easier and Faster for hbase in HadoopCon 2014

What we done?• 2010.7 以研發雲端計算軟體工具之投資計畫獲准進駐新竹科學園區• 2011 與清華大學資工系鍾葉青教授合作進行產學合作• 少數獲邀參與國際雲端計算研討會 IEEE CloudCom 的專業公司• 少數已經有實際經驗協助客戶完成建置 Hadoop 系統的資訊廠商• 2012.01 JackHare (ANSI SQL JDBC Driver) • 2012.11 HareDB Hbase Client• 2012.12 HareDB Data Model Management• 2013.08 Hare ( High Speed Query in HBase)

Page 4: Easier and Faster for hbase in HadoopCon 2014
Page 5: Easier and Faster for hbase in HadoopCon 2014

Who am I ?

• 姓名 : 范姜冠宇 Hubert• 任職 : 亦思科技• 功能 :

– 設計 HBase 相關產品與解決方案– 替苦悶的 RD 工程師打氣– 在嚴肅場合說冷笑話

Page 6: Easier and Faster for hbase in HadoopCon 2014

TALK ABOUT HBASE

Page 7: Easier and Faster for hbase in HadoopCon 2014

HBase 原生介面

Page 8: Easier and Faster for hbase in HadoopCon 2014

不方便的 HBase

• 資料傾倒問題• Table 管理問題

– Schema 管理問題• 查詢問題 ( 方便 , 效能 )• 程式學習門檻問題

Page 9: Easier and Faster for hbase in HadoopCon 2014

資料傾倒問題

Page 10: Easier and Faster for hbase in HadoopCon 2014

資料傾倒問題

Page 11: Easier and Faster for hbase in HadoopCon 2014

資料傾倒問題

Page 12: Easier and Faster for hbase in HadoopCon 2014

TABLE 管理問題

Page 13: Easier and Faster for hbase in HadoopCon 2014

Table 管理

Page 14: Easier and Faster for hbase in HadoopCon 2014

查詢問題

Page 15: Easier and Faster for hbase in HadoopCon 2014

查詢問題 (UI Query)

Page 16: Easier and Faster for hbase in HadoopCon 2014

查詢問題 (SQL Query)

Page 17: Easier and Faster for hbase in HadoopCon 2014

程式學習門檻問題

Page 18: Easier and Faster for hbase in HadoopCon 2014

如何降低門檻• ODBC/JDBC Driver• HareSQL Driver• Example with R

Page 19: Easier and Faster for hbase in HadoopCon 2014

ODBC DRIVER

Page 20: Easier and Faster for hbase in HadoopCon 2014
Page 21: Easier and Faster for hbase in HadoopCon 2014

ODBC Driver

• http://www.microsoft.com/en-us/download/details.aspx?id=40886

Page 22: Easier and Faster for hbase in HadoopCon 2014

設定 HareDB 連線資訊• 在 ` 開始 ` 搜尋 odbc administrator( 如圖 )• 若作業系統為 X64 ,請執行 64-bit ; X86

請執行 32-bit 。

Page 23: Easier and Faster for hbase in HadoopCon 2014

設定 HareDB 連線資訊• 點選新增,建立新資料來源。• 選擇 `Microsoft Hive ODBC Driver` 後,點

選完成。

Page 24: Easier and Faster for hbase in HadoopCon 2014

設定 HareDB 連線資訊

Page 25: Easier and Faster for hbase in HadoopCon 2014

Excel 讀取 HareDB 示範

Page 26: Easier and Faster for hbase in HadoopCon 2014

Excel 讀取 HareDB 示範

Page 27: Easier and Faster for hbase in HadoopCon 2014

Excel 讀取 HareDB 示範

Page 28: Easier and Faster for hbase in HadoopCon 2014

Excel 讀取 HareDB 示範

Page 29: Easier and Faster for hbase in HadoopCon 2014

Excel 讀取 HareDB 示範

Page 30: Easier and Faster for hbase in HadoopCon 2014

Excel 讀取 HareDB 示範

Page 31: Easier and Faster for hbase in HadoopCon 2014

Excel 讀取 HareDB 示範

Page 32: Easier and Faster for hbase in HadoopCon 2014

Excel 讀取 HareDB 示範

Page 33: Easier and Faster for hbase in HadoopCon 2014

SQL STRING

Page 34: Easier and Faster for hbase in HadoopCon 2014
Page 35: Easier and Faster for hbase in HadoopCon 2014
Page 36: Easier and Faster for hbase in HadoopCon 2014

與應用系統整合 – ( R )

Page 37: Easier and Faster for hbase in HadoopCon 2014
Page 38: Easier and Faster for hbase in HadoopCon 2014
Page 39: Easier and Faster for hbase in HadoopCon 2014

FASTER

Page 40: Easier and Faster for hbase in HadoopCon 2014

Region 4

Client

HBase

HiveQL

HareDriverHive Parser

Hare Planner

Hare Executor Coprocessor

Hare Optimizer

Region 3

EndPoint Instance

Region 2

EndPoint Instance

Region 1

EndPoint Instance

Hight Speed ?

Hare Executor Coprocessor Windup Server

Region 5

Page 41: Easier and Faster for hbase in HadoopCon 2014

  SQL A SQL B SQL C SQL E SQL F SQL G SQL H

Impala in Hadoop 115 s 13 s 91 s 78 s   7 s 6 s

Impala in HBase 2925 s 0.26 s 2338 s 5876 s   5832 s oom

Hare (only for HBase)

1410 s 9 s 1355 s 1303 s 1283 s 1258 s 1640 s

Hive in Hadoop 113 s 107 s 110 s 161 s 154 s 163 s 157 s

Hive mr2 10694 s 22 s 9661 s 9462 s 9461 s 9484 s 9032 s

Faster

Page 42: Easier and Faster for hbase in HadoopCon 2014

One more thing ….

Page 43: Easier and Faster for hbase in HadoopCon 2014