SAP HANA – Im Takt des Motors Christoph Rühle, MHP Competence Center SAP HANA June 2013.
SAP FORUM İSTANBUL 2016 - BÜYÜK VERİNİZİ SAP HANA İLE ANALİZ EDEBİLİRSİNİZ
-
Upload
sap-turkiye -
Category
Technology
-
view
68 -
download
1
Transcript of SAP FORUM İSTANBUL 2016 - BÜYÜK VERİNİZİ SAP HANA İLE ANALİZ EDEBİLİRSİNİZ
SAP FORUM İSTANBUL Reimagine Business for the Digital Economy
Buyuk Verinizi SAP HANA ile Nasil Analiz Edersiniz
Speaker’s Name : Ilker Tasdemir
Department : Profesyonel Hizmetler ve Servis Direktoru
© 2016 SAP SE or an SAP affiliate company. All rights reserved. 2 Internal
Agenda
MDS ap Firma Tanitimi
Big Data Dilemma
Is Hadoop = Big Data?
SAP HANA and Hadoop Use Cases
Why Do We Need SAP?
© 2016 SAP SE or an SAP affiliate company. All rights reserved. 4 Internal
Eastern Europe
Middle East
Africa
45+ years of Experience in IT (Since 1967)
4500+ Employees in 30 countries across 3 continents
150+ companies unified under the group
100+ top resellers awards from global IT Leaders
A 3.5 billion USD Leader offering stability & high Integrity in Technology & Solutions
SAP Partner Centre of Excellence
MDS ap Tech Overview
a MIDIS Group Company
Over 24 Years of in depth experiences helping customers Manage, Integrate, Analyze and
Mobilize Business Mission critical Data across the enterprise; Exceptional track record
providing Turnkey IT Solutions across Turkey, Middle East & Europe.
A Unique Partnership with SAP; Implementing Excellence; Optimizing Application
Management
Strategic long term partnerships with our customers; Focusing on Customer Satisfaction
and Technology Innovation
Help customers better use their data assets to improve business performance and make
smarter decisions
© 2016 SAP SE or an SAP affiliate company. All rights reserved. 5 Internal
The MDS ap Differentiator
5
MDSap with the best of breed SAP Business Analytics Platform
provides a complete Agile Visualization & Advanced Analytics
Solutions that optimizes Any Data Variety, regardless of its
structure, at Real-Time Velocity, to deliver next generation analytics
© 2016 SAP SE or an SAP affiliate company. All rights reserved. 6 Internal
Customer Base Over 400 Enterprise Customers
• Turkey Customers:
• Akbank
• Albaraka Türk
• Anadolu Sigorta
• DHL
• Halk Emeklilik
• Halk Sigorta
• İETT
• ING Bank
• İş Yatırım
• Meteoroloji Genel Müdürlüğü
• PTT
• T.C. Maliye Bakanlığı
• T.C. Orman Ve Su İşleri Bakanlığı
• TEB
• Toprak Mahsulleri Ofisi
• Turkcell
• Türkiye Finans Katılım Bankası
• Türkiye İş Bankası
• VakıfBank
• Ziraat Bankası
• Regional Customers:
• Abu Dhabi Investment Authority (Adia)
• ADIB
• Ahlibank
• Bank Dhofar
• Bank Nizwa
• BISB
• Boubyan Bank
• Emirates NBD
• Kuwait Credit Bank
• Kuwait Finance House
• Orange
• Qatar Islamic Bank
• RTA
• Saudi Arabian Monetary Agency
• Saudi Credit Bureau
© 2016 SAP SE or an SAP affiliate company. All rights reserved. 7 Internal
Rich Ecosystem Over 25 Partners
© 2016 SAP SE or an SAP affiliate company. All rights reserved. 9 Internal
BIG DATA ACCELERATED DRAMATICALLY THE OBSOLESCENCE OF IT LANDSCAPE
CRM data
GPS
Demand
Speed
Velocity
Transactions
Oppo
rtunit
ies
Service calls
Customer
Sales orders
Inventory
E-mails
Tweets
Planning
M2M Mobile
Instant messages
Volume
Variety Velocity
Value Variability
COMPLEX
Validity
© 2016 SAP SE or an SAP affiliate company. All rights reserved. 10 Internal
Desktop
Hobbyist
The Future?
Internet
Big Data
Byte : one grain of rice
Kilobyte : cup of rice
Megabyte : 8 bags of rice
Gigabyte : 3 Semi trucks
Terabyte : 2 Container Ships
Petabyte : Blankets Manhattan
Exabyte : Blankets west coast states
Zettabyte : Fills the Pacific Ocean
Yottabyte : AN EARTH SIZE RICE BALL!
How did we reach here?
2008
NSA's 1,500,000 square foot data center being built outside Salt Lake City
will be the first facility to house a yottabyte of data
2012
18,000 BC
1991
1928
© 2016 SAP SE or an SAP affiliate company. All rights reserved. 11 Internal
Typical “Best-Practice” Approach
© 2016 SAP SE or an SAP affiliate company. All rights reserved. 12 Internal
Typical “Best Practice” Approach
• Drop useful data by introducing ETL “bias”
• Potentially insightful data is lost
• Create latency as volumes increase and sources change
• Duplicate data through staging environments to support ETL
• Expensive “reactive” hardware to support processing scale requirements
Impact if we keep the current architecture
© 2016 SAP SE or an SAP affiliate company. All rights reserved. 14 Internal
What is Hadoop?
Microsoft Confidential
14
Distributed, scalable system on commodity HW
Composed of a few parts:
HDFS – Distributed file system
MapReduce – Programming model
Other tools: Hive, Pig, SQOOP, HCatalog, HBase, Flume, Mahout, YARN, Tez, Spark, Stinger, Oozie, ZooKeeper, Flume, Storm
Main players are Hortonworks, Cloudera, MapR
WARNING: Hadoop, while ideal for processing huge volumes of data, is inadequate for analyzing that data in real time (companies do batch analytics instead)
Core Services
OPERATIONAL SERVICES
DATA SERVICES
HDFS
SQOOP
FLUME
NFS
LOAD & EXTRACT
WebHDFS
OOZIE
AMBARI
YARN
MAP REDUCE
HIVE & HCATALOG
PIG
HBASE FALCON
Hadoop Cluster
compute
&
storage . . .
. . .
. . compute
&
storage
.
.
Hadoop clusters provide
scale-out storage and
distributed data processing
on commodity hardware
© 2016 SAP SE or an SAP affiliate company. All rights reserved. 15 Internal
What does Hadoop Provide?
• “A place to store unlimited amounts of data in any format inexpensively
• Allows collection of data that you may or may not use later: “just in case”
• A way to describe any large data pool in which the schema and data requirements are not defined until the data is queried: “just in time” or “schema on read”
• Complements EDW and can be seen as a data source for the EDW – capturing all data but only passing relevant data to the EDW
• Frees up expensive EDW resources (storage and processing), especially for data refinement
• Allows for data exploration to be performed without waiting for the EDW team to model and load the data
• Some processing in better done on Hadoop than ETL tools
• Also called bit bucket, staging area, landing zone or enterprise data hub
• Typical players are Hortonworks, Cloudera, MapR
© 2016 SAP SE or an SAP affiliate company. All rights reserved. 16 Internal
The Real Cost of Hadoop
http://www.wintercorp.com/tcod-report/
© 2016 SAP SE or an SAP affiliate company. All rights reserved. 17 Internal
Hadoop versus HANA
Hadoop HANA
Data Architecture Unstructured data and files on disk Structured data in memory
Data Structures No predefined schema (Schema On
Read)
Predefined schemas and
models
Performance Slow data access, seconds to hours Very fast, milliseconds to
seconds
Scalability Scale-out to hundreds and thousands of
commodity nodes
Scale-up/Scale-out to many
servers
Data Consistency BASE (Basic Availability, Soft State,
Eventual Consistency)
ACID (Atomic, Consistency,
Isolation, Durability)
Licensing Cost Free open source or commercial open
source
Many options from cloud to
enterprise
© 2016 SAP SE or an SAP affiliate company. All rights reserved. 19 Internal
When to use Hadoop vs HANA
• Hadoop has the lowest storage cost and highest data type flexibility, but also
the slowest processing speed
• SAP HANA has the highest processing speed and data conformity, but also
more limited by cost and data type
• Key is to leverage strengths of both platforms
• Hadoop + HANA = Infinite Storage and Instant Insight!
© 2016 SAP SE or an SAP affiliate company. All rights reserved. 20 Internal
Hadoop as Flexible Data Store
Use Hadoop to capture all types of data from
multiple sources
• SAP and non-SAP, internal and external
sources
• Full fidelity, lowest level granularity capture,
and storage of data of any type allows
preservation of data for future use
• Store and retrieve very large data sets and
objects
• Aggregate and consolidate OLTP data in
Hadoop to create OLAP fact tables for SAP
HANA
• Feed SAP HANA, SAP BusinessObjects,
Predictive Analytics via Hive, or Data
Services ETL
• Interactive Big Data Exploration
© 2016 SAP SE or an SAP affiliate company. All rights reserved. 21 Internal
Flexible Data Store Example
Data Descriptions
Data Stream Capture Real-time capture of high-volume data streams such as machine generated
log and sensor data, real-time Web logs
Document and Multimedia
storage
Very high volume storage of business documents(healthcare, insurance).
Rapid high volume storage and retrieval of media and BLOBs for social and
Web applications like Facebook using HBase
Social Media and Email Real-time capture of social and email text data for sentiment analytics, email
archiving
OLTP Transaction Data Capture of high volume OLTP transactions such as call centre, inventory, and
any other process transactions. Aggregate transactions and build OLAP fact
tables for SAP HANA. ETL via SAP Data Services to SAP HANA
Reference Data Copy of existing large reference data sets such as GIS, survey, industry-
specific data sets can be combined with other data for analytics
Data Archive Archive of system logs, audit data, and other data that otherwise would go to
long-term, off-site storage
© 2016 SAP SE or an SAP affiliate company. All rights reserved. 22 Internal
Hadoop as a Processing Engine
Use Hadoop as a data processing engine for
ETL rationalization to feed SAP HANA
• MapReduce programs execute process logic
• Pig for data analysis
• Mahout for data mining and machine
learning
• Replicate master data to Hadoop for data
processing
• Feed results to SAP HANA with Data
Services and merge with conformed data
model
© 2016 SAP SE or an SAP affiliate company. All rights reserved. 23 Internal
Processing Engine Example
Data Descriptions
Data Cleansing and
Enrichment
Fix data issues in Hadoop, enhance with additional information
ETL Rationalization Low-latency ingestion of data from operational systems
Tiered-storage: High-Valued Data loaded and transformed in HANA in
parallel, off-load preprocessing to Hadoop
Data Mining and Predictive
Analytics
Correlation, clustering, regression analysis. Predict machine failure, correlate
customer behavior across systems
Identify differences Differences in large, but different sets of data such as DNA analysis
Risk Analysis Fraud detection, identity risk patterns
© 2016 SAP SE or an SAP affiliate company. All rights reserved. 24 Internal
Hadoop and SAP HANA for Analysis
How does Hadoop fit into the Data Analytics Process with SAP HANA and BI
• Hadoop can store such high volumes of data that it often can’t be replicated into SAP HANA in a cost effective
or timely manner
• Some of the analysis must be done in Hadoop, as well as SAP HANA
• Queries executed in Hadoop take much longer to run than SAP HANA
• Analysis will likely require combining data from Hadoop, SAP HANA and other data sources
Combined Analytics:
• Two Phase Analytics
• Addresses long running Hadoop query times
• Run analysis continually on Hadoop, the periodic updates to SAP HANA for fast interactive query
response
• Federated Queries
• Split analysis into parts and run asynchronously on Hadoop, SAP HANA other systems
• Federate results in SAP HANA or BI
© 2016 SAP SE or an SAP affiliate company. All rights reserved. 25 Internal
Two Phase Analytics
• Hadoop runs data mining, statistical analysis, OLAP fact table generation – “slow” analytics
• SAP Data Services ETL process pushes results to SAP HANA for “fast” analytics
© 2016 SAP SE or an SAP affiliate company. All rights reserved. 26 Internal
Federated Queries
• Split Analysis into multiple queries, consolidate results
© 2016 SAP SE or an SAP affiliate company. All rights reserved. 27 Internal
Federation Scenarios
Client Side Federation
• BI Tool queries separately and combines the results
• Only for smaller data and result sets
Query Federation:
• Server-side execution of multiple queries and results combined
• Better for large data sets
Data Federation:
• Hadoop data virtualized as a table by another database like SAP HANA
© 2016 SAP SE or an SAP affiliate company. All rights reserved. 28 Internal
SAP HANA and Hadoop Integration
SAP HANA can integrate with Hadoop
• Smart Data Access
• Virtual Table created in SAP HANA
points to remote Hive source,
queries pushed down to Hive
• SAP Data Services
• Connect via Hive, HDFS
• Push MapReduce jobs to Hadoop
with Pig scripts
• SAP HANA Vora
• Native Spark processing with push-
down logic to Hadoop
• Vora Adapter for HANA to utilize
SDA
© 2016 SAP SE or an SAP affiliate company. All rights reserved. 29 Internal
“Typical” versus “Innovative”
• Entire “universe” of data is captured and maintained
• Mining of data via transformation on read leaves all data in place
• Refineries leverage the power of the cloud and traditional technologies
• Integration with traditional data warehousing methodologies
• Scale can be pushed to cloud for more horsepower
• Orchestration of data is a reality (less rigid, more flexible, operational)
• Democratization of predictive analytics, data sets, services and reports
© 2016 SAP SE or an SAP affiliate company. All rights reserved. 31 Internal
What’s the meaning of Life, Universe and Everything?
In the radio series and the first novel (1978), a group of
hyper-intelligent pan-dimensional beings demand to
learn the Answer to the Ultimate Question of Life, The
Universe, and Everything from the supercomputer, Deep
Thought, specially built for this purpose.
https://youtu.be/aboZctrHfK8
It takes Deep Thought 7½ million years to compute and check the answer.
© 2016 SAP SE or an SAP affiliate company. All rights reserved. 32 Internal
The Answer to Life, Universe and Everything is…
© 2016 SAP SE or an SAP affiliate company. All rights reserved. 33 Internal
What’s the Meaning of 42?
“The answer seems meaningless
because the beings who instructed it
never actually knew what the Question
was”
“Deep Thought can built a machine to
calculate the real question in 10M
years”
© 2016 SAP SE or an SAP affiliate company. All rights reserved. 34 Internal
Conclusion
“We don’t have 7 ½ million years to ask silly
questions.
We don’t have 10 million years to decide what
question to ask”
© 2016 SAP SE or an SAP affiliate company. All rights reserved.
Thank you
Contact information:
Ilker Tasdemir
Profesyonel Hizmetler ve Servis Direktoru
+90 532 549 9392 / +971 50 712 9169