Amazon SageMaker · 2020. 10. 17. · © 2019, Amazon Web Services, Inc. or its Affiliates. All...
Transcript of Amazon SageMaker · 2020. 10. 17. · © 2019, Amazon Web Services, Inc. or its Affiliates. All...
© 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Solutions Architect, Amazon Web Services Japan.
2019.6.18 Shoko Utsunomiya
Amazon SageMaker
Amazon SageMaker
© 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
自己紹介
宇都宮 聖子, Ph.D.
• 機械学習ソリューションアーキテクト
• 機械学習サービスを担当
• 前々職は量子情報の研究者
• 前職は自動車OEMで自動運転開発
• 担当領域
• 自動運転、AIヘルスケア、AI ゲーム
• 好きなサービス
• Amazon SageMaker
© 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
• SageMaker
• SageMaker
•
•
• SageMaker
© 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
• SageMaker
• SageMaker
•
•
• SageMaker
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
様々な業界で使われる機械学習
メディア(エンターテイメン
ト)
流通・小売ヘルスケア,ライフサイエンス
金融(サービス,取引)
自動車,製造
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
典型的な機械学習のワークフロー
データ前処理
モデルの開発
モデルの学習モデルの評価
本番環境へのデプロイ
監視・評価データ収集
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
典型的な機械学習のワークフロー
データ前処理
モデルの開発
モデルの学習モデルの評価
本番環境へのデプロイ
監視・評価データ収集
できるだけ高速に負担なくこのサイクルを回したい
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
データ取得:高クオリティのラベル付きデータを低コストで
データの前処理:適切なデータ前処理のパイプライン
開発環境:すべてのエンジニアへ継続的なML環境提供を低コストで
学習環境:大量データに対するスケーラブルな学習環境
ジョブ管理:学習ジョブとモデルの管理,トレーサビリティ
モデル構築:研究開発の最新の成果をいち早くアップデート
運用:実運用からのフィードバックをよりクイックに、よりたくさん
推論・デプロイ:エンドポイントのホスティングを楽にモデル開発から運用開始のタイムラグを減らしたい
セキュリティ:会社や法律に準拠した運用環境の提供
機械学習のワークフローでよくある課題
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
機械学習における “Undifferentiated Heavy Lifting”
開発環境構築
• 必要なリソースの見積もりと購入の決断
• 開発チームで均一な開発環境構築
• フレームワークのインストール,バージョン管理
機械学習モデルの学習
• CPU/GPU など用途にあったハードウェア環境提供
• スケーラブルな分散学習構築と広帯域な通信環境
運用
• 推論環境の準備とモデルのホスティング
• 機械学習と異なるスキルセットが求められる
Photo by Victor Freitas on Unsplash
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
AWS の提供する機械学習スタック
App developers with
little knowledge of ML
ML developers and
data scientists
ML researchers and
academics
Amazon SageMaker
Ground Truth
Algorithms
Notebooks
Marketplace
UnsupervisedLearning
SupervisedLearning
ReinforcementLearning
Optimization(Neo)
Training
Hosting
Deployment
Frameworks Interfaces Infrastructure
Amazon Rekognition
Image
AmazonPolly
AmazonTranscribe
AmazonTranslate
AmazonComprehend
AmazonLex
AmazonRekognition
Video
Vision Speech Language Chatbots
AmazonForecast
Forecasting
AmazonTextract
AmazonPersonalize
Recommendations
AmazonEC2 P3 & P3DN
AmazonEC2 C5
FPGAs AWSGreengrass
AmazonElastic
Inference
AmazonInferentia
Labeling Model development Training HostingML SERVICES
ML FRAMEWORKS & INFRASTRUCTURE
AI SERVICES
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
AWS の提供する機械学習スタック
ML developers and
data scientists
ML researchers and
academics
Amazon SageMaker
Ground Truth
Algorithms
Notebooks
Marketplace
UnsupervisedLearning
SupervisedLearning
ReinforcementLearning
Optimization(Neo)
Training
Hosting
Deployment
Frameworks Interfaces Infrastructure
AmazonEC2 P3 & P3DN
AmazonEC2 C5
FPGAs AWSGreengrass
AmazonElastic
Inference
AmazonInferentia
Labeling Model development Training HostingML SERVICES
ML FRAMEWORKS & INFRASTRUCTURE
マネージドサービスを活用しビジネスの価値にフォーカス
Amazon SageMaker
API
13
SDK
© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon Confidential and Trademark
Amazon SageMaker Notebook instance
❖ Jupyter ❖ JupyterLab
•
• Git
• SageMaker
SageMaker Python SDK
https://github.com/aws/sagemaker-python-sdk
:
• API
•
•
AWS Summit NY spot
: API
•
API1
• GPU
Elastic Inference
•
•
• A/B
A/B
2:
SageMaker
3: GPU
AWS
1:
SageMaker
© 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
• SageMaker
• SageMaker
•
•
• SageMaker
SageMaker
( )
• Amazon S3
•
• API
( )
• Amazon ECR (Elastic
Container Registry)
• (TensorFlow )
…
•
SageMaker
•
S3
/opt/ml/input/data/dog1.jpgdog2.jpg
image/jpeg
application/
json
Amazon SageMaker
( )
Amazon SageMaker
AWS CloudOffice
NetworkSageMaker Service
1.
AWS CloudOffice
NetworkSageMaker Service
2.
AWS CloudOffice
NetworkSageMaker Service
3. S3
AWS CloudOffice
NetworkSageMaker Service
4.
AWS CloudOffice
NetworkSageMaker Service
5.
AWS CloudOffice
NetworkSageMaker Service
6.
AWS CloudOffice
NetworkSageMaker Service
© 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
• SageMaker
• SageMaker
•
•
• SageMaker
( )
S3
• 16TB
• Pandas
• EMR
S3
FILE (Default):
…
PIPE :
•
• TensorFlow, MXNet
( )
Linear Learner
XGBoost XGBoost,(eXtreme Gradient Boosting)
PCA (Principal
Component Analysis)
k-means K
k-NN K
Factorization Machines
Random Cut Forest robust random cut tree
LDA (Latent Dirichlet Allocation)
~ 機械学習モデル ~SageMaker
※ LDAのオリジナルは教師なし
https://docs.aws.amazon.com/sagemaker/latest/dg/algos.html
Image classification ResNet
Object Detection SSD (Single Shot multibox
Detector)
Semantic
Segmentation
FCN, PSP, DeepLabV3 (ResNet50, ResNet101)
seq2seq Deep LSTM
Neural Topic Model NTM, LDA
Blazing text Word2Vec
Text Classification
Object2Vec Word2Vec
DeepAR Forecasting Autoregressive RNN
IP Insights NN (IP entity ) IP
~ ディープラーニングモデル ~SageMaker
https://docs.aws.amazon.com/sagemaker/latest/dg/algos.html
Image Classification
ILSVRC 2015 ResNet
AWS
( )
https://docs.aws.amazon.com/ja_jp/sagemaker/latest/dg/image-classification.html
dog cat
AWS Marketplace for Machine Learning
•
200
• SageMaker
•
© 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon Confidential and Trademark
• AWS
SageMaker
200
Amazon SageMaker
ok
AWS Marketplace
© 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon Confidential and Trademark
MLマーケットプレイスアルゴリズムの利用(学習)
Marketplace上でアルゴリズムの選択
SageMaker上でアルゴリズム登録
トレーニングジョブの作成・実行
© 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon Confidential and Trademark
MLマーケットプレイスモデルの利用(推論)
Marketplace 上でモデルの選択
SageMaker 上でのモデルパッケージ登録
エンドポイントの作成
© 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon Confidential and Trademark
SageMaker SDK AWS
CLI
AWS
ML
•
•
•
•
container
• S3 fit
SageMaker
( )
•
• Docker github
• 2018 re:Invent
SageMaker container
Deep learning TensorFlow Legacy mode: 1.4.1, 1.5.0, 1.6.0, 1.7.0, 1.8.0, 1.9.0,
1.10.0
Script mode: 1.11.0, 1.12.0
Chainer 4.0.0, 4.1.0, 5.0.0
PyTorch 0.4.0, 1.0.0
MXNet 1.3.0, 1.2.1, 1.1.0, 0.12.1
ML scikit-learn 0.20.0
https://github.com/aws/sagemaker-python-sdk/tree/master/src/sagemaker
TensorFlow: https://github.com/aws/sagemaker-python-sdk/tree/master/src/sagemaker/tensorflow
Chainer: https://github.com/aws/sagemaker-python-sdk/tree/master/src/sagemaker/chainer
PyTorch: https://github.com/aws/sagemaker-python-sdk/tree/master/src/sagemaker/pytorch
MXNet: https://github.com/aws/sagemaker-python-sdk/tree/master/src/sagemaker/mxnet
Sklearn: https://github.com/aws/sagemaker-python-sdk/tree/master/src/sagemaker/sklearn
※ 2019 2 13•
SageMaker
•
•
• PC ( )
•
• GPU Elastic Inference
• , A/B
SageMaker SDK Estimator
Chainer Estimator
fit()
Chainer
S3
deploy()
predict()
transformer.transform()
S3
S3
Chainer main
OK
SageMaker
argparse
model_fn()
SageMaker
• API
• ( : TensorFlow Horovod )
estimator = TensorFlow(entry_point='train.py’,
train_instance_type=‘ml.p3.2xlarge’,
train_instance_count=2,…)
opt = tf.train.AdagradOptimizer(0.01 * hvd.size())
opt = hvd.DistributedOptimizer(opt)Optimizer
• Deep Learning
•
• Dropout
•
• …
•
(HPO)
•
https://github.com/aws/sagemaker-python-sdk#sagemaker-automatic-model-tuning
https://aws.amazon.com/jp/blogs/news/amazon-sagemaker-automatic-model-tuning-becomes-
more-efficient-with-warm-start-of-hyperparameter-tuning-jobs/
• Docker SageMaker
•
• train_instance_type=‘local’ SageMaker Python SDK
Mac Book
SageMaker
•
•
• PC ( )
•
• GPU Elastic Inference
• , A/B
• deploy() API
• Web API URL URL
•
( )
• リアルタイム推論が必要でない場合、推論エンドポイントを維持するとコストがかかる
• バッチ推論では、推論が必要なときに、エンドポイントを作成し、推論後のエンドポイント削除を自動で行う
EP EP
EP:
Elastic Inference
• GPU
• CPU / EC2
GPU
EIA (Elastic Inference Accelerator)
3
• eia1.medium: 8TFLOPS
• eia1.large: 16TFLOPS
• eia1.xlarge: 32TFLOPS
•
•
• SageMakerVariantInvocationsPer
Instance (1 1
)
•
https://docs.aws.amazon.com/sagemaker/latest/dg/endpoint-auto-scaling.html#endpoint-auto-scaling-add-policy
https://docs.aws.amazon.com/ja_jp/autoscaling/application/userguide/application-auto-scaling-target-tracking.html
A/B
• A/B
•
•
•
•
https://docs.aws.amazon.com/ja_jp/sagemaker/latest/dg/API_runtime_InvokeEndpoint.html
© 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
• SageMaker
• SageMaker
•
•
• SageMaker
© 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
• 2018 re:Invent
•
© 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
• 2018 re:Invent
•
© 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Git
• Git
• Git
© 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Git Integration
• SageMaker Git
clone
• JupyterLab Git extension
© 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
• 2018 re:Invent
•
© 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
• CloudWatch Metrics
•
( )
© 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
(Beta)
•
•
© 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
• 2018 re:Invent
•
© 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Amazon SageMaker Neo
• Tensorflow PyTorch EC2 Greengrass
• Deep Learning 500MB-1GB
Amazon SageMaker Neo Runtime は 1MB
• Apache Software License OSS
© 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
SageMaker Python SDK Neo
mnist_estimator = TensorFlow(
entry_point='mnist.py', role=role, framework_version='1.11.0’,
training_steps=1000, evaluation_steps=100,
train_instance_count=2, train_instance_type='ml.c4.xlarge’)
mnist_estimator.fit(inputs)
optimized_estimator = mnist_estimator.compile_model(
target_instance_family='ml_c5', input_shape={'data':[1, 784]},
output_path=output_path, framework='tensorflow’,
framework_version='1.11.0’)
optimized_predictor = optimized_estimator.deploy(
initial_instance_count = 1, instance_type = 'ml.c5.4xlarge')
Amazon SageMaker
© 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
• 2018 re:Invent
•
© 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
AWS Step Functions API Connectors
• Step Functions
AWS
〜 〜
• DynamoDB: item
item
• AWS Batch:
• Amazon ECS/Fargate: ECS Fargate
• Amazon SNS: SNS
• Amazon SQS:
• AWS Glue:
• Amazon SageMaker:
© 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Apache Airflow
• Airflow 1.10.1 SageMaker Operator
• Airflow SageMaker
train_config = training_config(…)
trans_config
=transform_config_from_estimator(…
)
train_op =
SageMakerTrainingOperator(…)
transform_op =
SageMakerTransformOperator(…)
transform_op.set_upstream(train_op)
© 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
• (SSE-KMS)
•
•
•
•
• Cloudtrail
• PCI DSS HIPPA
© 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
• SageMaker S3 S3 VPC
• S3
• S3
• SageMaker API PrivateLink
• SageMaker Notebook Endpoint
• SageMaker Service API
• SageMaker Runtime API
© 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
• SageMaker
• SageMaker
•
•
• SageMaker
© 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
•
•
© 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
https://mlloft4.splashthat.com/
© 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Reference
SageMaker Example Notebooks
https://github.com/awslabs/amazon-sagemaker-examples
SageMaker SDK
https://github.com/aws/sagemaker-python-sdk
(Doc : https://readthedocs.org/projects/sagemaker/)
SageMaker
https://docs.aws.amazon.com/ja_jp/sagemaker/latest/dg/whatis.html