Ninja Migration: An Interconnect transparent Migration for Heterogeneous Data Centers

23
High%Performance Grid and Cloud Compu6ng Workshop, May 20 2013, Boston Ryousei Takano , Hidemoto Nakada, Takahiro Hirofuchi, Yoshio Tanaka, and Tomohiro Kudoh Informa(on Technology Research Ins(tute, Na(onal Ins(tute of Advanced Industrial Science and Technology (AIST), Japan Ninja Migra*on: An Interconnect2 transparent Migra*on for Heterogeneous Data Centers

description

High-Performance Grid and Cloud Computing Workshop

Transcript of Ninja Migration: An Interconnect transparent Migration for Heterogeneous Data Centers

Page 1: Ninja Migration: An Interconnect transparent Migration for Heterogeneous Data Centers

High%Performance/Grid/and/Cloud/Compu6ng/Workshop,/May/20/2013,/Boston �

Ryousei/Takano,/Hidemoto/Nakada,/Takahiro/Hirofuchi,//Yoshio/Tanaka,/and/Tomohiro/Kudoh/

/

Informa(on)Technology)Research)Ins(tute,))Na(onal)Ins(tute)of)Advanced)Industrial)Science)and)Technology)(AIST),)Japan�

Ninja&Migra*on:&An&Interconnect2transparent&Migra*on&for&

Heterogeneous&Data&Centers�

Page 2: Ninja Migration: An Interconnect transparent Migration for Heterogeneous Data Centers

Background�•  HPC/cloud/is/a/promising/HPC/plaIorm./

•  VM/migra6on/is/useful/for/improving/flexibility/and/maintainability/in/cloud/compu6ng./

��

VM1� VM2� VM3�

VM1� VM2�VM3�

Maintenance,/fault/tolerance,/energy/efficient/VM/placement�

Disaster/recovery/

VM1� VM2� VM3�

Page 3: Ninja Migration: An Interconnect transparent Migration for Heterogeneous Data Centers

Infiniband�

Constraints&on&VM&migra*on�• Migra6on/with/a/VMM%bypass/I/O/device/

–  It/can/greatly/reduce/the/overhead/of/virtualiza6on,/but/it/is/not/under/the/control/of/a/VMM./

��

Infiniband�

VM1� VM2� VM3�

Page 4: Ninja Migration: An Interconnect transparent Migration for Heterogeneous Data Centers

Impact&of&VMM2bypass&I/O �

��

0

50

100

150

200

250

300

BT CG EP FT LU

Exe

cutio

n tim

e [s

econ

ds]�

BMM (IB) BMM (10GbE) KVM (IB) KVM (virtio)

The/overhead/of/I/O/virtualiza6on/on/the/NAS/Parallel/Benchmarks/3.3.1/class/C,/64/processes.�

BMM: Bare Metal Machine�

KVM (virtio)�

VM1�

10GbE NIC�

VMM�

Guest driver�

Physical driver�

Guest OS�

KVM (IB)

VM1�

IB QDR HCA�

VMM�

Physical driver�

Guest OS�

•  Performance/evalua6on/of/HPC/cloud/–  (Para2)virtualized&I/O&incurs/a/large/overhead./–  PCI&passthrough&significantly/mi6gate/the/overhead./

Page 5: Ninja Migration: An Interconnect transparent Migration for Heterogeneous Data Centers

Constraints&on&VM&migra*on�• Migra6on/with/a/VMM%bypass/I/O/device/

–  It/can/greatly/reduce/the/overhead/of/virtualiza6on,/but/it/is/not/under/the/control/of/a/VMM./

•  Heterogeneity/of/interconnect/devices/– A/VM/assigned/to/an/Infiniband/device/cannot/migrate/to/an/Ethernet/machine./

��

Infiniband� Ethernet�

VM1� VM2� VM3�

Page 6: Ninja Migration: An Interconnect transparent Migration for Heterogeneous Data Centers

Challenge �•  Goal:/Migrate/a/cluster/of/VMs/between/heterogeneous/data/centers./!Interconnect%transparent/migra6on/

•  Challenge:/How/do/we/realize/it/with/the/minimal/overhead/of/virtualiza6on/during/normal/opera6on?/–  (Para%)/Virtualized/devices/suffer/from/the/overhead./

��

Page 7: Ninja Migration: An Interconnect transparent Migration for Heterogeneous Data Centers

Outline �•  Introduc6on/•  Ninja/migra6on:/interconnect%transparent/migra6on/

•  Experiment/

•  Conclusion �

Page 8: Ninja Migration: An Interconnect transparent Migration for Heterogeneous Data Centers

Interconnect2transparent&migra*on�

Infiniband�

Normal/opera6on �

VM1� VM2� VM3� Fallback/migra6on �

Recovery/migra6on �

Ethernet�

Fallback/opera6on �

VM1� VM2� VM3�

VM1�VM2�VM3�

VM1�VM2�VM3�

Infiniband/cluster� Ethernet/cluster�

Use&cases:/transparent/fail%over/to/another/cluster/for/maintenance,//evacua6on/from/a/disaster%stricken/data/center,/etc.�

Page 9: Ninja Migration: An Interconnect transparent Migration for Heterogeneous Data Centers

Requirements�1.  Detach/VMM%bypass/I/O/devices/only/when/VM/

migra6on/is/required./

2.  Global/Coordina6on/among/distributed/VMs/before/migra6on./

3.  Change/an/applica6on’s/transport/protocol//for/the/available/device/ager/migra6on./

� /Our/approach:/leverage/the/knowledge/of/an/applica6on/to/ensure/coopera6on/between/migra6on/and/a/communica6on/layer/inside/the/guest/OS./

��

Page 10: Ninja Migration: An Interconnect transparent Migration for Heterogeneous Data Centers

VM�

SymVirt:&Symbio*c&Virtualiza*on�

���

NIC�

VMM�

Applica6on�

Migra6on�

Global/coordina6on�

Device/setup�

Exis*ng&VM&migra*on&(Black%box/approach)/

Pro:/portability�

VM&migra*on&w/&SymVirt&(Gray%box/approach)/Pro:/performance�

NIC�

VM�

VMM�

Applica6on�

VMM#bypass)I/O�

Migra6on�

Global/coordina6on�

Device/setup�

Coopera1on�

Page 11: Ninja Migration: An Interconnect transparent Migration for Heterogeneous Data Centers

Ninja&migra*on�

���

NIC�

VM�

VMM�

Applica6on�

MPI/system�

NIC�

VM�

VMM�

Applica6on�

MPI/system�

NIC�

VM�

VMM�

Applica6on�

MPI/system�

///Ninja/migra6on �

Migra6on�

Global/coordina6on�

Device/setup�

In/conjunc6on/with/VM/migra6on,//MPI/system/is/in/charge/of:/•  global/coordina6on/among/MPI/

processes/•  changing/a/transport/protocol �

Page 12: Ninja Migration: An Interconnect transparent Migration for Heterogeneous Data Centers

Implementa*on�

���

confirm�

detach �

Guest OS mode �

VMM mode �

migration� re-attach �

confirm�

Application�

SymVirt coordinator (SELF component) �

SymVirt controller/ agent�

linkup�

MPI runtime �

•  No/modifica6on/to/either/of/the/MPI/system/and/applica6ons/

•  Open/MPI/user%level/checkpoint/restart/framework/(SELF)/

–  Global/coordina6on/protocol/–  Re%establishes/connec6ons/among/MPI/processes/ager/migra6on/

Migra6on�

Device/setup�

Global/coordina6on�

Page 13: Ninja Migration: An Interconnect transparent Migration for Heterogeneous Data Centers

Outline �•  Introduc6on/•  Ninja/migra6on:/interconnect%transparent/migra6on/

•  Experiment/

•  Conclusion �

���

Page 14: Ninja Migration: An Interconnect transparent Migration for Heterogeneous Data Centers

Experiment�

���

•  The/overhead/of/Ninja/migra6on/–  We/used/8/VMs/on/a/cluster./

–  We/migrated/VMs/once/during/a/benchmark/execu6on./

•  Fallback/and/recovery/migra6on/between/an/Infiniband/cluster/and/an/Ethernet/cluster/–  Infiniband:/VMM%bypass/I/O/(PCI/passthrough)/

–  Ethernet:/Para%virtualized/I/O/(vir6o_net)/•  Two/benchmark/programs/wriken/in/MPI/

–  memtest:/a/simple/memory/intensive/benchmark/

–  NAS/Parallel/Benchmarks/(NPB)/version/3.3.1/

Page 15: Ninja Migration: An Interconnect transparent Migration for Heterogeneous Data Centers

Experimental&SeRng�We/used/a/16/node/Infiniband/cluster. �

���

Blade&server&�Dell&PowerEdge&M610��CPU � Intel/quad%core/Xeon/E5540/2.53GHz/x2�

Chipset � Intel/5520�Memory� 48/GB/DDR3�InfiniBand/ Mellanox/ConnectX/(MT26428) �10/GbE/ Broadcom/NetXtreme/II/(BMC57711) �

Blade&switch�InfiniBand � Mellanox/M3601Q/(QDR/16/ports) �10GbE � Dell/M8024�

Host&machine&environment�OS � Debian/7.0�Linux/kernel � 3.2.18�

QEMU/KVM� 1.1%rc3�MPI� Open/MPI/1.6�OFED � 1.5.4.1�Compiler � gcc/gfortran/4.4.6�

VM&environment�VCPU� 8�Memory� 20/GB �

Page 16: Ninja Migration: An Interconnect transparent Migration for Heterogeneous Data Centers

Result:&memtest �•  The/overhead/of/Ninja/migra6on/

–  The/migra6on/6me/depends/on/the/memory/footprint./

–  Both/hotplug/and/link%up/6mes/are/almost/constant./

���

28.5 28.5 28.5 28.6

14.6 13.5 12.5 11.3

35.9 38.7 44.2 53.7

0

20

40

60

80

100

2GB 4GB 8GB 16GB Execu*

on&Tim

e&[Secon

ds]�

migra6on/ hotplug/ linkup/

memory&footprint�

Page 17: Ninja Migration: An Interconnect transparent Migration for Heterogeneous Data Centers

Result:&link2up&*me �

src.&device&2>&dest.&device � hotplug � link2up�Infiniband/%>/Infiniband � 3.88� 29.9�Ethernet/%>/Infiniband � 1.15� 29.8�Ethernet/%>/Ethernet� 0.13� 0.00�Infiniband/%>/Ethernet� 2.80� 0.00�

��

•  Focus/on/the/link%up/6me./–  Note:/the/source/and/the/des6na6on/are/the/same/node./

•  If/the/des6na6on/has/an/Infiniband/device,/the//link%up/6me/is/not/a/negligible/overhead./

[seconds]�

Page 18: Ninja Migration: An Interconnect transparent Migration for Heterogeneous Data Centers

Result:&NPB&(64&proc.,&Class&D)�

��

0

200

400

600

800

1000

1200

baseline migration baseline migration baseline migration baseline migration

BT CG FT LU

Exe

cutio

n tim

e [s

econ

ds]� migra6on/ hotplug/ linkup/ applica6on/

BT � CG � FT � LU�4417� 3394� 15678� 2348�

Transferred&Memory&Size&during&VM&Migra*on&[MB]�

There/is/no/overhead/during/normal/opera6ons � The/overhead/is/propor6onal/

to/the/memory/footprint. �

+8% �

+14% �

+37% �

+11% �

Page 19: Ninja Migration: An Interconnect transparent Migration for Heterogeneous Data Centers

Fallback/Recovery&migra*on:&memtest �

���

0

20

40

60

80

100

1 11 21 31

Exe

cutio

n tim

e [s

econ

ds]�

Steps �

Overhead/

Applica6on/

4/hosts//(IB) �

2/hosts//(TCP) �

4/hosts//(IB) �

4/hosts//(TCP) �

0

40

80

120

160

200

1 11 21 31

Exe

cutio

n tim

e [s

econ

ds]�

Steps �

Overhead/Applica6on/

4/hosts//(IB) �

2/hosts//(TCP) �

4/hosts//(IB) �

4/hosts//(TCP) �

Total&4&proc.&(1&proc.&/&VM)� Total&32&proc.&(8&proc.&/&VM)�

Page 20: Ninja Migration: An Interconnect transparent Migration for Heterogeneous Data Centers

Outline �•  Introduc6on/•  Ninja/migra6on:/interconnect%transparent/migra6on/

•  Experiment/

•  Conclusion �

���

Page 21: Ninja Migration: An Interconnect transparent Migration for Heterogeneous Data Centers

Related&Work�•  Heterogeneous/VM/migra6on/

– Vagrant/supports/a/live/migra6on/across/heterogeneous/VMM./[P./Liu/‘08]/

– Ninja/migra6on/provides/interconnect%transparent/migra6on./

•  VM/migra6on/with/VMM%bypass/I/O/devices/

– Driver2level:/shadow/driver/[A./Kadav/‘09],//Nomad/[W./Huang/‘07]/

– Run*me2level:/Ninja/migra6on �

���

Page 22: Ninja Migration: An Interconnect transparent Migration for Heterogeneous Data Centers

Conclusion �• We/propose/an/interconnect%transparent/migra6on/mechanism/to/migrate/a/bunch/of/VMs/between/heterogeneous/data/centers./

• We/demonstrate/the/implementa6on/called/Ninja/migra6on./

– VMs/can/migrate/between/an/IB/cluster/and/an/Ethernet/cluster/without/restar6ng/an/applica6on. �

–  It/has/no/performance/overhead/during/normal/opera6ons./

���

Page 23: Ninja Migration: An Interconnect transparent Migration for Heterogeneous Data Centers

Future&work�•  Demonstrate/the/scalability./

•  Inves6gate/the/very/long/link%up/6me/of/Infiniband./

•  Design/a/run6me%agnos6c/(MPI/free)/implementa6on./ �

���

This/work/was/partly/supported/by/JSPS/KAKENHI//Grant/Number/24700040. �