サルでもわかるMesos schedulerの作り方
-
Upload
wallyqs -
Category
Technology
-
view
1.477 -
download
2
Transcript of サルでもわかるMesos schedulerの作り方
About me
Name: Wally (ワリ)Twitter:
Github: From Mexico :)
https://twitter.com/wallyqshttps://github.com/wallyqs
Interestsインフラ、分散システム、次世代PaaS
I like fast deploys with high scalability &availability
文芸的プログラミングOrg mode heavy user
Org mode activityOrg mode Ruby parser
Used at Github for rendering .org files!Added syntax highlighting support and many otherimprovements/bugs
AgendaWhy Mesos? Why implementing our ownScheduler?What does a Mesos Scheduler do?
Communication flow between componentsBasic scheduler implementation stylesExamples
Full Stack PaaS means fork is almostunavoidable
Following communityPriorities mismatch
Platform too tightly coupledCan only deploy webworkloads
Conway's LawEtc…
A set of APIs instead of a Full Stackapproach
We can implement an scheduler with the logic we needNo vendor lock-inNot becoming unable of following OSS community due tofork.No roadmap mismatch issue
Basic components of a Mesos ClusterSome Mesos MasterMany Mesos SlavesSchedulers
Also calledframeworks
Executors
ExampleMaster running at 192.168.0.7:5050Slave running at 192.168.0.7:5051Scheduler running at192.168.0.7:59508Executor running at 192.168.0.7:58006
Discovery between Master to SlavesSlaves announce themselves to the Master
Master pings slave:POST /slave(1)/PING HTTP/1.0User-Agent: libprocess/slave-observer(1)@192.168.0.7:5050Connection: Keep-AliveTransfer-Encoding: chunked
Slave pongs back:POST /slave-observer(1)/PONG HTTP/1.0User-Agent: libprocess/slave(1)@192.168.0.7:5051Connection: Keep-Alive
Scheduler starts and Registers to themaster
POST /master/mesos.internal.RegisterFrameworkMessage HTTP/1.1Host: 192.168.0.7:5050User-Agent: Go 1.1 package httpContent-Length: 44Connection: Keep-AliveContent-Type: application/x-protobufLibprocess-From: scheduler(1)@192.168.0.7:59508Accept-Encoding: gzip
Master ACKs the registering to thescheduler
POST /scheduler(1)/mesos.internal.FrameworkRegisteredMessage HTTP/1.0User-Agent: libprocess/[email protected]:5050Connection: Keep-AliveTransfer-Encoding: chunked
Then Master starts giving resources to theScheduler
POST /scheduler(1)/mesos.internal.ResourceOffersMessage HTTP/1.0User-Agent: libprocess/[email protected]:5050Connection: Keep-AliveTransfer-Encoding: chunked
cpu 2slave(1)@192.168.0.7:5051
Scheduler accumulates offerings andlaunches tasks to the Master
The Master will give an Slave resource to run the job.POST /master/mesos.internal.LaunchTasksMessage HTTP/1.1Host: 192.168.0.7:5050User-Agent: Go 1.1 package httpContent-Length: 260Connection: Keep-AliveContent-Type: application/x-protobufLibprocess-From: scheduler(1)@192.168.0.7:59508Accept-Encoding: gzip
Master submits job from scheduler to theSlave
POST /slave(1)/mesos.internal.RunTaskMessage HTTP/1.0User-Agent: libprocess/[email protected]:5050Connection: Keep-AliveTransfer-Encoding: chunked
Executor is started and registers back tothe Slave
POST /slave(1)/mesos.internal.RegisterExecutorMessage HTTP/1.0User-Agent: libprocess/executor(1)@192.168.0.7:58006Connection: Keep-AliveTransfer-Encoding: chunked
Slave ACKs to the executor that it is awareof it
POST /executor(1)/mesos.internal.ExecutorRegisteredMessage HTTP/1.0User-Agent: libprocess/slave(1)@192.168.0.7:5051Connection: Keep-AliveTransfer-Encoding: chunked
Then Slave submits a job to the ExecutorPOST /executor(1)/mesos.internal.RunTaskMessage HTTP/1.0User-Agent: libprocess/slave(1)@192.168.0.7:5051Connection: Keep-AliveTransfer-Encoding: chunked
Executor will constantly be sharing statusto the slave
POST /slave(1)/mesos.internal.StatusUpdateMessage HTTP/1.0User-Agent: libprocess/executor(1)@192.168.0.7:58006Connection: Keep-AliveTransfer-Encoding: chunked
Then the Slave will escalate the status tothe Master
POST /master/mesos.internal.StatusUpdateMessage HTTP/1.0User-Agent: libprocess/slave(1)@192.168.0.7:5051Connection: Keep-AliveTransfer-Encoding: chunked
Responsibilities of theScheduler and Executor
Scheduler:Receive resource offerings and launchtasksProcess status updates about the tasks
Executor:Run tasksUpdate status of the tasks
Basic Example:CommandScheduler
サルでも分からなきゃいけないから、ScalaではなくてGoを使います。:P
超簡単 CommandScheduler (like mesos-exec in C++ but in Go)デフォルトのMesos Executorの機能でOK
https://github.com/mesos/mesos-go
/usr/local/libexec/mesos/mesos-executor
Usage:
go run command_scheduler.go -address=192.168.0.7:5050 -task-count=2 -cmd="while true; do echo helloworld; done"
Importspackage main
import ( "flag" "fmt" "net" "strconv"
"github.com/gogo/protobuf/proto" mesos "github.com/mesos/mesos-go/mesosproto" util "github.com/mesos/mesos-go/mesosutil" sched "github.com/mesos/mesos-go/scheduler")
CommandScheduler typeImplement the Scheduler interface
type CommandScheduler struct { tasksLaunched int tasksFinished int totalTasks int}
Scheduler のインタフェースResourceOffers と StatusUpdate くらい実装すればいい 他のメソードは一旦ペンディングで
ResourceOffersStatusUpdateRegisteredReregisteredDisconnectedOfferRescindedFrameworkMessageSlaveLostExecutorLostError
ResourceOffers の実装The Master will be giving the scheduler offeringsSome of the important information contained within the offeringsare
Resources available: disk, cpu, memId of the slave that contains such resources
コードfunc (sched *CommandScheduler) ResourceOffers(driver sched.SchedulerDriver, offers []*mesos.Offer) {
for _, offer := range offers { cpuResources := util.FilterResources(offer.Resources, func(res *mesos.Resource) bool { return res.GetName() == "cpus" }) cpus := 0.0 for _, res := range cpuResources { cpus += res.GetScalar().GetValue() }
memResources := util.FilterResources(offer.Resources, func(res *mesos.Resource) bool { return res.GetName() == "mem" }) mems := 0.0 for _, res := range memResources { mems += res.GetScalar().GetValue() }
fmt.Println("Received Offer <", offer.Id.GetValue(), "> with cpus=", cpus, " mem=", mems) remainingCpus := cpus remainingMems := mems
コード
ポイント #0: Scheduler is responsible of using resources correctlyポイント #1: TaskId needs to be unique somehowポイント #2: For a task to run it needs a SlaveId which is containedin the offervar tasks []*mesos.TaskInfofor sched.tasksLaunched < sched.totalTasks && CPUS_PER_TASK <= remainingCpus && // ポイント#0 MEM_PER_TASK <= remainingMems {
sched.tasksLaunched++ // ポイント#1 taskId := &mesos.TaskID{ Value: proto.String(strconv.Itoa(sched.tasksLaunched)), } task := &mesos.TaskInfo{ Name: proto.String("go-cmd-task-" + taskId.GetValue()), TaskId: taskId, SlaveId: offer.SlaveId, // ポイント#2 Resources: []*mesos.Resource{ util.NewScalarResource("cpus", CPUS_PER_TASK), util.NewScalarResource("mem", MEM_PER_TASK), }, Command: &mesos.CommandInfo{ Value: proto.String(*jobCmd), }, } fmt.Printf("Prepared task: %s with offer %s for launch\n", task.GetName(), offer.Id.GetValue()) tasks = append(tasks, task) remainingCpus -= CPUS_PER_TASK remainingMems -= MEM_PER_TASK
コード
ポイント#0: Use taskId (status.TaskId.GetValue()) to handlewhat to doポイント#1: In this example, the schedule stops if one task dies.func (sched *CommandScheduler) StatusUpdate(driver sched.SchedulerDriver, status *mesos.TaskStatus) {
// ポイント#0: status.TaskId.GetValue() fmt.Println("Status update: task", status.TaskId.GetValue(), " is in state ", status.State.Enum().String())
if status.GetState() == mesos.TaskState_TASK_FINISHED { sched.tasksFinished++ }
if sched.tasksFinished >= sched.totalTasks { fmt.Println("Total tasks completed, stopping framework.") driver.Stop(false) }
if status.GetState() == mesos.TaskState_TASK_LOST || status.GetState() == mesos.TaskState_TASK_KILLED || // ポイント#1 status.GetState() == mesos.TaskState_TASK_FAILED { fmt.Println( "Aborting because task", status.TaskId.GetValue(), "is in unexpected state", status.State.String(), "with message", status.GetMessage(), ) driver.Abort() }}
最後に、mainOnly thing we need to do is pass the scheduler to the configuration.func main() {
fwinfo := &mesos.FrameworkInfo{ User: proto.String(""), Name: proto.String("Go Command Scheduler"), }
bindingAddress := parseIP(*address)
config := sched.DriverConfig{ Scheduler: &CommandScheduler{ tasksLaunched: 0, tasksFinished: 0, totalTasks: *taskCount, }, Framework: fwinfo, Master: *master, BindingAddress: bindingAddress, } driver, err := sched.NewMesosSchedulerDriver(config)}
Done!go run examples/command_scheduler.go -address="192.168.0.7" -master="192.168.0.7:5050" -logtostderr=true -task-count=4 -cmd="ruby -e '10.times { puts :hellooooooo; sleep 1}'"
Initializing the Command Scheduler...Framework Registered with Master &MasterInfo{Id:*20150225-174751-117483712-5050-13334,Ip:*117483712,Port:*5050,Pid:*[email protected]:5050,Hostname:*192.168.0.7,XXX_unrecognized:[],}Received Offer < 20150225-174751-117483712-5050-13334-O0 > with cpus= 4 mem= 2812Prepared task: go-cmd-task-1 with offer 20150225-174751-117483712-5050-13334-O0 for launchPrepared task: go-cmd-task-2 with offer 20150225-174751-117483712-5050-13334-O0 for launchPrepared task: go-cmd-task-3 with offer 20150225-174751-117483712-5050-13334-O0 for launchPrepared task: go-cmd-task-4 with offer 20150225-174751-117483712-5050-13334-O0 for launchLaunching 4 tasks for offer 20150225-174751-117483712-5050-13334-O0Status update: task 1 is in state TASK_RUNNINGStatus update: task 3 is in state TASK_RUNNINGStatus update: task 2 is in state TASK_RUNNINGStatus update: task 4 is in state TASK_RUNNING
What about containers?Mesos 0.20からContainerInfoも使えます。例:
task := &mesos.TaskInfo{ Name: proto.String("go-cmd-task-" + taskId.GetValue()), TaskId: taskId, SlaveId: offer.SlaveId, // Executor: sched.executor, Resources: []*mesos.Resource{ util.NewScalarResource("cpus", CPUS_PER_TASK), util.NewScalarResource("mem", MEM_PER_TASK), }, Command: &mesos.CommandInfo{ Value: proto.String(*jobCmd), }, Container: &mesos.ContainerInfo{ // ポイント Type: mesos.ContainerInfo_DOCKER.Enum(), Docker: &mesos.ContainerInfo_DockerInfo{ Image: proto.String(*dockerImage), // Network: mesos.ContainerInfo_DockerInfo_BRIDGE.Enum(), // PortMappings: []*ContainerInfo_DockerInfo_PortMapping{}, }, },}
Examplesudo docker ps
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES1a8b3c964c3e redis:latest "\"/bin/sh -c redis- 17 minutes ago Up 17 minutes mesos-88de0870-b613-4bda-9ed4-30995834ccab
First, Sheduler needs to be aware of alltasks info too
type FaultTolerantCommandScheduler struct { tasksLaunched int tasksFinished int totalTasks int tasksList []*mesos.TaskInfo}
ResourceOffers handlerA task in order to be valid, it needs an SlaveID.In ResourceOffers we only execute the ones withoutSlaveID
var tasksToLaunch []*mesos.TaskInfo
for _, task := range sched.tasksList { // Check if it is running already or not (has an SlaveID) if task.SlaveId == nil { fmt.Println("[OFFER ] ", offer.SlaveId, "will be used for task:", task) task.SlaveId = offer.SlaveId remainingCpus -= CPUS_PER_TASK remainingMems -= MEM_PER_TASK tasksToLaunch = append(tasksToLaunch, task) }}
if len(tasksToLaunch) > 0 { fmt.Println("[OFFER] Launching ", len(tasksToLaunch), "tasks for offer", offer.Id.GetValue()) driver.LaunchTasks([]*mesos.OfferID{offer.Id}, tasksToLaunch, &mesos.Filters{RefuseSeconds: proto.Float64(1)})}
StatusUpdate handlerStatusUpdate を受け取たら、ハンドリングできる。 次のResourceOffers が行われるときに、リスケジュールされる。
if status.GetState() == mesos.TaskState_TASK_KILLED { taskId, _ := strconv.Atoi(*status.GetTaskId().Value) fmt.Println("[STATUS] TASK_KILLED: ", taskId) sched.tasksList[taskId - 1].SlaveId = nil}
if status.GetState() == mesos.TaskState_TASK_FAILED { taskId, _ := strconv.Atoi(*status.GetTaskId().Value) fmt.Println("[STATUS] TASK_FAILED: ", taskId) sched.tasksList[taskId - 1].SlaveId = nil}
ConclusionsNot so complicated to create your own custom schedulersEasy to extend and wrap around HTTP APIs to build desiredlogic.Good pluggable solution!