Nimrod & NetSolve
description
Transcript of Nimrod & NetSolve
![Page 1: Nimrod & NetSolve](https://reader035.fdocument.pub/reader035/viewer/2022062316/56814cfa550346895dba1072/html5/thumbnails/1.jpg)
Nimrod & NetSolveNimrod & NetSolve
Sathish VadhiyarSathish Vadhiyar
![Page 2: Nimrod & NetSolve](https://reader035.fdocument.pub/reader035/viewer/2022062316/56814cfa550346895dba1072/html5/thumbnails/2.jpg)
NimrodNimrod
Sources/Credits: Nimrod web site Sources/Credits: Nimrod web site & papers& papers
![Page 3: Nimrod & NetSolve](https://reader035.fdocument.pub/reader035/viewer/2022062316/56814cfa550346895dba1072/html5/thumbnails/3.jpg)
BackgroundBackground
For execution of parametric experiments across For execution of parametric experiments across distributed computersdistributed computers
User describes plan file that declares User describes plan file that declares parametersparameters
Parametric studies – range of different Parametric studies – range of different simulations calculated using the same programsimulations calculated using the same program
Need for a GridNeed for a Grid 3 variables, 4 values – 64 experiments3 variables, 4 values – 64 experiments Each experiment – several hoursEach experiment – several hours
![Page 4: Nimrod & NetSolve](https://reader035.fdocument.pub/reader035/viewer/2022062316/56814cfa550346895dba1072/html5/thumbnails/4.jpg)
Sample plan fileSample plan fileparameter iseed integer range from 100 to 4000 step 100;parameter iseed integer range from 100 to 4000 step 100;parameter thick label "BUC thickness" float range from 1.1 to 2.0 step parameter thick label "BUC thickness" float range from 1.1 to 2.0 step
0.1;0.1;parameter jseed integer compute thick*1000;parameter jseed integer compute thick*1000;
task nodestarttask nodestart copy ccal.$OS node:./ccalcopy ccal.$OS node:./ccal copy dummy node:.copy dummy node:. copy ccal.dat node:.copy ccal.dat node:. copy skel.inp node:.copy skel.inp node:.endtaskendtask
task maintask main node:substitute skel.inp ccal.inp node:substitute skel.inp ccal.inp node:execute ./ccalnode:execute ./ccal copy node:ccal.op ccalout.$jobnamecopy node:ccal.op ccalout.$jobnameendtaskendtask
![Page 5: Nimrod & NetSolve](https://reader035.fdocument.pub/reader035/viewer/2022062316/56814cfa550346895dba1072/html5/thumbnails/5.jpg)
Phases of Computational Phases of Computational ExperimentExperiment
1.1. Experiment pre-processing, when data is set Experiment pre-processing, when data is set up for the experiment; up for the experiment;
2.2. Execution pre-processing, when data is Execution pre-processing, when data is prepared for a particular execution; prepared for a particular execution;
3.3. Execution, when the program is executed for a Execution, when the program is executed for a given set of parameter values; given set of parameter values;
4.4. Execution post-processing, when data from a Execution post-processing, when data from a particular execution is reduced; particular execution is reduced;
5.5. Experiment post-processing, when results are Experiment post-processing, when results are processed, for example by running data processed, for example by running data interpretation or visualization software. interpretation or visualization software.
![Page 6: Nimrod & NetSolve](https://reader035.fdocument.pub/reader035/viewer/2022062316/56814cfa550346895dba1072/html5/thumbnails/6.jpg)
IllustrationIllustration
![Page 7: Nimrod & NetSolve](https://reader035.fdocument.pub/reader035/viewer/2022062316/56814cfa550346895dba1072/html5/thumbnails/7.jpg)
Nimrod ArchitectureNimrod Architecture
![Page 8: Nimrod & NetSolve](https://reader035.fdocument.pub/reader035/viewer/2022062316/56814cfa550346895dba1072/html5/thumbnails/8.jpg)
ArchitectureArchitecture
![Page 9: Nimrod & NetSolve](https://reader035.fdocument.pub/reader035/viewer/2022062316/56814cfa550346895dba1072/html5/thumbnails/9.jpg)
ArchitectureArchitecture
ComponentsComponents ClientClient Parametric engineParametric engine SchedulerScheduler DispatcherDispatcher Job wrapperJob wrapper
![Page 10: Nimrod & NetSolve](https://reader035.fdocument.pub/reader035/viewer/2022062316/56814cfa550346895dba1072/html5/thumbnails/10.jpg)
ComponentsComponents
Parametric engineParametric engine Persistent job servicePersistent job service Interacts with the client, schedule advisor and dispatcherInteracts with the client, schedule advisor and dispatcher Takes declarative plan from the userTakes declarative plan from the user
SchedulerScheduler Objectives – meet deadlines, minimize costObjectives – meet deadlines, minimize cost
DispatcherDispatcher Starts remote component called job wrapperStarts remote component called job wrapper Updates status of task to parametric engineUpdates status of task to parametric engine
Job wrapperJob wrapper Responsible for staging-in, execution and staging outResponsible for staging-in, execution and staging out
![Page 11: Nimrod & NetSolve](https://reader035.fdocument.pub/reader035/viewer/2022062316/56814cfa550346895dba1072/html5/thumbnails/11.jpg)
Cost ModelCost Model
Cost / Priority matrix defined Cost / Priority matrix defined based on specification by based on specification by resource providersresource providers
Nimrod/G scheduler Nimrod/G scheduler performs discovery and performs discovery and allocation of resources based allocation of resources based on specified execution times on specified execution times and cost constraintsand cost constraints
Cost of experiment varies Cost of experiment varies depending on the loaddepending on the load
![Page 12: Nimrod & NetSolve](https://reader035.fdocument.pub/reader035/viewer/2022062316/56814cfa550346895dba1072/html5/thumbnails/12.jpg)
Scheduling HeuristicScheduling Heuristic
1.1. DiscoveryDiscovery1.1. Initial filtering of resources based on cost specificationsInitial filtering of resources based on cost specifications2.2. Identification of lowest-cost set of resources able to meet Identification of lowest-cost set of resources able to meet
deadlinesdeadlines
2.2. AllocationsAllocations1.1. Jobs allocated from the queue to the resources identified in Jobs allocated from the queue to the resources identified in
step 1step 1
3.3. MonitoringMonitoring1.1. Completion time of jobs monitoredCompletion time of jobs monitored2.2. Execution rate establishedExecution rate established
4.4. RefinementRefinement1.1. Execution rate used to update expected completion times of Execution rate used to update expected completion times of
remaining jobsremaining jobs2.2. Revisit steps 1 and 2Revisit steps 1 and 2
![Page 13: Nimrod & NetSolve](https://reader035.fdocument.pub/reader035/viewer/2022062316/56814cfa550346895dba1072/html5/thumbnails/13.jpg)
ExperimentsExperiments
![Page 14: Nimrod & NetSolve](https://reader035.fdocument.pub/reader035/viewer/2022062316/56814cfa550346895dba1072/html5/thumbnails/14.jpg)
Ionization chamber calibrationIonization chamber calibration
Chamber response to front wall thicknessChamber response to front wall thickness
ion-pair = ion-pair =
400 tasks400 tasks
Each model involved about 40 minutes – 140 Each model involved about 40 minutes – 140 minutesminutes
3 experiments – 10-hr, 15-hr, 20-hr deadline3 experiments – 10-hr, 15-hr, 20-hr deadline
![Page 15: Nimrod & NetSolve](https://reader035.fdocument.pub/reader035/viewer/2022062316/56814cfa550346895dba1072/html5/thumbnails/15.jpg)
No. of resources Vs timeNo. of resources Vs time
![Page 16: Nimrod & NetSolve](https://reader035.fdocument.pub/reader035/viewer/2022062316/56814cfa550346895dba1072/html5/thumbnails/16.jpg)
Cost vs TimeCost vs Time
![Page 17: Nimrod & NetSolve](https://reader035.fdocument.pub/reader035/viewer/2022062316/56814cfa550346895dba1072/html5/thumbnails/17.jpg)
Cost vs TimeCost vs Time
![Page 18: Nimrod & NetSolve](https://reader035.fdocument.pub/reader035/viewer/2022062316/56814cfa550346895dba1072/html5/thumbnails/18.jpg)
Cost vs TimeCost vs Time
![Page 19: Nimrod & NetSolve](https://reader035.fdocument.pub/reader035/viewer/2022062316/56814cfa550346895dba1072/html5/thumbnails/19.jpg)
Cost vs TimeCost vs Time
![Page 20: Nimrod & NetSolve](https://reader035.fdocument.pub/reader035/viewer/2022062316/56814cfa550346895dba1072/html5/thumbnails/20.jpg)
Another ExperimentAnother Experiment
![Page 21: Nimrod & NetSolve](https://reader035.fdocument.pub/reader035/viewer/2022062316/56814cfa550346895dba1072/html5/thumbnails/21.jpg)
Experimental setupExperimental setup
165 cpu jobs, each 5 min. in duration165 cpu jobs, each 5 min. in duration
Deadline – 2 hoursDeadline – 2 hours
Budget - 396000Budget - 396000
2 strategies:2 strategies: Optimize timeOptimize time Optimize costOptimize cost
![Page 22: Nimrod & NetSolve](https://reader035.fdocument.pub/reader035/viewer/2022062316/56814cfa550346895dba1072/html5/thumbnails/22.jpg)
ResultsResults
![Page 23: Nimrod & NetSolve](https://reader035.fdocument.pub/reader035/viewer/2022062316/56814cfa550346895dba1072/html5/thumbnails/23.jpg)
Time OptimizationTime Optimization
![Page 24: Nimrod & NetSolve](https://reader035.fdocument.pub/reader035/viewer/2022062316/56814cfa550346895dba1072/html5/thumbnails/24.jpg)
Cost OptimizationCost Optimization
![Page 25: Nimrod & NetSolve](https://reader035.fdocument.pub/reader035/viewer/2022062316/56814cfa550346895dba1072/html5/thumbnails/25.jpg)
SchedulingScheduling
Adaptive scheduling Adaptive scheduling algorithmsalgorithms
Time minimization and Time minimization and limited budget (etime limited budget (etime optimal)optimal)
Time minimization and Time minimization and unlimited budget (etime unlimited budget (etime highoptimal)highoptimal)
Cost minimization and Cost minimization and limited by deadline (ecost limited by deadline (ecost optimal)optimal)
None minimization, limited None minimization, limited time and cost (etime + time and cost (etime + ecost optimal)ecost optimal)
![Page 26: Nimrod & NetSolve](https://reader035.fdocument.pub/reader035/viewer/2022062316/56814cfa550346895dba1072/html5/thumbnails/26.jpg)
Nimrod / ONimrod / O
Optimization of parameters to minimize objective Optimization of parameters to minimize objective functionfunctionCase study: optimize shape and angle of attack Case study: optimize shape and angle of attack of airfoil that maximizes the lift to drag ratioof airfoil that maximizes the lift to drag ratioDesign optimization problemDesign optimization problemObjective function can be non-linear, contain Objective function can be non-linear, contain noise, can be continuous or discretenoise, can be continuous or discreteNo single optimization algorithm can give the No single optimization algorithm can give the best resultbest resultNimrod / O supports a range of algorithmsNimrod / O supports a range of algorithms
![Page 27: Nimrod & NetSolve](https://reader035.fdocument.pub/reader035/viewer/2022062316/56814cfa550346895dba1072/html5/thumbnails/27.jpg)
Contd …Contd …
Search algorithmsSearch algorithms P-BFGSP-BFGS SimplexSimplex Divide-and-conquerDivide-and-conquer Simulated annealingSimulated annealing
![Page 28: Nimrod & NetSolve](https://reader035.fdocument.pub/reader035/viewer/2022062316/56814cfa550346895dba1072/html5/thumbnails/28.jpg)
Plan file modified by Nimrod / OPlan file modified by Nimrod / O
![Page 29: Nimrod & NetSolve](https://reader035.fdocument.pub/reader035/viewer/2022062316/56814cfa550346895dba1072/html5/thumbnails/29.jpg)
ReferencesReferences
Abramson D, Lewis A, Peachey T, Fletcher, C., “An Abramson D, Lewis A, Peachey T, Fletcher, C., “An Automatic Design Optimization Tool and its Application Automatic Design Optimization Tool and its Application to Computational Fluid Dynamics”, SuperComputing to Computational Fluid Dynamics”, SuperComputing 2001, Denver, Nov 2001.2001, Denver, Nov 2001.Abramson , D., Sosic , R., Giddy , J., Cope , M. "The Abramson , D., Sosic , R., Giddy , J., Cope , M. "The Laboratory Bench: Distributed Computing for Laboratory Bench: Distributed Computing for Parametised Simulations", 1994 Parallel Computing and Parametised Simulations", 1994 Parallel Computing and Transputers Conference, Wollongong, Nov 94, pp 17 27.Transputers Conference, Wollongong, Nov 94, pp 17 27.Abramson D., Sosic R., Giddy J. and Hall B., "Nimrod: A Abramson D., Sosic R., Giddy J. and Hall B., "Nimrod: A Tool for Performing Parametised Simulations using Tool for Performing Parametised Simulations using Distributed Workstations", The 4th IEEE Symposium on Distributed Workstations", The 4th IEEE Symposium on High Performance Distributed Computing, Virginia, High Performance Distributed Computing, Virginia, August 1995. August 1995.
![Page 30: Nimrod & NetSolve](https://reader035.fdocument.pub/reader035/viewer/2022062316/56814cfa550346895dba1072/html5/thumbnails/30.jpg)
ReferencesReferences
Abramson, D., Giddy, J. and Kotler, L. High Performance Abramson, D., Giddy, J. and Kotler, L. High Performance Parametric Modeling with Nimrod/G: Killer Application for the Global Parametric Modeling with Nimrod/G: Killer Application for the Global Grid?, International Parallel and Distributed Processing Symposium Grid?, International Parallel and Distributed Processing Symposium (IPDPS), pp 520- 528, Cancun, Mexico, May 2000.(IPDPS), pp 520- 528, Cancun, Mexico, May 2000.Buyya, R., Abramson, D. and Giddy, J. Nimrod/G: An Architecture of Buyya, R., Abramson, D. and Giddy, J. Nimrod/G: An Architecture of a Resource Management and Scheduling System in a Global a Resource Management and Scheduling System in a Global Computational Grid, HPC Asia 2000, May 14-17, 2000, pp 283 289, Computational Grid, HPC Asia 2000, May 14-17, 2000, pp 283 289, Beijing, China.Beijing, China.Abramson, D, Buuya, R. and Giddy, J. “A Computational Economy Abramson, D, Buuya, R. and Giddy, J. “A Computational Economy for Grid Computing and its Implementation in the Nimrod-G for Grid Computing and its Implementation in the Nimrod-G Resource Broker”, Future Generation Computer Systems. Volume Resource Broker”, Future Generation Computer Systems. Volume 18, Issue 8, Oct-2002.18, Issue 8, Oct-2002.Buyya, R., Giddy, J. and Abramson, D. "An Evaluation of Economy-Buyya, R., Giddy, J. and Abramson, D. "An Evaluation of Economy-based Resource Trading and Scheduling on Computational Power based Resource Trading and Scheduling on Computational Power Grids for Parameter Sweep Applications", Workshop on Active Grids for Parameter Sweep Applications", Workshop on Active Middleware Services (AMS 2000), (in conjuction with Ninth IEEE Middleware Services (AMS 2000), (in conjuction with Ninth IEEE International Symposium on High Performance Distributed International Symposium on High Performance Distributed Computing), Kluwer Academic Press, August 1, 2000, Pittsburgh, Computing), Kluwer Academic Press, August 1, 2000, Pittsburgh, USA. USA.
![Page 31: Nimrod & NetSolve](https://reader035.fdocument.pub/reader035/viewer/2022062316/56814cfa550346895dba1072/html5/thumbnails/31.jpg)
Junk !!Junk !!
![Page 32: Nimrod & NetSolve](https://reader035.fdocument.pub/reader035/viewer/2022062316/56814cfa550346895dba1072/html5/thumbnails/32.jpg)
Nimrod ArchitectureNimrod Architecture
![Page 33: Nimrod & NetSolve](https://reader035.fdocument.pub/reader035/viewer/2022062316/56814cfa550346895dba1072/html5/thumbnails/33.jpg)
ComponentsComponents
GeneratorGenerator Input: plan fileInput: plan file Processes plan file, gives choices to the user Processes plan file, gives choices to the user
regarding parametersregarding parameters Output: run file (description of a job)Output: run file (description of a job)
DispatcherDispatcher Input: run fileInput: run file Stages file to remote resourcesStages file to remote resources Runs jobs on remote resourcesRuns jobs on remote resources
![Page 34: Nimrod & NetSolve](https://reader035.fdocument.pub/reader035/viewer/2022062316/56814cfa550346895dba1072/html5/thumbnails/34.jpg)
Nimrod-G ArchitectureNimrod-G Architecture
•Origin:
•Implements scheduling and monitoring
•Exists for the entire duration of the experiment
•Responsible for execution of experiment within specified time and cost constraints
•Client
•User interacts with the Origin process through the client
•Multiple clients can connect to a single origin process and monitor the same experiment
![Page 35: Nimrod & NetSolve](https://reader035.fdocument.pub/reader035/viewer/2022062316/56814cfa550346895dba1072/html5/thumbnails/35.jpg)
Nimrod ComponentsNimrod Components
Nimrod Resource BrokerNimrod Resource Broker Origin process spawns NRB on the remote Origin process spawns NRB on the remote
sitesite Interacts with GRAMInteracts with GRAM Capabilities beyond GRAM including file Capabilities beyond GRAM including file
staging, creation of jobs and process controlstaging, creation of jobs and process control
![Page 36: Nimrod & NetSolve](https://reader035.fdocument.pub/reader035/viewer/2022062316/56814cfa550346895dba1072/html5/thumbnails/36.jpg)
experimentsexperiments
90-second jobs over 10 simulated queues 90-second jobs over 10 simulated queues with different access costs (Q1=10, Q2 = with different access costs (Q1=10, Q2 = 12 etc.)12 etc.)
100 jobs, 9000 seconds100 jobs, 9000 seconds
10 queues, 900 seconds optimal10 queues, 900 seconds optimal
Deadlines – 990, 1980, 2970Deadlines – 990, 1980, 2970
Costs – 252000, 171000, 126000Costs – 252000, 171000, 126000
![Page 37: Nimrod & NetSolve](https://reader035.fdocument.pub/reader035/viewer/2022062316/56814cfa550346895dba1072/html5/thumbnails/37.jpg)