neos & HTCondor: Optimizing Your Worldneos & HTCondor neos: ... Incoming scip ASU CPLEX neos...
Transcript of neos & HTCondor: Optimizing Your Worldneos & HTCondor neos: ... Incoming scip ASU CPLEX neos...
neos & HTCondor
neos: Network-Enabled Optimization System
http://www.neos-guide.org/content/bar-crawl-optimization
Mathematical FormulationMaximize ∑j∈Vbjyj−α∗∑(i,j) A∈ dijxij
Subject to exiting node on tour ∑(i,j)∈Axij−yi=0,∀i∈V
Subject to entering node on tour ∑(i,j)∈Axij−yi=0,∀j∈V
AMPL Modelset V;set LINKS := {i in V, j in V: i <> j};param alpha >= 0;param d{LINKS} >= 0;param b{V} >= 0; #default benefit of visiting a barparam c{V} >= 0; # default cost of one drinkparam B default 30; # default maximum budget for drinks
neos
Browser
Custom
Kestrel(AMPL or GAMS)
Optimization Job
neos
Solver Poolat UW-Madison
Off-Site SolversArgonne National Lab Arizona State UniversityUniversity of Klagenfurt
University of Minho
Results
Solver NamesAlphaECP
ASABARON BDMLPBiqMacBLMVM
bnbsBonminbpmpd
CbcClp
concordeCONDORCONOPT Couenne
CPLEX
csdpddsip
DICOPT Domino
DSDPfeaspump FilMINT
filterfilterMPEC
Gurobi icos
Ipopt KNITRO
LANCELOT L-BFGS-B
LINDOGlobal
LOQOLRAMBO
MILESMINLP MINOS MINTO MOSEK
MUSCOD-IINLPECNMTRnsips OOQPPATH
PATHNLPPENBMIPENSDP
PGAPackproxy
PSwarm QSopt_EX
RELAX4SBBscip SD
SDPASDPLRSDPT3
SeDuMiSNOPT
SYMPHONYTRON
XpressMP
Solver Categoriesbcococpgo
kestrellno
lpmilp
mincomiocp
ncondo
sdpsioslp
socpuco
Solver InputsAMPL
CCPLEX
FortranGAMS
jpgLP
MATLAB_BINARYMOSEL
MPS
OSILRELAX4
SDPASDPLRSMPS
SPARSESPARSE_SDPA
TSPZIMPL
Web
XMLRPC
neos Workflow
0
20000
40000
60000
80000
100000
120000
140000
MinhoKlagenfurtArizona StateArgonneUW Madison
neos & HTCondor
Jobs Per Month
Rackmount Dell Servers running RedHatManaged via puppet: http://puppet.com/Usage Reports in Ganglia: http://ganglia.info/
1 Central Manager1 Submit Node5 Execute Nodes with specialized solvers
neos & HTCondor
Our Pool
CPLEXCPLEXCPLEX
neos & HTCondor
Started Homegrown
neos
Which Solver?
Incoming
scip ASU
CPLEX
neos Solver-2
neos Solver-1neos
Queue
Solver Poolat UW-Madison
(March 2014)
Resource Allocati on3G memory per job8 hour maximum1-4 cores dependent on solverAdjust priority by length of job
Rapid solver additi ons with load balancingNew node has over 2x CPU and memorySmaller servers were overloaded
neos & HTCondor
Problems Solves
# Original – don’t overload one by one# APPEND_RANK = ( -1 * TARGET.TotalLoadAvg )
# Updated (slightly overloads largest)APPEND_RANK = (TARGET.TotalCpus – TARGET.TotalLoadAvg)
WishlistProvide additional resources and priority to someAllow users to track, easily access jobsBetter analytics, tracking, problem resolutionMonetize?
Integrated user authenti cati on codeName, Username, Password, EmailLDAP, MySQL backendPHP/JavaScript handler headers on existing pagesServer hooks to connect user to job, give priority
neos & HTCondor
User Authentication
8 hour job limitPartial Results have value
neos & HTCondor
Solutions andDiminishing Returns
0 1 2 4 8 240
0.5
1
1. Streaming resultsOnly on stdout, stderrDoesn’t pass fi les created by solvers
2. Spool on evictUndocumented debugging option for evicted jobsDidn’t work through multiple HTCondor version
updates
3. Transfer OutputFirst solution of Research Computing FacilitatorsWorks!
neos & HTCondor
Problem: Partial Resultsstream_output = truestream_error = true
SpoolOnEvict = true
when_to_transfer_output = ON_EXIT_OR_EVICTwant_graceful_removal = true+SpoolOnEvict = false
HTCondorMigrate to Python bindingsTie job to neos job
GeneralAdding Authenticated User to XMLRPC
neos & HTCondor
Ongoing Projects