Hard & Soft Skills to Avoid Outages by Pascal-Louis Perez
-
Upload
hakka-labs -
Category
Technology
-
view
686 -
download
0
description
Transcript of Hard & Soft Skills to Avoid Outages by Pascal-Louis Perez
Hard & Soft Skills toAvoid Outages
@pascallouis from @SquareNY
Code git rmProfit!Ship MaintainBless
Code git rmProfit!Ship MaintainBless
Code git rmProfit!Ship MaintainBless
• Fighting mixing ids
• Entity bound ids (e.g. Id<T>)
• Textual ids MWDN-YP89-OLVL-USER
• Testable configurations
• etc.
Tactics
Code git rmProfit!Ship MaintainBless
• Not controversial (anymore)
• Living code documentation
• Enables collaboration
• Technique to encode invariants
TDD
Code git rmProfit!Ship MaintainBless
• Tests which can be changed by a (small) subset of engineering
• Enforced via policy or technology
Gold Tests
Code git rmProfit!Ship MaintainBless
• “Change your language and you change your thoughts” — Karl Albrecht
• Can be implementation agnostic
Expressive Tests
Code git rmProfit!Ship MaintainBless
... Given feed PaymentEventFeedListener receives:""" { "payment_id": "EPT-300", "isTivoReplay": false, "merchant": { "token": "m-1" }, ... }""" Then expect table balance_changing_events order by id: | event_type | status | process_attempts | | HOLD | UNPROCESSED | 1 | | CAPTURE | UNPROCESSED | 0 | When then the time is 2012-01-06 17:10:00 And balance changing event queue processes items Then expect table balance_changing_events order by id: | event_type | status | process_attempts | | HOLD | UNPROCESSED | 2 | | CAPTURE | PROCESSED | 1 |
oror
Code git rmProfit!Ship MaintainBless
Quality
Time
Automated
ManualOups!
Code git rmProfit!Ship MaintainBless
• In theory: static vs dynamic
• In practice: pre vs post-production
Code Analysis
Code git rmProfit!Ship MaintainBless
• Type Checking
• Testing, CI
• Linters
• Forbidden Call Analysis
Pre Analysis
Code git rmProfit!Ship MaintainBless
• Logging
• Metrics
• Invariant Checking
Post Analysis
Code git rmProfit!Ship MaintainBless
Speaking of Alerts: Metrics vs Checks
?OK
WARNING
1
0
200ms
0ms
Code git rmProfit!Ship MaintainBless
Alert Oups!
Report Report
Precise Imprecise
Immediate
Deferred
Response
Signal
Alerting & Reporting
Code git rmProfit!Ship MaintainBless
• Time set aside, monthly or quarterly
• No top-down mandate except “fix it”
Fix It Weeks
Code git rmProfit!Ship MaintainBless
Code git rmProfit!Ship MaintainBless
Post-Mortem
• When Anytime there are issues!
• Why Learn and avoid mistakes of the past
• How Blameless
Code git rmProfit!Ship MaintainBless
Post-Mortem
• Go through the timeline
• The Good, The Bad and the Ugly
• Action Items
Code git rmProfit!Ship MaintainBless
Root Cause Analysis
Code git rmProfit!Ship MaintainBless
Code git rmProfit!Ship MaintainBless
Proportional Investing
• When you lose N hours to maintenance, you spend an equivalent N hours on improving things.
Safety drives productivity; and unleashes creativity.
Technology, sure. But, it’s mostly about culture and people.
Many layers of defense, lots of ways to do it — find what’s right for your team.