SplunkLive! Utrecht 2016 - Exact
-
Upload
splunk -
Category
Technology
-
view
152 -
download
4
Transcript of SplunkLive! Utrecht 2016 - Exact
Copyright © 2015 Splunk Inc.
DevOps - Lower meantime to resolution while facilitating growth
André van de GraafPrincipal, Quality Assurance
Exact Software
© 2015 EXACT
›
2
350,000companies
7countries
LAUNCHEDExact Online, a SaaS-based
version of the product
We build business software aimed at SMBs
2005
Dutch based companyFounded in 1984 by Dutch students
5 Datacenters
© 2015 EXACT
›
3
Exact Infrastructure & Operations
Team of 7 engineers running a platform supporting
350,000 COMPANIES
Splunk team0.5 FTE Setup and configuration0.5 FTE Data import, dashboards, alerts, reports.
© 2015 EXACT6
Exact’s ambition: Exponential growth
For exponential growth we need to automate. Splunk will facilitate this.Last quarter: 250 new companies added on a daily basis
© 2015 EXACT7
Situation before Splunk?
• Support department was our alert system• 2 datacenters • 4 countries• Weekly builds• Manually analyzing logs• At least 1 war room session per month
© 2015 EXACT9
Splunk Implementation
• Operation Visibility• Business Insight• Pro Active Monitoring• Search and Investigation
© 2015 EXACT
Search and investigate: Detailed perfmon counters
• We log performance counters every 5 seconds. So we are able to investigate issue to specific moments when did it start and when did it end. Logging per minute is not detailed enough for us.
- TITLE OF PRESENTATION16
• For statistics, we aggregate performance counter per hour.
© 2015 EXACT
Search + Investigation
Bugreport: Splunk queries to see how many customers are affected by a bug accros all countries. This will help development teams to priorities the bug intake.
After deployement we can use the same splunk query to see if it is really fixed.
© 2015 EXACT19
Where are we as of today?
• From 2 to 5 datacenters• From 4 to 7 countries• From Weekly to Daily builds• Adding 250 companies on a daily basis• Data size increased with 100 % in 1 year
© 2015 EXACT20
Where are we as of today?
• Fully operational in DevOps team• Lowered meantime to resolve with 75%• Inform support department pro-active• Scale the platform not the team• Splunk is part of the delivery process of new
Exact Online functionality.
© 2015 EXACT21
What Did We Learn?
• Start documentation from the beginning• Rubbish in = rubbish out. Fix the source!• Implement 1 naming convention which applies
to all datacenters.• 1 naming convention within Splunk: Reports,
lookups etc...
© 2015 EXACT22
What Did We Learn?
• Do not hard code stuff.• End result of an incident is a new alert and or
dashboard. Continuous improvements. Part of RCA process.
• Analyze your imported sources. Is everything useful to import.
© 2015 EXACT23
Obstacles to overcome
• No ADFS support. Supported as of Splunk 6.4• Resources, small team
© 2015 EXACT24
Next Steps
• Roll out Splunk to our Development and Support teams.• Upgrade to Splunk 6.5• Implementation of Splunk IT Service Intelligence (ITSI)• POC Machine Learning