20171027 モニタリング勉強会
-
Upload
paul-traylor -
Category
Software
-
view
1.982 -
download
0
Transcript of 20171027 モニタリング勉強会
![Page 1: 20171027 モニタリング勉強会](https://reader031.fdocument.pub/reader031/viewer/2022031518/5a65610d7f8b9a06748b4705/html5/thumbnails/1.jpg)
Operating Prometheusモニタリング勉強会2017/10/27 @kfdm
![Page 2: 20171027 モニタリング勉強会](https://reader031.fdocument.pub/reader031/viewer/2022031518/5a65610d7f8b9a06748b4705/html5/thumbnails/2.jpg)
Self Introduction• Paul Traylor• LINE Fukuoka 開発室• Currently responsible for updating monitoring environment at
LINE Fukuoka• https://github.com/line/promgen• https://promcon.io/2017-munich/talks/prometheus-as-a-
internal-service/
![Page 3: 20171027 モニタリング勉強会](https://reader031.fdocument.pub/reader031/viewer/2022031518/5a65610d7f8b9a06748b4705/html5/thumbnails/3.jpg)
Operating Prometheus at LINE Fukuoka• 4 HA Pairs• ~2000 targets
per machine• ~800k samples
per machine
• ~3.5 million samples• ~7000 exporters
https://github.com/line/promgen
![Page 4: 20171027 モニタリング勉強会](https://reader031.fdocument.pub/reader031/viewer/2022031518/5a65610d7f8b9a06748b4705/html5/thumbnails/4.jpg)
Scaling Prometheus ‒ HA• Run multiple Prometheus
instance with the same targets• Alerts are de-duplicated by Alertmanager
![Page 5: 20171027 モニタリング勉強会](https://reader031.fdocument.pub/reader031/viewer/2022031518/5a65610d7f8b9a06748b4705/html5/thumbnails/5.jpg)
Scaling Prometheus ‒ Shard• Split targets
across multipleservers• Alertmanager
de-duplicatesalerts• Proxy or remote
read
![Page 6: 20171027 モニタリング勉強会](https://reader031.fdocument.pub/reader031/viewer/2022031518/5a65610d7f8b9a06748b4705/html5/thumbnails/6.jpg)
Prometheus 1.8 ‒ Storage Format
https://promcon.io/2016-berlin/talks/the-prometheus-time-series-database/
http://labs.gree.jp/blog/2017/10/16614/
• One series per file• Rewrites may have
to touch millionsof files• Queries also may
touch millions offiles• No easy way to backup
![Page 7: 20171027 モニタリング勉強会](https://reader031.fdocument.pub/reader031/viewer/2022031518/5a65610d7f8b9a06748b4705/html5/thumbnails/7.jpg)
Prometheus 2.0 ‒ New Storage Format
https://promcon.io/2017-munich/slides/storing-16-bytes-at-scale.pdfhttps://fabxc.org/blog/2017-04-10-writing-a-tsdb/
• Chunks stored in buckets by time• Chunks past retention setting are just deleted• Easier to backup• Easier to compress
![Page 8: 20171027 モニタリング勉強会](https://reader031.fdocument.pub/reader031/viewer/2022031518/5a65610d7f8b9a06748b4705/html5/thumbnails/8.jpg)
Prometheus 2.0 ‒ Backups
├── 01BX40G8TA6T1MNSS8JJE7ENPY/│ ├── chunks/│ ├── index│ ├── meta.json│ └── tombstones├── 01BX5Y9SSE10VBZK4CMZ86WDR6/│ ├── chunks/│ ├── index│ ├── meta.json│ └── tombstones├── lock└── wal/├── 000760└──000761
• https://github.com/Gouthamve/agni
![Page 9: 20171027 モニタリング勉強会](https://reader031.fdocument.pub/reader031/viewer/2022031518/5a65610d7f8b9a06748b4705/html5/thumbnails/9.jpg)
Prometheus 2.0 ‒ Flag Changes• Most flags move from single dash to double dash• Many storage settings move to tsdb settings• -config.file -> --config.file• -storage.local.path -> --storage.tsdb.path
![Page 10: 20171027 モニタリング勉強会](https://reader031.fdocument.pub/reader031/viewer/2022031518/5a65610d7f8b9a06748b4705/html5/thumbnails/10.jpg)
Prometheus 2.0 ‒ Rule Format Changes
https://www.robustperception.io/converting-rules-to-the-prometheus-2-0-format/
groups:- name: alert.rulesrules:- alert: HighErrorRateexpr: job:request_latency_seconds:mean5m{job="myjob"}> 0.5for: 10mannotations:summary: High request latency- alert: DailyTestexpr: vector(1)for: 1mannotations:summary: Daily alert test
• ./promtool update rules /path/to/rules
![Page 11: 20171027 モニタリング勉強会](https://reader031.fdocument.pub/reader031/viewer/2022031518/5a65610d7f8b9a06748b4705/html5/thumbnails/11.jpg)
Prometheus 2.0 ‒ Migration
![Page 12: 20171027 モニタリング勉強会](https://reader031.fdocument.pub/reader031/viewer/2022031518/5a65610d7f8b9a06748b4705/html5/thumbnails/12.jpg)
Prometheus 2.0 ‒ Remote Read• Prometheus 1.8 (Read)• InfluxDB (Read and Write)• Graphite (Write)• OpenTSDB (Write)• TimescaledB (Read and Write)• https://prometheus.io/docs/operating/integrations/• https://github.com/prometheus/prometheus/tree/master/do
cumentation/examples/remote_storage/remote_storage_adapter
![Page 13: 20171027 モニタリング勉強会](https://reader031.fdocument.pub/reader031/viewer/2022031518/5a65610d7f8b9a06748b4705/html5/thumbnails/13.jpg)
Open Metrics• https://github.com/RichiH/OpenMetrics• https://github.com/RichiH/OpenMetrics/blob/master/CONT
RIBUTORS.md
![Page 14: 20171027 モニタリング勉強会](https://reader031.fdocument.pub/reader031/viewer/2022031518/5a65610d7f8b9a06748b4705/html5/thumbnails/14.jpg)
Questions?