# TODOs ## HDD power status Log if HDD is active/idle or spun-down. ## Public IP address logging Logg the public IP address. Reuse `netcup-dns` Python functions. ## Use Grafana to visualize metrics One can use Prometheus + Grafana to collect and visualize server metrics. > https://geekflare.com/best-open-source-monitoring-software/ > This list won’t be complete without including two fantastic open-source solutions – Prometheus and Grafana. Its DIY solution where you use Prometheus to scrape the metrics from server, OS, applications and use Grafana to visualize them. As we do already collect logs, we should do some research on how to import data into Grafana. ### Time series * https://grafana.com/docs/grafana/latest/fundamentals/timeseries/#introduction-to-time-series E.g. CPU and memory usage, sensor data. * https://grafana.com/docs/grafana/latest/fundamentals/timeseries/#time-series-databases A time series database (TSDB) is a database explicitly designed for time series data. Some supported TSDBs are: * Graphite * InfluxDB * Prometheus ### Installation * https://grafana.com/docs/grafana/latest/setup-grafana/installation/docker/#alpine-image-recommended * https://grafana.com/docs/grafana/latest/setup-grafana/installation/docker/#install-official-and-community-grafana-plugins * https://grafana.com/grafana/plugins/marcusolsson-csv-datasource/?tab=installation * https://grafana.github.io/grafana-csv-datasource/ * https://grafana.com/grafana/plugins/marcusolsson-json-datasource/?tab=installation * https://grafana.github.io/grafana-json-datasource/ ```shell sudo docker run --rm \ -p 3000:3000 \ --name=grafana \ -e "GF_INSTALL_PLUGINS=marcusolsson-json-datasource,marcusolsson-csv-datasource" \ grafana/grafana-oss ``` TODO: test csv or json data import tools ## Netdata - Can be exported to Grafana * https://github.com/netdata/netdata/blob/master/docs/getting-started/introduction.md ## Monit - An existing monitoring service ### General notes and links * Monit is a widely used service for system monitoring. * OPNsense uses Monit: https://docs.opnsense.org/manual/monit.html * Short slideshow presentation: https://mmonit.com/monit/#slideshow * https://wiki.ubuntuusers.de/Monit/ * Excellent configuration and usage summary in the Arch Linux Wiki: https://wiki.archlinux.org/title/Monit * Examples * https://mmonit.com/wiki/Monit/ConfigurationExamples * One can use the returncode or stdout of an executed shell script * https://mmonit.com/wiki/Monit/ConfigurationExamples#HDDHealth ``` check program HDD_Health with path "/usr/local/etc/monit/scripts/sdahealth.sh" every 120 cycles if content != "PASSED" then alert # if status > 0 then alert group health ``` * Documentation * Event queue - Store events (notifications) if mail server is not reachable * https://mmonit.com/monit/documentation/monit.html#Event-queue ``` set eventqueue basedir /var/monit ``` * https://mmonit.com/monit/documentation/monit.html#SPACE-USAGE-TEST ``` check filesystem rootfs with path / if space usage > 90% then alert ``` * https://mmonit.com/monit/documentation/monit.html#PROGRAM-STATUS-TEST ``` check program myscript with path /usr/local/bin/myscript.sh if status != 0 then alert ``` * https://mmonit.com/monit/documentation/monit.html#PROGRAM-OUTPUT-CONTENT-TEST * https://mmonit.com/monit/documentation/monit.html#Link-upload-and-download-bytes ``` check network eth0 with interface eth0 if upload > 500 kB/s then alert if total downloaded > 1 GB in last 2 hours then alert if total downloaded > 10 GB in last day then alert ``` * https://mmonit.com/monit/documentation/monit.html#MANAGE-YOUR-MONIT-INSTANCES ### Monitoring all your monit instances * Monit itself does only monitor the current system * Multi-server monitoring is a paid extra service called M/Monit :/ * But there are other open source services for this * https://github.com/monmon-io/monmon#why-did-you-create-monmon ### Setup Install and start: ```shell sudo pacman -S --needed monit lm_sensors smartmontools sudo systemctl start monit sudo systemctl status monit | grep 'Active: active (running)' ``` Print default configuration: ```shell sudo cat /etc/monitrc | grep -v '^#' #=> set daemon 30 #=> - A cycle is 30 seconds long. #=> set log syslog #=> - We will overwrite this config value later on. #=> set httpd port 2812 #=> - Only listen on localhost with username admin and pwd monit. ``` Include `monit.d`: ```shell sudo mkdir -p /etc/monit.d/ ! sudo cat /etc/monitrc | grep -q '^include' && echo 'include /etc/monit.d/*' | sudo tee -a /etc/monitrc ``` Log to file: ```shell sudo install -m700 /dev/stdin /etc/monit.d/log <<< 'set log /var/log/monit.log' sudo systemctl restart monit # tail -f /var/log/monit.log ``` System: ```shell sudo install -m700 /dev/stdin /etc/monit.d/system <<< 'check system $HOST if filedescriptors >= 80% then alert if loadavg (5min) > 2 for 4 cycles then alert if memory usage > 75% for 4 cycles then alert if swap usage > 50% for 4 cycles then alert' sudo systemctl restart monit ``` Filesystem: ```shell sudo install -m700 /dev/stdin /etc/monit.d/fs <<< 'check filesystem rootfs with path / if space usage > 80% then alert' sudo systemctl restart monit ``` SSL options: * https://mmonit.com/monit/documentation/monit.html#SSL-OPTIONS ```shell sudo install -m700 /dev/stdin /etc/monit.d/ssl <<< '# Enable certificate verification for all SSL connections set ssl options { verify: enable }' sudo systemctl restart monit ``` Mailserver, alerts and eventqueue: * https://mmonit.com/monit/documentation/monit.html#Setting-a-mail-server-for-alert-delivery * https://mmonit.com/monit/documentation/monit.html#Setting-an-error-reminder * https://mmonit.com/monit/documentation/monit.html#Event-queue * If no mail server is available, Monit can queue events in the local file-system for retry until the mail server recovers. * By default, the queue is disabled and if the alert handler fails, Monit will simply drop the alert message. ```shell sudo install -m700 /dev/stdin /etc/monit.d/mail <<< 'set mailserver smtp.mail.de port 465 username "langbein@mail.de" password "qiXF6cUgfvSVqd0pAoFTqZEHIcUKzc3n" using SSL with timeout 20 seconds set mail-format { from: langbein@mail.de subject: $SERVICE - $EVENT at $DATE message: Monit $ACTION $SERVICE at $DATE on $HOST: $DESCRIPTION. } set alert daniel@systemli.org with reminder on 10 cycles set eventqueue basedir /var/monit' sudo systemctl restart monit sudo monit -v | grep 'Mail' ``` Test alert: * https://wiki.ubuntuusers.de/Monit/#E-Mail-Benachrichtigungen-testen * It is enough to restart monit. It will send an email that it's state has changed (stopped/started). * But if desired, one can also create a test for a non-existing file: ```shell sudo install -m700 /dev/stdin /etc/monit.d/alerttest <<< 'check file alerttest with path /.nonexistent.file' sudo systemctl restart monit ``` Example script - run a speedtest: ```shell sudo pacman -S --needed speedtest-cli sudo install -m700 /dev/stdin /etc/monit.d/speedtest <<< 'check program speedtest with path /usr/bin/speedtest-cli every 120 cycles if status != 0 then alert' sudo systemctl restart monit ``` Check config syntax: ```shell sudo monit -t ``` ################## TODOS ########################## * See Firefox bookmark folder 20230219_monit. * Disk health * BTRFS balance * Save disk usage and temperatures to CSV log file * e.g. by using `check program check-and-log-temp.sh` monit configuration * Or: do checks by monit and every couple minutes run `check program log-system-info.sh` ### Monit behind Nginx TODO: Nginx reverse proxy with basic authentication.