Both sides previous revision
Previous revision
|
Next revision
Both sides next revision
|
tutorial:adm:server_monitoring [2019/03/25 10:21] fiserp [Monitoring of server with CzechIdM] |
tutorial:adm:server_monitoring [2019/03/25 10:24] fiserp [Monitored parameters] |
^Service/Parameter ^Probe binary ^Name in NRPE ^Warning threshold ^Critical threshold ^Check frequency ^Notification frequency ^ | ^Service/Parameter ^Probe binary ^Name in NRPE ^Warning threshold ^Critical threshold ^Check frequency ^Notification frequency ^ |
|HOST UP| N/A | this is not implemented on the target machine | N/A or ping RTT threshold | high ping RTT or host is not pingable at all | every 5 minutes | every 6 hours | | |HOST UP| N/A | this is not implemented on the target machine | N/A or ping RTT threshold | high ping RTT or host is not pingable at all | every 5 minutes | every 6 hours | |
|swap used space | check_swap | check_swap | 50% swap free | 10% swap free | every 5 minutes | every 24 hours | | |swap used space | check\_swap | check\_swap | 50% swap free | 10% swap free | every 5 minutes | every 24 hours | |
|disk free space | check_disk | check_disk | 90% used | 95% used | every 5 minutes | every 24 hours | | |disk free space | check\_disk | check\_disk | 90% used | 95% used | every 5 minutes | every 24 hours | |
|system load | check_load | check_load | 4,3.5,3 | 6,5.5,5 | every 5 minutes | every 24 hours | | |system load | check\_load | check\_load | 4,3.5,3 | 6,5.5,5 | every 5 minutes | every 24 hours | |
|used memory | check_mem | check_mem | 90% used | 95% used | every 5 minutes | every 24 hours | | |used memory | check\_mem | check\_mem | 90% used | 95% used | every 5 minutes | every 24 hours | |
|process count | check_procs | check_procs | 300+ | 500+ | every 5 minutes | every 24 hours | | |process count | check\_procs | check\_procs | 300+ | 500+ | every 5 minutes | every 24 hours | |
|zombie process count | check_procs | check_zombies | 1+ | 5+ | every 5 minutes | every 24 hours | | |zombie process count | check\_procs | check\_zombies | 1+ | 5+ | every 5 minutes | every 24 hours | |
|system time | check_ntp_time | check_time | skew >1min | skew >5min | every hour | every 24 hours | | |system time | check\_ntp\_time | check\_time | skew >1min | skew >5min | every hour | every 24 hours | |
|CzechIdM is running | check_http | check_idm | N/A | CzechIdM not running | every 5 minutes | every 24 hours | | |CzechIdM is running | check\_http | check\_idm | N/A | CzechIdM not running | every 5 minutes | every 24 hours | |
|HTTPD is running | check_http | check_httpd | response time >1s | HTTPD is not running | every 5 minutes | every 24 hours | | |HTTPD is running | check\_http | check\_httpd | response time >1s | HTTPD is not running | every 5 minutes | every 24 hours | |
|HTTPS certificate expiration | check_http | check_httpd_cert | less than 30 days | less than 7 days | once a day | every 24 hours | | |HTTPS certificate expiration | check\_http | check\_httpd\_cert | less than 30 days | less than 7 days | once a day | every 24 hours | |
|PostgresSQL is running | check_pgsql | check_postgres | response time >0.5s | response time >1s or not running at all | every 5 minutes | every 24 hours | | |PostgresSQL is running | check\_pgsql | check\_postgres | response time >0.5s | response time >1s or not running at all | every 5 minutes | every 24 hours | |
| |
===== Implementation ===== | ===== Implementation ===== |