When a device goes down we get notification messages from every sensor with a notification configured saying that it can't be read. For example, we'd get notifications for: CPU, Memory, Uptime, Volume C, Volume F, Volume G, Volume H etc
When each device has more than 10 sensors, we do get a lot of notifications that that can hide the actual issue.
The reason why this became an issue is that three servers went down the other day due to a problem at our hosting provider and we got 40 or so notifications. We couldn't easily tell from the notifications that it was a problem with just 3 servers because we were bombarded with the notifications.
It would be nice if we could say that device X is down when we fail to connect to Y sensors and to then to send out a "device X is down" notification and suppress the others until it comes back up.
Hello Steve,
Setting a so-called dependency on the Ping sensor would be the way to go here. Head to "Settings" tab of the ping sensor, section "Schedules, Dependencies..." and set it be to "Master for parent object". This will pause all other sensors on the device as soon as ping goes down.
As soon as ping is working again, it will resume the other sensors as well. Now since ping working does not necessarily mean the device is fully booted and therefor all other sensors working right away, you can set a delay for this. This you find on "Settings" tab of the device then where the dependency is shown.
See also: PRTG Manual - Dependencies
Kind regards,
Erhard
Jun, 2018 - Permalink