Alert after N periods

In order to avoid false alarms, it would be great to have a per-check variable to define how many failed pings to wait before alerting.

3 votes

Miguel Medina shared this idea · August 17, 2016 · Report… · Admin →

declined ·

AdminAdrien Rey-Jarthon (Founder, updown.io) responded · August 17, 2016

We already do 2 double checks before notifying, which is enough to eliminate any false positive. If you want to ignore real but small downtimes, you should increase the check period.

An error occurred while saving the comment

Scott Wheeler commented · June 03, 2020 22:24 · Report

Just jumping in as a company that's currently evaluating updown, and is switching away from Pingdom. We have on-server monitoring systems that try to restart things when there's a problem. Our preferred behavior is getting email alerts on single failures, but waiting 3 minutes (i.e. three checks) before getting SMSes. Basically, we treat text alerts as "Stop what you're doing, or wake up and fix this..." vs. the lower priority notifications being something that lets us keep an eye on general app health. A drop down next to the different notification types with a threshold for each notification is something we'd consider useful.

Submitting...
AdminAdrien Rey-Jarthon (Founder, updown.io) commented · August 17, 2016 20:36 · Report

You're totally right! that's why our double checks are not at the same time. When your check frequency is 30s, after a first down check, the two double checks will be performed at t+15s (from another server) and t+30s (from any server). This way you're sure to have a downtime confirmed from at least two locations and lasting at least 30s. That's why I said that if you want to be more tolerant to small downtime (say 2 min) you have to set a check frequency of at least 2min. And that will also save you a lot of money ;)

We think that's our job to provide the fastest and most accurate monitoring and that's why don't allow you to tweak the settings but instead give you the best setup so you don't have to worry about this.

Submitting...
Miguel Medina commented · August 17, 2016 18:32 · Report

Thanks Adrien for your quick answer. In my experience, a double-check with another location is not the same than waiting for a second or third failed check. The best example I find is a timeout due to temporary slow performance. Sometimes apps fail to answer a ping because there's a usage peak at the exact moment of the check. Depending on the SLA, that could be perfectly fine as long as it gets ok after 2 minutes.

Submitting...

How can we improve updown.io?

Alert after N periods

Feedback

General

Feedback and Knowledge Base

Searching…

Give feedback

updown.io

Alert after N periods

We're glad you're here

We're glad you're here

We're glad you're here

We're glad you're here

General

Categories

Searching…

Give feedback

updown.io