Monitor Failure Report

Topics: Developer Forum, User Forum
Nov 15, 2007 at 7:13 PM
I would like to create a report that shows the times in which a monitor was in a Failed status. i.e. Monitor X Failed at 10 PM for 20 minutes.

Though nothing else was easy, Servers Alive made a report like this easy by only entering a single line at every status change with the amount of time the system existed in it's last state. i.e. Monitor X Failed at 10 PM and was last OK for 1440 minutes.

Has anyone else worked on a report like this?
Nov 15, 2007 at 8:29 PM
Looking for help on this subject I found an article (http://www.intelligententerprise.com/printarticleflat.jhtml;jsessionid=AUT2KCE0ZJH4GQSNDLRSKH0CJUNN2JVN?article=/020613/510warehouse1_1.jhtml ) where the author recommends a table referencing time spans would be best served having two columns representing the beginning and the end time.
Coordinator
Nov 16, 2007 at 1:16 PM
The MonitorEvent table contains the data you are looking for. Instead of storing two timestamps as you suggest, it stores one timestamp and the time elapsed (in seconds) from the previous event - in effect getting the same thing.
Using the MonitorEvent table is then a relatively simple query, for example:

--Set Monitor ID here
Declare @MonitorID int
set @MonitorID=82
 
 
/* PolyMon Constants */
Declare @OK tinyint
Declare @Warn tinyint
Declare @Fail tinyint
set @OK=1
set @Warn=2
set @Fail=3
/* End PolyMon Constants*/
 
 
--determine current status
declare @CurrStatusID tinyint
declare @CurrAsOf datetime
 
select @CurrStatusID=StatusID, @CurrAsOF=EventDT 
from MonitorCurrentStatus 
where MonitorID=@MonitorID
 
 
--determine last fail event
declare @LastFailEventID int
declare @LastFailDT datetime
 
select @LastFailEventID = max(EventID), @LastFailDT=max(EventDT)
from MonitorEvent 
where MonitorID=@MonitorID and StatusID=@Fail
 
 
 
--determine how long Monitor was in failed state at that last fail event
declare @PriorOKEventID int
declare @DownTimeSecs int
 
select @PriorOKEventID=max(EventID) 
from MonitorEvent 
where MonitorID=@MonitorID and StatusID=@OK and EventID < @LastFailEventID
 
select @DownTimeSecs=abs(sum(coalesce(UpDownTimeSecs,0)))
from MonitorEvent 
where MonitorID=@MonitorID and EventID>@PriorOKEventID and EventID <= @LastFailEventID
 
 
--display results
select @MonitorID as [Monitor ID], 
	@CurrStatusID as [Curr Status ID], 
	@CurrAsOf as [Curr As Of], 
	@LastFailDT as [Last Fail DT], 
	coalesce(@DowntimeSecs,0) as [DownTime (secs)]

This is something I could easily include for next release - if you would like this for next release, please request it in the Issue Tracker section.
Hopefully this helps - let me know if you need further info or help.
Thanks,
Fred.
Nov 16, 2007 at 8:06 PM
This is close, but we need this information for every failed event in a given period.


Something like the following would be perfect:

Failures between 1/1/2007 and 1/7/2007


Uptime percentage : 64.285%
Total Failure Minutes : 60

Start Time End Time Total Minutes
01/01/2007 08:00 01/01/2007 08:30 30
01/03/2007 12:00 01/03/2007 12:20 20
01/07/2007 17:10 01/06/2007 17:20 10



Or even showing both Warnings and Failures; denoted then tallied separately:

Failures between 1/1/2007 and 1/7/2007


Uptime percentage : 64.285%
Total Failure Minutes : 60
Total Warning Minutes : 30

Status Start Time End Time Total Minutes
Failed 01/01/2007 08:00 01/01/2007 08:30 30
Warning 01/02/2007 01:00 01/02/2007 01:15 15
Failed 01/03/2007 12:00 01/03/2007 12:20 20
Warning 01/04/2007 17:15 01/04/2007 17:30 15
Failed 01/07/2007 17:10 01/06/2007 17:20 10