Section 8 - Events & Alerts
Mutiny has a number of built-in methods for displaying Events and/or sending out Alerts when a property passes a set threshold or an agent detects a warning or critical condition on a node. These Events can be displayed in various ways depending on the choices made in custom views and Alerts can be by email, SMS and can be forwarded onto other systems using SNMP Traps, V2 informs, custom email templates and custom built API's to upstream systems like ServiceNow.
When a threshold is passed, Mutiny creates an "Unconfirmed" Event and places it in the Event table. Mutiny then checks for a transient event. If the value passes back under the set threshold before the event threshold time, it is treated as transient and marked as closed.
If the property is still over threshold after the Event threshold time has passed it is considered a real Event and changes to a "Confirmed" event; next Mutiny checks for a "No Alerts Period" to see if the Event should be ignored at this time, if not, it is upgraded to a state of "Open".
Mutiny next updates the Open Events views to include this node and it's alerting property, then checks to see if there are any contacts listed for this Alert.
For each listed contact mutiny then creates an Alert queue item. If the user is "On Shift" and has a zero delay on this Alert property then the Alert message is processed. Any other user listed for this property that is "Off Shift" or has a delay set for this property, has an Alert queued ready for sending when their delay times out or they become "On Shift". You can see the processed and queued items listed in the Alert log.
The alert can be tracked from the Related alerts panel at the bottom of each event page. Clicking on the ID or open time will expand the Alert tracking page.
Queued items listed will remain queued until one of the following becomes true;
- User is back on shift - Alert is sent
- User's delay times out - Alert is sent
- Property returns to OK - Alert is cancelled
- Event is older than 7 days - Alert is cancelled (but will be auto regenerated is the condition still exists)
Once a property returns to okay, it is recorded once again in the Event Log.
Most properties can have thresholds set against them. When these thresholds are passed mutiny creates an event. During discovery default thresholds are set for most items, if you wish to change these then simply open the appropriate property and adjust the values accordingly.
8.2.1. Event Thresholds
Event thresholds are the dwell times used by mutiny to filter out transient events. The Event Thresholds panel is at the bottom of the node's configure panel.
Mutiny will wait for the Threshold minutes before treating the Event as real. This feature helps remove false positives so that Contacts treat all Alert messages with equal importance.
8.2.2. No Alert Periods (see also 8.9 Maintenance mode)
In addition to the Event Thresholds, mutiny can also suppress the sending of Alerts during predefined "No Alerts" periods. These would be times such as planned maintenance or regularly scheduled restarts etc.
Two time periods can be set for each day as well as the property affected.
Contacts receive the alerts generated by the methods described above. To administer the alert Contacts, navigate to the Contact administration screen in the Admin section.
A contact can be an individual, an email distribution group or other ticketing system. Each Contact has a settings panel that contains options to receive all or certain types of alerts, editable shift patterns and fields for email address and or mobile number for SMS paging.
Alerts raised during shifts that are not enabled are queued until the next on shift time is passed.
Ticking either of the "All Alerts" buttons will send all alerts raised on the system and this overides shifts and alert settings.
8.4 Receive All Alerts and Tracked Views
This is the simplest method of receiving alerts generated by your Mutiny system. By default, receive all alerts is set to [no] and the alerting methods described below will apply. If you wish to receive all critical alerts click the [Critical only] checkbox and press [update] alternatively you can select [Warning and Critical] and receive every alert generated for any system configured within Mutiny. This is the method you might use if you send all alerts to another ticketing system for example.
Track views is a more selective way to receive alerts from groups of nodes that you have grouped into custom views. Click on the [Track Views] button to open the view selection panel.
8.4. Event Alert Templates
The Event Alert Settings button at the bottom of the Contacts panel contains all the event settings for all properties. This panel provides a quick method for adding a template of Alert settings for each contact.
The Alert settings can be set for each property and the list boxes at the bottom of the panel allow for the copying of the settings to nodes or views of nodes.
8.5. Alert Actions
Located at the bottom of each Event panel is the Alert Action section.
If enabled; this allows an event to be forwarded onto another system via SNMP trap, V2 inform or if available, a custom action. A custom action might be a bespoke API or script provided by Mutiny development.
Email alerts are the main method for communicating issues to contacts. They can be customised to contain various properties as well as the alerts message itself.Email alert templates can be edited to contain special information that might be used to alert into a helpdesk system or email to SMS gateway.The subject line can also be prefaced with a keyword in the system configuration screen.It is also possible to customise the email templates on a per user basis if required.
8.7. Email alert templates
To edit the email templates you need to use the template editor from the bottom of the Contacts page.
This opens the template editor.
To create a new template enter the name you wish to use and then select an existing template to base yours on. Press the update button and your new template will open in the editor.
The top half is the editor and the bottom half gives you an example of how your layout looks. You can enter free text and also embed variables from the variable pull-down at the top. Press update to save your changes. Your new template is now available for use in your contact record.
The default Short format template produces an alert similar to this;
8.8. SMS Alerts
In addition to email alerts mutiny can send SMS alert messages through a modem and PSTN line or other email to SMS gateway.
The SMS templates are editable like the above but are obviously restricted in the amount of characters that can be sent.
Using a combination of email alerts and SMS alerts you have the flexibility of being able to build complex alerting procedures based on time and criticality.
8.9 Maintenance mode
Maintenance Mode is a feature that allows an administrator to select groups of monitored devices(nodes) and temporarily exclude them from the Event/Alert process.
To recap, an Event occurs when a monitored property (e.g. disk drive, memory, ping response etc.) changes its Status from OK (normal) to Warning or Critical. Once the Event threshold has been passed (2 minutes by default), the Event is considered to be "Open" and it will remain in this state until the property returns to an OK Status or it is no longer monitored. The Event is then considered to be "Closed". Whilst an Event is Open, Mutiny will, if configured, send Alerts (emails, SMS messages or Traps) to one or more Contacts.
There are times when it is desirable for a monitored device to be excluded from this process - one example being when it has to undergo scheduled or emergency maintenance. Although it has always been possible to do this from the Mutiny GUI, it is quite laborious, involves a large number of clicks and there is no mechanism to indicate or remind the operator that a node has been excluded. In response to customer feedback, Maintenance Mode has been developed to address this issue.
To enable (or disable) Maintenance Mode, you must be logged on as an Administrator or Super-Admin as you will need to access the Node Manager pages.
From the menus bar, select [Nodes]=>Manager and you will see the new [Maintenance] button,
Click the [Enable Maintenance] button to select the nodes for Maintenance Mode. You can block-select either nodes or Views or a combination of both. Options are available to exclude the nodes from both the Event logging & Alerting processes, or simply to just prevent any Alerts being sent from Events on those nodes. You can also set the length of time for which you want the node to remain in maintenance usingthe drop-down selector. Once the options have been chosen and the nodes selected, click [Apply]:
In order to make them easier to locate, the selected nodes are automatically placed in a new smart View named "In Maintenance" and the node-Status icon is replaced by a spanner symbol:
Any already-open Events from nodes placed into maintenance mode will be automatically closed within 1minute and removed from the Wallboards
Whilst the node is in maintenance mode, no Alerts will be sent from any Events on the node. Additionally,(dependent on the option being chosen) the node will not be able to create any Events from its monitored properties. But note that all properties will still be monitored and the graphing will not show any gaps.
Maintenance mode will be automatically turned off for each node once the chosen timespan has elapsed, but you wish to take any nodes out of maintenance before this, you need to return to the Node Manager.Click the [Maintenance] button and this time choose [Exit Maintenance]. A list of the nodes currently in maintenance mode will be displayed and you can choose any the wish to exit maintenance immediately and click [Apply]:
Once the nodes are out of maintenance, any Events will be quickly opened (or re-opened) and added to the Wallboards. Any configured Alerts will be sent as soon as the conditions for sending have been met.