Alarm Management

Overview

The alarm management module is based on the detection of events which internal (VNOC), SNMP thresholds, or sylogs sent by the managed devices and collected by the MSActivator. Alarm management is designed to provide email notifications to customers or managers, to send SNMP traps to an external trap server, or trigger predetermined automated processes.

The detection of events relies on rules configured at the super administrator level. Rule management is available for the super administrator (ncroot). The rules are defined globally and can be modified by the SOC team. The SOC team can modify the setting of the notification  on a per-event and/or per-customer basis. The rules are executed on a periodic basis (the period frequency can be configured) and alarms are generated whenever a rule matches.

Rule Management

Rule management allows the super administrator to build, test and configure some rules.

Image

Event Details

The event row can be expanded to display all the event fields as well as some additional fields.

Event fields

Internal events

The event below is an internal event (prefixed by VNOC) generated by the MSActivator.

Image

  • _timestamp: this is the date when the event was actually stored in the MSActivator log indexer.
  • _ttl: this is the event Time To Leave in the log indexer cluster. "2w" stands for 2 weeks. This means that after 2 weeks this event will be removed from the system. This value is configurable systemwide and it should be set according to the event and logs retention period required for the MSActivator.

Syslog

Alarms can also be triggered based on conditions that match certain syslogs sent by devices.

For example, the event below is triggered by a FortiGate when login attempts fail three times in a row:

Image

Matching rules

When lots of rules are stored in the system, it is convenient to be able to get the list of the rules that match a specific event. The list of matching rules is available in the "matching rule(s)" tab of the event detail.

In the example below, you can see that the first result of the search for "Header Leakage*" would also be matched by the rule RESERVED.

Image

The reason for this is the term "RESERVED" is also found in the first result of the search for "Header Leakage*", which houses the rule "RESERVED".

Image

 

Rule Builder

The rule builder field is a simple textbox that allows the user to type rules based on the MSActivator search engine query syntax.

Image

The rule is directly executed over the events that are available in the system.

Severity

You can select from a list of severities to associate to a rule.

A severity selection is required when building a rule. Selecting a severity will allow, for example, the system to raise an alarm whenever a certain severity is raised, whatever the event is.

Image

Scope of a Rule

By default, a rule will apply to all customers on the system, and thus by extension, to every device in the system.

It is possible to create rules on a customer by customer basis. When creating customer specific rules, only one customer can be selected at a time.

Image

Rule Criteria

Two types of criteria are available for configuration:

  • the alarm recipient
  • the alarm media

Image


The alarm recipient is a list of checkboxes (customer, manager, privileged manager and administrator). It is mandatory to select at least one recipient for the rule. Each recipient is associated to a role as defined by the MSActivator RBAC. This selection determines who will be contacted when an alarm is triggered. An alarm can also be sent to a group of users based on the roles selected by the alarm recipient check-boxes.

Automated Process Execution

In addition to notifications, the user can configure the execution of a Workflow process when an alarm occurs.

Image

To do this, select a service and a process using the list shown above.

From the advanced parameter section, select the fields that will be passed as parameters to the process.

The list of fields should match the list of variables that are defined for the process.

For example, the following process variables:

Image

would be listed as:

Image

Note: the parameters are used to run an aggregation query over the data index in the Elasticsearch cluster. Therefore it is not recommended to use the date or raw log fields as process parameters, because the value of these fields is different for each log.

Event Cumulation and Cumulation Time

When processes need to be executed based on events, there is a possibility to control the event cumulation. This is to avoid cases where lots of events are triggered (some security attack for example) but we don't want the system to execute 1 process per event.

By default the event cumulation is set to 10: 10 events will have to be detected before the process is executed.

The default cumulation time is 15 minutes so the system will wait 15 minutes to trigger the process.

The 2 parameters work together and the whichever threshold is reached first will trigger the execution.

The parameters should both be set to 0 to have the fastest response time.

Image

Rule Creation

To be applied, the rule must be saved in the system.

For example, in order to get notifications whenever login to the device fails due to an invalid password on a managed device, the rule below can be used and saved with a name (in this case, "LOGIN_FAILED"):

Image

Rule Load and Update

To update an existing rule, the rule should be loaded first.

Image

Then click "save rule".

Rule Deletion

To delete a rule, simply click on the red "X" and confirm the action when prompted.

Rule Application

A rule is applicable as soon as it is saved in the system.

If a rule has been created to send a email if a match is found, the system will start running the rule and possibly sending emails as soon as the rule is saved.

The MSActivator rule matching process is a rule with a period (in seconds) that can be configured with the configuration tool, accessed using the following path:

SEC Engine configuration->Initial Configuration->check_alert period
  

Alarm Management

Users can view their alarms from the customer portal or from the customer menu (Monitoring->Alarm)

Image

The user can select the time range they wish to display (from the last 10 minutes to the past day). Users can also select the number of events to be displayed in the detailed view. The alarm summary view shows a list of events aggregated by severity, type and subtype. The detail view shows each event with the detail of the event that triggered the alarm.

Examples

Create a Rule to Send Alarm for SNMP Thresholds

SNMP thresholds can be configured using monitoring profiles. The SNMP threshold events are characterized by the keyword SNMPTHLD. The following search should bring back some results, provided that this kind of event has been raised and exists in the event index.

Image

Create Rules to Send Alarm for Configuration Update

Configuration update events are generated by the MSActivator and look like this:

%VNOC-<severity level>-UPDATECONF: <message>
  

The following rule should match every configuration update related event, whether it failed or it succeded:

Image

Or for configurations related to object:

Image


In case of a configuration error, the MSActivator will raise an event %VNOC with a severity level 1 (%VNOC-1-UPDATECONF: <message>)

One possible way to detect PUSHCONFIG events that have failed is to filter the event by severity and only search for severity 1:

Image

Another method to generate an alarm based on configuration failure is:

Image

It is often useful to have multiple rule types for different types of configuration related events.

For example, license related issues can be detected by the rule below:

Image

Create Rules to Send Alarm When Device Goes Down or Up

To trigger an alarm when an IPUP or an IPDOWN happens, use the rule below:

Image

Create a Rule to Detect Security Events

Use the rule below to detect threat events detected by a UTM:

Image

This rule can be made more specific to target specific threats. As shown below, you can use it to detect a threat were the destination port is 80:

Image

Video Tutorial