Marvis Actions Operations Guide
Version 1.0
Juniper Networks
Introduction
This manual explains the 'Marvis Actions Operations Guide'. The procedures are based on Mist Cloud as of May 2025. Actual screens and displays may differ due to updates.
For the latest update information, please check: https://www.juniper.net/documentation/us/en/software/mist/product-updates/
Configuration details and parameters may vary depending on the environment and setup. For detailed configuration information, please refer to the following link: https://www.juniper.net/documentation/product/us/en/mist/
Additional Juniper Japanese manuals are available on the 'Solutions & Technical Information Site': https://www.juniper.net/jp/ja/local/solution-technical-information/mist.html
The content of this document is based on the information available at the time of creation and is subject to change without prior notice. The configurations and features described in this document are not provided as a condition of purchase.
Update History
Version | Update Date | Summary |
---|---|---|
Ver 1.0 | May 2025 | Initial Release |
Marvis Actions Overview
Marvis Actions are not a substitute for real-time alerts such as port up/down notifications.
Marvis Actions is a key component of Marvis - Virtual Network Assistant. It utilizes AI and machine learning to visualize network issues in Wireless/Wired/WAN, identify root causes, and suggest actions, thereby supporting proactive network operations before critical user experience impacts occur and shortening MTTR/MTTI.
- Visualizes network issues that may affect user experience.
- Identifies the root cause and impact scope of network problems.
- Provides administrators with recommended actions.
The interface displays various network elements and issues, including Clients, Access Points, Switches, WAN Edges, and Security, with a count of 261 total actions.
Site Display
You can switch to the site display by clicking the [Site] tab.
The site display allows users to view network issues geographically. Users can click on specific sites to drill down into details. The interface shows detected problems, with options to view more details for each issue.
Example Issues:
- VPN Path Down
- AP OFFLINE (2)
- ARP FAILURE (3)
- AUTHENTICATION FAILURE (4)
- BAD WAN UPLINK (3)
- COVERAGE HOLE (1)
- DHCP FAILURE (6)
- DNS FAILURE (3)
- HEALTH CHECK FAILED (1)
Recommended Action Example: For issues with individual APs, test the cable/port or perform a factory reset. For issues with the entire switch/site, check the configuration to reach the Mist cloud.
Example: Missing VLAN
Missing VLAN indicates that a VLAN is configured on the AP but not on the switch port.
Steps to resolve:
- Navigate to [Marvis] and click [Marvis Actions].
- Click [Switch], then click [Missing VLAN].
- Click [View More] to see details and perform necessary actions (e.g., adding VLAN configuration to the switch).
The system suggests adding VLAN 30 to the switch port mge-0/0/0.
Recommended Action: The below APs don't see any incoming traffic which is expected from the specified VLANs. Please add these VLANs to the respective switch ports.
Resolve Action
We appreciate your proactive feedback.
4. Change the [Status] to [Resolve], select the resolution, and click [OK]. This will reduce the count of Marvis Actions.
Resolution Options:
- Solved using the Mist suggested action (Mist provided action).
- Solved using another method (please comment below).
- A known issue and should be ignored in the future.
- Incorrectly listed as an issue (false positive).
If the issue is resolved, select the corresponding resolution. If the status is 'In Progress', it indicates that the issue is being addressed.
Other Actions
Clicking 'Other Actions' displays 'Persistently Failing Clients' and 'Access Port Flap'.
Persistently Failing Clients: Detects wired and wireless clients that repeatedly fail to connect due to client-specific issues. This can be caused by incorrect PSK input or faulty 802.1X settings. Issues are automatically resolved within an hour of correction and are considered low-priority.
Access Port Flap: Identifies ports that repeatedly flap (go up and down) in a short period. This indicates a problem with the port or the connected wired client, potentially due to low connection reliability, frequent device reboots, or duplex setting mismatches.
Anomaly Detection Event Card
The Anomaly Detection Event Card provides detailed root cause analysis information for specific Marvis Actions, including Authentication Failures, ARP Failures, DNS Failures, and DHCP Failures.
The card is composed of Timeline, Summary, Causes, and Details sections.
- Timeline: Shows the temporal progression of event counts.
- Summary: Provides an overview of the anomaly and its likely causes.
- Causes: Details the root causes of the anomaly.
- Details: Lists impacted clients and devices.
The size of the circles in the 'Causes' section indicates the relative impact or severity of the issue on the site.
Subscriptions and Actions Summary
This section provides a summary of available actions for each subscription type, including:
- Glossary of Terms
- Layer 1 Actions
- Connectivity Actions
- AP Actions
- Switch Actions
- WAN Edge Actions
- Other Actions
Refer to 'Subscriptions and Actions' for details on required subscriptions.
Glossary of Terms
Term | Definition |
---|---|
Model Input Feature | Input or feature used by the model to determine if the conditions for raising specific Actions are met. |
Trigger Conditions | Conditions that trigger Marvis Actions. |
Validate Time | The time it takes for Marvis to mark an unresolved Marvis Action as resolved, indicating the user has likely resolved the issue or the symptoms are no longer present. |
The following pages will explain the overview of Marvis Actions.
Layer 1 Actions
Marvis Actions | Model Input Feature | Triger Conditions | Validation Time |
---|---|---|---|
Bad Cable | AP, Switch, WAN Edge statistics, Events | Detected the following during the monitoring period: • Speed change • Port error • Switch port is active but no traffic is passing • Detection of frequent disconnections/reconnections of APs |
7 (Days) |
Connectivity Actions
Marvis Actions | Model Input Feature | Triger Conditions | Validation Time |
---|---|---|---|
Authentication Failure | Wired Client, Wireless Client | Deviation from predicted baseline (deviation/drift). The LTSM-based model sets the baseline for overall site authentication success or failure. The model raises these Marvis Actions considering the severity of the issue. The higher the severity and deviation from the baseline, the higher the confidence in the Actions raised by the model during the observation period. | 1 (Day) |
DHCP Failure | Wired Client, Wireless Client | Deviation from predicted baseline (deviation/drift). The LTSM-based model sets the baseline for overall site DHCP success or failure. The model raises these Marvis Actions considering the severity of the issue. The higher the severity and deviation from the baseline, the higher the confidence in the Actions raised by the model during the observation period. | 1 (Hour) |
ARP Failure | Wired Client, Wireless Client | Deviation from predicted baseline (deviation/drift). The LTSM-based model sets the baseline for overall site ARP success or failure. The model raises these Marvis Actions considering the severity of the issue. The higher the severity and deviation from the baseline, the higher the confidence in the Actions raised by the model during the observation period. | 1 (Day) |
DNS Failure | Wired Client, Wireless Client | Deviation from predicted baseline (deviation/drift). The LTSM-based model sets the baseline for overall site DNS success or failure. The model raises these Marvis Actions considering the severity of the issue. The higher the severity and deviation from the baseline, the higher the confidence in the Actions raised by the model during the observation period. | 1 (Day) |
AP Actions
Marvis Actions | Model Input Feature | Triger Conditions | Validation Time |
---|---|---|---|
Offline | AP Statistics | One or more APs are down locally or offline (only loss of cloud connectivity). The model performs correlation analysis to identify the cause of AP downtime (switch, site, region, or ISP issue). To receive notifications when a device is offline, set up infrastructure alerts for device up/down events and specify thresholds. | 15 (Minutes) |
Health Check Failed | AP Statistics | AP or wireless is in a state of repeated malfunction even after auto-recovery. | 30 (Days) |
Non-Compliant | AP Statistics | Difference between AP or multiple APs' firmware versions and the firmware versions configured in the site settings. | 30 (Minutes) |
Coverage Hole | AP Statistics, Client Statistics | Baseline deviations for SLEs are associated with one or more APs in high-impact areas. The model considers issues like APs located at outdoor entrances or building access points, and recognizes fringe patterns. The model identifies Marvis Actions that indicate coverage hole issues affecting users, considering the severity of the anomaly. If the anomaly indicator is strong, the model generates actions faster than if it is weak. The model examines multiple batches of data to identify APs with coverage hole issues. | 7 (Days) |
Insufficient Capacity | AP Statistics, Client Statistics | Baseline deviations caused by repetitive and long-term capacity constraints of APs, not related to seasonal variations. The model identifies Marvis Actions indicating capacity issues affecting users, considering the severity of the anomaly. If the anomaly indicator is strong, the model generates actions faster than if it is weak. The model inspects multiple data batches to identify APs with capacity issues. | 7 (Days) |
AP Actions (Continued)
Marvis Actions | Model Input Feature | Triger Conditions | Validation Time |
---|---|---|---|
AP Loop Detected | AP events | Reflection events occurring on the AP due to network loops caused by misconfigurations or improper settings. Reflection events occur when the AP receives packets sent on the same or different VLANs. Since reflection events are generated directly below site events, you can monitor these events to track raw statistics. | 30 (Minutes) |
Switch Actions
Marvis Actions | Model Input Feature | Triger Conditions | Validation Time |
---|---|---|---|
Missing VLAN | AP Port Statistics | Uplink port statistics reported by APs with missing VLANs. This action correlates data from two or more APs to determine if a VLAN is missing on an AP port used by clients. This correlation prevents the 'Missing VLAN' Action from being raised if any client on the site is not using the VLAN. | 30 (Minutes) |
Negotiation Incomplete | Individual Switch Port Statistics | Switch port Auto-Negotiation failure reported. | Up to 30 (Minutes) |
MTU Mismatch | Individual Switch Port Statistics | MTU mismatch between the switch port and the connected device. The reported statistics indicate port errors. The model generates Marvis Actions based on severity and time. A larger MTU mismatch results in higher severity and faster action generation. | 1 (Day) |
Loop Detected | Switch port events | A phenomenon where a loop occurs in the topology, intentionally or accidentally, causing rapid and repeated changes in the STP topology. The model uses STP topology change events as input features and considers severity and time. The higher the frequency of STP topology changes within a period, the faster the detection. Loops that occur over a longer period with slower event occurrences can also trigger Marvis Actions. | 30 (Minutes) |
Network Port Flap | Switch port events (Trunk port) | Continuous port bounces on ports configured as trunk ports. The model considers frequency and time. Higher port flap frequency results in higher severity. For slow port flaps occurring over a longer period, the model detects them within hours or days. | 30 (Minutes) |
Switch Actions (Continued)
Marvis Actions | Model Input Feature | Triger Conditions | Validation Time |
---|---|---|---|
High CPU | Switch chassis statistics | Average CPU utilization continuously exceeding 90% during the monitoring period. The model considers the frequency and duration of the issue. Statistics indicating high average CPU utilization across all samples in the monitoring dataset indicate a serious issue for the user. The model promptly raises Marvis Actions for such issues. | 30 (Minutes) |
Port Stuck | Switch port statistics | Sudden changes in traffic patterns of end devices on access ports. The model does not generate false positives for seasonal traffic patterns. It also considers traffic patterns between similar endpoints for inference. This Marvis Action executes automatically. When a port stuck issue is detected, the port automatically bounces, and the endpoint becomes operational again. The model generates this action only if the endpoint does not return to an operational state after the automatic port bounce, or if the port stuck issue occurs multiple times. | 30 (Minutes) |
Traffic Anomaly | Switch port statistics | Deviation (drift) from predicted traffic patterns for broadcast and multicast frame counters. The model baselines traffic patterns for each switch or switch port over several days. This action uses an LSTM-based model. The model raises these Marvis Actions based on issue severity. Significant deviations persisting over the monitoring period lead to faster action generation by the model. Minor deviations over a longer period may result in delayed action generation by the model. | 1 (Day) |
Misconfigured Port (Uplink) | Switch port statistics | Mismatch in MTU, VLAN, or duplex settings between uplink ports. The model identifies mismatches in switch interconnections at the edge. | 60 (Minutes) |
WAN Edge Actions
Marvis Actions | Model Input Feature | Triger Conditions | Validation Time |
---|---|---|---|
MTU Mismatch | WAN Edge Statistics | MTU mismatch between WAN Edge ports and connected devices. The model validates reported statistics for specific port errors. The model raises these Marvis Actions considering severity and time. Higher MTU mismatch results in higher severity and actions are raised within a specific timeframe. | 30 (Minutes) |
Bad WAN Uplink | WAN Edge(Uplink) | High latency, packet drops, congestion, and issues with ARP or DHCP on the uplink. Detected from WAN port statistics based on deviations from the baseline. Issues determined to be of high severity are listed before issues of lower severity. | 1 (Hour) |
VPN Path Down | VPN Tunnels, peer paths | Occurrence of VPN peer path down. VPN paths from spoke to hub, or paths terminating at the hub. If you need alerts for every port up or down event, subscribe to high-severity port monitoring alerts to obtain raw alerts. Issues determined to be of high severity are listed before issues of lower severity. | 1 (Hour) |
Non-Compliant | SRX Series Firewall | Difference in Junos OS versions between the primary and backup partitions. | 30 (Minutes) |
Other Actions
Marvis Actions | Model Input Feature | Triger Conditions | Validation Time |
---|---|---|---|
Persistently Failing Clients | Wired Client, Wireless Client | Clients continuously failing to authenticate and connect to the network. Continuous persistent failures during the monitoring period. Trigger timing varies depending on the site and the number of concurrent failures correlated with client count. | 60 (Minutes) |
Access Port Flap | Switch Port (Access port) | Continuous port up/down events on ports configured as access ports. The model considers the frequency and duration of the issue. Higher port flap frequency results in higher severity. For slow port flaps occurring over a longer period, the model detects them within hours or days. | 30 (Minutes) |