Overview
The NVIDIA® Mellanox® UFM® platforms revolutionize data center networking management, by combining enhanced, real-time network telemetry with AI-powered cyber Intelligence and analytics to support scale-out InfiniBand data centers.
UFM platforms empower research and industrial data center operators to efficiently provision, monitor, manage and preventatively troubleshoot and maintain the modern data center fabric, to realize higher utilization of fabric resources and a competitive advantage, while reducing OPEX. From workload optimizations and configuration checks, to improving fabric performance through AI-based detection of network anomalies and predictive maintenance, UFM platforms comprise multiple solution levels and a comprehensive feature set to meet the broadest range of modern scale-out data center requirements.
Key UFM Platform Highlights

NVIDIA MELLANOX UFM TELEMETRY
REAL-TIME MONITORING

NVIDIA MELLANOX UFM ENTERPRISE
FABRIC VISIBILITY & CONTROL

NVIDIA MELLANOX UFM CYBER-AI
CYBER INTELLIGENCE & ANALYTICS
UFM TELEMETRY
REAL-TIME MONITORING

The UFM Telemetry platform provides network validation tools to monitor network performance and conditions, capturing and streaming rich real-time network telemetry information, application workload usage, system configuration and more, to an on-premise or cloud-based database for further analysis.
Key Features:
- Switches, Adapters, Cables telemetry
- System validation
- Network performance tests
- Streaming of telemetry information into on-premise or cloud-based database
UFM ENTERPRISE
FABRIC VISIBILITY & CONTROL

The mid-tier UFM Enterprise platform combines all the benefits of UFM Telemetry with enhanced network monitoring & management capabilities, workload optimizations and periodic configuration checks. It also performs automated network discovery and provisioning, traffic monitoring, and congestion discovery. UFM Enterprise enables job scheduler provisioning and integration with leading job schedulers, cloud and cluster managers, including Slurm and Platform LSF. UFM also enables network provisioning and integration with OpenStack, Azure Cloud and VMware.
Key Features:
- UFM Telemetry inside
- Automated network discovery and validation
- Secure cable management
- Congestion tracking identifying traffic bottlenecks
- Problem identification and resolution
- Global software updates
- Job scheduler provisioning, integrated with Slurm and Platform LSF
- Advanced reporting and comprehensive REST APIs
- Rich web-based GUI

NVIDIA MELLANOX UFM CYBER-AI
CYBER INTELLIGENCE & ANALYTICS
The UFM Cyber-AI appliance enhances the benefits of UFM Telemetry and UFM Enterprise, providing scale-out of preventive maintenance for lowering supercomputing OPEX.
Platform: Requires dedicated UFM Cyber-AI appliance on-premise
Key Features:
- UFM Telemetry and UFM Enterprise inside
- Detects performance degradations
- Detects usage profile changes over time
- Detect abnormal cluster behavior
- Correlation between phenomena (that may seem non-related) powered by Artificial Intelligence
- Alerts when preventive maintenance is needed
- Continuous system data collection optimizes predictability
How UFM Cyber-AI Works
The unique advantages of the Cyber-AI platform are based on a process of capturing rich telemetry information over time and utilizing deep learning algorithms. Here’s how it works:

- The UFM learns the data center’s “heartbeat”, operation mode, conditions, usage, and workload network signatures, then builds an enhanced database of telemetry information and discovery of correlations between events.
- The UFM translates and correlates changes of the heartbeat to indications of future performance degradations or abnormal usage of the data center’s computing resources.
- Such changes and correlations between phenomena, trigger the performance of predictive analytics, and initiate alerts that indicate abnormal system and application behavior, as well as potential system failures.
- System administrators can quickly detect and respond to such potential security threats and address upcoming failures in an efficient manner, saving OPEX and maintaining end-user SLAs.
Integration with Existing Data Center Management Tools
UFM provides an open and extensible object model to describe data center infrastructure and conduct all relevant management actions. UFM’s API enables integration with leading job schedulers, cloud and cluster managers, including Slurm and Platform LSF. UFM also network provisioning and integration with OpenStack, Azure Cloud and VMware.
NVIDIA Mellanox Care – Monitoring & NOC Services
Regular performance analysis is essential to ensure that your Mellanox solution is aligned with your business objectives and the latest Mellanox technology. Our Monitoring and NOC Services constantly examine your solution for any potential faults before they occur, giving you a peace of mind by identifying and addressing issues before they become problems. The end result is increased ROI and lower system maintenance costs.

- Remote NOC, network management, and monitoring services
- Dedicated service engineer
- Tier 1, 2, and 3 support
- Ongoing fault and trouble management
- Trouble reporting and management
- Fault analysis and reporting
- Performance monitoring – alarms and real-time alerts
- Scalable, cost-effective service
- Product Brief NVIDIA Mellanox Unified Fabric Manager (UFM) Platform PortfolioDownload
- Product Brief Mellanox InfiniBand Adapter BrochureDownload
- ConnectX InfiniBand Adapter Cards for OCP BrochureDownload
- Product Brief ConnectX-6 VPIDownload
- Product Brief ConnectX-5 VPIDownload
- Product Brief ConnectX-5 VPI OCPDownload
- Product Brief ConnectX-5 VPI Socket DirectDownload
- Product Brief ConnectX-4 VPIDownload
- Product Brief ConnectX-3 Pro VPIDownload
- UFM-SDN Appliance Product BriefDownload
- UFM-LSF Product BriefDownload