What Is Service Telemetry?
Service telemetry measures network latency for remote direct memory access (RDMA) services and provides online visualization. It applies to IPv4 RoCEv2 packets. This technology implements visualization of both input/output (I/O) quality and throughput. It measures the latency for network, storage, and compute nodes in a storage I/O operation by segment, and measures the transmission duration, effective throughput, and retransmission rate of RoCEv2 packets to monitor the network and demarcate problems.
Why Do We Need Service Telemetry?
As we enter the intelligence era, there are more and more services with massive data storage and read/write requirements. RDMA services face the following challenges to O&M:
- The network cannot proactively detect service performance deterioration or fluctuation caused by problems such as congestion. Instead, network faults are usually reported by the service department.
- When the storage I/O latency or input/output operations per second (IOPS) deteriorates, it is difficult to locate the fault.
- The NPU throughput cannot be measured due to the minute-level collection precision of interface statistics.
- The PFC statistics cannot reflect the degree of network congestion nor the impact on the throughput.
- It is difficult to detect and locate NIC and silent packet loss problems.
- Troubleshooting takes a long time due to the lack of scenario-based best practices for troubleshooting.
To address these challenges, Huawei launches the service telemetry technology. This technology breaks through limitations of traditional network monitoring and provides RDMA-based I/O quality visualization and throughput visualization. It accurately monitors and analyzes I/O latency and throughput data, as well as quickly detects storage service performance deterioration and network congestion. This facilitates network problem identification and network quality optimization, which in turn spurs the development of intelligent lossless networks.
How Does Service Telemetry Work?
I/O Quality Visualization
Service process
The following figure shows the layers involved in the service process of service telemetry.
Working process of service telemetry
- Analysis presentation layer (iMaster NCE-FabricInsight): Displays I/O-based performance indicators of service traffic and delivers configurations to devices through NETCONF interfaces.
- Device measurement layer (switches):
- Compute-side port: Service packets enter or leave a measurement device through the compute-side port. The measurement device identifies specified packets, performs I/O latency measurement and breakdown, and reports the measurement result to the analyzer.
- Storage-side port: Service packets enter or leave a measurement device through the storage-side port. The measurement device identifies specified packets, performs I/O latency measurement and breakdown, and reports the measurement result to the analyzer.
Latency breakdown solution
Based on the I/O interaction process, service telemetry can be used to match specified packets in transmit and return directions, define I/O latency breakdown objects, and measure the I/O latency. The following figure shows the latency breakdown solution.
Packet interaction in read and write I/Os
- Data access latency (DAL): Used to locate problems on the storage side. DALs in read and write operations are measured separately.
- Data preparation latency (DPL): Used to locate problems on the compute side. The DPL is only involved in the write operation.
- I/O latency (IOL): Total latency on the compute/storage side.
- Network round-trip time (RTT): They are different in read and write operations. iMaster NCE-FabricInsight calculates the network RTT using the following formula: RTT = IOL1 – IOL2.
Throughput Visualization
Service process
The following figure shows the layers involved in the service process of throughput visualization.
Throughput visualization system model
- Analysis presentation layer (iMaster NCE-FabricInsight): Displays throughput performance of service traffic and delivers configurations to devices through NETCONF interfaces.
- Device service measurement layer (switches): Service packets enter or leave Server B through Server A. After throughput visualization is enabled, Device A or Device B can identify RoCEv2 packets, measure throughput visualization indicators (time required for a single RDMA transmission, effective throughput of RDMA transmission, and ratio of retransmissions initiated by RDMA), and report the measurement result to the analyzer.
Throughput monitoring solution
The following figure shows the packet exchange process of a single RDMA transmission, where the sender sends RoCEv2 packets to the receiver through Device.
Packet exchange process
Throughput visualization can be used to analyze the following indicators:
- Flow Completion Time (FCT), in microseconds: FCT = Time when the last data packet is received by Device – Time when the first data packet is received by Device.
- Flow Effective Throughput (FET), indicating the effective throughput of RDMA transmission per second: FET (bit/s) = Effective throughput (bit)/FCT (microsecond) x 106.
- Flow NAK Rate (FNR), indicating the ratio of retransmissions initiated by RDMA: FNR = Number of retransmitted NAK packets/Number of RDMA messages (excluding retransmitted packets).
Typical Application Scenario of Service Telemetry
The following figure shows the typical application scenario of service telemetry. The service telemetry function can be enabled on switch ports. This function is deployed on the ports connecting to compute-side and storage-side servers and does not need to be deployed on the interconnection ports between switches.
Typical application scenario of service telemetry
The following table shows two modes commonly used in service application.
Routine Monitoring Mode |
Maintenance or Key Assurance Mode |
|
|---|---|---|
Deployment position |
Single-point measurement (compute-side port) |
Multi-point coordinated measurement (compute-side and storage-side ports) |
Solution |
Single-point measurement + port-based polling The port-based polling solution is used to limit the number of packets sent to the CPU. |
Multi-point measurement + interesting flow The number of flows is reduced to limit the number of packets sent to the CPU. |
Service indicator |
|
|
Applicable scenario |
Full-flow monitoring (time division multiplexing by interface group, full-flow instead of full-packet) |
Full-process monitoring of interesting flows (full packets of interesting flows) |
- Author: Qian Jinchen, Yin Rongrong
- Updated on: 2025-07-07
- Views: 2463
- Average rating:
Export PDF