Abstract
I. Introduction
II. Background and Motivation
III. Design of Interleaved Sketch
IV. Evaluation
V. Related Works
Authors
Figures
References
Abstract
Network telemetry is vital to various network applications, including network anomaly detection, capacity planning, and congestion alleviation. State-of-the-art network telemetry systems are claimed to be scalable, flexible, all-purpose, and accurate. They adopt interval approaches that track network traffic in each interval and collect statistics for analysis at a specific epoch. However, interval methods are impaired by collecting inconsistency and clearing inconsistency, which pollute statistics. Moreover, The state-of-theart centralized controllers have long latency, which aggravates the discrepancy. Accordingly, we propose the interleaved sketch, a consistent and decentralized network telemetry system across all switches. Each switch has two asymmetric sketches that work in an interleaved fashion, and is self-supervised to improve consistency. The distributed control plane extracts the flow characteristics and provides network-wide telemetry with low latency. We build a P4 prototype of our proposed interleaved sketch and test it on a Barefoot Tofino switch. Experimental results demonstrate that our interleaved sketch achieves ideal accuracy at line speed, with 6% resource overhead.
Introduction
Network telemetry provides a network-wide perspective by monitoring massive network traffic. Network telemetry is the cornerstone of many network applications, including congestion control, anomaly detection [1]–[۳], and Heavy Hitter detection [4], [5], as it is highly beneficial to these applications. To ensure accuracy, network telemetry systems should present a network-wide and consistent view that covers as many switches as possible and provides accurate flow-level statistics. Conventional wisdom mainly focuses on single switches and single task. However, this focus is not general enough for a wide range of telemetry tasks, as well as lacking in completeness in terms of the network-wide telemetry. Most importantly, they focuses primarily on CPUs [6], which are unable to handle data center traffic [7]. Software-Defined Network (SDN) [8] represented a revolution in traditional network conception and has opened up a new era in network telemetry. SDN decouples the data plane from the control plane of switches and makes the data plane programmable, which makes processing while forwarding each packet in line-rate a reality. The ongoing SDN revolution has facilitated the innovation of telemetry algorithms, including FlowRadar [9], UnivMon [10], SketchLearn [11], and Elastic Sketch [12]. These telemetry algorithms adopt interval approaches that track network traffic in each interval and collect statistics for analysis at the end of each interval.