Large scientific data transfers often occur at high rates causing increased burstiness in Internet traffic. To limit the adverse effects of these high-rate largesized flows, which are referred to as a flows, on delay-sensitive audio/video flows, a network management system called Alpha Flow Traffic Engineering System (AFTES) is proposed for intra-domain traffic engineering. An offline approach is used in which AFTES analyzes NetFlow records collected by routers, extracts source–destination address prefixes of a flows, and uses these prefixes to configure firewall filters at ingress routers of a provider’s network to redirect future a flows to traffic-engineered paths and isolated queues. The effectiveness of this scheme was evaluated through an analysis of 7 months of NetFlow data obtained from an ESnet router. For this data set, 91 % of bytes generated by a flows during high-rate intervals would have been directed had AFTES been deployed. The negative aspect of using address prefixes in firewall filters, i.e., the redirection of b flows to a-flow paths/queues, was also quantified.
Scientific computing applications in fields such as high-energy physics, climate science, genomics, etc., generate large (tera- to peta-byte sized) data sets . To transfer these data sets at high speeds, scientific users often invest in high-end computing clusters with disk arrays, parallel file systems, and high-speed access links. Usage logs collected at these data-transfer servers show that some transfers occurred at a significant fraction of link capacity, e.g., 4 Gbps1 on 10 Gbps links . New TCP variants such as H-TCP  are used to generate such high rates for single flows. The high-rate large-sized transfers, referred to as a flows, are the primary source of burstiness in IP traffic .
Core research-and-education network providers, such as US Department of Energy (DOE)’s Energy Sciences Network (ESnet) , have identified such a flows as having adverse effects on general-purpose (b) flows. As a flows cause burstiness, audio/video applications experience packet delay variance (jitter) and a corresponding degradation in performance. Such degradations in performance result in trouble tickets that add to a provider’s operational costs. To address these costs, DOE has supported research on traffic engineering systems. We propose one such system to (i) identify high-rate large-sized data-transfer flows from the packet traffic entering ingress routers of a provider’s network, (ii) control the path taken by these flows by establishing intra-domain virtual circuits (traffic engineering), and (iii) isolate packets from these flows into separate virtual queues to reduce their effects on general-purpose flows [6–8].
The first task of the traffic-engineering system listed above, which is to identify high-rate large-sized flows from the packet traffic entering a provider’s network, is the problem statement addressed in this work.
Basis for solution approach A seemingly simple solution is to modify end-user applications, such as GridFTP , to signal a provider’s network with a controlplane message before initiating any high-rate large-sized transfers. Such a message would negate the need for automatic a flow identification systems within a provider’s network. This solution was attempted in projects such as Lambdastation , Terapaths , and CHEETAH , but practical difficulties of application upgrades and adoption by users hindered its deployment. This led us to pursue an intra-domain traffic engineering solution because deployment of such a system would be entirely within a provider’s control. Such a solution does not preclude a parallel technology adoption effort of the end-application signaled approach.
For provider networks to automatically identify a flows, we start by examining the available features in current-day IP routers. Unfortunately, routers do not have built-in mechanisms to determine the rate and size of a flow (where a ‘‘flow’’ is identified by the 5-tuple: source and destination IP addresses, source and destination transport-layer port numbers, and protocol type). Next, we considered a portmirroring mechanism in which IP routers could be configured to make and transmit copies of packets to a port that is connected to an external server; the latter could then be used to execute a flow-based rate/size analysis for a-flow identification. However such a mechanism was deemed unscalable for the high link rates (10–100 Gbps) of provider networks.
After concluding that there are no built-in mechanisms for flow rate/size computation within routers, and that port-mirroring is infeasible, we looked for other mechanisms that could be exploited. Our finding is that NetFlow, a feature supported in provider-scale IP routers, can be used to solve our problem . The NetFlow feature allows routers to collect information for a sampled set of packets, which is then exported, in the form of NetFlow records, to an external NetFlow Collector. In current-day installations, NetFlow records are exported on a coarse time granularity, which is on the order of minutes to hours. An analysis of the NetFlow data showed that it was not possible to accurately predict the duration and size of an a flow by observing the first few NetFlow records corresponding to a live (online) flow. Any traffic-engineering/flow isolation actions taken on the presumption of a flow being an a flow may be futile in that the flow could end even before the router-configuration actions for traffic-engineering/flow-isolation were completed. Therefore, we developed an offline mechanism in which NetFlow records from completed flows are analyzed, and information extracted from this analysis is used to configure routers to identify future a flows for traffic-engineering/flowisolation.
Solution approach We propose a network management system called Alpha Flow Traffic Engineering System (AFTES) that would be run on a server external to the routers.2 AFTES would obtain NetFlow records from the NetFlow collector, and store the source–destination address prefixes of already completed a flows. These prefixes are used to configure firewall filters at ingress routers so that future a flows between the same source/destination subnets will get redirected to trafficengineered, QoS-controlled paths. A persistence measure is used to delete address prefix entries from the firewall filters for which no a flows are observed over an aging interval. This AFTES design would be effective if the following hypothesis is true.
Hypothesis Most high-speed data transfer nodes have static IP addresses, and a flows are created repeatedly between the same source–destination subnets. The basis for this hypothesis is that scientists typically execute their simulations on the same supercomputing centers, and hence we expect them to transfer data between the same two clusters. If the hypothesis is true, the offline prefix identifier based AFTES scheme will be effective in identifying and directing a flows to traffic-engineered, QoS-controlled paths. We carried out traffic analysis of NetFlow records collected from ESnet to test this hypothesis.