Content delivery networks (CDNs) have become a crucial part of the modern Web infrastructure. This paper studies the performance of the leading content delivery provider – Akamai. It measures the performance of the current Akamai platform and considers a key architectural question faced by both CDN designers and their prospective customers: whether the co-location approach to CDN platforms adopted by Akamai, which tries to deploy servers in numerous Internet locations, brings inherent performance benefits over a more consolidated data center approach pursued by other influential CDNs such as Limelight. We believe the methodology we developed for this study will be useful for other researchers in the CDN arena.
After a period of industry consolidation, there is a resurgence of interest in content delivery networks. Dozens new CDN companies have emerged and become a critical part of the Web infrastructure: Akamai alone claims to be delivering 20% of all Web traffic .
Two main “selling points” of a CDN service are (a) that they supply on-demand capacity to content providers and (b) that they improve performance of accessing the content from user perspective because they deliver the content from a nearby location. This paper focuses on the second aspect, and considers the performance of the current Akamai platform and the question of how widely dispersed a CDN platform needs to be to provide proximity benefits to the users.
Our study of platform distribution is motivated by the ongoing active discussion on the two main approaches in CDN design that have emerged over the years. One approach, exemplified by Akamai and Digital Island (currently owned by Level 3 Communications), aims at creating relatively limited-capacity points of presence at as many locations as possible. For example, Akamai platform in 2007 spanned more than 3,000 locations in over 750 cities in over 70 countries, with each location having on average less than ten servers, and their footprint has grown further since then (see http://www.akamai.com/hdwp p.2). The other approach utilizes massive data centers, comprising thousands of servers, but in many fewer locations. The examples of providers pursuing this approach include Limelight and AT&T. Limelight currently lists 20 data centers on its web site .
In practice, there may be complex reasons contributing to this design choice. On one hand, Akamai attempts to obtain free deployment of its cache servers at some ISPs in return for reducing the ISPs’ upstream traffic thus reducing the cost of running its platform. On the other hand, consolidated platforms pursued by Limelight and AT&T can be more manageable and often are deployed in data centers owned rather than rented by the CDN. Yet a large number of locations is often cited as directly translating to improved client proximity and content delivery performance. Thus, without passing judgment on the overall merits of the two approaches, we focus on this marketing claim and address the question: how many locations is enough from the client-observed performance perspective? Or, said differently, when does one hit the diminishing return in terms of improving client proximity by increasing the number of locations? In fact, by considering performance implications of platform consolidation, our study addresses an important aspect of the two approaches and thus contributes to the debate regarding their overall strengths.
Note that the number of locations is orthogonal to the overall CDN capacity and hence to the CDN’s ability to provide capacity-on-demand to content providers. By provisioning enough network connectivity, power supply, and servers at a given data center, one can assemble a very large aggregate CDN capacity at relatively small number of data centers. For example, from the statement on Limelight’s website that “Each Limelight Networks Delivery Center houses thousands of servers” one can infer that Limelight has at least 20,000 servers across their 20 data centers, which is at worst three times as few as Akamai, despite having two orders of magnitude fewer locations. In terms of network capacity, Limelight had 2.5Tbps aggregate bandwidth in August 2009 ; while we could not find a similar number for Akamai, this is more than the aggregate peak traffic it has delivered .
One would assume that content delivery networks would have done this study themselves long time ago. This might be true - we will never know. However, proprietary research is not open to public (and public scrutiny) and is often driven by vested interests. This paper attempts to answer the above question by examining Akamai performance. We chose Akamai because it is the dominant CDN provider, both in terms of the market share and size. Our general approach is to study how performance of Akamai-accelerated content delivery would suffer if it were done from fewer data centers. Considering performance implications of data center consolidation by the same CDN is important because it eliminates a possibility that unrelated issues in different CDNs could affect the results. An abstract of our preliminary results appeared in . The current paper present the complete study.
Our work contributes insights into the following aspects of content delivery networks:
• CDN performance improvement. CDNs offer capacity on-demand, and hence overload protection, to subscribing content providers. But do CDNs improve user experience during normal load? Krishnamurthy et al.  compared the performance of different CDNs, but to our knowledge ours is the first study that provides an independent direct estimate of the performance improvement of Akamai-accelerated downloads. In particular, we consider both performance improvement Akamai offers to content providers and the quality of Akamai server selection when it selects a cache for a download.
• The extent of platform distribution. We address a question to what extent a large number of points of presence improves CDN performance. While considering just one aspect – performance – of this issue, this contributes to the debate on the merits of the highly distributed vs. more consolidated approaches to CDN design from the customer perspective.
In addition to the above performance insights, we hope the methodology we develop to obtain them will be useful for others conducting research in this area.
2. RELATED WORK
A number of CDNs offer acceleration services today [19, 1, 13, 8, 17]. Our study could be useful for them as they decide on their infrastructure investment priorities and for their customers in selecting a CDN provider.