XCo: Explicit Coordination to Prevent Ethernet Congestion in Cloud Computing Clusters
Large cluster-based cloud computing platforms increasingly
use commodity Ethernet technologies, such as Gigabit Ethernet,
10GigE, and Fibre Channel over Ethernet (FCoE),
for intra-cluster communication. Traffic congestion can become
a performance concern in the Ethernet due to consolidation
of data, storage, and control traffic over a common
layer-2 fabric, as well as consolidation of multiple virtual
machines (VMs) over less physical hardware. Even as
networking vendors race to develop switch-level hardware
support for congestion management, we make the case that
virtualization has opened up a complementary set of opportunities
to reduce or even eliminate network congestion in
cloud computing clusters. We present the design, implementation,
and evaluation of a system called XCo, that performs
explicit coordination of network transmissions over a shared
Ethernet fabric to proactively prevent network congestion.
XCo is a software-only distributed solution executing only
in the end-nodes. A central controller uses explicit permissions
to temporally separate (at millisecond granularity) the
transmissions from competing senders through congested
links. XCo is fully transparent to applications, presently
deployable, and independent of any switch-level hardware
support. We present a detailed evaluation of our XCo prototype
across a number of network congestion scenarios, and
demonstrate that XCo significantly improves network performance
during periods of congestion.
Publications
- Vijay Shankar Rajanna, Smit Shah, Anand Jahagirdar, Christopher Lemoine, and Kartik Gopalan,
XCo: Explicit Coordination to Prevent Network Fabric Congestion in Cloud Computing Cluster Platforms,
In Proc. of 19th ACM International Symposium on High Performance Distributed Computing (HPDC), Chicago, Illinois, USA, 2010.
[pdf]
[bibtex]
- Vijay Shankar Rajanna, Smit Shah, Anand Jahagirdar and Kartik Gopalan,
XCo: Explicit Coordination for Preventing Congestion in Data Center Ethernet,
In Proc. of 6th IEEE International Workshop on Storage Network Architecture and Parallel I/Os, Incline Village, NV, USA, 2010.
[bibtex]