TOC |
|
This Internet-Draft is submitted to IETF in full conformance with the provisions of BCP 78 and BCP 79.
Internet-Drafts are working documents of the Internet Engineering Task Force (IETF), its areas, and its working groups. Note that other groups may also distribute working documents as Internet-Drafts.
Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as “work in progress.”
The list of current Internet-Drafts can be accessed at http://www.ietf.org/ietf/1id-abstracts.txt.
The list of Internet-Draft Shadow Directories can be accessed at http://www.ietf.org/shadow.html.
This Internet-Draft will expire on April 18, 2010.
Copyright (c) 2009 IETF Trust and the persons identified as the document authors. All rights reserved.
This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents in effect on the date of publication of this document (http://trustee.ietf.org/license-info). Please review these documents carefully, as they describe your rights and restrictions with respect to this document.
This draft provides a review of the proposed load balancing methods for structured (DHT-based) P2P systems, and introduces their operating principles and main characteristics. To maximize performance, reliability and fairness of a structured P2P system, the load between the nodes should be equally or nearly equally distributed. Structured P2P systems are usually based on Distributed Hash Tables (DHT), that distribute peers to the identifier space and objects among peers by choosing random identifiers for them. In the worst case, this may result in O(log N) imbalance of object distribution between the peers, causing high degree of heterogeneity in loads of peer nodes. Other sources for load imbalance are frequent arrival and departure of nodes (i.e. churn) and the variation in both the size and the popularity of objects. Load balancing in structured P2P systems has been thoroughly studied during the past years. In this draft, we introduce and compare the existing fundamental load balancing models.
1.
Introduction
2.
Terminology
3.
Fundamental load balancing models
3.1.
Virtual servers
3.2.
Controlling the object location
3.3.
Controlling the node location
3.4.
Address space balancing
3.5.
Other mechanisms
4.
Performance Analysis
4.1.
Virtual servers
4.2.
Controlling the object location
4.3.
Controlling the node location
4.4.
Address space balancing
5.
Acknowledgements
6.
IANA Considerations
7.
Security Considerations
8.
References
8.1.
Normative References
8.2.
Informative References
§
Authors' Addresses
TOC |
In most of the dedicated client/server systems it is enough to prevent the load within a node to grow in such extent that it would fail in performing its tasks. However, in P2P systems the requirements for the fairness of load balancing are higher since P2P nodes are not usually dedicated to certain tasks. Instead, the P2P software usually runs on the background while the user is doing something else with the node. This emphasizes the importance of fair load distribution between the networked nodes.
The network management and routing in structured P2P systems are based on a collaborative effort of the participating nodes instead of centralized solutions. Structured P2P networks were developed to improve routing efficiency and success rate, and thus to decrease the communication overhead. In structured P2P networks, the overlay uses specialized algorithms for generating topologies for the nodes participating the system. The overlay has specific control for the routing between the nodes, and the overlay controls the content placement. Thus, the exact location of the specific content is always known by the overlay. The load generated by the overlay signaling, is called signaling load. Another source of load in P2P networks is data transfer load, which originates from content transfers between the nodes. In the basic P2P scenario, the content is moved directly between the nodes, and thus the content transfer generates load only to the source and target node. In addition, most of the systems allows the content to be stored outside the node which is actually sharing it. In the case of realtime content, the media cannot be split in similar manner to third party nodes. However, the realtime streams may be routed through one or more third party nodes. The reason for using third party streaming relays is usually the restrictions of the access network of source or target node, such as Network Addres Translation (NAT).
This draft provides a survey of the existing and emerging load balancing technologies in the area of signaling in structured P2P networking. The main purpose of this draft is to provide a scientific background for defining which load balancing model or models are the most suitable for P2PSIP environment.
TOC |
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in [RFC2119] (Bradner, S., “Key words for use in RFCs to Indicate Requirement Levels,” March 1997.).
Load: The stress caused to a certain node in the network, concerning e.g. the node's computational, network, memory, storage, or battery capacity.
Overload: A computing entity can be interpreted as overloaded if its functionality, performance, reliability, or usability substantially suffers from the load.
Load balancing: a technique to distribute workload evenly across two or more networked nodes, in order to get optimal resource utilization, maximize throughput, minimize response time, and avoid overload
TOC |
Load balancing in P2P has special aspects and requirements that distinguish it from load balancing in distributed server systems. P2P systems usually consist of two distinguishable functions: 1) Search and discovery of nodes, content, or services (also called as signaling part), and 2) Transferring content or information between the nodes in either realtime or non-realtime manner (also called as media part). Thus, also load balancing has to consider both parts. In the first version of this draft, we consider only signaling part. However, also media transfer-related load balancing mechanisms has to be considered. This part remains in TBD status.
TOC |
During the past years, a great deal of research has been conducted to improve the load balancing in structured P2P networks. Many of the proposed load balancing mechanisms are based on the concept of virtual servers [CHORD] (Stoica, I., Morris, R., Liben-Nowell, D., Karger, D., Kaashoek, M., Dabek, F., and H. Balakrishnan, “Chord: a scalable peer-to-peer lookup protocol for Internet applications,” .), where single physical node may host several virtual servers that are all participating independently the same overlay. The concept of multiple virtual servers per physical node balances the load statistically. The model also enables resource-aware load balancing, since the number of virtual servers allocated per node can vary. In dynamic virtual server load balancing [VIRTUALSERVER_1] (Rao, A., Laksminarayanan, K., Surana, S., Karp, R., and I. Stoica, “Load Balancing in Structured P2P Systems,” .), [VIRTUALSERVER_2] (Dabek, F., Kaashoek, M., Karger, D., Morris, R., and I. Stoica, “Wide-area Cooperative Storage with CFS,” .), the load between the physical nodes is balanced by dynamically controlling the number of virtual servers per physical node (i.e. moving them between physical nodes during runtime).
TOC |
Another fundamental load balancing concept is to directly control the object location on the overlay. Power of Two Choices [POWEROF2] (Byers, J., Considine, J., and M. Mitzenmacher, “Simple Load Balancing for DHTs,” .), is based on the concept of using multiple hash functions per object. When inserting an object to an overlay, all respective hash values are computed and the load information of the corresponding peer candidates is retrieved. Finally, the object is stored on the peer with the lightest load. Searching the object can be implemented by; 1) sending separate lookups using each respective hash values, or 2) using redirection pointers between the corresponding peer candidates, and sending single lookup using some of the hash functions (if the peer receiving the lookup does not contain the object, it forwards the request to the containing peer, using the redirection pointer).
TOC |
Controlling the node location on the overlay can as well be used for load balancing. Node migration, proposed by Karger&Ruhl [SIMPLELOADBAL] (Karger, D. and M. Ruhl, “Simple Efficient Load Balancing Algorithms for Peer-to-Peer Systems,” .), is a mechanism allowing underloaded nodes to migrate to portions of the address space occupied by too many data items, to share the load of the node responsible for the overloaded address space. In this model, an underloaded node queries the load of a randomly chosen remote node and makes a pairwise load comparison between itself and the remote node. If the load of the remote node is higher, the node relocates to share the address space of the remote node. Rieche et al [HEATDISP] (Rieche, S., Petra, L., and K. Wehrle, “A Thermal-Dissipation-based Approach for Balancing Data Load in Distributed Hash Tables,” .), presented another load balancing model, which is fundamentally based on moving objects between the overlay nodes. In this model, the address space is divided to intervals, containing f nodes, where f is a fixed minimum number of nodes assigned to an interval. The nodes within the interval are collectively responsible for all of the objects stored in the interval. If the interval gets overloaded,; 1) the overloaded interval can be split to two intervals if the interval contains more than 2f nodes, 2) move nodes from other intervals to the overloaded interval to reach 2f nodes in the interval and split it (as in the first case), or 3) move the border between the overloaded interval and its successor or predecessor to balance the load between them.
TOC |
Yet another fundamental load balancing model is address space balancing between the overlay nodes. The concept was introduced in [SIMPLELOADBAL] (Karger, D. and M. Ruhl, “Simple Efficient Load Balancing Algorithms for Peer-to-Peer Systems,” .), together with the previously introduced object location controlling model. The main idea in the address space balancing is that each node has a fixed set of O(log N) possible locations, called virtual nodes, on the overlay, calculated using different hash algorithms (notice the difference in the terms between virtual node and virtual server). Each node has only one of these virtual nodes active at any time. The virtual node that spans the smallest address space between the real node owning it and the succeeding real node, is selected as the active node. The selection is made occasionally. With high probability, each node will be responsible for O(1/N) fraction of the address space.
TOC |
Several improvements have been developed for most of the presented fundamental load balancing models, offering e.g. better performance, lower overhead or locality awareness. However, as they do not significantly change the fundamental working principles, they are not presented as separate load balancing categories.
TOC |
Common goal for each of the presented fundamental load balancing mechanisms is to achieve fair load distribution between the overlay nodes without generating too much overhead to the overlay. However, there are significant differences in both the fairness of load distribution and the generated overhead. The following subsections discuss the benefits and drawbacks of each of the presented fundamental mechanisms.
TOC |
In its basic form, the virtual server model affects the load balance statistically. The situation can be compared to throwing a die once versus throwing it n times and calculating their average result. Even though simple and powerful, this method is also unable to address neither the heterogeneity in the node capabilities nor the dynamic changes in the load. The heterogeneity between the nodes can be addressed by allocating more virtual servers to higher-capacity devices and less virtual servers to lower-capacity devices. Dynamic load variations can be addressed by a mechanism that allows moving virtual nodes (and the objects they are responsible for) between the nodes during runtime.
The general problem with the virtual server model is the overhead it generates. As a physical node appears as multiple virtual nodes in the overlay, the overlay size grows with O(N) relation to the number of virtual nodes per physical device. The problem is emphasized with structured P2P networks, due to their dependence on network management signaling, which grows with the overlay size. With some optimizations, such as sending only one request message per physical node and keeping common node-specific routing tables, the signaling overhead does not grow with the same proportion, but is still a major problem. The situation gets even worse with dynamic load balancing, which requires mechanisms for runtime load monitoring and moving virtual nodes between the nodes.
TOC |
Controlling the object location, with e.g. the power of two choices method, is simple but efficient load balancing method. The method does not generate any additional management overhead, since each physical device appears as a single node in the overlay, in contrast to the virtual server model. Instead, the model generates some additional overhead to payload messaging due to the redundancy in storing and looking up an object. However, as proved by multiple research results [REFERENCES], the management signaling dominates the traffic in structured P2P networks. Thus, the generated overall overhead is significantly lower than with virtual server model in most of the P2P networks. With redirection pointers, the payload signaling overhead can be alleviated with the cost of increased maintenance signalling, as the pointers need maintenance.
The basic model of power of two choices is a passive mechanism, i.e. it does not provide active load transfers between the nodes. Thus, even though with high probability an overloaded node does not receive any additional objects during runtime, the objects (causing overload to the node) cannot be moved to another node. However, if the redirection pointers are in use, and thus the responsible node candidates for a certain object know each other, they can move objects between each other to provide dynamic load balancing. In general, the model works better in dynamic, high churn networks with short object lifetimes. The model also inherently takes into account the node heterogeneity, as the load is calculated locally at each node.
TOC |
Similarly to previous model, controlling the node location aims to balance the load between nodes by controlling the number of objects per node. However, instead of controlling object locations, the locations of nodes themselves are controlled by the load balancing mechanism. In a passive form, additional overhead is generated only during the joining phase and when the nodes are looked up. Object lookups are not affected by the model. Active node location management generates also load transfer overhead, as the objects are moved between the moved node and the node taking over the objects the moving node was storing in the original location and, similarly, between the moving node and the node that was holding the objects in the new location. Depending on the used load balancing model, the distribution of load information between the devices may also generate overhead to the overlay. Also this model takes into account the node heterogeneity, as the load is calculated locally.
TOC |
This load balancing model is somehow similar to the previously presented node location management mechanism, but the goal is different. Address space balancing aims to place the nodes to the overlay so that the address space controlled by each node is as equal as possible. This model generates management and load transfer overhead, as the node optimum node location is determined every time the address space of a node is changed due to neighbour node departures and new node arrivals, and changed when needed. A drawback of this model is that it does not take into account the actual load of the node.
TOC |
The authors would like to acknowledge...
TOC |
This draft includes no request to IANA at this time.
TOC |
No security considerations at this time.
TOC |
TOC |
[RFC2119] | Bradner, S., “Key words for use in RFCs to Indicate Requirement Levels,” BCP 14, RFC 2119, March 1997 (TXT, HTML, XML). |
TOC |
[CHORD] | Stoica, I., Morris, R., Liben-Nowell, D., Karger, D., Kaashoek, M., Dabek, F., and H. Balakrishnan, “Chord: a scalable peer-to-peer lookup protocol for Internet applications.” |
[HEATDISP] | Rieche, S., Petra, L., and K. Wehrle, “A Thermal-Dissipation-based Approach for Balancing Data Load in Distributed Hash Tables.” |
[POWEROF2] | Byers, J., Considine, J., and M. Mitzenmacher, “Simple Load Balancing for DHTs.” |
[SIMPLELOADBAL] | Karger, D. and M. Ruhl, “Simple Efficient Load Balancing Algorithms for Peer-to-Peer Systems.” |
[VIRTUALSERVER_1] | Rao, A., Laksminarayanan, K., Surana, S., Karp, R., and I. Stoica, “Load Balancing in Structured P2P Systems.” |
[VIRTUALSERVER_2] | Dabek, F., Kaashoek, M., Karger, D., Morris, R., and I. Stoica, “Wide-area Cooperative Storage with CFS.” |
TOC |
Erkki Harjula | |
University of Oulu | |
Erkki Koiso-Kanttilan katu 3 | |
University of Oulu, 90014 | |
Finland | |
Phone: | +358 8 553 2522 |
Email: | erkki.harjula@ee.oulu.fi |
Mika Ylianttila | |
University of Oulu | |
Erkki Koiso-Kanttilan katu 3 | |
University of Oulu, 90014 | |
Finland | |
Phone: | +358 8 553 25311 |
Email: | mika.ylianttila@ee.oulu.fi |