Network Working Group | L.D. Deng |
Internet-Draft | J.P. Peng |
Intended status: Informational | China Mobile |
Expires: January 03, 2014 | Y.Z. Zhang |
CoolPad | |
July 02, 2013 |
Efficient Chunk Availability Compression for PPSP
draft-deng-ppsp-bfbitmap-00.txt
This document proposes to employ bloom filter algorithms in compressing chunk availability information in exchange between peers and the tracker through the PPSP-TP protocol and PPSPP protocol.
This Internet-Draft is submitted in full conformance with the provisions of BCP 78 and BCP 79.
Internet-Drafts are working documents of the Internet Engineering Task Force (IETF). Note that other groups may also distribute working documents as Internet-Drafts. The list of current Internet-Drafts is at http://datatracker.ietf.org/drafts/current/.
Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress."
This Internet-Draft will expire on January 03, 2014.
Copyright (c) 2013 IETF Trust and the persons identified as the document authors. All rights reserved.
This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents (http://trustee.ietf.org/license-info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document. Code Components extracted from this document must include Simplified BSD License text as described in Section 4.e of the Trust Legal Provisions and are provided without warranty as described in the Simplified BSD License.
As it is pointed out by [I-D.ietf-ppsp-problem-statement], current practices often use a "bitmap" message in order to exchange chunk availability. The message is of kilobytes in size and exchanged frequently, e.g., an interval of several seconds or less.
To begin with, in a mobile environment with scarce bandwidth, the message size may need to be shortened or it may require more efficient methods for expressing and distributing chunk availability information, which is different from wire-line P2P streaming.
Even in a wire-line P2P streaming application, frequent exchange of large volume of bitmap information, is among the key factors that set a limit to the system's efficiency and scalability [P2P-limit].
Therefore, the following requirements for PPSP protocols in terms of chunk availability exchange are stated in [I-D.ietf-ppsp-problem-statement] :
To this end, we propose to employ bloom filter algorithms in compressing chunk availability information in exchange between peers and the tracker through the PPSP-TP protocol and PPSPP protocol, in view of its widely demonstrated high compacted data structure and low complexity and cost in storage, transportation and computation.
Bloom Filter (or BF for short) was first introduced in 1970s [BF-bloom], which makes use of multiple hashing functions to build a mapping from a set to a compact binary array, for the purpose of highly efficient member queries with a tolerably low error rate of wrongly reported hits. Despite their extraordinary efficiency in terms of storage reduction and query acceleration, bloom filters suffer from the fact that there is possibility that a non-member of the set be taken as a member after the query. However, research [BF-analysis] shows that the odds that a BF makes an erroneous query response can be suppressed to near zero, by tactful selection and configuration of various system parameters, including the hash functions used, the number of hash functions to be used, and the length of the bit array.
------------------------------------------------ BF(set S, integer m, hash set H) 1 filter=allocate m bits initialized to 0; 2 for each xi in S do 3 for each hash functions hi in H do 4 filter[hi(xi)]=1; 5 return filter; ------------------------------------------------- MT(element elm, BF filter, integar m, hash set H) 1 for each hash functions hi in H do 2 if (filter[hi(elm)]!=1) 3 return false; 4 return true; -------------------------------------------------
Figure 1: Basic algorithms for BF-bitmap
More specifically, as shown in Figure 1, the BF(S,m) algorithm takes a n-membered sub-set S={x1,x2,...,xn} from a universal set U, as input and outputs a m-bit binary array B as a compacted representation of S. In order to do that, it makes use of k independent random hash functions, each of which maps a member to a bit in B (i.e hj: U-> [1,m], j=1...k) by marking all the k bits mapped from each member of S. The BF algorithm is highly efficient in the following aspects:
For instance, given a 2GB movie file, the original bitmap for a sharing peer would be 1024-bit, if using 2MB-sized chunks. By simply using 4 random hash functions and a 128-bit BF-bitmap, the possibility of erroneous hits by MT algorithm would be lower than 3%.
As for a simple illustration, the 4 hash functions may be established through the MD5 message-digest algorithm [RFC1321], a widely used cryptographic hash function that produces a 128-bit (16-byte) hash value from an arbitrary binary input. MD5 has been utilized in a wide variety of security applications, and is also commonly used to check data integrity.
Specifically, the processing of 4 hash functions is as follows: use the MD5 algorithm to turn a given chunk_ID into a 128-bit binary array, further separate the 128 bits into 4 arrays (32-bit each), and finally divide each of them using 128 to yield four integers in the range of [1,m].
We borrow the general message flow (shown in Figure 2) from [I-D.ietf-ppsp-base-tracker-protocol] to exemplify the usage of BF-bitmap in PPSP protocols.
+--------+ +--------+ +--------+ +--------+ +-------+ | Player | | Peer 1 | | Portal | | Tracker| | Peer 2| +--------+ +--------+ +--------+ +--------+ +-------+ | | | | | |--Page request----------------->| | | |<--------------Page with links--| | | |--Select stream (MPD Request)-->| | | |<--------------------OK+MPD(x)--| | | |--Start/Resume->|--CONNECT(join x)------------>| | |<-----------OK--|<----------------OK+Peerlist--| | | | | | |--Get(Chunk)--->|<---------- (Peer protocol) ------------->| |<--------Chunk--|<---------------------------------Chunks--| : : : : : | |--STAT_REPORT---------------->| | | |<-------------------------Ok--| | : : : : : |--Get(Chunk)--->|<---------- (Peer protocol) ------------->| |<--------Chunk--|<---------------------------------Chunks--| : : : : :
Figure 2: A typical PPSP session for watching a streaming content.
When a peer wants to receive streaming of a selected content (Leech mode):
When a peer wants to share streaming contents (Seeder mode) with other peers:
This document proposes to employ bloom filter algorithms in compressing chunk availability information in exchange between peers and the tracker through the PPSP-TP protocols and PPSPP protocol. The key modifications to the current protocols are summarized as follows: (as shown in Figure 3)
+--------+ +--------+ +--------+ +--------+ +-------+ | Player | | Peer 1 | | Portal | | Tracker| | Peer 2| +--------+ +--------+ +--------+ +--------+ +-------+ | | | | | (a1) |--Page request----------------->| | | |<----Page with links(+BF conf)--| | | |--Select stream (MPD Request)-->| | | |<----------OK+MPD(x)(+BF conf)--| | | (a2) |--Start/Resume->|--CONNECT(join x)------------>| | |<-----------OK--|<------OK(+BF conf)+Peerlist--| | | | | | (c1) | |<------------ HAVE(BF(S2))----------------| (c2) |-Get(Chunk s1)->| | | | (c3) | |------------- REQUEST(BF(s1))------------>| |<--------Chunk--|<---------------------------------Chunks--| : : : : : (a3) | |-STAT_REPORT(BF(ContentMap))->| | | |<-------------------------Ok--| | : : : : : |--Get(Chunk)--->|<---------- (Peer protocol) ------------->| |<--------Chunk--|<---------------------------------Chunks--| : : : : :
Figure 3: A typical PPSP session with BF-bitmaps.
As stated earlier, the BF scheme is based on a mutual arrangement between the information requestor and the responder of the basic settings for the hash algorithms (both the number of them and the specific ones) in use and the coded bitmap's binary length. In other words, there SHOULD be a way of configuration setup mechanism in a local system.
To serve as the input for further discussion, we provide two initial proposals here:
TBA
None.
[RFC2119] | Bradner, S., "Key words for use in RFCs to Indicate Requirement Levels", BCP 14, RFC 2119, March 1997. |
[RFC2234] | Crocker, D. and P. Overell, "Augmented BNF for Syntax Specifications: ABNF", RFC 2234, November 1997. |
[I-D.ietf-ppsp-problem-statement] | Zhang, Y. and N. Zong, "Problem Statement and Requirements of Peer-to-Peer Streaming Protocol (PPSP)", draft-ietf-ppsp-problem-statement-12 (work in progress), January 2013. |
[P2P-limit] | Feng, C., Li, B. and B. Li, "Understanding the performance gap between pull-based mesh streaming protocols and fundamental limits", in Proc. of IEEE INFOCOM , 2009. |
[I-D.ietf-ppsp-base-tracker-protocol] | Cruz, R., Nunes, M., Gu, Y., Xia, J. and J. Taveira, "PPSP Tracker Protocol-Base Protocol (PPSP-TP/1.0)", draft-ietf-ppsp-base-tracker-protocol-00 (work in progress), February 2013. |
[I-D.ietf-huang-extended-tracker-protocol] | Huang, R., Zong, N., Cruz, R., Nunes, and J. Taveira, "PPSP Tracker Protocol-Extended Protocol", draft-ietf-huang-extended-tracker-protocol-02 (work in progress), February 2013. |
[BF-bloom] | Bloom, B.H., "Space/time trade-offs in hash coding with allowable errors.", Communications of ACM Vol. 13, No. 7, pp. 422-426, 1970. |
[BF-analysis] | Broder, A. and M. Mitzenmacher, "Network applications of Bloom Filters: a survey", Internet Mathematics Vol. 1, No. 4, pp. 485–509, 2004. |
[RFC1321] | Rivest, and Newport, "RFC 1321: The MD5 message-digest algorithm", draft-ietf-p2psip-base-26 (work in progress), April 1992. |
[I-D.ietf-p2psip-base] | Jennings, C., Lowekamp, B., Rescorla, E., Baset, S. and H. Schulzrinne, "REsource LOcation And Discovery (RELOAD) Base Protocol", February 2013. |