Internet DRAFT - draft-cui-iss-problem
draft-cui-iss-problem
Network Working Group Y. Cui
Internet-Draft Z. Lai
Intended status: Informational L. Sun
Expires: May 5, 2016 Tsinghua University
November 2, 2015
Internet Storage Sync: Problem Statement
draft-cui-iss-problem-03
Abstract
Internet storage services have become more and more popular. They
attract a huge number of users and produce a significant share of
Internet traffic. Most existing Internet storage services make use
of proprietary sync protocols with different capabilities to achieve
the data sync. However, a single Internet storage service using its
proprietary sync protocols has intrinsic limitations on service
usability and network performance. This document outlines the
related problems caused by using proprietary sync protocols and
missing key capabilities. It also shows a demand for designing a
standard sync protocol to achieve better usability and sync
performance.
Status of This Memo
This Internet-Draft is submitted in full conformance with the
provisions of BCP 78 and BCP 79.
Internet-Drafts are working documents of the Internet Engineering
Task Force (IETF). Note that other groups may also distribute
working documents as Internet-Drafts. The list of current Internet-
Drafts is at http://datatracker.ietf.org/drafts/current/.
Internet-Drafts are draft documents valid for a maximum of six months
and may be updated, replaced, or obsoleted by other documents at any
time. It is inappropriate to use Internet-Drafts as reference
material or to cite them other than as "work in progress."
This Internet-Draft will expire on May 5, 2016.
Copyright Notice
Copyright (c) 2015 IETF Trust and the persons identified as the
document authors. All rights reserved.
This document is subject to BCP 78 and the IETF Trust's Legal
Provisions Relating to IETF Documents
Cui, et al. Expires May 5, 2016 [Page 1]
Internet-Draft iss Problems November 2015
(http://trustee.ietf.org/license-info) in effect on the date of
publication of this document. Please review these documents
carefully, as they describe your rights and restrictions with respect
to this document. Code Components extracted from this document must
include Simplified BSD License text as described in Section 4.e of
the Trust Legal Provisions and are provided without warranty as
described in the Simplified BSD License.
Table of Contents
1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 2
2. Terminology and Concepts . . . . . . . . . . . . . . . . . . 4
3. Architecture of Internet Storage Service . . . . . . . . . . 5
4. Problems . . . . . . . . . . . . . . . . . . . . . . . . . . 6
4.1. Complicated Support for APIs . . . . . . . . . . . . . . 6
4.2. Unavailable Cross-service Sync . . . . . . . . . . . . . 7
4.3. Multiple Similar Clients . . . . . . . . . . . . . . . . 7
4.4. Protocol Capability Configurations and Implementations . 8
4.4.1. Chunking and Deduplication . . . . . . . . . . . . . 9
4.4.2. Chunking and Delta-encoding . . . . . . . . . . . . . 9
4.4.3. Bundling . . . . . . . . . . . . . . . . . . . . . . 10
4.5. Sync Protocols in Mobile and Wireless Environments . . . 10
4.6. Unsatisfactory Concurrent Work Ability . . . . . . . . . 11
5. Advantages of Standard Sync Protocol . . . . . . . . . . . . 12
6. Understanding of Sync Protocol . . . . . . . . . . . . . . . 13
7. Related Work in IETF . . . . . . . . . . . . . . . . . . . . 14
8. Security Considerations (TBD) . . . . . . . . . . . . . . . . 14
9. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . 14
10. Informative References . . . . . . . . . . . . . . . . . . . 14
Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 15
1. Introduction
Internet storage services provide a convenient way for users to
synchronize local files or folders with remote servers. In recent
years, Internet storage services have gained tremendous popularity
and accounted for a large amount of Internet traffic. This high
public interest also pushes various providers to enter the Internet
storage market. Services like Dropbox, Google Drive, OneDrive and
Box are becoming pervasive in people's routine. Dropbox, typically
considered as one of the leading providers, annouced that they have
more than 400 million registered users in June, 2015 [users], and
this number will keep growing in the future. Internet storage
services enable the users to access, operate and share their data
from anywhere, on any devices, at any time and with any connectivity.
Internet storage services also provide powerful APIs which allow
third-party applications to offload the burden of data storage and
management to the server. By aggregating users' files or application
Cui, et al. Expires May 5, 2016 [Page 2]
Internet-Draft iss Problems November 2015
data in the server, Internet storage services are becoming the "data
entrance" for personal users.
Sync protocol is the key design consideration of Internet storage
services. The sync protocol can be equipped with several
capabilities to optimize the storage usage and speed up data
transmission. Existing Internet storage services employ their
proprietary sync protocols to store/retrieve user data to/from the
remote servers. However, using proprietary sync protocols with
different capabilities in different Internet Storage services has
intrinsic limitations on service usability and network performance.
Multi-service usability: Users may use multiple Internet storage
services for the diversity of performance and functionality. In
addition, an Internet storage service has full access to user data,
the user data is at risk when the service is attacked or when
authorities require the providers to expose their data. Some
enterprise users may want to use their own network-based storage
service. Furthermore, it is complicated for developers to use
different APIs to combine their application with Internet storage
service. It also makes it unavailable for an Internet storage
service user to synchronize data with the users of other service.
Moreover, to use multi-service a user may install a series of client
applications with similar functionality, which wastes the local
resource and sacrifices the user experience.
Missing or misusing capabilities: Previous works show that existing
Internet storage services have different capability configurations
and implementations. These capabilities are closely related to each
other and help to efficiently synchronize user data. However, most
of the storage services are found to be lack of key capabilities or
the capabilities are not reasonably configured, which may result in
unexpected sync failure and sync inefficiency. How to reasonably
design and implement capabilities in the sync protocol has indeed
become a critical problem for the providers.
To address the problems mentioned above, an open and standard sync
protocol is required. In addition, this standard sync protocol are
expected to support the useful capabilities to avoid unexpected sync
failures and improve network performance.
This document outlines the problems arisen in existing Internet
storage services with various proprietary sync protocols. Section 2
lists the terminology and related concepts of Internet storage
services. Section 3 introduces the architecture of existing Internet
storage services. Section 4 describes the main problems and issues
that need to be considered. Section 5 explains the advantages of
using open and standard sync protocol. Section 6 shows a high-level
Cui, et al. Expires May 5, 2016 [Page 3]
Internet-Draft iss Problems November 2015
understanding of the sync protocol. Section 7 identifies the
differences between ISS and related work in IETF (i.e. WebDAV).
2. Terminology and Concepts
Data synchronization (sync): A primary technique for Internet storage
services. It enables the client to automatically update local file
changes to the remote servers through network communications.
Client: An application which is installed at the user side (i.e. on
multiple terminals). It enables users to access and experience
Internet storage service.
Control server: The entity that takes the responsibility of
authenticating users, managing metadata information and also
notifying changes to the client. It stores authentication and
metadata information of users.
Data storage server: The entity that stores the synchronized files of
users.
Control data: The control information exchanged with control server
to fulfil the data sync process. Typical control data includes
metadata (e.g. hashes for chunks), authentication information and
etc.
Content data: The original data of the local file, often in forms of
small chunks.
Sync protocol: A communication protocol between client and remote
servers to achieve data sync. It contains control flow and data
flow. Sync protocols are always built on HTTPS/HTTP.
o Control flow: This flow is for client and control server to
exchange control data.
o Data flow: This flow is for transmitting content data between
client and data storage servers.
Sync efficiency: A performance metric that indicates how fast the
changes can be synchronized to the Internet with the lowest traffic
overhead.
Useful capabilities to improve sync efficiency:
o Chunking: Split large file into small chunks.
o Bundling: Transmit multiple small chunks as a single big chunk.
Cui, et al. Expires May 5, 2016 [Page 4]
Internet-Draft iss Problems November 2015
o Deduplication: Avoid retransmission of existing content on the
Internet.
o Delta-encoding: Only synchronize modified data.
o Compression: Compress data before transmission.
3. Architecture of Internet Storage Service
The architecture of most Internet storage services is generally
composed of three major components: client, control server and data
storage server. And the whole architecture is shown in Figure 1.
* * * * * * * *
* * * * * * * * * * * * * *
* INTERNET *
* +------------+ +------------+ *
------| Control | | +------------+ *
| * | server | | |Data storage|========
| * +------------+ + | servers | * |
| * +------------+ * |
| * * * * * * * * * * * * * * |
Control Flow * * * * * * * * Data Flow
| |
| |
| +--------+ |
---------------------| Client |=====================
+--------+
Figure 1
With the help of sync protocol, all the three components could
communicate with each other. Control server is responsible for
storing all the control data, including authentication information,
metadata and etc. And once there are changes made on synchronized
files, the control server will notify the clients. However the other
type of data, content data, is stored in the form of chunks on the
data storage servers with no knowledge of sources, users and
relationship with other data chunks. As a result, a complete user
file will be split into small chunks and those chunks may be stored
on several different data storage servers. These two types of
servers are separate logical entities and are usually deployed in
different locations. Every time the client synchronize a local file
to the Internet, it needs to exchange control data and content data
with different types of servers in different flows.
Cui, et al. Expires May 5, 2016 [Page 5]
Internet-Draft iss Problems November 2015
4. Problems
Existing popular Internet storage services, including Dropbox,
OneDrive, GoogleDrive and etc, are using their own proprietary sync
protocols to achieve the data sync. Using different proprietary
protocols are always considered not to be beneficial to the
development of Internet services. This section describes current
problems for Internet storage services caused by their sync
protocols. We summarize six specific problems from three different
aspects: service usability, protocol capabilities and concurrent work
ability. As we discussed in Section 1, users prefer to use multiple
storage services for the considerations of performance, reliability
and security. Service usability among multiple services is still
lacking to some extent due to the proprietary format of sync
protocols. Section 4.1, Section 4.2 and Section 4.3 describe the
problems which are concerned with the usability. Moreover, previous
works and measurements have revealed that most sync protocols are
lack of key service capabilities or the capabilities are not well
configured, which significantly degrades the network performance,
especially in the mobile and wireless environment. Section 4.4 and
Section 4.5 illustrate the problems of current protocol capabilities.
In addition, the unsatisfied concurrent work ability is specified in
Section 4.6.
4.1. Complicated Support for APIs
Popular Internet storage services provide APIs that extend access to
the content management features in client software for use in third-
party applications. In practical platform, these APIs take care of
synchronizing data with Internet storage servers through a familiar
system-like way. Behind the scenes, API synchronize changes to the
server and automatically notify the client when changes are made on
other devices. These APIs can also include some further advanced
features or functions, e.g. revision or restoration of files, to make
the client work better. Different providers have different APIs
provided to the developers and their APIs have different styles and
features in order to support different platforms (e.g. Windows and
Andorid).
Third-party applications prefer to combine multiple Internet storage
services into their applications to achieve better performance,
reliability and security. However, for these developers who want to
use multiple storage services, they need to learn the APIs of all
service providers in order to design and implement their own clients.
Although there have already been some successful third party clients
that support multiple services (e.g. ExpanDrive [ExpanDrive], IFTTT
[IFTTT]), it is not easy for the developers to learn and apply so
Cui, et al. Expires May 5, 2016 [Page 6]
Internet-Draft iss Problems November 2015
many different APIs to develop and maintain their third party
clients.
4.2. Unavailable Cross-service Sync
Synchronizing is one of the most important functions provided by
Internet storage services. With this function provided, files in the
Internet could be easily shared and manipulated by different people
and groups. Anyone who is permitted to read and download the file is
able to modify and upload new versions of this file to the Internet.
However, this synchronizing function merely works well inside a
single service. Users who are using the same Internet storage
service could easily achieve the sharing (i.e. download) and
coordinated operations on their files. When referring to the
synchronizing among different Internet storage services, it is not
complete since the sync among different services is not available.
For example, if a Dropbox user wants to work on a cooperative file
with a Google Drive user currently, he is only able to share this
file with the other one by sending an open HTTP link of this file.
After clicking on that link, the Google Drive user could only
download this file through HTTP. However, the Google Drive user can
only read and download the shared file. He cannot modify and update
the shared file since Dropbox and Google Drive are using two
different proprietary sync protocols. This is because the
cooperative file is stored on Dropbox servers. A Google Drive client
cannot download/upload the file through Dropbox's sync protocol since
it has no idea of the Dropbox's sync protocol. Different services
using different proprietary sync protocols results in the
unavailability.
4.3. Multiple Similar Clients
The emergency of more and more Internet storage services provides
users with a wide range of choices for storing their local files
remotely. Like other Internet applications, users are not restricted
to use only one of those services. Actually, they tend to have
multiple accounts for different Internet storage services and
experience them simultaneously. One important reason is that users
are always pursuing better functionality. For example, Dropbox is
better at file processing, OneDrive is better at the interoperability
and compatibility with Microsoft Office while GoogleDrive has a
better performance at mail attachment. To enable all the desired
functions and features, a simple way is to register and use all the
desired Internet storage services. Furthermore, people may simply
need multiple Internet storage services for larger storage space and
higher reliability.
Cui, et al. Expires May 5, 2016 [Page 7]
Internet-Draft iss Problems November 2015
However, using different Internet storage service results in a
problem that users have to install multiple similar client
applications. Since almost all commercial Internet storage services
have their own proprietary sync protocols and corresponding client
applications, installing and running multiple similar client
applications sacrifices the user experience and also increases the
complexity of synchronizing files with different providers' servers
in Internet. For instance, users usually suffer from duplicate
operations in order to upload the same file to their different
service accounts.
4.4. Protocol Capability Configurations and Implementations
Data sync is not a simple remote file transfer process, it can
implement several capabilities to optimize the data storage usage and
speed up data transmissions. There exists five well-known
capabilities that can be employed by Internet storage services to
improve the sync efficiency and reliability: chunking, bundling,
deduplication, delta-encoding and compression. All these
capabilities are aimed to help to efficiently synchronize user data
via Internet communications.
However, the investigation of [Benchmarking] shows that different
Internet storage services have different capability configurations
and implementations. And most existing Internet storage services do
not implement all the five capabilities in their sync protocol. Lack
of such capabilities can do affect the sync efficiency. Table 1 from
[QuickSync] shows different capabilities implementations of four
popular Internet storage services (i.e. Dropbox, GoogleDrive,
OneDrive and Seafile) on Windows OS.
+----------------+-------------+-------------+-------------+-------------+
| Capabilities | Dropbox | GoogleDrive | OneDrive | Seafile |
| | | | | |
+----------------+-------------+-------------+-------------+-------------+
| Chunking | 4MB | 8MB | Variable | Variable |
+----------------+-------------+-------------+-------------+-------------+
| Bundling | Yes | No | No | No |
+----------------+-------------+-------------+-------------+-------------+
| Deduplication | Yes | No | No | Yes |
+----------------+-------------+-------------+-------------+-------------+
| Delta-encoding | Yes | No | No | No |
+----------------+-------------+-------------+-------------+-------------+
| Compression | Yes | Yes | No | No |
+----------------+-------------+-------------+-------------+-------------+
Table 1
Cui, et al. Expires May 5, 2016 [Page 8]
Internet-Draft iss Problems November 2015
Measurements and study from [QuickSync] also reveal that those key
capabilities significantly affect the sync performance. Most of them
should be implemented and well configured to achieve data sync. The
remaining part of this subsection lists the problems caused by
insufficient or unreasonably configured capabilities.
4.4.1. Chunking and Deduplication
Chunking is the most widely implemented capability that simplifies
the transmission recovery when the sync of a large file is
interrupted. Different implementations of chunking has different
chunking schemes (i.e. dynamic chunking or static chunking) and chunk
sizes. Chunking is closely related to deduplication since the
deduplication is performed in the chunk granularity. Typically,
smaller chunk size and dynamic chunking scheme (e.g. Content Defined
Chunking) are better for detecting and eliminating redundancy.
However the ability to detect more redundancy is not always equal to
better sync efficiency since it will introduce more computation
overhead (i.e. finding more redundancy needs more CPU time).
Aggressive dynamic chunking scheme (e.g. Content Defined Chunking)
performs better in a high delay (i.e. high RTT) environment, while
fixed-size scheme performs well in good network conditions. A trade-
off between computation time and transmission time need to be
considered to achieve an effective chunking. A better chunking
strategy may be network-aware which means the sync should be able to
employ appropriate chunking strategy according to its current network
condition.
4.4.2. Chunking and Delta-encoding
Delta-encoding is an algorithm that can be used to find the different
portion of two files and achieve incremental sync. However, not all
Internet storage services implement delta-encoding. One possible
reason is that most delta-encoding algorithms work at the granularity
of file, while to save the storage space thus reducing the cost,
files are often split into chunks to manage for Internet storage
services. Naively piecing together all chunks to reconstruct the
whole file to achieve incremental sync would waste massive intra-
cluster bandwidth. Therefore, some Internet storage services, e.g.
Dropbox, implement delta-encoding at the chunk granularity. The
delta-encoding is performed between two chunks in the original and
modified version respectively according to the chunk offset from the
beginning of the file. If a service uses the fixed size chunking
method, some types of modifications, e.g. inserting some new data at
the head of a file, may cause that the two chunks used to perform
delta-encoding have very little similarity. In this circumstance,
delta-encoding is unable to reveal the delta between the original and
modified file so that the incremental sync fails. To solve the
Cui, et al. Expires May 5, 2016 [Page 9]
Internet-Draft iss Problems November 2015
problem, we need to design an improved delta-encoding algorithm with
appropriate chunking that makes the incremental sync always available
in various scenarios.
4.4.3. Bundling
Small files are more likely to be modified and synchronized
frequently. For example, people usually collaborate on a number of
small files (e.g. a project's source code always consists of multiple
small files). In a high delay environment, synchronizing large
number of small files is not efficient. One reason is that most
existing Internet storage services employ a sequential
acknowledgement mechanism. Under this circumstance, the next chunk
is only allowed to be transmitted until the last chunk's
acknowledgement has been received. The sequential acknowledgement
mechanism wastes the limited bandwidth since the TCP connection is in
idle state for a long time. Bundling small files together and
employing delayed acknowledgement mechanism can effectively make full
use of limited bandwidth so that the whole sync time and traffic
overhead can be significantly decreased.
4.5. Sync Protocols in Mobile and Wireless Environments
The increasing number of mobile terminals introduces the requirement
of synchronizing data on any device via any connectivity at anytime
and anywhere. A change made on the data through the desktop is
required to be automatically transferred to the user's mobile phone
or other mobile devices. Based on the measurements from
[Look_at_Mobile_Cloud], the problem of missing capabilities is more
severe when referring to the mobile Internet storage services. The
root cause and problem are twofold:
First of all, mobile devices have limited storage and computation
ability, it is really hard to implement all the five useful
capabilities discussed previously on a mobile client since the
implementation of those capabilities will bring extra overhead
(Table 2 shows the implementations for capabilities on Android OS).
The measurement results from [Look_at_Mobile_Cloud] shows that none
of existing mobile Internet storage services implement all the five
key capabilities and only very few of them could be found on a mobile
Internet storage client. That explains why most Internet storage
services wastes limited bandwidth, produce large useless traffic and
suffer long sync time in the mobile environment. How to implement
all the desired capabilities with lower requirement of storage and
computation resources is a critical problem needs to be addressed.
Cui, et al. Expires May 5, 2016 [Page 10]
Internet-Draft iss Problems November 2015
+----------------+-------------+-------------+-------------+-------------+
| Capabilities | Dropbox | GoogleDrive | OneDrive | Seafile |
| | | | | |
+----------------+-------------+-------------+-------------+-------------+
| Chunking | 4MB | 260K | 1MB | No |
+----------------+-------------+-------------+-------------+-------------+
| Bundling | No | No | No | No |
+----------------+-------------+-------------+-------------+-------------+
| Deduplication | Yes | No | No | No |
+----------------+-------------+-------------+-------------+-------------+
| Delta-encoding | No | No | No | No |
+----------------+-------------+-------------+-------------+-------------+
| Compression | No | No | No | No |
+----------------+-------------+-------------+-------------+-------------+
Table 2
Secondly, sync protocol cannot well handle network disruptions caused
by unstable network connection. For example, some services fail to
resume sync if the data transmission is interrupted, or incur too
much additional recovery overhead when exception happens. A well
designed sync protocol that guarantees reliability and efficiency in
mobile or wireless networks is expected.
4.6. Unsatisfactory Concurrent Work Ability
With the popularity of Internet storage services, collaborative work
is becoming an important feature of such services. This feature is
especially important and provides convenience for a team or an
organization since participants could easily retrieve and edit the
target file on the Internet. Currently, such collaborative work
ability is still unsatisfactory that some common and frequent
operations may lead to redundant file versions. More specifically,
parallel updates from different end users may result in a version
conflict. If two or more users are editing the same file
concurrently, it is hard to make the file updated correctly. To
ensure every participant's modification would be considered, the
typical way is to lock the file and allow other participants to
create different versions for the same file. To obtain a final
version, participants have to negotiate with each other about their
modifications (versions) and merge the final version manually. This
would definitely affect the work efficiency since people have to
spend lots of time and effort on managing redundant versions and
merging a final version.
A desired concurrent work ability is when different people are
working on the same file, the client should automatically create
exclusive versions for their users locally. And after they finished
Cui, et al. Expires May 5, 2016 [Page 11]
Internet-Draft iss Problems November 2015
and uploaded to the server, the server would automatically merge
different versions to get a final version without any human
involvement. Furthermore, a better solution is like what
[GoogleDocs] does which provides actual real-time edit. Multiple
people could edit the same file and are able to find each other's
cursor and real-time operation. Such desired ability does help to
improve the collaborative work ability but is really challenging when
designing a protocol.
5. Advantages of Standard Sync Protocol
An open and standard sync protocol between client and server can
effectively address some problems mentioned above. The sync protocol
consists of two types of flows: control flow and data flow. Control
flow is between client and control server. It is intended for user
authentication, metadata management and also the active notification
of data changes. Data flow is between client and data storage
servers, which is only for transmitting actual file data (in the form
of numerous chunks). The combination of control flow and data flow
enables the whole data sync. According to the analysis of problems
above, the key capabilities could be supported as optional features
in the sync protocol and it would be better if the protocol is
network-aware. The rest of this section lists the advantages of
employing an open and standard sync protocol.
First off, with a standard sync protocol provided, a third party
client that supports multiple Internet storage services is easy to
implement since APIs provided by different providers would be
unnecessary or at least simplified. This would attract more and more
people or organizations to develop and implement their own client
(sometimes it is even possible for the user himself to implement his
client). As a result, users do not need multiple clients for
multiple services any more and their user experience is improved.
Furthermore, the competition in the (third party) client market is
increasing which is beneficial for the users. They are able to
choose their clients flexibly and the frequent updates of clients
enable users to obtain more functions and better user experience.
Another advantage of having standard sync protocol is that the sync
among different services is available or at least possible to
achieve. If two different services both employ the standard sync
protocol, their users could synchronize files with each other using
the same standard sync protocol (not the basic HTTP download any
more). In this way, users from different services could achieve
sharing and coordinated operations on their local files.
Using standard sync protocol also makes it easy to improve Internet
storage services. Compared with the existing proprietary formats,
Cui, et al. Expires May 5, 2016 [Page 12]
Internet-Draft iss Problems November 2015
standard sync protocol is totally open and designed by many
contributors. People are welcome to revise and improve the standard
protocol. We believe that both users and providers will benefit a
lot from such a standard sync protocol.
6. Understanding of Sync Protocol
Client Control Server Data Storage Server
| | |
|---meta data, auth info-->| |
|<-------start sync--------| |
| sync preparation | |
| | |
|--------------------store/retrieve------------------>|
|<--------------------ok/content----------------------|
| ... |
|--------------------store/retrieve------------------>|
|<--------------------ok/content----------------------|
| data transmission |
| | |
|---meta data, ver info--->| |
|<-----conclude sync-------| |
| sync finish | |
| | |
Figure 2
Figure 2 shows a preliminary and high level understanding of the sync
protocol. The whole sync process could be divided into three stages:
sync preparation, data transmission and sync finish. In the first
stage, the client should exchange its metadata, authentication
information with the control server to initiate a sync process.
During this stage, the capabilities including network-aware chunking
and deduplication should be performed. In the second stage, data
transmission, client sends/retrieves chunks to/from the data storage
servers. To speed up the data sync and make it more reliable, the
capabilities like bundling and delta-encoding could be employed.
When the sync finishes (i.e. sync finish stage), the client would
send its metadata again for the control server to check and conclude
the sync process. Also some version information is exchanged for the
version control. From this understanding we could derive that the
control flow and data flow are closely related, which cannot work
without each other.
Cui, et al. Expires May 5, 2016 [Page 13]
Internet-Draft iss Problems November 2015
7. Related Work in IETF
WebDAV ([RFC4918]) provides an alternative way to exchange local data
with remote web servers. It can be treated as previous IETF effort
on file collections, authoring and versioning over HTTP. WebDAV
mainly focuses on the authoring and versioning for distributed web
contents. Typical WebDAV protocol extends HTTP protocol to enable
users to collaboratively edit and manage files on remote servers.
WebDAV focuses on the distributed work (authoring and versioning)
while ISS will focus on the data sync. A potential major difference
between data sync and distributed authoring/versioning is the
frequency of data transmission. In data sync, the client will
automatically exchange data with remote servers when there are any
changes. In reality, every time you perform 'save' operation of a
file, the client will solicit a data sync process. Such frequent
data transmission will cause a large amount of network traffic. This
introduces challenges to the design of sync protocols. A possible
solution is to make use of those well-known service capabilities and
make the protocol to be network-aware to some extent. The ISS
protocol suite could build on the WebDAV protocol or basic HTTP
protocol.
8. Security Considerations (TBD)
TBD
9. Acknowledgements
The authors would like to thank Barry Leiba, Mark Nottingham, Julian
Reschke, Marc Blanchet, Mike Bishop, Haibin Song, Philip Hallam
Baker, Michiel de Jong and Ted Lemon for their valuable comments and
contributions to this work.
10. Informative References
[Batched] Li, Z., Wilson, C., Jiang, Z., Liu, Y., Zhao, B., Jin, C.,
Zhang, Z., and Y. Dai, "Efficient Batched Synchronization
in Dropbox-Like Cloud Storage Services", Middleware ,
2013.
[Benchmarking]
Drago, I., Bocchi, E., Mellia, M., Slatman, H., and A.
Pras, "Benchmarking Personal Cloud Storage", IMC , 2013.
[ExpanDrive]
"ExpanDrive", <http://www.expandrive.com/>.
Cui, et al. Expires May 5, 2016 [Page 14]
Internet-Draft iss Problems November 2015
[GoogleDocs]
"Google Docs",
<http://www.google.com/intl/en/docs/about/>.
[IFTTT] "IFTTT", <https://ifttt.com/>.
[Inside_Dropbox]
Drago, I., Mellia, M., Munafo, M., Sperotto, A., Sadre,
R., and A. Pras, "Inside Dropbox: Understanding Personal
Cloud Storage Services", IMC , 2012.
[Look_at_Mobile_Cloud]
Cui, Y., Lai, Z., and N. Dai, "A First Look at Mobile
Cloud Storage Services: Architecture, Experimentation and
Challenge", IEEE Network , 2015.
[QuickSync]
Cui, Y., Lai, Z., Wang, X., Dai, N., and C. Miao,
"QuickSync: Improving Synchronization Efficiency for
Mobile Cloud Storage Services", MOBICOM , 2015.
[RFC4918] Dusseault, L., Ed., "HTTP Extensions for Web Distributed
Authoring and Versioning (WebDAV)", RFC 4918,
DOI 10.17487/RFC4918, June 2007,
<http://www.rfc-editor.org/info/rfc4918>.
[rsync] "rsync", <https://rsync.samba.org/>.
[Towards] Li, Z., Jin, C., Xu, T., Wilson, C., Liu, Y., Cheng, L.,
Liu, Y., Dai, Y., and Z. Zhang, "Towards Network-level
Efficiency for Cloud Storage Services", IMC , 2014.
[users] "400 million strong", <https://blogs.dropbox.com/
dropbox/2015/06/400-million-users/>.
Authors' Addresses
Yong Cui
Tsinghua University
Beijing 100084
P.R.China
Phone: +86-10-6260-3059
Email: yong@csnet1.cs.tsinghua.edu.cn
Cui, et al. Expires May 5, 2016 [Page 15]
Internet-Draft iss Problems November 2015
Zeqi Lai
Tsinghua University
Beijing 100084
P.R.China
Phone: +86-10-6278-5822
Email: uestclzq@gmail.com
Linhui Sun
Tsinghua University
Beijing 100084
P.R.China
Phone: +86-10-6278-5822
Email: lh.sunlinh@gmail.com
Cui, et al. Expires May 5, 2016 [Page 16]