TAPS Working Group | P. Tiesel |
Internet-Draft | T. Enghardt |
Intended status: Experimental | Berlin Institute of Technology |
Expires: December 17, 2017 | June 15, 2017 |
Socket Intents
draft-tiesel-taps-socketintents-00
This document outlines an API-independent concept that allows applications to share their knowledge about upcoming communication and express their performance preferences in a portable and abstract way: Socket Intents. Socket Intents express what an application knows, assumes, expects or wants to prioritize regarding its own network communication. The information provided by Socket Intents should be taken into account by the network stack in a best-effort way.
Socket Intent can be used to stem against the complexity and make use of multiple provisioning domains as well as new transport protocols and features available to a larger user base by expressing the applications intents in an abstract and portable way.
This Internet-Draft is submitted in full conformance with the provisions of BCP 78 and BCP 79.
Internet-Drafts are working documents of the Internet Engineering Task Force (IETF). Note that other groups may also distribute working documents as Internet-Drafts. The list of current Internet-Drafts is at http://datatracker.ietf.org/drafts/current/.
Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress."
This Internet-Draft will expire on December 17, 2017.
Copyright (c) 2017 IETF Trust and the persons identified as the document authors. All rights reserved.
This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents (http://trustee.ietf.org/license-info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document. Code Components extracted from this document must include Simplified BSD License text as described in Section 4.e of the Trust Legal Provisions and are provided without warranty as described in the Simplified BSD License.
The words “MUST”, “MUST NOT”, “SHALL”, “SHALL NOT”, “SHOULD”, and “MAY” are used in this document. It’s not shouting; when these words are capitalized, they have a special meaning as defined in [RFC2119].
Flow, Association, Stream, or Object are used as defined in [I-D.tiesel-taps-communitgrany]:
Despite recent advances in the transport area, the adaption of new transport protocols and transport protocol features is slow and only happens in limited domains (primarily in the Web browser and within datacenters). The same problem occurs for taking advantage of multiple available access networks or provisioning domains (PvDs). In both cases, the benefits of the new transport diversity comes at the cost of an increased complexity that has to be mastered by the application programmer.
Enabling features like TCP fast open [RFC7413] or controlling how MPTCP [RFC6824] creates subflows requires specialized APIs that are not part of the standard socket API, often require deep knowledge of the transport protocol internals, and are not portable across different implementations.
Applications that want to use multiple network interfaces usually have to use their own heuristics to select which access network to use. Choosing the right interface is difficult as their characteristics differ, e.g. in terms of performance, and obtaining the necessary information is often not easy since it may require special privileges and differs heavily by implementation.
In all cases mentioned above, an application that wants to take advantage of the available transport diversity is faced with substantially higher complexity regarding network APIs and networking code.
Application programmers opening a communication channel typically know how this channel will be used. Beside the hard requirements already necessary for establishing the communication channels, e.g., reliable in-order stream transport, there is more information available: An application developer has an intuition about optimization preferences, e.g., optimize for bandwidth, latency, or cost, about expectations, e.g. towards data loss, and possibly also about specifics, such as how many bytes will be sent or received.
This information does not directly map to the choice of a transport protocol, to certain protocol parameters, nor to which PvD to use, but the information can imply that the application can benefit from certain transport options or help to choose between multiple PvD as described in [RFC7556], Section 6.2, and therefore enable the OS to adjust its defaults for this communication channel accordingly.
The preferences, expectations and other information known about the upcoming communication MAY be expressible in an intuitive, abstract way independent of the network- and transport protocol. Its representation SHOULD be independent of the actual API used for network communication, e.g., these SHOULD be expressible in whatever API available, e.g., as “socketopts” for BSD sockets or as part of the address resolution configuration for Post Sockets. Finally, given the expectations and external constraints known, the OS SHOULD use the information provided via Socket Intents in an best-effort fashion and therefore try to choose the best transport protocol, default parameters and PvDs available and MAY try to further optimize based on them.
With Socket Intents, applications MAY express their communication preferences in order to take advantage of the available transfer diversity. Depending on the API used, Socket Intents can be used on a per Flow, Association, Stream, or Object level. Communication preference refers to desired transport characteristics, e.g., low delay or high throughput, stable transport or minimal cost, and is optional information.
The following sections contain a list or Socket Intent types and their possible values.
Socket Intents are structured as key / value pair. The key is expressed by a short name, the value has a fixed data type (Enum, Int or Float).
The namespace for the short names is portioned as follows: - Experimental Socket Intent type MUST start with “x-“. - Private or vendor specific Socket Intent type MUST start with “y-[vendor]-“. - The remming Socket Intent type namespace SHOULD be managed by an IANA registry. The assignment of new types requires an RFC or expert review.
For Enum data types, a list of valid values MUST be provides by the document specifying that intent.
An implementation faced with unknown intent types or invalid or unknown values MAY ignore that Intent but SHOULD return an error code to the application.
Socket Intents are not QoS labels, but have an orthogonal meaning.
The Traffic Category describes the dominating traffic pattern of the respective communication unit expected by the application.
Level | Description |
---|---|
query | Single request / response style workload, latency bound |
control | Long lasting low bandwidth control channel, not bandwidth bound |
stream | Stream of bytes/objects with steady data rate |
bulk | Bulk transfer of large objects, presumably bandwidth bound |
mixed* | Don’t know or none of the above |
This Intent is used to communicate the expected size of a transfer.
This Intent is used to communicate the expected lifetime of the respective communication unit.
This Intent is used to communicate the bitrate of the respective communication unit.
This Intent describes the anticipated sender-side burst characteristics of the traffic for this communication unit. It expresses how the traffic sent by the application is expected to vary over time, and, consequently, how long sequences of consecutively sent packets will be. Note that the actual burst characteristics of the traffic at the receiver side will depend on the network.
This Intent can provide hints to the application on what the resource usage pattern for this communication unit will look like, which can be useful for balancing the requirements of different application.
Level | Description |
---|---|
no_bursts | Application sends traffic at a constant rate |
regular_bursts | Application sends bursts of traffic periodically |
random_bursts | Application sends bursts of traffic irregularly |
bulk | Application sends a bulk of traffic |
mixed* | Don’t know or none of the above |
This Intent describes the desired delay characteristics for this communication unit. It provides hints for the OS whether to optimize for low delay or for other criteria. There are no hard requirements or implied guarantees on whether these requirements can actually be satisfied.
Level | Description |
---|---|
stream | Delay and packet delay variation should be kept as low as possible |
interactive | Delay should be kept as low as possible, but some variation is tolerable |
transfer* | Delay and packet delay variation should be reasonable, but are not critical |
background | Delay and packet delay variation is no concern |
This Intent describes how an application deals with disruption of its communication, e.g. connection loss. It communicates how well the application can recover from such disturbance and can have implications on how many resources the OS should allocate to failover techniques for this particular communication unit.
Level | Description |
---|---|
sensitive* | Disruptions result in application failure, disrupting user experience |
recoverable | Disruptions are inconvenient for the application, but can be recovered from |
resilient | Disruptions have minimal impact for the application |
This describes the Intents of an Application towards costs cased by the respective communication unit. It should guide the OS how to handle cost vs. performance and reliability tradeoffs.
Level | Description |
---|---|
no_expense | Avoid expensive transports and consider failing otherwise |
optimize_cost | Prefer inexpensive transports and accept service degradation |
balance_cost* | Use system policy to balance cost and other criteria |
ignore_cost | Ignore cost, choose transport solely based on other criteria |
Consider a cellphone performing an OS upgrade. This process usually implies downloading a large file. This is a bulk transfer for which the application may already know the file size. Timing is typically noncritical and the data can be downloaded as background traffic with minimal cost and power overhead. It would not hurt if the TCP connection was closed during the transfer as the download can be continued.
For this case, the application should set the “Traffic Category” to “bulk”, “Timeliness” to “background”, and “Application Resilience” to “resilient”. In addition, “Object Size to be Received” can be provided. Finally, the application may set the the “Cost Preferences” to “no_expense”.
The OS can use this information and therefore may schedule this transfer on a flaky but not traffic-billed WiFi link and may reject the connection attempt if no cheap access link is available.
Consider a user watching non-live video content using MPEG-DASH [DASH]. This usually means fetching a stream of video chunks. The application should know the size of each chunk and may know the bitrate and the duration of each chunk and the whole video. Disconnection of the TCP connection should be avoided because that might have an effect that is visible to the user.
For this case, the application should set the “Traffic Category” to “stream”, the “Timeliness” to “stream”, and “Application Resilience” to “sensitive”. It may also provide the “Stream Bitrate Received” and “Duration” expected. Finally, the application may set the the “Cost Preferences” to “balance_cost”.
The OS can use this information and, e.g, use MPTCP [RFC6824] if available to schedule the traffic on the cheaper link (e.g, WiFi) while establishing an additional subflow over an expensive link (e.g., LTE). If the desired bandwidth cannot be matched by the cheaper link, the more expensive link can be added to satisfy the desired bandwidth.
If the application would set the “Cost Preferences” to “optimize_cost”, the OS would not schedule traffic on the second subflow and the application would reduce the video quality to adapt to the available data rate.
Consider a user managing a remote machine via SSH. This usually involves at least one long-lived console session and possibly file transfers using SCP or rsync multiplexed on the same association (e.g. TCP connection).
For the console session, the application can set the “Traffic Category” to “control”, the “Burstiness” to “random bursts”, the timeliness to “interactive” and the resilience to “sensitive”.
For the file transfers, SSH may set both, the “Traffic Category” and “Burstiness” to “bulk”. It may also know the size of the transfer and therefore sets “Object Size to be Sent” or “Object Size to be Received”.
Assuming there are transport opportunities supporting multiple streams in a single association (e.g. SCPT [RFC4960]), the OS can use this information to schedule the streams over different links to meet their requirements (latency vs. bandwidth). In case the OS has to use TCP, it can still optimize by disabling TCP Nagle Algorithm for console session related transmissions.
TBD
We assume that applications specify their preferences in a selfish, but not malicious way and that it is up to the OS to find a compromise between demands.
A malicious application could confuse the OS in a way that leads to scheduling traffic with certain Intents on amore expensive interface, penalizing this traffic, or even rejecting it. The attack vector added by this is negligible: As the malicious application could also generate the traffic it claims to intent, it already has a much more powerful attack vector.
As a mitigation, the OS could monitor and compare the intents specified with the traffic actually generated and notify the user if the usage of Socket Intents is unusual or defective.
Varying the transport or IP layer parameters of packets belonging to different Streams or Objects multiplexed in the same encrypted association might enable an attacker to gain some ground truth about the shares of different kinds of traffic. As this might also be implied by packet timings, application developers might weight the small additional information disclosure against the possible performance gains. Using Socket Intents on Association level can be considered safe.
The Socket Intents type namespace SHOULD be managed by the IANA registry. Details conforming to [RFC5226] are laid out in Section 4.1, the initial types for the registry are described in Section 5.
[RFC2119] | Bradner, S., "Key words for use in RFCs to Indicate Requirement Levels", BCP 14, RFC 2119, DOI 10.17487/RFC2119, March 1997. |
[RFC5226] | Narten, T. and H. Alvestrand, "Guidelines for Writing an IANA Considerations Section in RFCs", BCP 26, RFC 5226, DOI 10.17487/RFC5226, May 2008. |
[CoNEXT2013] | Schmidt, P., Enghardt, T., Khalili, R. and A. Feldmann, "Socket intents", Proceedings of the ninth ACM conference on Emerging networking experiments and technologies - CoNEXT '13 , DOI 10.1145/2535372.2535405, 2013. |
[DASH] | International Organization for Standardization, "Dynamic adaptive streaming over HTTP (DASH) - Part 1: Media presentation description and segment formats", Standard ISO/IEC 23009-1:2014 , June 2011. |
[I-D.pauly-taps-guidelines] | Pauly, T., "Software Guidelines for Protocol Evolution", Internet-Draft draft-pauly-taps-guidelines-00, February 2017. |
[I-D.tiesel-taps-communitgrany] | Tiesel, P. and T. Enghardt, "Communication Units Granularity Considerations for using Transport Diversity or Multiple Provisioning Domains", draft-tiesel-taps-communitgrany-00 (work in progress) , July 2017. |
[I-D.trammell-taps-post-sockets] | Trammell, B., Perkins, C., Pauly, T. and M. Kuehlewind, "Post Sockets, An Abstract Programming Interface for the Transport Layer", Internet-Draft draft-trammell-taps-post-sockets-00, March 2017. |
[RFC4594] | Babiarz, J., Chan, K. and F. Baker, "Configuration Guidelines for DiffServ Service Classes", RFC 4594, DOI 10.17487/RFC4594, August 2006. |
[RFC4960] | Stewart, R., "Stream Control Transmission Protocol", RFC 4960, DOI 10.17487/RFC4960, September 2007. |
[RFC6824] | Ford, A., Raiciu, C., Handley, M. and O. Bonaventure, "TCP Extensions for Multipath Operation with Multiple Addresses", RFC 6824, DOI 10.17487/RFC6824, January 2013. |
[RFC7413] | Cheng, Y., Chu, J., Radhakrishnan, S. and A. Jain, "TCP Fast Open", RFC 7413, DOI 10.17487/RFC7413, December 2014. |
[RFC7556] | Anipko, D., "Multiple Provisioning Domain Architecture", RFC 7556, DOI 10.17487/RFC7556, June 2015. |