Internet DRAFT - draft-rosenberg-dispatch-ript-inbound
draft-rosenberg-dispatch-ript-inbound
Network Working Group J. Rosenberg
Internet-Draft Five9
Intended status: Standards Track February 7, 2020
Expires: August 10, 2020
RealTime Internet Peering for Single User Endpoints
draft-rosenberg-dispatch-ript-inbound-00
Abstract
The Real-Time Internet Peering for Telephony (RIPT) protocol defines
a technique for establishing, terminating and otherwise managing
calls between entities in differing administrative domains. While it
can be used for single user devices like an IP phone, it requires the
IP phone to have TLS certificates and be publically reachable with a
DNS record. This specification remedies this by extending RIPP to
enable clients to receive inbound calls. It also provides basic
single-user features such as forking, call push and pull, third-party
call controls, and call appearances. It describes techniques for
resiliency of calls, especially for mobile clients with spotty
network connectivity.
Status of This Memo
This Internet-Draft is submitted in full conformance with the
provisions of BCP 78 and BCP 79.
Internet-Drafts are working documents of the Internet Engineering
Task Force (IETF). Note that other groups may also distribute
working documents as Internet-Drafts. The list of current Internet-
Drafts is at https://datatracker.ietf.org/drafts/current/.
Internet-Drafts are draft documents valid for a maximum of six months
and may be updated, replaced, or obsoleted by other documents at any
time. It is inappropriate to use Internet-Drafts as reference
material or to cite them other than as "work in progress."
This Internet-Draft will expire on August 10, 2020.
Copyright Notice
Copyright (c) 2020 IETF Trust and the persons identified as the
document authors. All rights reserved.
This document is subject to BCP 78 and the IETF Trust's Legal
Provisions Relating to IETF Documents
(https://trustee.ietf.org/license-info) in effect on the date of
Rosenberg Expires August 10, 2020 [Page 1]
Internet-Draft RIPT Inbound February 2020
publication of this document. Please review these documents
carefully, as they describe your rights and restrictions with respect
to this document. Code Components extracted from this document must
include Simplified BSD License text as described in Section 4.e of
the Trust Legal Provisions and are provided without warranty as
described in the Simplified BSD License.
Table of Contents
1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 2
2. Differences with SIP Outbound . . . . . . . . . . . . . . . . 3
3. Overview of Operation . . . . . . . . . . . . . . . . . . . . 4
4. Example Use Cases . . . . . . . . . . . . . . . . . . . . . . 5
4.1. Inbound Call Forking . . . . . . . . . . . . . . . . . . 6
4.2. Answer and Stop Ringing Other Devices . . . . . . . . . . 6
4.3. Remote in Use . . . . . . . . . . . . . . . . . . . . . . 7
4.4. Call Pull . . . . . . . . . . . . . . . . . . . . . . . . 7
4.5. Call Push . . . . . . . . . . . . . . . . . . . . . . . . 7
4.6. Select Device . . . . . . . . . . . . . . . . . . . . . . 8
4.7. Third Party Call Control - Place Outbound . . . . . . . . 8
4.8. Third Party Call Control Answer or Decline Inbound . . . 9
4.9. Third Party Call Control Hangup . . . . . . . . . . . . 9
4.10. Third Party Call Control Move Call . . . . . . . . . . . 9
4.11. Resiliency Miss Incoming call . . . . . . . . . . . . . . 10
4.12. Resiliency MidCall Network Change . . . . . . . . . . . . 10
4.13. Resiliency MidCall Wireless Fade and Recover . . . . . . 10
4.14. Resiliency MidCall Wireless Fade and Move . . . . . . . . 11
4.15. Resiliency MidCall Wireless Fade and Peer Hangup . . . . 11
4.16. Resiliency MidCall Wireless Fade and Server Drop . . . . 11
5. Normative Protocol Specification . . . . . . . . . . . . . . 12
6. Syntax . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
7. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 12
8. Security Considerations . . . . . . . . . . . . . . . . . . . 12
9. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . 12
10. References . . . . . . . . . . . . . . . . . . . . . . . . . 12
10.1. Normative References . . . . . . . . . . . . . . . . . . 13
10.2. Informative References . . . . . . . . . . . . . . . . . 13
Author's Address . . . . . . . . . . . . . . . . . . . . . . . . 13
1. Introduction
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
"SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
document are to be interpreted as described in [RFC2119].
The Real-Time Internet Peering Protocol (RIPT) defines a technique
for establishing, terminating and otherwise managing calls between
entities. It is an application ontop of HTTP/3, and as such has the
Rosenberg Expires August 10, 2020 [Page 2]
Internet-Draft RIPT Inbound February 2020
notion of a client that opens connections and makes requests to a
server. In the core RIPT specification, clients can only place
outbound calls. Inbound calls are supported by requiring an entity
to also run a server.
While this requirement is appropriate for use cases like SIP
trunking, carrier to carrier peering, or other arrangements involving
a large number of calls, it is a poor match for single user devices.
A single user device is one in which an actual end user would log in
and use that device for making and receiving calls. Exampes include
desktop softphones, browser based webRTC appications, IP hardphones,
and video conferencing endpoints. These devices are often behind a
NAT, dont have DNS names, and don't have TLS certificates, all of
which are pre-requisiites to run a server.
Furthermore, an end user may often be logged into multiple such
devices, possibly from multiple locations. This introduces
additional requirements. Inbound calls need to be forked to all
devices, and ring on all of them. A user must be able to answer on
one, and stop ringing on the others. SIP [RFC3261] natively
supported these capabilities. However, it lacked other ones which
are clearly needed - native support for mobile-based apps which
utilize push notifications is one significant example.
SIP's lack of call state in servers as a built-in feature of the
protocol has also meant it couldn't readily support other features
truly needed for a system where a user can be logged into multiple
devices. These include the ability for one device to see the state
of the call, and know on which other device the call is being
handled. Another important feature includes the ability to - from
any device - end the call, move it to a different device, or on the
device the user is sitting on. It also includes basic third party
call controls - the ability to initiate or answer a call from one
client, but have the media delivered to another.
To remedy these challenges this specification provides an extension
to RIPT to facilitate single-user devices.
2. Differences with SIP Outbound
This specification covers a similar problem space as SIP Outboud
[RFC5626], however it works much differently.
Firstly, delivery of an inbound call to an IP phone in a timely
fashion clearly requires the IP phone to be able to have some kind of
persistent connection over which it can receive incoming call
indications. In SIP Outbound, the specification itself provided this
capability. This specification, however, does not. Rather, it
Rosenberg Expires August 10, 2020 [Page 3]
Internet-Draft RIPT Inbound February 2020
assumes that it merely exists, and is provided through some non-
standardied means, which we refer to as a "push channel".
For mobile devices, the push channel is provided by the mobile OS.
For browser applications, it might be provided by a websocket
connection that the application is using to receive a variety of
events, including those having nothing to do with calling.
The push channel is also used to provide an indication of feature
invocations to the client when those features are invoked elsewhere
(ie., third party call control). The specific feature names and
other UI elements are out of scope for this spceification as well.
Rather, this specification only shows how, once a client knows it
needs perform a call manipulation, it can use RIPT to do it.
The second significant difference compared to SIP Outbound is that
RIPT does not use the push channel to push actual protocol messages;
rather it uses it as a "shoulder tap" to let the client know about a
new event, and provide it a URI with which it can get more
information or take action.
3. Overview of Operation
To signal usage of this specification, the server includes a new
element, "inbound", in its TG description. The format of this
element is identical to what it would look like to receive calls on a
TG that would have been hosted by the single-user device, had it been
able to do so. For example, the following TG describes a single user
TG which can handle both outbound and inbound calls:
{
"outbound": {
"origins" : (encoded passport)
"destinations" : "*"
},
"inbound": {
"destinations" : "+14085551002",
}
}
The client will follow RIPT procedures for handler registration.
This is analagous to the SIP REGISTER operation. For server to
server peering arrangements, the handler represents a particular
collection of capabilities on an SBC or IP PBX. When used by single-
user devices, it represents each individual device. Consequently, if
a user has four IP phones, there would be four handlers created on
the server. As specified in RIPT each client needs to remember its
handler URI persistently in order to modify it or delete it later on.
Rosenberg Expires August 10, 2020 [Page 4]
Internet-Draft RIPT Inbound February 2020
If an incoming call arrives for the client, the server creates the
call, including the call URI, and the push channel is used to inform
the client of a new call, and provide it with the call URI. The
client performs a GET against this URI to obtain the information
about the call. As defined in the core RIPT specification, this will
provide the client with the calling and caller party identifiers,
call direction (here, inbound), and the client directive. The client
can then alert the user, and in parallel establish the signaling and
media byways. The client can send the proceeding, alerting,
answered, or declined events to the server to adjust the state of the
call. Once answered, the call is active and processing proceeds
identically to the case where it had placed an outbound call.
Multi-device handling follows from the fact that the server will
broadcast all call events to all open GET requests to /events on the
call. As such, if there are multiple IP phones, each of which
receives a push notification of the new call, all of them will
perform a GET on the call URI, establish signaling and media byways,
and then alert the user. Once the user answers on one device, the
call state changes to answered and this event is sent to the other
devices, which can cease ringing. Furthermore, the other devices can
follow the state of the call by maintaining a GET to /events, even
though they are not sending or receiving media.
Since other devices can track the state of the call, they can render
it while the call is ongoing - providing basic 'shared call
appearance' functionality.
The movement of calls between different devices is learned through a
new event defined here, the "handler changed" event, which is sent by
the server. Its payload is the URI of the new handler.
The core RIPT specification also provides a simple way for one device
to take a call from another - by using a client-side migration. The
device which wishes to take the call would POST to the call URI,
changing the handler to itself. It would get a new, modified
directive, and then connect its media byways to begin sending and
receiving media.
These basic primitives can be used in concert with application-
specific (and non-standardized) user interface and push channel
contents to accomplish many different functions.
4. Example Use Cases
This section outlines example use cases that are enabled by this
specification. It is not normative in nature. It merely describes
Rosenberg Expires August 10, 2020 [Page 5]
Internet-Draft RIPT Inbound February 2020
how the new API features defined by this specification can be used by
clients to deal with these cases.
4.1. Inbound Call Forking
Consider two devices - A and B. A single user, Alice, logs into both
devices. These devices query the provider, and through the
techniques described in RIPT, get the TG for the service provided to
Alice and register their respective handlers. Furthermore, assume
that device A only supports G.711 and Opus, while device B supports
both Opus, G.711, and G.729.
When a new call arrives for Alice, the server would create a call
URI, and use the push channel to inform both devices that a new call
has arrived. The push notification would inform the IP Phones of the
call URI. Both phones perform a GET against the call URI, which
returns the caller and called numbers, call direction, and current
state - which is proceeding. Since the clients see that this call
has not yet been answered, both of them render UI and begin alerting.
Both will also open signaling byways to the call URI and PUT
"proceeding" and then "alerting" events. The server will in turn,
echo the "alerting" events back to all clients which are receiving
events on the byway, since the state of the call has changed to
"alerting".
This achieves the basic forking operation.
4.2. Answer and Stop Ringing Other Devices
Consider now that user Alice answers on device A. This will cause
device A to send an "answered" event to the server. In parallel, it
will perform a POST to the call URI and provide its handler URI in
the body. The response includes the directive for the call. This
allows the server to know that device A doesnt support G.729, and
thus it directs device A to send with G.711. Furthemore, the server
would send the "answered" event to all other clients which have an
open signaling byway, which in this case is phone B. It will receive
the "answered" event and thus cease ringing.
Note that - had IP phone B receive the original push notification
late, if it should query the call URI after the call has been
answered, it would see that the state is answered and thus not ring.
Because the server maintais state, it is resilient to intermittent
client connectivity.
Rosenberg Expires August 10, 2020 [Page 6]
Internet-Draft RIPT Inbound February 2020
4.3. Remote in Use
Consider further now what happens with device B. The call is being
handled on device A. However, device B maintains its signaling
byway. As a result, it will see the the call remains live. If that
call should end, the client would receive the "ended" event from the
server, and therefore be able to show that the call is no longer
active.
Additionally, if the service provider offers advanced telephony
features such as "hold" or "transfer", those state changes could be
delivered to device B via the push-channel. Similarly, the client
could query - using web APIs beyond the scope of this specification -
to learn about states like "on-hold". (OPEN ISSUE: this does seem a
bit wonky that RIPP is used for the basic call state, but a separate
web API is needed if the state is something like "on-hold".)
4.4. Call Pull
Consider now that IP phone B wishes to take over the call. This is
called "call pull".
To do that, it performs a client migration. It POSTS to the call URI
its own handler. The server sees that this new handler supports
G.729, so it returns a directive to the client telling it to send
with G.729. Device A would receive a notification on the signaling
byway that the handler has changed to device B, and thus it knows
that a migration has happened and it should close its media byway.
(NOTE: need to consider race conditions).
4.5. Call Push
In the push case, the user on device A wishes to move the call to
device B. The user is in front of device A, and not device B. To
perform the move, it uses its UI, obtains the list of devices which
are available from the server, and asks the server to move the call
to device B. The means by which this happens are not standardized
here, and assume the existence of a browser function in the client
which can render the UI for features such as this.
When the server wants to move the call to device B, it sends it a
push on the push channel and tells it to take the call, along with
the call URI. Device B then performs the client migration,
identically to the pull case above.
Rosenberg Expires August 10, 2020 [Page 7]
Internet-Draft RIPT Inbound February 2020
4.6. Select Device
As part of the call push operation, the user on device A will need to
obtain the list of devices to which it can push the call. This
specification assumes that this is provided through non-standardized
means, by virtue of the phone having a browser which allows it to see
the set of devices and select one.
4.7. Third Party Call Control - Place Outbound
In a similar way, this specification allows a device to be controlled
by third party call control. A user would visit a web page, enter in
a number to call, and click the "call" button. This capability does
not require standardization. The RIPT server would create an
outbound call object, and then perform a push notification to both
devices with the call URI. Both devices would query the call URI,
and see that there is a new call happening, in the outbound
direction, with the state of proceeding. The call state would also
indicate the caller (here, user Alice herself) and the called party -
the number dialed by Alice.
Both phones could alert Alice to the outbound call in progress. When
Alice selets the device on which to proceed, this would cause that
device to perform a POST to the call URI to set itself as the
handler, and then establish media and signaling byways. This would
also trigger the server to actually place the call towards the
destination.
This technique for third party call control is superior to the one
described in [RFC3725]. Firstly, the calling and called party
numbers are properly represented and will render correctly on the
devices. This is because we're not actually placing a call towards
ALice's phone - we're informing Alice of an outbound call placed from
another location. Secondly, the technique allows the phone to render
proper UI - that this is not an inbound call, that it is an outbound
call to be taken. Call progress can also be properly rendered,
including locally generated ringback.
In this use case, the outbound call was picked up by Alice by
'forking' the outbound call notification to all of her devices. The
service provider could, alternatively, allow Alice to choose a
specific device for placing the outbound call. In that case, the
server would send an indication to just that device, over the push
channel, telling it to connect to the call URI.
Rosenberg Expires August 10, 2020 [Page 8]
Internet-Draft RIPT Inbound February 2020
4.8. Third Party Call Control Answer or Decline Inbound
Another third party call control use case is that of an inbound call
which rings user devices, and a user would like to accept the call
from a webpage or other client, distinct from the device on which the
call is to actually be answered.
This capability is not possible with the mechanism defined in
[RFC3725].
This is possible with RIPT. The webpage would render the incoming
call notification to the user (again, no standardization is needed
for this, it is all just a browser application). The user would see
information on the incoming caller, select the device on which to
answer, and then hit an answer button. The server would then send a
push notification to the selceted device, with an instruction to
answer the call. The IP phone would then perform the POST operation
to the call URI, including its handler in the body, and accept the
call with the "answered" event.
4.9. Third Party Call Control Hangup
To hangup the call, once again Alice is in front of her browser, and
is able to see the call in the browser UI, and see that the call is
being handled by device A. Alice clicks the 'hangup' button on her
browser. The server changes the state of the call by sending an
"ended" event to all devices which have a signaling byway open
(which, in this case, would be both devices A and B). Device A would
cease rendering media and disconnect its signaling and media byways
for the call. Device B, which had remote-in-use, would remove the
remote-in-use indication from the UI.
TODO: should add meta-data to the ended event, indicating who ended
the event, to drive better UI and also deal with call drops
4.10. Third Party Call Control Move Call
In this use case, Alice is once again at her PC on her browser. She
is on a call which is rendering media on device A, and wishes to move
the call to device B. Using the browser UI, she instructs the server
to do so. The server would send a push notification to device B,
asking it to take the call. Device B would then POST its handler to
the call URI, open the media byways, and take the call, identically
to the pull use case above.
Rosenberg Expires August 10, 2020 [Page 9]
Internet-Draft RIPT Inbound February 2020
4.11. Resiliency Miss Incoming call
Consider now user Alice that has a mobile app with a RIPT client in
it. Alice was driving in her car. At the very moment the server
sends a push notification, Alice's device loses network coverage and
the push notification is lost.
When Alice exits the tunnel a few moments later, the application gets
notified that network connectivity has ben restored (note: i dont
believe this is actually provided in mobile OS today, it would
require a change perhaps to enable it). The application can then
perform a query to the server to get its current calls, using
techniques outside of the scope of this spceification. Once it
learns the call URI, it can query the call state and then render the
call as alerting.
4.12. Resiliency MidCall Network Change
Consider a case where user Alice is on her mobile device, and on a
call. While she is on the call, she moves from her cellular network
into her home, and her device switches to WiFi.
When this happens, the VoIP application on the mobile device receives
a notification from the OS that there has been a network change.
Note that - since RIPT doesnt use IP addresses at all - there is no
need to 're-REGISTER', or in fact to 're-INVITE'. The client just
continues doing what it was doing - performing GETs on /media to
receive media packets, and PUTs on /media to send them. In fact, the
client need not even explicitly listen for network change events. It
just continues sending and receiving media as before.
The change in IP will cause the signaling byways to end. The client
just re-establises them and continues where it was. RIPT requires a
client and server to buffer a small amount of media for cases where
the media byways are temporarily disconnected. In cases where there
is no network connectivity during the transition, the buffered
packets are sent in a burst. In this way, there is no loss of media
through the transition.
4.13. Resiliency MidCall Wireless Fade and Recover
Consider a similar case, where user Alice is on her mobile device,
and on a call. While she is on the call, she moves into a tunnel,
and network connectivity is lost for a few seconds.
The PUT and GET requests against the server for the media byway will
fail, and the signaling byway will possibly timeout or return an
error. The IP phone just buffers the media content being spoken by
Rosenberg Expires August 10, 2020 [Page 10]
Internet-Draft RIPT Inbound February 2020
the user. Similarly, the server will be buffering the media it
receives. When the connection is restored, the media byways will be
re-established, and the server will quickly push the buffered media
to the client and vice-a-versa. This allows the call to continue,
with no loss of media, within the depths of the jitter buffer.
4.14. Resiliency MidCall Wireless Fade and Move
In a similar use case, Alice is on her mobile phone in a call and
goes to her house. She is one of those unfortunate few who have no
cell signal in her house, nor does she have WiFi on her cell phone.
Poor Alice.
When Alice enters her home, the network connectivity on her mobile
phone is lost. However, her PC is up and running, so she logs into
her service provider's portal from the browser. This shows the call
in progress. Alice can hit the "move" button, which will cause the
browser to take the call, identically to the technqiues described
above.
4.15. Resiliency MidCall Wireless Fade and Peer Hangup
In this case, Alice is once again on her mobile device and enters an
area where there is no coverage for a long distance. As such, her
device is unable to send and receive media for many seconds. The
server is able to detect this, and can inform the remote user that
Alice has lost network connectivity (open question: should this be
done via ripp or through proprietary means?). The remote user gives
up and some point and hangs up the call. Alice's server ends the
call.
When Alice's phone finally regains network connectivity, it connects
to the call URI and gets a 404. This tells the device that the call
no longer exists, and so Alice's phone indicates to Alice that the
call has been ended (todo: should we keep the call state around in
the 'ended' state for an hour or so, so that Alice's device can query
it later and learn that it was ended by the remote party through an
explicit hangup event, and also learn when)
4.16. Resiliency MidCall Wireless Fade and Server Drop
In this final use case, Alice enters an area where there is no
coverage for an extended period of time. The server quickly detects
that she is not connected (it ceases receiving media). After a
period of time, the server decices to end the call. It changes the
call state to ended, which is passed to the remote party.
Rosenberg Expires August 10, 2020 [Page 11]
Internet-Draft RIPT Inbound February 2020
When Alice's phone recovers and connects to the call, it gets a 404,
informing her that the call has ended.
5. Normative Protocol Specification
A server that supports inbound calls on its TG MUST include the
"inbound" element in its TG description. This MUST include the
allowed caller IDs in the "origins" element, and the allowed
destinations in the "destinations".
The server MUST allow the client to send a "proceeding", "alerting",
"answered", "declined", "failed", "noanswer" and "end" events, and
take the associated actions on the call.
A client that answers a call MUST perform a POST operation to the
call URI, and in the body of the request, it MUST include its handler
URI, and no other information. The server MUST respond with a
directive. If the directive works for the client, the client MUST
generate an 'answered' event to answer the call. The client MUST NOT
POST its handler to the call URI until the user indicates that this
device should accept the call.
It MUST initiate signaling and media byways for the call, render
incoming media and generate outgoing media for the call.
6. Syntax
This specification outlines the syntax for the new events and TG
description.
7. IANA Considerations
No values are assigned in this document, no registries are created,
and there is no action assigned to the IANA by this document.
8. Security Considerations
TODO
9. Acknowledgements
Thanks to Cullen Jennings for his input on this document.
10. References
Rosenberg Expires August 10, 2020 [Page 12]
Internet-Draft RIPT Inbound February 2020
10.1. Normative References
[RFC2119] Bradner, S., "Key words for use in RFCs to Indicate
Requirement Levels", BCP 14, RFC 2119,
DOI 10.17487/RFC2119, March 1997,
<https://www.rfc-editor.org/info/rfc2119>.
10.2. Informative References
[RFC3261] Rosenberg, J., Schulzrinne, H., Camarillo, G., Johnston,
A., Peterson, J., Sparks, R., Handley, M., and E.
Schooler, "SIP: Session Initiation Protocol", RFC 3261,
DOI 10.17487/RFC3261, June 2002,
<https://www.rfc-editor.org/info/rfc3261>.
[RFC3725] Rosenberg, J., Peterson, J., Schulzrinne, H., and G.
Camarillo, "Best Current Practices for Third Party Call
Control (3pcc) in the Session Initiation Protocol (SIP)",
BCP 85, RFC 3725, DOI 10.17487/RFC3725, April 2004,
<https://www.rfc-editor.org/info/rfc3725>.
[RFC5626] Jennings, C., Ed., Mahy, R., Ed., and F. Audet, Ed.,
"Managing Client-Initiated Connections in the Session
Initiation Protocol (SIP)", RFC 5626,
DOI 10.17487/RFC5626, October 2009,
<https://www.rfc-editor.org/info/rfc5626>.
Author's Address
Jonathan Rosenberg
Five9
Email: jdrosen@jdrosen.net
Rosenberg Expires August 10, 2020 [Page 13]