Registration Protocols Extensions | M. Loffredo |
Internet-Draft | M. Martinelli |
Intended status: Standards Track | IIT-CNR/Registro.it |
Expires: November 30, 2020 | S. Hollenbeck |
Verisign Labs | |
May 29, 2020 |
Registration Data Access Protocol (RDAP) Query Parameters for Result Sorting and Paging
draft-ietf-regext-rdap-sorting-and-paging-13
The Registration Data Access Protocol (RDAP) does not include core functionality for clients to provide sorting and paging parameters for control of large result sets. This omission can lead to unpredictable server processing of queries and client processing of responses. This unpredictability can be greatly reduced if clients can provide servers with their preferences for managing large responses. This document describes RDAP query extensions that allow clients to specify their preferences for sorting and paging result sets.
This Internet-Draft is submitted in full conformance with the provisions of BCP 78 and BCP 79.
Internet-Drafts are working documents of the Internet Engineering Task Force (IETF). Note that other groups may also distribute working documents as Internet-Drafts. The list of current Internet-Drafts is at https://datatracker.ietf.org/drafts/current/.
Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress."
This Internet-Draft will expire on November 30, 2020.
Copyright (c) 2020 IETF Trust and the persons identified as the document authors. All rights reserved.
This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents (https://trustee.ietf.org/license-info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document. Code Components extracted from this document must include Simplified BSD License text as described in Section 4.e of the Trust Legal Provisions and are provided without warranty as described in the Simplified BSD License.
The availability of functionality for result sorting and paging provides benefits to both clients and servers in the implementation of RESTful services [REST]. These benefits include:
Approaches to implementing features for result sorting and paging can be grouped into two main categories:
However, there are some drawbacks associated with the use of the HTTP header. First, the header properties cannot be set directly from a web browser. Moreover, in an HTTP session, the information on the status (i.e. the session identifier) is usually inserted in the header or in a cookie, while the information on the resource identification or the search type is included in the query string. The second approach is therefore not compliant with the HTTP standard [RFC7230]. As a result, this document describes a specification based on the use of query parameters.
Currently, the RDAP protocol [RFC7482] defines two query types:
While the lookup query does not raise issues in response size management, the search query can potentially generate a large result set that could be truncated according to server limits. In addition, it is not possible to obtain the total number of objects found that might be returned in a search query response [RFC7483]. Lastly, there is no way to specify sort criteria to return the most relevant objects at the beginning of the result set. Therefore, the client might traverse the whole result set to find the relevant objects or, due to truncation, might not find them at all.
The specification described in this document extends RDAP query capabilities to enable result sorting and paging, by adding new query parameters that can be applied to RDAP search path segments. The service is implemented using the Hypertext Transfer Protocol (HTTP) [RFC7230] and the conventions described in RFC 7480 [RFC7480].
The implementation of the new parameters is technically feasible, as operators for counting, sorting and paging rows are currently supported by the major RDBMSs.
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in BCP 14 [RFC2119] [RFC8174] when, and only when, they appear in all capitals, as shown here.
The new query parameters are OPTIONAL extensions of path segments defined in RFC 7482 [RFC7482]. They are as follows:
Augmented Backus-Naur Form (ABNF) [RFC5234] is used in the following sections to describe the formal syntax of these new parameters.
According to most advanced principles in REST design, collectively known as HATEOAS (Hypermedia as the Engine of Application State) ([HATEOAS]), a client entering a REST application through an initial URI should use server-provided links to dynamically discover available actions and access the resources it needs. In this way, the client is not requested to have prior knowledge of the service and, consequently, to hard code the URIs of different resources. This allows the server to make URI changes as the API evolves without breaking clients. Definitively, a REST service should be as self-descriptive as possible.
Therefore, servers implementing the query parameters described in this specification SHOULD provide additional information in their responses about both the available sorting criteria and possible pagination. Such information is collected in two OPTIONAL response elements named, respectively, "sorting_metadata" and "paging_metadata".
The "sorting_metadata" element contains the following properties:
At least one of the "currentSort" and "availableSorts" properties MUST be present.
The "paging_metadata" element contains the following fields:
Servers returning the "paging_metadata" element in their response MUST include the string literal "paging" in the rdapConformance array. Servers returning the "sorting_metadata" element MUST include the string literal "sorting".
Currently, the RDAP protocol does not allow a client to determine the total number of the results in a query response when the result set is truncated. This is inefficient because the user cannot determine if the result set is complete.
The "count" parameter provides additional functionality (Figure 1) that allows a client to request information from the server that specifies the total number of objects matching the search pattern.
https://example.com/rdap/domains?name=*nr.com&count=true
Figure 1: Example of RDAP query reporting the "count" parameter
The ABNF syntax is the following:
A trueValue means that the server MUST provide the total number of the objects in the "totalCount" field of the "paging_metadata" element (Figure 2). A falseValue means that the server MUST NOT provide this number.
{ "rdapConformance": [ "rdap_level_0", "paging" ], ... "paging_metadata": { "totalCount": 43 }, "domainSearchResults": [ ... ] }
Figure 2: Example of RDAP response with "paging_metadata" element containing the "totalCount" field
The RDAP protocol does not provide any capability to specify result set sort criteria. A server could implement a default sorting scheme according to the object class, but this feature is not mandatory and might not meet user requirements. Sorting can be addressed by the client, but this solution is rather inefficient. Sorting features provided by the RDAP server could help avoid truncation of relevant results.
The "sort" parameter allows the client to ask the server to sort the results according to the values of one or more properties and according to the sort direction of each property. The ABNF syntax is the following:
"a" means that an ascending sort MUST be applied, "d" means that a descending sort MUST be applied. If the sort direction is absent, an ascending sort MUST be applied (Figure 3).
https://example.com/rdap/domains?name=*nr.com&sort=name https://example.com/rdap/domains?name=*nr.com&sort=registrationDate:d https://example.com/rdap/domains?name=*nr.com&sort=lockedDate,name
Figure 3: Examples of RDAP query reporting the "sort" parameter
With the exception of sorting IP addresses, servers MUST implement sorting according to the JSON value type of the RDAP field the sorting property refers to. That is, JSON strings MUST be sorted lexicographically and JSON numbers MUST be sorted numerically. If IP addresses are represented as JSON strings, they MUST be sorted based on their numeric conversion.
If the "sort" parameter reports an allowed sorting property, it MUST be provided in the "currentSort" field of the "sorting_metadata" element.
In the "sort" parameter ABNF syntax, property-ref represents a reference to a property of an RDAP object. Such a reference could be expressed by using a JSONPath. The JSONPath in a JSON document [RFC8259] is equivalent to the XPath [W3C.CR-xpath-31-20161213] in a XML document. For example, the JSONPath to select the value of the ASCII name inside an RDAP domain object is "$.ldhName", where $ identifies the root of the document (DOM). Another way to select a value inside a JSON document is the JSON Pointer [RFC6901]. While JSONPath or JSON Pointer are both standard ways to select any value inside JSON data, neither is particularly easy to use (e.g. "$.events[?(@.eventAction='registration')].eventDate" is the JSONPath expression of the registration date in an RDAP domain object).
Therefore, this specification provides a definition of property-ref in terms of RDAP properties. However, not all the RDAP properties are suitable to be used in sort criteria, such as:
On the contrary, properties expressed as values of other properties (e.g. registration date) could be used in such a context. The list of properties an RDAP server MAY implement are divided into two groups: object common properties and object specific properties.
The correspondence between these sorting properties and the RDAP object classes is shown in Table 1:
Object class | Sorting property | RDAP property | RFC 7483 | RFC 6350 | RFC 8605 |
---|---|---|---|---|---|
Searchable objects | Common properties | eventAction values suffixed by "Date" | 4.5. | ||
Domain | name | unicodeName/ldhName | 5.3. | ||
Nameserver | name | unicodeName/ldhName | 5.2. | ||
ipV4 | v4 ipAddress | 5.2. | |||
ipV6 | v6 ipAddress | 5.2. | |||
Entity | handle | handle | 5.1. | ||
fn | vcard fn | 5.1. | 6.2.1 | ||
org | vcard org | 5.1. | 6.6.4 | ||
voice | vcard tel with type="voice" | 5.1. | 6.4.1 | ||
vcard email | 5.1. | 6.4.2 | |||
country | country name in vcard adr | 5.1. | 6.3.1 | ||
cc | country code in vcard adr | 5.1. | 3.1 | ||
city | locality in vcard adr | 5.1. | 6.3.1 |
With regard to the definitions in Table 1, some further considerations are needed to disambiguate some cases:
The "jsonPath" field in the "sorting_metadata" element is used to clarify the RDAP field the sorting property refers to. The mapping between the sorting properties and the JSONPaths of the RDAP fields is shown in Table 2. The JSONPaths are provided according to the Goessner v.0.8.0 specification ([GOESSNER-JSON-PATH]). Further documentation about JSONPath operators used in Table 2 is included in Appendix A.
Object class | Sorting property | JSONPath |
---|---|---|
Searchable objects | registrationDate | $.domainSearchResults[*].events[?(@.eventAction=="registration")].eventDate |
reregistrationDate | $.domainSearchResults[*].events[?(@.eventAction=="reregistration")].eventDate | |
lastChangedDate | $.domainSearchResults[*].events[?(@.eventAction=="last changed")].eventDate | |
expirationDate | $.domainSearchResults[*].events[?(@.eventAction=="expiration")].eventDate | |
deletionDate | $.domainSearchResults[*].events[?(@.eventAction=="deletion")].eventDate | |
reinstantiationDate | $.domainSearchResults[*].events[?(@.eventAction=="reinstantiation")].eventDate | |
transferDate | $.domainSearchResults[*].events[?(@.eventAction=="transfer")].eventDate | |
lockedDate | $.domainSearchResults[*].events[?(@.eventAction=="locked")].eventDate | |
unlockedDate | $.domainSearchResults[*].events[?(@.eventAction=="unlocked")].eventDate | |
Domain | name | $.domainSearchResults[*].unicodeName |
Nameserver | name | $.nameserverSearchResults[*].unicodeName |
ipV4 | $.nameserverSearchResults[*].ipAddresses.v4[0] | |
ipV6 | $.nameserverSearchResults[*].ipAddresses.v6[0] | |
Entity | handle | $.entitySearchResults[*].handle |
fn | $.entitySearchResults[*].vcardArray[1][?(@[0]=="fn")][3] | |
org | $.entitySearchResults[*].vcardArray[1][?(@[0]=="org")][3] | |
voice | $.entitySearchResults[*].vcardArray[1][?(@[0]=="tel" && @[1].type=="voice")][3] | |
$.entitySearchResults[*].vcardArray[1][?(@[0]=="email")][3] | ||
country | $.entitySearchResults[*].vcardArray[1][?(@[0]=="adr")][3][6] | |
cc | $.entitySearchResults[*].vcardArray[1][?(@[0]=="adr")][1].cc | |
city | $.entitySearchResults[*].vcardArray[1][?(@[0]=="adr")][3][3] |
Table 2 JSONPath notes:
An RDAP server MAY use the "links" array of the "sorting_metadata" element to provide ready-made references [RFC8288] to the available sort criteria (Figure 4). Each link represents a reference to an alternate view of the results.
{ "rdapConformance": [ "rdap_level_0", "sorting" ], ... "sorting_metadata": { "currentSort": "name", "availableSorts": [ { "property": "registrationDate", "jsonPath": "$.domainSearchResults[*] .events[?(@.eventAction==\"registration\")].eventDate", "default": false, "links": [ { "value": "https://example.com/rdap/domains?name=*nr.com &sort=name", "rel": "alternate", "href": "https://example.com/rdap/domains?name=*nr.com &sort=registrationDate", "title": "Result Ascending Sort Link", "type": "application/rdap+json" }, { "value": "https://example.com/rdap/domains?name=*nr.com &sort=name", "rel": "alternate", "href": "https://example.com/rdap/domains?name=*nr.com &sort=registrationDate:d", "title": "Result Descending Sort Link", "type": "application/rdap+json" } ] }, ... ] }, "domainSearchResults": [ ... ] }
Figure 4: Example of a "sorting_metadata" instance to implement result sorting
The cursor parameter defined in this specification can be used to encode information about any pagination method. For example, in the case of a simple implementation of the cursor parameter to represent offset pagination information, the cursor value "b2Zmc2V0PTEwMCxsaW1pdD01MAo=" is the Base64 encoding of "offset=100,limit=50". Likewise, in a simple implementation to represent keyset pagination information, the cursor value "a2V5PXRoZWxhc3Rkb21haW5vZnRoZXBhZ2UuY29t=" represents the Base64 encoding of "key=thelastdomainofthepage.com" whereby the key value identifies the last row of the current page.
This solution lets RDAP providers implement a pagination method according to their needs, a user's access level, and the submitted query. In addition, servers can change the method over time without announcing anything to clients. The considerations that has led to this solution are reported in more detail in Appendix B.
The ABNF syntax of the cursor paramter is the following:
https://example.com/rdap/domains?name=*nr.com &cursor=wJlCDLIl6KTWypN7T6vc6nWEmEYe99Hjf1XY1xmqV-M=
Figure 5: An example of RDAP query reporting the "cursor" parameter
An RDAP server SHOULD use the "links" array of the "paging_metadata" element to provide a ready-made reference [RFC8288] to the next page of the result set (Figure 6). Examples of additional "rel" values a server MAY implement are "first", "last", and "prev".
{ "rdapConformance": [ "rdap_level_0", "paging" ], ... "notices": [ { "title": "Search query limits", "type": "result set truncated due to excessive load", "description": [ "search results for domains are limited to 50" ] } ], "paging_metadata": { "totalCount": 73, "pageSize": 50, "pageNumber": 1, "links": [ { "value": "https://example.com/rdap/domains?name=*nr.com", "rel": "next", "href": "https://example.com/rdap/domains?name=*nr.com &cursor=wJlCDLIl6KTWypN7T6vc6nWEmEYe99Hjf1XY1xmqV-M=", "title": "Result Pagination Link", "type": "application/rdap+json" } ] }, "domainSearchResults": [ ... ] }
Figure 6: Example of a "paging_metadata" instance to implement cursor pagination
The value constraints for the parameters are defined by their ABNF syntax. Therefore, each request that includes an invalid value for a parameter SHOULD produce an HTTP 400 (Bad Request) response code. The same response SHOULD be returned in the following cases:
Optionally, the response MAY include additional information regarding the negative answer in the HTTP entity body.
Implementation of the new parameters is technically feasible, as operators for counting, sorting and paging are currently supported by the major RDBMSs. Similar operators are completely or partially supported by the most known NoSQL databases (MongoDB, CouchDB, HBase, Cassandra, Hadoop).
NOTE: Please remove this section and the reference to RFC 7942 prior to publication as an RFC.
This section records the status of known implementations of the protocol defined by this specification at the time of posting of this Internet-Draft, and is based on a proposal described in RFC 7942 [RFC7942]. The description of implementations in this section is intended to assist the IETF in its decision processes in progressing drafts to RFCs. Please note that the listing of any individual implementation here does not imply endorsement by the IETF. Furthermore, no effort has been spent to verify the information presented here that was supplied by IETF contributors. This is not intended as, and must not be construed to be, a catalog of available implementations or their features. Readers are advised to note that other implementations may exist.
According to RFC 7942, "this will allow reviewers and working groups to assign due consideration to documents that have the benefit of running code, which may serve as evidence of valuable experimentation and feedback that have made the implemented protocols more mature. It is up to the individual working groups to use this information as they see fit".
IANA is requested to register the following values in the RDAP Extensions Registry:
Security services for the operations specified in this document are described in RFC 7481 [RFC7481].
A search query typically requires more server resources (such as memory, CPU cycles, and network bandwidth) when compared to a lookup query. This increases the risk of server resource exhaustion and subsequent denial of service due to abuse. This risk can be mitigated by either restricting search functionality or limiting the rate of search requests. Servers can also reduce their load by truncating the results in a response. However, this last security policy can result in a higher inefficiency if the RDAP server does not provide any functionality to return the truncated results.
The new parameters presented in this document provide RDAP operators with a way to implement a server that reduces inefficiency risks. The "count" parameter gives the client te ability to evaluate the completeness of a response. The "sort" parameter allows the client to obtain the most relevant information at the beginning of the result set. This can reduce the amount of unnecessary search requests. Finally, the "cursor" parameter enables the user to scroll the result set by submitting a sequence of sustainable queries within server-acceptable limits.
The authors would like to acknowledge Brian Mountford, Tom Harrison, Karl Heinz Wolf and Jasdip Singh for their contribution to the development of this document.
[CURSOR] | Nimesh, R., "Paginating Real-Time Data with Keyset Pagination", July 2014. |
[CURSOR-API1] | facebook.com, "facebook for developers - Using the Graph API", July 2017. |
[CURSOR-API2] | twitter.com, "Pagination", 2017. |
[GOESSNER-JSON-PATH] | Goessner, S., "JSONPath - XPath for JSON", 2007. |
[HATEOAS] | Jedrzejewski, B., "HATEOAS - a simple explanation", 2018. |
[OData-Part1] | Pizzo, M., Handl, R. and M. Zurmuehl, "OData Version 4.0. Part 1: Protocol Plus Errata 03", June 2016. |
[REST] | Fredrich, T., "RESTful Service Best Practices, Recommendations for Creating Web Services", April 2012. |
[RFC6901] | Bryan, P., Zyp, K. and M. Nottingham, "JavaScript Object Notation (JSON) Pointer", RFC 6901, DOI 10.17487/RFC6901, April 2013. |
[SEEK] | EverSQL.com, "Faster Pagination in Mysql - Why Order By With Limit and Offset is Slow?", July 2017. |
A JSONPath expression represents a path to find an element (or a set of elements) in a JSON content.
The base JSONPath specification requires that implementations support a set of "basic operators". These operators are used to access the elements of a JSON structure like objects and arrays, and their subelements, respectively, object members and array items. No operations are defined for retrieving parent or sibling elements of a given element. The root element is always referred to as $ regardless of it being an object or array.
Additionally, the specification permits implementations to support arbitrary script expressions. These can be used to index into an object or an array, or to filter elements from an array. While script expression behaviour is implementation-defined, most implementations support the basic relational and logical operators, as well as both object member and array item access, sufficiently similarly for the purposes of this document. Commonly-supported operators/functions divided into "top-level operators" and "filter operators" are documented in Table 3 and Table 4 respectively.
Operator | Descritpion |
---|---|
$ | Root element |
.<name> | Object member access (dot-notation) |
['<name>'] | Object member access (bracket-notation) |
[<number>] | Array item access |
* | All elements within the specified scope |
[?(<expression>)] | Filter expression |
Operator | Descritpion |
---|---|
@ | Current element being processed |
.<name> | Object member access |
[<number>] | Array item access |
== | Left is equal to right |
!= | Left is not equal to right |
< | Left is less than right |
<= | Left is less than or equal to right |
> | Left is greater than right |
>= | Left is greater than or equal to right |
&& | Logical conjunction |
|| | Logical disjunction |
An RDAP query could return a response with hundreds, even thousands, of objects, especially when partial matching is used. For that reason, the cursor parameter addressing result pagination is defined to make responses easier to handle.
Presently, the most popular methods to implement pagination in a REST API include offset pagination and keyset pagination. Neither pagination method requires the server to handle the result set in a storage area across multiple requests since a new result set is generated each time a request is submitted. Therefore, they are preferred in comparison to any other method requiring the management of a REST session.
Using limit and offset operators represents the traditionally used method to implement results pagination. Both of them can be used individually:
When limit and offset are used together, they provide the ability to identify a specific portion of the result set. For example, the pair "offset=100,limit=50" returns the first 50 objects starting from position 101 of the result set.
Though easy to implement, offset pagination also includes drawbacks:
Keyset pagination [SEEK] adds a query condition that enables the selection of the only data not yet returned. This method has been taken as the basis for the implementation of a "cursor" parameter [CURSOR] by some REST API providers (e.g. [CURSOR-API1],[CURSOR-API2]). The cursor is an opaque URL-safe string representing a logical pointer to the first result of the next page (Figure 5).
Nevertheless, even keyset pagination can be troublesome:
Furthermore, in the RDAP context, some additional considerations can be made:
Ultimately, both pagination methods have benefits and drawbacks.