Network Working Group                                       A. Silverton
Internet-Draft                                                    Q. Xie
Expires: January 20, 2006                                       Motorola
                                                               M. Tuexen
                                      Muenster Univ. of Applied Sciences
                                                            T. Dreibholz
                                            University of Duisburg-Essen
                                                           July 17, 2005

             Reliable Server Pooling Sockets API Extensions

   This document describes a sockets-like API for the Reliable Server
   Pooling (RSerPool) protocol suite.  This API provides applications
   within an RSerPool enabled system with a reliable communications
   layer via a highly-available socket interface (rsp_socket).

1.  Introduction

   This document describes a sockets-like API for the Reliable Server
   Pooling (RSerPool) protocol suite.  This API provides applications
   within an RSerPool enabled system [2] with a reliable communications
   layer via a highly-available socket interface (rsp_socket).

   The RSerPool API is intended to mimic the well-known UNIX sockets
   API.  The functions in this proposed extension to the sockets API are
   prefixed with "rsp_" to signify that they are highly-available
   RSerPool API calls.  Some new utility functions are defined as well.
   The API is intended to be extensible.

2.  Send and Receive Data with HA Sockets

   The UNIX sockets API contains several functions for sending and
   receiving data, e.g., send/recv, writev/readv, etc.  Here we provide
   RSerPool analogs of the most general of the IO functions, sendmsg and
   recvmsg (we pick this pair just to illustrate our general approach.
   More careful thinking should be put in here before deciding the exact
   syntax we will use for HA sockets).

   The intention is to retain as much as possible the basic UNIX
   programming model that many UNIX programmers have been familiar with.
   However, the semantics may be changed since we do not see the benefit
   for strictly mapping the semantics from UNIX sockets API.

   The following API call is used for sending a message with an HA

     ssize_t rsp_sendmsg(int sockfd,         /* HA socket descriptor */
                         struct msghdr *msg, /* message header struct */
                         int flags);         /* Options flags */

   The following API call is used for receiving a message with an HA

     ssize_t rsp_rcvmsg(int sockfd,         /* HA socket descriptor */
                        struct msghdr *msg, /* msg header struct */
                        int flags);         /* Options flags */

   The msg header stucture has RSERPOOL specific semantics.

     struct msghdr {
       void           *msg_name;       /* RSERPOOL destination name    */
       int             msg_namelen;    /* Length of name               */
       struct iovec   *msg_iov;        /* Data blocks                  */
       size_t          msg_iovlen;     /* Number of blocks             */
       void           *msg_control;    /* RSERPOOL specific ancillary
                                          data                         */
       size_t          msg_controllen; /* Length of cmsg data          */
       unsigned        msg_flags;      /* RSERPOOL specific flags
                                          returned by recvmsg          */

   For example, the *msg_name can be a pointer to a NULL terminated
   string that indicates the destination pool handle, or it can be a
   pointer to a structure that indicates a PE handle in a specific pool.

   The use of *msg_control and msg_flags parameter is TBD.  Some
   potential uses are: to indicate whether this message is allowed for
   failover, etc., or to use them to return a pointer to a PE handle
   indicating to whom the message was actually sent.

2.1  Send Message By Pool Handle

   Using rsp_sendmsg to send to a pool handle:

     struct msghdr myMsg;
     char destinationPool[255];
     strcpy(destinationPool, "IETF-RSERPOOL-Server-cluster");
     myMsg.msg_name = destinationPool;
     myMsg.msg_namelen = strlen(destinationPool) + 1;

     /* set up the data pointer here */

     myMsg.msg_flags = FAILOVER_ALLOWED;
     ret = rsp_sendmsg(sock, &myMsg, myFlags);


   (Should this go in the example section?)

2.2  Send Message By Pool Element Handle

   Using rsp_sendmsg to send to a specific PE handle (and the sender
   does not allow failover of this message):

     struct msghdr myMsg;
     struct sockaddr_pehandle destPE; /* feilds TBD */
     /* (somehow) obtain and fill in destPE struct here */
     myMsg.msg_name = &destPE;
     myMsg.msg_namelen = sizeof(destPE);

     /* set up the data pointer here */

     myMsg.msg_flags = FAILOVER_NOT_ALLOWED;
     ret = rsp_sendmsg(sock, &myMsg, myFlags);


   (Should this go in the example section?)

3.  Socket Utility Functions

   These calls provide the basic functionality for controlling RSerPool
   HA sockets and obtaining information about active sockets.

   rsp_socket(): This call sets up necessary ASAP resources for this HA
   socket and returns a socket fd to the caller.

     ssize_t rsp_socket(TBD);

   rsp_close(): This call closes the specified HA socket (and releases
   the ASAP resources associated to this HA socket).

     ssize_t rsp_close(int sockfd);

   rsp_connect(): One possibility here is to use this call to pre-fetch
   the pool information.  The caller can put the destination pool handle
   in the parameter when making this call, and the ASAP layer will query
   the Pool Registrar for the named pool and if successful will store
   the query result in its local cache.  If the named pool does not
   exist in the Pool Registrar, an error is then returned to the caller
   of connect().

     ssize_t rsp_connect(sockfd, TBD);

4.  RSerPool Utility Functions

   A number of utility functions, not directly related to data IO, but
   necessary all the same, are defined for the RSerPool sockets

   rsp_register(): This function call is used to registry a PE as a
   member of a pool.  This call should be after socket();

     ssize_t rsp_register(sockfd, TBD);

   rsp_deregister(): This function call is used to remove, or
   deregister, a PE from a pool.

     ssize_t rsp_deregister(sockfd, TBD);

   rsp_getpoolinfo(): The caller uses this to get the details of a pool.
   The call should return a full PE list, together with the pool policy,
   PE load factor, etc. to the caller.  This can be the base for the
   basic mode operation (see the service draft) in which the user of
   RSERPOOL chooses to handle its own data sending but using RSERPOOL
   only for resolving pool handles.

     struct TBD rsp_getPoolInfo(sockfd, TBD);

   rsp_reportpefailure(): This call is used by an PU and PE to report an
   unreachable PE in a pool.

     ssize_t rsp_reportfailure(sockfd, TBD);

   rsp_notify(): A notify function is needed to pass various RSERPOOL as
   well as lower layer events back to the HA socket user.  This is just
   a placeholder and requires more thought.

     ssize_t rsp_notify(TBD);

   rsp_forcefailover(): Not sure if this is really necessary as a
   separate API call, however, this functionality is needed to allow
   user of RSERPOOL to trigger a failover (the upper layer may want to
   take over the failure detection from RSERPOOL).

     ssize_t rsp_forcefailover(sockfd, TBD);

5.  Example API Usage

   This section demonstrates the usage of the RSerPool API.

5.1  An ECHO Pool Element

   int main()
       int fd;
       struct sockaddr_in server_addr;
       struct rsp_info info;
       struct rsp_sndrcvinfo sinfo;
       struct rsp_loadinfo linfo;
       int len;
       char buf[1<<16];

       /* initialize the RSerPool implementation */
       memset(&info, 0, sizeof(struct rsp_info));

       /* we want to provide an SCTP based echo server */
       fd  = rsp_socket(AF_INET, SOCK_DGRAM, IPPROTO_SCTP);

       /* we want to use all addresses and use the standard port number */
       memset(&server_addr, 0, sizeof(server_addr));
       server_addr.sin_family      = AF_INET;
   #ifdef HAVE_SIN_LEN
       server_addr.sin_len         = sizeof(struct sockaddr_in);
       server_addr.sin_addr.s_addr = htonl(INADDR_ANY);
       server_addr.sin_port        = htons(7);
       rsp_bind(fd, (const struct sockaddr *) &server_addr, sizeof(server_addr));

       /* we need to register ourself */
       memset(&linfo, 0, sizeof(struct rsp_loadinfo);
       linfo.policy = RSP_ROUND_ROBIN;
       rsp_register(fd, "echo", 4, &linfo);

       /* send back all received messages */
       while (1) {
           memset(&sinfo, 0, sizeof(struct rsp_sndrcvinfo));
           len = rsp_recvmsg(fd, (void *) buf, sizeof(buf), &sinfo, 0);
           rsp_sendmsg(fd, (const void *) buf, len, sinfo, 0);

       /* the deregistration would also be done implicitly by rsp_close() */

5.2  An ECHO Pool User

   int main()
       int fd, n, done;
       struct rsp_info info;
       fd_set rset;
       char buf[1<<16];

       /* initialize the RSerPool implementation */
       memset(&info, 0, sizeof(struct rsp_info));

       /* we want to provide an SCTP based echo server */
       fd  = rsp_socket(AF_INET, SOCK_DGRAM, IPPROTO_SCTP);

       /* set the destination */
       rsp_connect(fd, "echo", 4);

       done = 0;

       while(!done) {
           FD_SET(0,  &rset);
           FD_SET(fd, &rset);
           /* wait for a packet from the server of stdin input */
           rsp_select(fd + 1, &rset, (fd_set *) NULL, (fd_set *), NULL, (struct timeval *) NULL);
           if (FD_ISSET(0, &rset)) {
               n = Read(0, (void *) buf, sizeof(buf));
               if (n == 0)
                   done = 1;
                   rsp_sendmsg(fd, (const void *) buf, len, NULL, 0);
           if (FD_ISSET(fd, &rset)) {
               n = rsp_recvmsg(fd, (void *) buf, sizeof(buf), NULL, 0);
               if(n > 0)
                   printf("%s", buf);

6.  Security Considerations

   The security threat analysis of RSerPool is found in [5].  This
   document does not introduce any new threats.

7.  IANA Considerations


8.  Acknowledgements

