Internet DRAFT - draft-rosenberg-mimi-msg-format

draft-rosenberg-mimi-msg-format







Mimi                                                        J. Rosenberg
Internet-Draft                                                     Five9
Intended status: Standards Track                             C. Jennings
Expires: 27 April 2023                                             Cisco
                                                         24 October 2022


            Message format for More Messaging Interop (MIMI)
                   draft-rosenberg-mimi-msg-format-00

Abstract

   This document defines a semantic model and format for the inter-
   provider exchange of chat messages.  This format is focused on
   interoperability, while providing extensibility for additional
   content downstream.  It supports the common messaging features
   present in chat systems today, including threading, reactions,
   images, gifs, videos, delivery and read receipts.

Status of This Memo

   This Internet-Draft is submitted in full conformance with the
   provisions of BCP 78 and BCP 79.

   Internet-Drafts are working documents of the Internet Engineering
   Task Force (IETF).  Note that other groups may also distribute
   working documents as Internet-Drafts.  The list of current Internet-
   Drafts is at https://datatracker.ietf.org/drafts/current/.

   Internet-Drafts are draft documents valid for a maximum of six months
   and may be updated, replaced, or obsoleted by other documents at any
   time.  It is inappropriate to use Internet-Drafts as reference
   material or to cite them other than as "work in progress."

   This Internet-Draft will expire on 27 April 2023.

Copyright Notice

   Copyright (c) 2022 IETF Trust and the persons identified as the
   document authors.  All rights reserved.











Rosenberg & Jennings      Expires 27 April 2023                 [Page 1]

Internet-Draft               MIMI Msg Format                October 2022


   This document is subject to BCP 78 and the IETF Trust's Legal
   Provisions Relating to IETF Documents (https://trustee.ietf.org/
   license-info) in effect on the date of publication of this document.
   Please review these documents carefully, as they describe your rights
   and restrictions with respect to this document.  Code Components
   extracted from this document must include Revised BSD License text as
   described in Section 4.e of the Trust Legal Provisions and are
   provided without warranty as described in the Revised BSD License.

Table of Contents

   1.  Introduction  . . . . . . . . . . . . . . . . . . . . . . . .   2
   2.  Chat Resource Semantic Model  . . . . . . . . . . . . . . . .   2
   3.  MIMI Message Syntax . . . . . . . . . . . . . . . . . . . . .   4
   4.  Normative References  . . . . . . . . . . . . . . . . . . . .   6
   Authors' Addresses  . . . . . . . . . . . . . . . . . . . . . . .   7

1.  Introduction

   The More Instant Messaging Interoperability (MIMI) working group will
   specify the minimal set of mechanisms required to make modern
   Internet messaging applications interoperable.  Over time, messaging
   applications have achieved widespread use, their feature sets have
   broadened, and their adoption of end-to-end encryption (E2EE) has
   grown, but the lack of interoperability between these services
   continues to create a suboptimal user experience.  The standards
   produced by the MIMI working group will allow for E2EE messaging
   applications for both consumer and enterprise to interoperate without
   undermining the security guarantees that they provide.

   For security purposes, end to end messaging encryption using MLS
   [I-D.ietf-mls-architecture] will be used.  MLS provides encryption of
   opaque blobs of message content, but does not specify the content
   format itself.  This specification is meant to fill that gap,
   providing the semantics of a messaging system and the syntax of
   messages exchanged between providers.

2.  Chat Resource Semantic Model

   A chat resource (often called a chat or chat room), represents a
   message-based communications between 2 or more users.  When there are
   two users, it is referred to as a 1-1 chat.  When there are more than
   two users, it is referred to as a group chat.  Each chat resource is
   identified by a tuple, consisting of a version 4 UUID, and a DNS
   name.  The UUID uniquely identifies the chat resource, and is called
   the chat ID.  The DNS name identifies the provider in which it lives.
   We refer to this provider as the owner.




Rosenberg & Jennings      Expires 27 April 2023                 [Page 2]

Internet-Draft               MIMI Msg Format                October 2022


   In some chat systems, there can only be a single instance of a 1-1
   group chat between a pair of users.  MIMI is agnostic to this choice,
   and reflects whatever policy is in place by the owner.

   A chat has a set of properties.  In this version, only a single
   property is defined - the display name.

   The chat also maintains the current list of members.  Each member is
   represented by their identity, which can be mapped to the keying
   material used to decrypt messages using MLS [I-D.ietf-mls-protocol].

   The chat, of course, has a sequence of messages.  Each message has a
   type.  The set of valid types is extensible.  Messages are immutable
   once posted.  If a message is edited or deleted, this is handled by
   sending a new message which is an edit or deletion of the prior
   message.  The set of types are: content (in which the user sends
   text, image, video or audio), edit (in which the user is modifying a
   prior message), delete (in which the user is deleting a prior
   message), reaction (in which the user is reacting to a prior
   message), and create thread (in which the user is creating a thread
   about a prior message).  The content message type includes the format
   of the content as a MIME type (e.g., text/plain).  All messages
   include a reference to the prior message.  For reactions, edits,
   deletes and threads, this reference is to the specific message for
   which this is a reaction, edit, delete or start of a thread.  For a
   content message, the reference indicates the most recent message in
   the chat known to the user when they posted the message.  This
   facilitates message sequencing operations.

   There is also a message type for modifying the chat properties.  This
   message contains the property name and its value.  In this case, it
   would be text for the display name of the chat.

   (TBD: need to sort out message related to group membership changes
   and whether they are part of this protocol or just using mls in some
   way).

   All messages include the identity of the user that generated the
   message.  These must match to the identities known the MLS AS.

   (TBD: how to convey the keyID needed to decrypt, which is needed
   outside of the payload that is encrypted?)

   All messages also include the chat resource (ID and provider DNS
   name).  This makes each MIMI message completely self-contained, and
   usable without any additional context outside of the message itself.





Rosenberg & Jennings      Expires 27 April 2023                 [Page 3]

Internet-Draft               MIMI Msg Format                October 2022


   When a user posts a message to the chat, the message is e2e
   encrypted.  This means that the server, and its provider, does not
   and cannot decrypt the content.  Thus, mimi messages are considered
   opaque to the server.  The server will store these messages, but note
   the timestamp at which the message is received.  This timestamp is
   used to facilitate synchronization of messages between the source of
   truth and any domains which are holding replicas.  The
   synchronization is performed by having the providers of the
   participants issue subscription using I-D.nandakumar-mimi-transport,
   and requesting all messages since a specific timestamp.

   Different chat systems have different rules about whether or not a
   new user, added to the chat, has access to historic messages in the
   chat that were posted prior to joining.  This specification leaves
   that choice to the policy of the owner of the chat, and supports
   models where history is provided, and where it is not provided.  In
   cases where it is not provided, when a user is added to a chat at
   time T, they would have access to all content posted from time T
   onwards.  This would work by having their provider subscribe to all
   messages starting at time T.  In cases where history is required, the
   provider would request messages starting from some time prior to T,
   probably as the user scrolls backwards through the chat.

   Consequently, a key property of the system is that, for any value of
   T, a provider can subscribe to messages sent since time T, pass them
   to the end client, which can decrypt them and "execute" them in
   sequence.  That sequence produces a valid rendering of the chat
   history that is not missing information.  For this to be true, it
   also means that reactions, threads, edits and deletes must also
   include the original content to which they apply.  Consider the case
   where a message is posted at time T-5, and then another user posts a
   reaction at time T+3.  A new user is invited to the group chat at
   time T.  If they subscribe to receive all messages sent since time T,
   they will get the reaction at time T+3, but not the original content
   which is being reacted to.  Thus, the edit needs to include the
   content to which there is a reaction.

   TODO: need to add timestamps, think about whether these are client
   generated and thus included in the signatures or server side; does
   MLS say something about this?

3.  MIMI Message Syntax

   MIMI messages are structured as JSON, which is the current syntax
   dujour for representing extensible data on the Internet.  The old
   CPIM format [RFC3862], while originally specified as an interoperable
   format for instant messaging, is sufficiently dated at this point and
   missing many of the fields needed.



Rosenberg & Jennings      Expires 27 April 2023                 [Page 4]

Internet-Draft               MIMI Msg Format                October 2022


   The following is an example message in json format:

   {
     "ID" : "6845db7f-95b4-4f60-9a65-820f222e444a",

     "chat" : {
       "ID" : "72c659b7-d1f7-46ab-ae73-2339e3839036",
       "provider": "whatsapp.com"
     },

     "sender" : "+14085551212",

     "type" : "reaction",

     "reaction" : {
       "unicode" : "U+2764"
     }

     "reference" : {
       "ID" : "959489b0-40ab-4baf-b187-5795b8757c67",

       "sender" : "+17329876543",
       "type" : "content",
       "format" "text/plain",

       "text" : "Sure, I will join you guys *l8r*",

       "refersTo" : "473db0ec-7950-4c38-8de2-189ea9ac132b"

     }

   }

   The "ID" field indicates the identity of the message.  The "chat"
   structure includes the chat resource ID and its associated provider.
   The "sender" here is an E.164 number which refers to the sender of
   this mesage.  This example message is of type "reaction".  For each
   type, there is always a structure which has information specific to
   this type.  In the case of a reaction, this is a "reaction" structure
   that has a single field - the unicode character that represents the
   reaction.  In this case, it is U+2764 which is a heart.










Rosenberg & Jennings      Expires 27 April 2023                 [Page 5]

Internet-Draft               MIMI Msg Format                October 2022


   Most importantly, the message contains a reference structure, which
   is the message to which the reaction applies.  The reference always
   includes the ID, sender, type and content of the reference.  Here, it
   is a text message from a different user, "+17329876543".  That
   message, in turn, was typed at a time when message "473db0ec-
   7950-4c38-8de2-189ea9ac132b" was the most recent one in the UI of
   this user.

   In this use case, had there been reactions to this message which
   happened prior to the user joining the group, and history was not
   provided, the new user would not see all of the reactions - it would
   only see those reactions which were sent subsequent to the user
   joining the chat.  But, the new user joining the group would at least
   see the message to which the reaction was applied, even though that
   message itself may have been sent prior to the user joining the
   group.

   For text content, markdown is used to enable based formatting.  A
   limited subset of markdown will be supported (details TBD).

   Threads are not permitted to have subthreads.

   Link previews are problematic and require further discussion.  THere
   are two options - previews generated at the sender, and previews
   generated at the receiver.  If the preview is generated at the
   receiver, it is a significant security issue, since it triggers the
   receiver to fetch a URL that they did not explicitly click on.  When
   generated at the sender, they potentially reveal private information
   about the page which would only be shown to the sender, not the
   receiver (think: sending a link to your bank).  My view is that they
   should be sender generated in mimi, but without cookies.

4.  Normative References

   [I-D.ietf-mls-architecture]
              Beurdouche, B., Rescorla, E., Omara, E., Inguva, S., Kwon,
              A., and A. Duric, "The Messaging Layer Security (MLS)
              Architecture", Work in Progress, Internet-Draft, draft-
              ietf-mls-architecture-09, 19 August 2022,
              <https://www.ietf.org/archive/id/draft-ietf-mls-
              architecture-09.txt>.










Rosenberg & Jennings      Expires 27 April 2023                 [Page 6]

Internet-Draft               MIMI Msg Format                October 2022


   [I-D.ietf-mls-protocol]
              Barnes, R., Beurdouche, B., Robert, R., Millican, J.,
              Omara, E., and K. Cohn-Gordon, "The Messaging Layer
              Security (MLS) Protocol", Work in Progress, Internet-
              Draft, draft-ietf-mls-protocol-16, 11 July 2022,
              <https://www.ietf.org/archive/id/draft-ietf-mls-protocol-
              16.txt>.

   [RFC2119]  Bradner, S., "Key words for use in RFCs to Indicate
              Requirement Levels", BCP 14, RFC 2119,
              DOI 10.17487/RFC2119, March 1997,
              <https://www.rfc-editor.org/info/rfc2119>.

   [RFC3862]  Klyne, G. and D. Atkins, "Common Presence and Instant
              Messaging (CPIM): Message Format", RFC 3862,
              DOI 10.17487/RFC3862, August 2004,
              <https://www.rfc-editor.org/info/rfc3862>.

   [RFC6120]  Saint-Andre, P., "Extensible Messaging and Presence
              Protocol (XMPP): Core", RFC 6120, DOI 10.17487/RFC6120,
              March 2011, <https://www.rfc-editor.org/info/rfc6120>.

   [RFC6914]  Rosenberg, J., "SIMPLE Made Simple: An Overview of the
              IETF Specifications for Instant Messaging and Presence
              Using the Session Initiation Protocol (SIP)", RFC 6914,
              DOI 10.17487/RFC6914, April 2013,
              <https://www.rfc-editor.org/info/rfc6914>.

Authors' Addresses

   Jonathan Rosenberg
   Five9
   Email: jdrosen@jdrosen.net


   Cullen Jennings
   Cisco
   Email: fluffy@iii.ca













Rosenberg & Jennings      Expires 27 April 2023                 [Page 7]