<?xml version='1.0' encoding='utf-8'?>
<!DOCTYPE rfc [
  <!ENTITY nbsp    "&#160;">
  <!ENTITY zwsp   "&#8203;">
  <!ENTITY nbhy   "&#8209;">
  <!ENTITY wj     "&#8288;">
]>
<?xml-stylesheet type="text/xsl" href="rfc2629.xslt" ?>
<!-- generated by https://github.com/cabo/kramdown-rfc version 1.7.29 (Ruby 3.4.4) -->
<rfc xmlns:xi="http://www.w3.org/2001/XInclude" ipr="trust200902" docName="draft-mahy-mimi-av-metadata-01" category="info" consensus="true" submissionType="IETF" tocInclude="true" sortRefs="true" symRefs="true" version="3">
  <!-- xml2rfc v2v3 conversion 3.29.0 -->
  <front>
    <title abbrev="MIMI Content AV Metadata">Audio, Video, and Image Metadata extensions for the More Instant Messaging Interoperability (MIMI) Content format</title>
    <seriesInfo name="Internet-Draft" value="draft-mahy-mimi-av-metadata-01"/>
    <author fullname="Rohan Mahy">
      <organization/>
      <address>
        <email>rohan.mahy@gmail.com</email>
      </address>
    </author>
    <date year="2026" month="March" day="02"/>
    <area>Applications and Real-Time</area>
    <workgroup>More Instant Messaging Interoperability</workgroup>
    <keyword>mimi content</keyword>
    <keyword>image metadata</keyword>
    <keyword>audio metadata</keyword>
    <keyword>video metadata</keyword>
    <abstract>
      <?line 38?>

<t>The More Instant Messaging Interoperability (MIMI) content format is a container for rich content, which can reference image, video, and audio files.
This document describes metadata for these files to allow for more pleasant rendering.</t>
    </abstract>
    <note removeInRFC="true">
      <name>About This Document</name>
      <t>
        The latest revision of this draft can be found at <eref target="https://rohanmahy.github.io/mimi-av-metadata/draft-mahy-mimi-av-metadata.html"/>.
        Status information for this document may be found at <eref target="https://datatracker.ietf.org/doc/draft-mahy-mimi-av-metadata/"/>.
      </t>
      <t>
        Discussion of this document takes place on the
        More Instant Messaging Interoperability Working Group mailing list (<eref target="mailto:mimi@ietf.org"/>),
        which is archived at <eref target="https://mailarchive.ietf.org/arch/browse/mimi/"/>.
        Subscribe at <eref target="https://www.ietf.org/mailman/listinfo/mimi/"/>.
      </t>
      <t>Source for this draft and an issue tracker can be found at
        <eref target="https://github.com/rohanmahy/mimi-av-metadata"/>.</t>
    </note>
  </front>
  <middle>
    <?line 43?>

<section anchor="introduction">
      <name>Introduction</name>
      <t>The MIMI content format <xref target="I-D.ietf-mimi-content"/> can convey a variety of media types, as either inline or referenced external content.
In messaging applications it is common to display audio, video, and static image content, collectively audio/video (AV).
The layout for messaging applications often reserves a placeholder for the AV content.
While it is common for static images to be immediately displayed, audio and video content is often not immediately downloaded and rendered.
Even if image data is downloaded immediately, if there is a network or server delay there can be time when the aspect ratio or dimensions of the image are not yet know.
It is therefore useful to have some rendering hints about the media for more pleasant rendering.
This document defines extensions to the MIMI content format to provide these hints.</t>
    </section>
    <section anchor="conventions-and-definitions">
      <name>Conventions and Definitions</name>
      <t>The key words "<bcp14>MUST</bcp14>", "<bcp14>MUST NOT</bcp14>", "<bcp14>REQUIRED</bcp14>", "<bcp14>SHALL</bcp14>", "<bcp14>SHALL
NOT</bcp14>", "<bcp14>SHOULD</bcp14>", "<bcp14>SHOULD NOT</bcp14>", "<bcp14>RECOMMENDED</bcp14>", "<bcp14>NOT RECOMMENDED</bcp14>",
"<bcp14>MAY</bcp14>", and "<bcp14>OPTIONAL</bcp14>" in this document are to be interpreted as
described in BCP 14 <xref target="RFC2119"/> <xref target="RFC8174"/> when, and only when, they
appear in all capitals, as shown here.</t>
      <?line -18?>

<t>This document uses a variety of terms from the MIMI content format definition, especially <tt>NestedPart</tt>, <tt>SinglePart</tt>, <tt>ExternalPart</tt>, and <tt>MultiPart</tt>.</t>
    </section>
    <section anchor="av-metadata-extensions">
      <name>AV Metadata Extensions</name>
      <t>The AV Metadata MIMI content extension is an array of AV metadata entries.
Each AV metadata entry is a CBOR map of AV metadata properties, all of which are optional except the <tt>part_index</tt> and <tt>type</tt>. The semantics of the individual property fields is as follows:</t>
      <ul spacing="normal">
        <li>
          <t><tt>part_index</tt>: refers to the order of MIMI parts from the relevant part inside the NestedPart structure in a MIMI content message. It can refer to a <tt>SinglePart</tt> or <tt>ExternalPart</tt>.</t>
        </li>
        <li>
          <t><tt>type</tt>: an integer enumeration representing the media type (not-including the subtype). audio is 1, image is 2, and video is 3. An extension socket is defined, although its need is not anticipated.</t>
        </li>
        <li>
          <t><tt>width</tt>: the width of the image or video in pixels</t>
        </li>
        <li>
          <t><tt>height</tt>: the height of the image or video in pixels</t>
        </li>
        <li>
          <t><tt>duration</tt>: the duration of the audio or video in seconds. It can be expressed as an unsigned integer or a positive floating point number</t>
        </li>
        <li>
          <t><tt>preview_index</tt>: for a video part, the <tt>partIndex</tt> of another related part that represents its image preview. It can refer to a <tt>SinglePart</tt> or <tt>ExternalPart</tt>, or a <tt>MultiPart</tt> with <tt>chooseOne</tt> <tt>partSemantics</tt> which contains only <tt>SinglePart</tt> or <tt>ExternalPart</tt> types, all of which must be an image with a disposition value of <tt>preview</tt>.</t>
        </li>
        <li>
          <t><tt>accessibility_text</tt>: this text could be rendered instead of the audio, image, or video when various accessiblity settings are enabled, or during no or slow network access when a cached or preview image is not available.</t>
        </li>
        <li>
          <t><tt>rotation</tt>: one of four values: 0, 90, 180, or 270. This integer refers to the number of degrees of clockwise rotation (in 90 degree increments) needed to correctly view the image.</t>
        </li>
      </ul>
      <ul empty="true">
        <li>
          <t>Note that an orientation field is not necessary. Any image and video with a <tt>width</tt> field which is larger than its <tt>height</tt> is assumed to have a landscape mode orientation, while one with a <tt>height</tt> larger than its width is assumed to have a portrait mode orientation.</t>
        </li>
      </ul>
      <t>The following snippet of Concise Data Definition Language (CDDL) <xref target="RFC8160"/>  is used to formally define the structure of the extension.</t>
      <sourcecode type="cddl"><![CDATA[
av_metadata_array = (
    "av_metadata" : [ * metadata_entry ]
)

metadata_entry = {
    &(part_index: 1) : uint16,
    &(type: 2)       : audio / image / video / $ext_media,
    ? &(width: 3)              : uint,
    ? &(height: 4)             : uint,
    ? &(duration: 5)           : nonnegative_number,
    ? &(preview_index: 6)      : uint16,
    ? &(accessibility_text: 7) : tstr,
    ? &(rotation: 8)           : 0 / 90 / 180 / 270
    $ext_av_metadata
}

nonnegative_number = uint / float .gt 0.0
uint16 = uint .size 2

audio = 1
image = 2
video = 3
]]></sourcecode>
    </section>
    <section anchor="example">
      <name>Example</name>
      <t>Below is an example of a video of puppies, a preview image, and an audio clip.</t>
      <sourcecode type="cbor-diag"><![CDATA[
"av_metadata" : [
  {
     /partIndex /         1: 2,
     /type      /         2: 3, /video/
     /width     /         3: 1920,
     /height    /         4: 1080,
     /duration  /         5: 37, / in seconds. can be uint or float /
     /preview_index /     6: 4,
     /accessibility_text/ 7: "two golden retriever puppies playing in" +
                             "overgrown grass lit with low sunlight"
  },
  {
     /partIndex /         1: 4,
     /type      /         2: 2, /image/
     /width     /         3: 1920,
     /height    /         4: 1080
  },
  {
     /partIndex /         1: 7,
     /type      /         2: 1, /audio/
     /duration  /         5: 9.45, / in seconds. can be uint or float /
     /accessibility_text/ 7: "uproarious laughter"
  }
]
]]></sourcecode>
    </section>
    <section anchor="security-considerations">
      <name>Security Considerations</name>
      <t>TODO Security</t>
    </section>
    <section anchor="iana-considerations">
      <name>IANA Considerations</name>
      <t>TODO register the extension with IANA.</t>
    </section>
  </middle>
  <back>
    <references anchor="sec-normative-references">
      <name>Normative References</name>
      <reference anchor="I-D.ietf-mimi-content">
        <front>
          <title>More Instant Messaging Interoperability (MIMI) message content</title>
          <author fullname="Rohan Mahy" initials="R." surname="Mahy">
            <organization>Rohan Mahy Consulting Services</organization>
          </author>
          <date day="7" month="July" year="2025"/>
          <abstract>
            <t>   This document describes content semantics common in Instant Messaging
   (IM) systems and describes a profile suitable for instant messaging
   interoperability of messages end-to-end encrypted inside the MLS
   (Message Layer Security) Protocol.

            </t>
          </abstract>
        </front>
        <seriesInfo name="Internet-Draft" value="draft-ietf-mimi-content-07"/>
      </reference>
      <reference anchor="RFC2119" xml:base="https://bib.ietf.org/public/rfc/bibxml/reference.RFC.2119.xml">
        <front>
          <title>Key words for use in RFCs to Indicate Requirement Levels</title>
          <author fullname="S. Bradner" initials="S." surname="Bradner"/>
          <date month="March" year="1997"/>
          <abstract>
            <t>In many standards track documents several words are used to signify the requirements in the specification. These words are often capitalized. This document defines these words as they should be interpreted in IETF documents. This document specifies an Internet Best Current Practices for the Internet Community, and requests discussion and suggestions for improvements.</t>
          </abstract>
        </front>
        <seriesInfo name="BCP" value="14"/>
        <seriesInfo name="RFC" value="2119"/>
        <seriesInfo name="DOI" value="10.17487/RFC2119"/>
      </reference>
      <reference anchor="RFC8174" xml:base="https://bib.ietf.org/public/rfc/bibxml/reference.RFC.8174.xml">
        <front>
          <title>Ambiguity of Uppercase vs Lowercase in RFC 2119 Key Words</title>
          <author fullname="B. Leiba" initials="B." surname="Leiba"/>
          <date month="May" year="2017"/>
          <abstract>
            <t>RFC 2119 specifies common key words that may be used in protocol specifications. This document aims to reduce the ambiguity by clarifying that only UPPERCASE usage of the key words have the defined special meanings.</t>
          </abstract>
        </front>
        <seriesInfo name="BCP" value="14"/>
        <seriesInfo name="RFC" value="8174"/>
        <seriesInfo name="DOI" value="10.17487/RFC8174"/>
      </reference>
      <reference anchor="RFC8160">
        <front>
          <title>IUTF8 Terminal Mode in Secure Shell (SSH)</title>
          <author fullname="S. Tatham" initials="S." surname="Tatham"/>
          <author fullname="D. Tucker" initials="D." surname="Tucker"/>
          <date month="April" year="2017"/>
          <abstract>
            <t>This document specifies a new opcode in the Secure Shell terminal modes encoding. The new opcode describes the widely used IUTF8 terminal mode bit, which indicates that terminal I/O uses UTF-8 character encoding.</t>
          </abstract>
        </front>
        <seriesInfo name="RFC" value="8160"/>
        <seriesInfo name="DOI" value="10.17487/RFC8160"/>
      </reference>
    </references>
    <?line 165?>

<section numbered="false" anchor="acknowledgments">
      <name>Acknowledgments</name>
      <t>TODO acknowledge.</t>
    </section>
  </back>
  <!-- ##markdown-source:
H4sIAAAAAAAAA61Y7W7bOBb9r6fgehaLZNaW4zRtGmPSTpqkOwby0U3SDgaD
IqEl2iYqkwJJOfEU7bPss+yT7bkkJVtJm7bABkgikbzk/Tj33Ev1er3ESVeI
IescVLnUXfZO5gL/uMrZaM6ngp0Kx3PuOBN3TigrtbJsog1zM8xpI9hIWceV
w0Jr+VSqKUacMLoUho9lId2SbZyOTkeb7FBjAishPueuk/Dx2IgFzqbpZvbg
XXNmJ8m4E1NtlkMm1UQnSa4zxefQNzd84npzPlv25nIue3zRm0ep3tYgsdV4
Li1p65Yllo+Or14z9hPjhdU4UKpclAJ/lOt0WUfk0mkjeUEvo4NX+AcLO6OL
q9edRFXzsTDDBFuLYZLBfrihskPmTCUSqP8k4UZwcmFZFhIaex+RBy8EL3pX
ci46ya02H6ZGVyWZ+31u6yQfxBJy+TBhPUZWsiz4iN6lj05tM41wimBrZEHB
XI0shKpgAmM/rAhjwY2d32EGLfkX7UDjcy4LjJN2v0rhJqk2UxrnJpthfOZc
aYf9Pi2jIbkQab2sTwP9sdG3VvRpgz4JTqWbVWOIGj3jigLcvx9gWlYgGtat
ndAsT8MOqdQPBPuPoCaduXnRSRJeuZk25HKcwtikKoqAuAs6gJ1C1k/AAK7k
Xz7aQz8igi+8Iimd8euURtJMz5NEeczD/GGSEJRXb0mvh9iNrTM8c0ly9eN5
lbXyikmAz49xqYTxyWpkNquXddntzL/CGiMmwgiViQCnbkBMSP8Ap4kshE2h
FXZF8lVzOigXNjNyLGyDrZoSrAgSzGnkWqFv/cSc7CkLwS0ZZCjxDAxKg+1z
meeFSJKfyECj8yojn0ZPEDPcs+/jx7+NekceRiGKcf7TJ28T3hZiCRcsuMGa
JdMTqJlL7jFsYZxlAhCBa6Qq4CLK9cYRuec5o3hRH5smI4UN6iDw9SSX3tsI
8FwrMjmXtiz4Mviu5U3E0sksZm0TiUwXhcgICEUU6oec3Th4t5l6D2A7Xbng
xi8roSfYDBZYYRaCYg8VMjHTRR6jT1QNVm3s+X2GELV1p2XrKvoAjgkW3nWO
9IvGibwbsUF2BXXrCMlaG6VdW1bfqkLzHP4lqQABkafJMUiJyUl0jIeSR1qz
fG2XLi2kwImAcSUc0SrFz9tuAEzyflhCWIAFDvQLxOMUcgO3JfzNDLmO5HLM
xqKm/d5RETC6t2EpHPug9C1A4K3zW08IzpUV4AZy04wvBLMaxzTIZjOpHFQc
U+ho1wDAR3PhfopNAE27XnVxlPtKSmCqNJpiEZPQn4/8QlIdUj6oVVE6op2l
fw85hirDqMxYFIS3l1dUA+k/Ozv3zxfH/347ujg+oufL3w5OTpqHJK64/O38
7cnR6mkleXh+enp8dhSEMcpaQ0nn9OCPTsiPzvmbq9H52cFJB1kJI9Z9QcGI
cCQGLI1whCOb1DyUk8yrwzf//c9gh+jh4vXh9mCwB0IIL88Huzt4IRSE07QC
JsMr/LVMkE+CEx8QaQE5pXToFDxV2BmwyCjscOfPf5Jn3g/ZL+OsHOy8iANk
cGuw9llr0Pvs4cgD4eDELwx94ZjGm63xe55u63vwR+u99vva4C8vPS32Bs9f
vkiSe8AE7m2bXBGSOXpCo+dfxWfeYK7LBKUgei1E4OYMNVzkb7hxN112c4k8
KET9dhx5OL5T2G5Oq8JJP5AStNdaRXbcJEpA9fpcS6UmozyJIOLGcG8HJJpy
hoWwDxl0zFEq788sA/8cvjq/QANU3hcufZF20hcb4AnToeQSkHVJbkB5EXeZ
KAM73JQw6Zqa0rubYCiVqpuUkSEWfQXSN1sxlMolUr3CHvGkJYquKJDApBa1
51R5LXqLXmvrYShzDZMg58GZ2NX7hxauhdGIQiyIomgcZ9pILmwVM1QMg2Jd
ER/Dj203h2IlUgbibFoN3xa0Ak0k3A51Slp7+4cUHkr5KSTRu86F523aq6Ry
B6+Aalf0SkJsA7TdkyorUKLiLC4DNLWZxroFLw26kejxvN1dq2R4f5KyA7UG
E6uzD8KzfyBlqn8FmsRqOkMRtShDREDW1wsfKVmiXOXejluZuxkMITX8c7vM
wPh4rGKlvBOFJaGZkNOZi1Lh5XvE8iq4JwrWr7VosH1d1AoEK7dNiECw4o48
az29kvcruGCqPMGGMEAeANdWUtPCJijSPgilxgIWrkoedbjYSXHbAG/i5cLJ
BKjuCvejAHtoyeFBassAPXJgQJ6bcbcKuPUeD16IZ/w4wrrBijU2QWwQmpts
prUV50rcBNUu69S7qXvm0FXbUEAeP6VpN9cZYF5ZR34mYHsj/MHc91beqYjX
gheVIJHaiyEleJYhMjK0/tcO+PSBpqYEz1CtKnLauu6tKGed4Hkr/t260W9w
4FsjYnNdIeLxDH+7sMJRbK2nLaH4uCDoU9NU+R5HeTRZavLrTizIhz1xBwF3
Qg8sioascs7nyoIuhdjVm2e0q9GrlTd/oisTnIHL9laX7eF38HzLq7C9u0Xs
iJ1qYLapLSCRdsnF1AjhyTMrkMm3Eu1RfRjbQBrsbcVF2Cszgqqc3fRZDeUd
NbfGoGlEwL0NTRqiBL04004EiCKiGiVDxY09IdeWKkFu4WZJzLKsW8yGcyIG
IllE0YAXbIB7M9nn6PZJ4K/pIZC9BSvmTRfKsRoZjQYGnKhzsa6Sv/eh7yfv
1ifWW90/I1DVFw8otcFNFZeH+/unoe6G4kP4sEqirfLUhR40I78fUXlcNaDs
hKtpRc7YODw6Otls2rVnW2jX6PzKhsN9H0EtQ6DgQOtN9YkIbygbqnz+/Jll
uFcmfHFdF+brUOr32Ya/r3fWpjpsyP5kPzc1/DqU+ffJZpLcG9tnH734PzZW
pXXIBpvYoQIYB8+6cTp8MtneZOFnGBm4H+Pfj9Hvs79D8WtfwoLoSwj7CAzZ
k1q62YTOWC0LARyync1Hl9W1YMiebraWKa2UmPpPEdchZ1ZCLQ4fsmeb63vX
ZtLCh8w0ZLvkD4cQrZbVSTdkz9tKbMEHe/QH2Y2/yG0v492yFqPkE31Gua8v
4kH6QM4XI5ZOHdtKt5KgZD2bWvmXYNv0eYdisM8GSYjCPgZDHPbZE0IN9ZXH
d3xe0ieJV4LYLfSJIgz6QhVDh8eyKsvQ6LVJLn5DUTHoWSHLGpVjbXqI9TR5
AECYHbDF+k1lhGH1zwBo6sZ53+uEx2Z+G3jpsvAFoR/XhUxur3sCuO5tb9Vb
xQ6jtWQHS7aeN0uaXmJtyVOcttslOK91ErGN8D4HTYeQ1Lq0ABV3egbs1qc8
xFGf7Q5ZB7WFTelzBtV46svpqh8dTx87lsQ2UnXYPxP22E9HQ3Bq6D43NWA2
hnMCFVKUbaUKcgR9XfzU/XYodr4RCrSVfQ+F/08ovlOr3W9ohca3Hz40PR7a
vXTn6Q8F92vBq3BHid1FwdEyo0PyHk7eh2xDul2KDB0FOg6UCbpmBIXoHnd+
dN7M+qWjg7ODLy8zYirR75h2JQjhJak0fmkd8+yDvztm9F0HLc3UF/zk4zDw
icj3OxNc/kXnU9yZNytR8v8Hx/BerzMZAAA=

-->

</rfc>
