WG Working Group D. Habib Internet-Draft A. Joaquin Intended status: Standards Track True3D Technologies Inc. Expires: 10 March 2025 6 September 2024 VoxelVideo Format draft-habib-voxelvideo-00 Abstract This document proposes the VoxelVideo format, a file structure designed specifically for the efficient handling, playback, and livestreaming of 3D voxel-based videos. The format is intended for applications in gaming, virtual reality, live sports, and interactive media, providing a robust framework for managing complex 3D data with spatial precision and color fidelity. This document describes the current JSON-based version and outlines future plans to adopt a more efficent, compressed format. About This Document This note is to be removed before publishing as an RFC. The latest revision of this draft can be found at https://example.com/LATEST. Status information for this document may be found at https://datatracker.ietf.org/doc/draft-habib-voxelvideo/. Discussion of this document takes place on the WG Working Group mailing list (mailto:WG@example.com), which is archived at https://example.com/WG. Source for this draft and an issue tracker can be found at https://github.com/voxelvideos/voxel-video-file-format. Status of This Memo This Internet-Draft is submitted in full conformance with the provisions of BCP 78 and BCP 79. Internet-Drafts are working documents of the Internet Engineering Task Force (IETF). Note that other groups may also distribute working documents as Internet-Drafts. The list of current Internet- Drafts is at https://datatracker.ietf.org/drafts/current/. Habib & Joaquin Expires 10 March 2025 [Page 1] Internet-Draft VoxelVideo Format September 2024 Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress." This Internet-Draft will expire on 10 March 2025. Copyright Notice Copyright (c) 2024 IETF Trust and the persons identified as the document authors. All rights reserved. This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents (https://trustee.ietf.org/ license-info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document. Code Components extracted from this document must include Revised BSD License text as described in Section 4.e of the Trust Legal Provisions and are provided without warranty as described in the Revised BSD License. Table of Contents 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 3 2. Conventions and Definitions . . . . . . . . . . . . . . . . . 3 3. VoxelVideo Format Overview . . . . . . . . . . . . . . . . . 3 4. File Structure and Key Components . . . . . . . . . . . . . . 3 4.1. Version . . . . . . . . . . . . . . . . . . . . . . . . . 4 4.2. Playback Metadata . . . . . . . . . . . . . . . . . . . . 4 4.3. Voxel Dimensions . . . . . . . . . . . . . . . . . . . . 4 4.4. Title . . . . . . . . . . . . . . . . . . . . . . . . . . 4 4.5. Blocks Array . . . . . . . . . . . . . . . . . . . . . . 4 5. Frame Types and Future Enhancements . . . . . . . . . . . . . 5 6. Interpretation of the Blocks Array . . . . . . . . . . . . . 5 7. Potential Use Cases . . . . . . . . . . . . . . . . . . . . . 5 8. Future Directions . . . . . . . . . . . . . . . . . . . . . . 6 9. Security Considerations . . . . . . . . . . . . . . . . . . . 6 10. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 6 11. References . . . . . . . . . . . . . . . . . . . . . . . . . 6 11.1. Normative References . . . . . . . . . . . . . . . . . . 6 11.2. Informative References . . . . . . . . . . . . . . . . . 6 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 7 Habib & Joaquin Expires 10 March 2025 [Page 2] Internet-Draft VoxelVideo Format September 2024 1. Introduction The VoxelVideo format addresses the need for an efficient and scalable method to handle, render, and stream 3D voxel-based videos. Existing video formats are not optimized for the distinct characteristics of voxel data, such as spatial precision and color fidelity in three dimensions. The VoxelVideo format is tailored for use in applications like gaming, virtual reality, live sports, and interactive media, where real-time manipulation and playback of complex 3D content are essential. This document describes the JSON- based format and outlines future enhancements, including the adoption of a compressed binary format aimed at improving performance and reducing file sizes. 2. Conventions and Definitions The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in BCP 14 [RFC2119] [RFC8174] when, and only when, they appear in all capitals, as shown here. This document uses the following terms: * *Voxel*: a three-dimensional pixel representing a point in 3D space with associated attributes such as color [AES]. * *I-frame (Intra-coded frame)*: A self-contained video frame that fully represents the voxel grid. * *P-frame (Predicted frame)*: A video frame which contains only information on the differences from previous frames. 3. VoxelVideo Format Overview The VoxelVideo format utilizes a JSON-based structure to organize 3D voxel data, facilitating efficient playback and manipulation. It is designed to be straightforward to implement, with a focus on clarity and accessibility in its initial version. Future iterations will shift towards a compressed binary format to enhance scalability and performance. 4. File Structure and Key Components The current structure of the VoxelVideo is as follows: Habib & Joaquin Expires 10 March 2025 [Page 3] Internet-Draft VoxelVideo Format September 2024 export default interface VoxelVideoData { Version: number; // Specifies the version of the file format Framerate: number; // Frames per second Framecount: number; // Total number of frames Duration: number; // Total duration of the video in seconds Dimensions: { // Dimensions of the voxel matrix x: number; // Width in voxels y: number; // Height in voxels z: number; // Depth in voxels }; Title: string; // Title of the video Blocks: (string | null)[][]; // 3D array representing the voxel grid } 4.1. Version The Version field ensures compatibility across different iterations of the format. 4.2. Playback Metadata Metadata fields like Framerate, Framecount, and Duration provide essential information for correct playback of the video content. 4.3. Voxel Dimensions Dimensions defines the size of the voxel space, specifying the width, height, and depth necessary for rendering. 4.4. Title The Title field serves to identify the content of the video, for easier cataloging and retrieval. 4.5. Blocks Array The Blocks array represents the 3D voxel grid as a nested array, where each inner array corresponds to a frame of the video. Within each frame, the arrays represent slices of the voxel space at varying depths, providing a structured representation of the 3D voxel data across the entire frame. Habib & Joaquin Expires 10 March 2025 [Page 4] Internet-Draft VoxelVideo Format September 2024 5. Frame Types and Future Enhancements Currently, the VoxelVideo format utilizes I-frames, where each frame is a complete voxel grid independent of other frames. Future versions will include P-frames, which will encode changes between frames to reduce file size and improve streaming efficiency in 3D environments. 6. Interpretation of the Blocks Array To interpret the Blocks array, read the voxel grid plane by plane, starting at the top-left corner of the first plane (height 0), proceeding row by row from left to right. After completing all rows of a plane, move up to the next plane and repeat the process until all planes are read. 7. Potential Use Cases The VoxelVideo format is suitable for scenarios requiring real-time interaction and manipulation of 3D video content. Use cases include immersive virtual reality environments where live 3D voxel video streaming can deliver spatially precise and dynamic visual data. The format is also applicable in education simulations that require high- fidelity 3D visualizations for teaching concepts in fields such as medicine, engineering, and environmental science. Interactive gaming environments can utilize the VoxelVideo format to represent 3D voxel data, allowing for the creation of fully manipulable virtual worlds that support complex interactions within the game environment. In additon, the VoxelVideo format may be used in live sports broadcasting to generate 3D replays and visualizations, enabling viewers to observe events from various perspectives and analyze the content interactively. This capability can provide additional insights beyond traditional 2D video formats. The VoxelVideo format supports livestreaming using Dynamic Adaptive Streaming over HTTP (DASH), similar to its application for 2D videos. This approach allows for efficient and scalable delivery of 3D voxel video content across various devices and network conditions by segmenting videos into smaller parts and encoding each segment at different quality levels. Habib & Joaquin Expires 10 March 2025 [Page 5] Internet-Draft VoxelVideo Format September 2024 8. Future Directions The current version of the VoxelVideo format (version 0.0), utilizes a JSON-based structure designed for simplicity and ease of use, facilitating initial development and experimentation. Future iterations will focus on enhancing performance and scalability by transitioning to a compressed binary format, aiming to reduce file sizes and improve data storage and retrieval efficiency. Additionally, future versions will explore the implementation of P-frames to encode changes between frames, further optimizing file size and playback performance. These enhancements are intended to expand the format's capabilities, providing a more robust solution for managing and delivering 3D voxel-based video content at scale. 9. Security Considerations This document does not specifically address security considerations. It is important for implementers of the VoxelVideo format to consider the potential security implications associated with processing untrusted voxel data. Implementers should account for the worst-case scenarios in terms of computational complexity, memory usage, and required processing resources when handling voxel data, especially in contexts like livestreaming where inputs may be dynamic and unverified. 10. IANA Considerations This document has no IANA actions. 11. References 11.1. Normative References [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate Requirement Levels", BCP 14, RFC 2119, DOI 10.17487/RFC2119, March 1997, <https://www.rfc-editor.org/rfc/rfc2119>. [RFC8174] Leiba, B., "Ambiguity of Uppercase vs Lowercase in RFC 2119 Key Words", BCP 14, RFC 8174, DOI 10.17487/RFC8174, May 2017, <https://www.rfc-editor.org/rfc/rfc8174>. 11.2. Informative References [AES] Shchurova, C. I., "A methodology to design a 3D graphic editor for micro-modeling of fiber-reinforced composite parts", Journal Advances in Engineering Software, Volume 90, December 2015, Pages 76-82, 2015. Habib & Joaquin Expires 10 March 2025 [Page 6] Internet-Draft VoxelVideo Format September 2024 Authors' Addresses Daniel Habib True3D Technologies Inc. 15 Morris Ave Apt. 307 Long Branch, NJ, 07740 United States of America Phone: +1 908 812 8365 Email: daniel@quickvid.ai Alyssa Joaquin True3D Technologies Inc. Email: alyssa.joaquin@gmail.com Habib & Joaquin Expires 10 March 2025 [Page 7]