<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE rfc [
  <!ENTITY nbsp    "&#160;">
  <!ENTITY zwsp   "&#8203;">
  <!ENTITY nbhy   "&#8209;">
  <!ENTITY wj     "&#8288;">
]>
<?xml-stylesheet type="text/xsl" href="rfc2629.xslt"?>
<rfc xmlns:xi="http://www.w3.org/2001/XInclude"
     ipr="trust200902"
     docName="draft-ailex-vap-legal-ai-provenance-00"
     category="exp"
     submissionType="independent"
     xml:lang="en"
     version="3">

  <front>
    <title abbrev="VAP-LAP">
      Verifiable AI Provenance (VAP) Framework and Legal AI Profile (LAP)
    </title>

    <seriesInfo name="Internet-Draft" value="draft-ailex-vap-legal-ai-provenance-00"/>

    <author surname="AILEX" fullname="AILEX" role="editor">
      <organization>AILEX Inc. / VeritasChain Standards Organization</organization>
      <address>
        <postal>
          <street>1-10-8 Dogenzaka, Shibuya-ku</street>
          <region>Tokyo</region>
          <code>150-0043</code>
          <country>Japan</country>
        </postal>
        <email>info@ailex.co.jp</email>
        <uri>https://ailex.co.jp</uri>
      </address>
    </author>

    <date year="2026" month="February" day="14"/>

    <area>Security</area>
    <workgroup>Internet Engineering Task Force</workgroup>

    <keyword>AI provenance</keyword>
    <keyword>verifiable AI</keyword>
    <keyword>legal AI</keyword>
    <keyword>audit trail</keyword>
    <keyword>hash chain</keyword>
    <keyword>EU AI Act</keyword>

    <abstract>
      <t>
        This document specifies the Verifiable AI Provenance (VAP) Framework,
        a cross-domain upper framework for cryptographically verifiable
        decision audit trails in high-risk AI systems, along with the Legal
        AI Profile (LAP), a domain-specific instantiation for legal AI and
        LegalTech systems.
      </t>
      <t>
        VAP defines common infrastructure including hash chain integrity,
        digital signatures, unified conformance levels (Bronze/Silver/Gold),
        external anchoring via RFC 3161 Time-Stamp Protocol and compatible
        transparency services, a Completeness Invariant pattern guaranteeing
        no selective logging, standardized Evidence Pack format for
        regulatory submission, and privacy-preserving verification protocols.
      </t>
      <t>
        LAP extends VAP for the judicial AI domain, addressing unique
        requirements including attorney oversight verification (Human
        Override Coverage), three-pipeline completeness invariants for legal
        consultation, document generation, and fact-checking, as well as
        privacy-preserving fields designed to maintain attorney-client
        privilege while enabling third-party auditability.
      </t>
    </abstract>
  </front>

  <middle>
    <!-- ============================================================ -->
    <section anchor="introduction">
      <name>Introduction</name>
      <t>
        The deployment of AI systems in high-risk domains -- including
        finance, healthcare, transportation, and the administration of
        justice -- creates a structural accountability gap. AI decisions
        that affect fundamental rights and societal infrastructure lack
        standardized, cryptographically verifiable audit trails that
        independent third parties can inspect.
      </t>
      <t>
        Current approaches rely on trust-based governance: AI providers
        assert that their systems are safe and well-logged, but no
        independent party can cryptographically verify these claims.
        The Verifiable AI Provenance (VAP) Framework addresses this gap
        by defining a "Verify, Don't Trust" architecture for AI decision
        provenance.
      </t>
      <t>
        This document defines two complementary specifications:
      </t>
      <ol>
        <li>
          VAP Framework (Part I): A cross-domain upper framework defining
          common infrastructure for verifiable AI provenance applicable to
          any high-risk AI domain.
        </li>
        <li>
          Legal AI Profile (LAP) (Part II): A domain-specific profile for
          legal AI systems, addressing requirements arising from professional
          regulation of attorneys and high-risk AI system governance.
        </li>
      </ol>

      <section anchor="scope">
        <name>Scope</name>
        <t>
          VAP targets AI systems where "system failure could cause
          significant and irreversible harm to human life, societal
          infrastructure, or democratic institutions." This intentionally
          strict scope distinguishes VAP from general-purpose logging
          frameworks.
        </t>
        <t>
          LAP specifically addresses legal AI systems that provide AI-powered
          legal consultation, document generation, and fact-checking services
          to licensed attorneys.
        </t>
      </section>

      <section anchor="design-philosophy">
        <name>Design Philosophy</name>
        <t>
          The core principle is "Verify, Don't Trust." Rather than relying on
          AI providers' claims about the safety and integrity of their
          systems, VAP enables independent, cryptographic verification of
          every AI decision's provenance, completeness, and human oversight.
        </t>
      </section>
    </section>

    <!-- ============================================================ -->
    <section anchor="conventions">
      <name>Conventions and Definitions</name>
      <t>
        The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
        "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY",
        and "OPTIONAL" in this document are to be interpreted as described
        in BCP 14 <xref target="RFC2119"/> <xref target="RFC8174"/> when,
        and only when, they appear in all capitals, as shown here.
      </t>

      <section anchor="terminology">
        <name>Terminology</name>
        <dl>
          <dt>VAP</dt>
          <dd>Verifiable AI Provenance Framework - the cross-domain upper
          framework defined in this document.</dd>

          <dt>Profile</dt>
          <dd>A domain-specific instantiation of VAP (e.g., VCP for finance,
          CAP for content, LAP for legal).</dd>

          <dt>LAP</dt>
          <dd>Legal AI Profile - the judicial AI domain profile defined in
          this document.</dd>

          <dt>Provenance</dt>
          <dd>Cryptographically verifiable record of data origin, derivation,
          and history.</dd>

          <dt>Completeness Invariant</dt>
          <dd>A mathematical guarantee that every attempt event has exactly
          one corresponding outcome event.</dd>

          <dt>Evidence Pack</dt>
          <dd>A self-contained, signed package of provenance events suitable
          for regulatory submission and third-party audit.</dd>

          <dt>External Anchor</dt>
          <dd>Registration of a Merkle root hash with an external trusted
          timestamping service such as <xref target="RFC3161"/> or a
          compatible transparency log.</dd>

          <dt>Human Override</dt>
          <dd>An event recording a human professional's review, approval,
          modification, or rejection of an AI-generated output.</dd>

          <dt>Override Coverage</dt>
          <dd>The ratio of AI outputs reviewed by a human professional to
          total AI outputs, expressed as a percentage.</dd>

          <dt>Causal Link</dt>
          <dd>A reference from an outcome event to its originating attempt
          event, establishing referential integrity within a pipeline.</dd>
        </dl>
      </section>
    </section>

    <!-- ============================================================ -->
    <!-- PART I: VAP FRAMEWORK                                        -->
    <!-- ============================================================ -->
    <section anchor="vap-architecture">
      <name>VAP Framework Architecture</name>

      <section anchor="layer-model">
        <name>Layer Model</name>
        <t>
          VAP is organized into four core layers, a common infrastructure
          layer, and a domain profile layer:
        </t>
        <dl>
          <dt>Integrity Layer</dt>
          <dd>Hash chain, digital signatures, timestamps (REQUIRED for
          all levels).</dd>

          <dt>Provenance Layer</dt>
          <dd>Actor, input, context, action, and outcome recording
          (REQUIRED).</dd>

          <dt>Accountability Layer</dt>
          <dd>Operator identification, approval chain, delegation records
          (REQUIRED for operator_id; RECOMMENDED for approval chain).</dd>

          <dt>Traceability Layer</dt>
          <dd>Trace IDs, causal links, cross-profile references
          (REQUIRED for trace_id; OPTIONAL for cross-references).</dd>

          <dt>Common Infrastructure</dt>
          <dd>Conformance levels, external anchoring, completeness invariant,
          evidence packs, privacy-preserving verification, retention framework
          (availability depends on conformance level).</dd>

          <dt>Domain Profile Layer</dt>
          <dd>Domain-specific event types, data model extensions, regulatory
          mappings (defined per profile).</dd>
        </dl>
      </section>

      <section anchor="domain-profiles">
        <name>Domain Profiles</name>
        <t>
          VAP supports multiple domain profiles. Each profile MUST define:
        </t>
        <ol>
          <li>Event Types: Domain-specific event type taxonomy.</li>
          <li>Data Model Extensions: Additional fields beyond the VAP common
          event structure.</li>
          <li>Conformance Mapping: Mapping to VAP Bronze/Silver/Gold levels.</li>
          <li>Regulatory Alignment: Mapping to applicable regulations
          (informative).</li>
          <li>Completeness Invariant Application: How the completeness
          invariant applies to domain-specific event flows.</li>
        </ol>
        <t>
          Registered profiles include VCP (Finance), CAP (Content/Creative AI),
          and LAP (Legal AI, defined in Part II of this document). Additional
          profiles for automotive (DVP), medical (MAP), and public
          administration (PAP) domains are under development.
        </t>
      </section>
    </section>

    <!-- ============================================================ -->
    <section anchor="cryptographic-foundation">
      <name>Cryptographic Foundation</name>

      <section anchor="algorithms">
        <name>Algorithm Requirements</name>
        <t>
          All VAP-conformant implementations MUST support the following
          cryptographic algorithms:
        </t>
        <table>
          <name>Required Cryptographic Algorithms</name>
          <thead>
            <tr><th>Category</th><th>Primary</th><th>Alternative</th><th>Post-Quantum (Future)</th></tr>
          </thead>
          <tbody>
            <tr><td>Hash</td><td>SHA-256</td><td>SHA-384, SHA-512</td><td>SHA3-256</td></tr>
            <tr><td>Signature</td><td>Ed25519 (RFC 8032)</td><td>ECDSA P-256</td><td>ML-DSA-65</td></tr>
            <tr><td>Encryption</td><td>AES-256-GCM</td><td>ChaCha20-Poly1305</td><td>Kyber-1024</td></tr>
          </tbody>
        </table>
        <t>
          Implementations MUST include algorithm identifiers in all
          cryptographic fields to support crypto agility and future
          algorithm migration.
        </t>
      </section>

      <section anchor="hash-chain">
        <name>Hash Chain Specification</name>
        <t>
          Events MUST be linked in a hash chain where each event's hash
          includes the hash of the preceding event:
        </t>
        <sourcecode type="pseudocode"><![CDATA[
EventHash[n] = SHA-256(
  Canonicalize(Event[n] without Signature field)
)

where Event[n].PrevHash = EventHash[n-1]
      Event[0].PrevHash = null  (genesis event)

Canonicalization MUST follow RFC 8785 (JSON Canonicalization Scheme).
        ]]></sourcecode>
        <t>
          Chain integrity verification MUST confirm:
        </t>
        <ol>
          <li>Each event's hash matches its recomputed hash.</li>
          <li>Each event's PrevHash matches the preceding event's EventHash.</li>
          <li>The genesis event has a null PrevHash.</li>
        </ol>
      </section>

      <section anchor="digital-signatures">
        <name>Digital Signature Requirements</name>
        <t>
          Every event MUST be signed using Ed25519 (<xref target="RFC8032"/>).
          The signature MUST be computed over the event hash bytes:
        </t>
        <sourcecode type="pseudocode"><![CDATA[
Signature = Ed25519.Sign(PrivateKey, EventHash_bytes)
Encoded as: "ed25519:" + Base64(Signature)
        ]]></sourcecode>
      </section>
    </section>

    <!-- ============================================================ -->
    <section anchor="common-event-structure">
      <name>Common Event Structure</name>
      <t>
        All VAP-conformant events MUST include the following fields:
      </t>
      <sourcecode type="json"><![CDATA[
{
  "vap_version": "1.2",
  "profile": {
    "id": "string (VCP|CAP|LAP|DVP|MAP|PAP|EIP)",
    "version": "semver string"
  },
  "header": {
    "event_id": "UUIDv7 (RFC 9562)",
    "chain_id": "UUIDv7",
    "prev_hash": "sha256:... | null (genesis)",
    "timestamp": "ISO 8601 with timezone",
    "event_type": "string (profile-specific)"
  },
  "provenance": {
    "actor": {
      "actor_id": "string",
      "actor_hash": "sha256:... (privacy-preserving)",
      "role": "string"
    },
    "input": { },
    "context": { },
    "action": { },
    "outcome": { }
  },
  "accountability": {
    "operator_id": "string",
    "last_approval_by": "string",
    "approval_timestamp": "ISO 8601"
  },
  "domain_payload": { },
  "security": {
    "event_hash": "sha256:...",
    "hash_algo": "SHA256",
    "signature": "ed25519:...",
    "sign_algo": "ED25519",
    "signer_id": "string"
  }
}
      ]]></sourcecode>
      <t>
        Event identifiers MUST use UUIDv7 (<xref target="RFC9562"/>) to
        ensure time-ordered sortability. JSON canonicalization MUST follow
        <xref target="RFC8785"/>.
      </t>

      <section anchor="numeric-encoding">
        <name>Numeric Value Encoding</name>
        <t>
          Fields representing monetary amounts, cryptographic values, or
          high-precision measurements SHOULD be encoded as JSON strings
          rather than JSON numbers. This recommendation is motivated by:
        </t>
        <ul>
          <li>
            IEEE 754 double-precision floating-point, the only numeric type
            in JSON (per RFC 8259, Section 6), cannot exactly represent all
            decimal values. For example, 0.1 + 0.2 != 0.3 in IEEE 754.
            Financial and legal contexts require exact decimal representation.
          </li>
          <li>
            JSON parsers across programming languages exhibit inconsistent
            behavior for large integers (exceeding 2^53) and high-precision
            decimals, leading to silent data corruption.
          </li>
          <li>
            Canonicalization stability: <xref target="RFC8785"/> defines
            specific rules for numeric serialization, but string encoding
            avoids parser-dependent numeric reformatting entirely, ensuring
            consistent hash computation across implementations.
          </li>
        </ul>
        <t>
          Fields where exact precision is not critical (e.g., event_count,
          token_count) MAY use JSON numbers. Implementations MUST document
          which fields use string encoding. Implementations that use JSON
          numbers for counters MUST ensure that any numeric-to-string
          conversion performed during canonicalization is deterministic
          and documented, to avoid signature verification ambiguity across
          languages and libraries.
        </t>
      </section>
    </section>

    <!-- ============================================================ -->
    <section anchor="conformance-levels">
      <name>Conformance Levels</name>
      <t>
        VAP defines three conformance levels applicable to all domain
        profiles. Each level inherits all requirements of lower levels
        (Gold is a superset of Silver, which is a superset of Bronze).
      </t>

      <section anchor="bronze">
        <name>Bronze Level</name>
        <t>Target: SMEs, early adopters. Core capabilities:</t>
        <ul>
          <li>Event logging for all AI decision points (REQUIRED)</li>
          <li>SHA-256 hash chain linking all events (REQUIRED)</li>
          <li>Ed25519 digital signature on every event (REQUIRED)</li>
          <li>ISO 8601 timestamps with timezone (REQUIRED)</li>
          <li>UUIDv7 event identifiers (REQUIRED)</li>
          <li>Minimum 6-month retention (REQUIRED)</li>
          <li>JSON Schema validation (REQUIRED)</li>
        </ul>
      </section>

      <section anchor="silver">
        <name>Silver Level</name>
        <t>Target: Enterprise, regulated industries. Additional requirements
        beyond Bronze:</t>
        <ul>
          <li>Daily external anchoring to a trusted timestamping service
          conforming to <xref target="RFC3161"/> or an equivalent
          transparency log (REQUIRED)</li>
          <li>Completeness Invariant verification (REQUIRED)</li>
          <li>Evidence Pack generation capability (REQUIRED)</li>
          <li>Sensitive data hashing for privacy preservation (REQUIRED)</li>
          <li>Minimum 2-year retention (REQUIRED)</li>
          <li>Merkle tree construction for efficient proofs (REQUIRED)</li>
          <li>Third-party verification endpoint (REQUIRED)</li>
        </ul>
      </section>

      <section anchor="gold">
        <name>Gold Level</name>
        <t>Target: Highly regulated industries. Additional requirements
        beyond Silver:</t>
        <ul>
          <li>Hourly external anchoring (REQUIRED)</li>
          <li>HSM for signing key storage, FIPS 140-2/3 (REQUIRED)</li>
          <li>Integration with a transparency log service such as IETF
          SCITT or equivalent (REQUIRED)</li>
          <li>Real-time audit API with sub-second latency (REQUIRED)</li>
          <li>Minimum 5-year retention (REQUIRED)</li>
          <li>24-hour incident response and evidence preservation (REQUIRED)</li>
          <li>Geographic redundancy, minimum 2 regions (REQUIRED)</li>
          <li>Annual third-party audit (REQUIRED)</li>
          <li>Crypto-shredding support (REQUIRED)</li>
        </ul>
      </section>
    </section>

    <!-- ============================================================ -->
    <section anchor="external-anchoring">
      <name>External Anchoring</name>
      <t>
        External anchoring proves that events existed at a specific point
        in time, preventing backdating, forward-dating, and log forking.
      </t>

      <section anchor="anchoring-services">
        <name>Anchoring Service Types</name>
        <t>
          VAP defines an abstract anchoring interface that can be realized
          by multiple service types. The baseline anchoring service is
          <xref target="RFC3161"/> Time-Stamp Authority (TSA). Additional
          service types include transparency logs and public blockchains.
        </t>
        <dl>
          <dt>RFC 3161 TSA (Baseline)</dt>
          <dd>Traditional enterprise timestamping via X.509 PKI
          (<xref target="RFC3161"/>). This is the normative baseline.
          Trust model: CA trust hierarchy.</dd>

          <dt>Transparency Log (e.g., IETF SCITT)</dt>
          <dd>Append-only transparency logs providing public verifiability.
          IETF SCITT (<xref target="IETF-SCITT"/>) is one such service;
          implementations MAY use any transparency log providing equivalent
          guarantees. Trust model: public append-only log.</dd>

          <dt>Blockchain</dt>
          <dd>Bitcoin or Ethereum anchoring for maximum decentralization.
          Trust model: PoW/PoS consensus. This option is non-normative
          and provided for environments requiring decentralized trust.</dd>
        </dl>
        <t>
          Gold Level implementations MUST use at least one transparency
          log service (such as SCITT) or equivalent, in addition to or
          instead of RFC 3161 TSA. Implementations SHOULD use multiple
          independent anchoring services for critical deployments.
        </t>
      </section>

      <section anchor="anchor-record">
        <name>Anchor Record Format</name>
        <sourcecode type="json"><![CDATA[
{
  "anchor_id": "UUIDv7",
  "anchor_type": "RFC3161 | TRANSPARENCY_LOG | BLOCKCHAIN",
  "merkle_root": "sha256:...",
  "event_count": 1000,
  "first_event_id": "UUIDv7",
  "last_event_id": "UUIDv7",
  "first_event_timestamp": "ISO 8601",
  "last_event_timestamp": "ISO 8601",
  "anchor_timestamp": "ISO 8601",
  "anchor_proof": "Base64-encoded proof",
  "service_endpoint": "https://tsa.example.com"
}
        ]]></sourcecode>
      </section>

      <section anchor="merkle-tree">
        <name>Merkle Tree Construction</name>
        <t>
          Events MUST be batched into a binary Merkle hash tree for
          efficient anchoring and selective disclosure. The tree
          construction uses SHA-256 as the hash function and follows
          a standard binary tree structure:
        </t>
        <sourcecode type="pseudocode"><![CDATA[
Leaf[i]      = SHA-256(EventHash[i])
Interior[j]  = SHA-256(Left_child || Right_child)
MerkleRoot   = Interior[root]
        ]]></sourcecode>
        <t>
          If the number of leaves is not a power of two, the last leaf
          MUST be duplicated to complete the tree. The resulting Merkle
          root is submitted to the external anchoring service.
        </t>
        <t>
          Implementations MAY follow the tree construction specified in
          <xref target="RFC6962"/> (Certificate Transparency) or any
          equivalent binary Merkle tree construction that produces
          deterministic, verifiable inclusion proofs.
        </t>
        <t>
          Merkle inclusion proofs enable selective disclosure: a verifier
          can confirm that a specific event is included in an anchored
          batch without accessing other events in the batch.
        </t>
      </section>
    </section>

    <!-- ============================================================ -->
    <section anchor="completeness-invariant">
      <name>Completeness Invariant</name>
      <t>
        The Completeness Invariant is a mathematical guarantee that every
        "attempt" event has exactly one corresponding "outcome" event.
        This prevents selective logging -- the omission of inconvenient
        records.
      </t>
      <t>General form:</t>
      <sourcecode type="pseudocode"><![CDATA[
For each pipeline P:
  Count(P_ATTEMPT) = Count(P_SUCCESS)
                   + Count(P_DENY)
                   + Count(P_ERROR)
      ]]></sourcecode>

      <t>The invariant enforces three properties:</t>
      <dl>
        <dt>Completeness</dt>
        <dd>Every ATTEMPT has an outcome. Violation indicates missing events.</dd>
        <dt>Uniqueness</dt>
        <dd>Each ATTEMPT has exactly one outcome. Violation indicates
        duplicate records.</dd>
        <dt>Referential Integrity</dt>
        <dd>Every outcome contains a causal link to its originating ATTEMPT.
        Violation indicates orphan events.</dd>
      </dl>

      <t>
        Domain profiles MUST specify which event types constitute attempts
        and outcomes for the invariant. Each outcome event MUST contain
        a causal link field referencing the originating attempt event's
        identifier. Verification SHOULD account for a configurable grace
        period for in-flight operations.
      </t>
    </section>

    <!-- ============================================================ -->
    <section anchor="evidence-pack">
      <name>Evidence Pack Specification</name>
      <t>
        An Evidence Pack is a self-contained, signed package of provenance
        events suitable for regulatory submission and third-party audit.
      </t>

      <section anchor="pack-structure">
        <name>Pack Structure</name>
        <t>An Evidence Pack MUST contain:</t>
        <ul>
          <li>manifest.json: Pack metadata and integrity information</li>
          <li>events/: Event batches (max 10,000 events per file)</li>
          <li>anchors/: External anchor records</li>
          <li>merkle/: Merkle tree structure and selective disclosure proofs</li>
          <li>keys/: Public keys for signature verification</li>
          <li>signatures/: Pack-level signature</li>
        </ul>
      </section>

      <section anchor="pack-manifest">
        <name>Pack Manifest</name>
        <t>
          The manifest MUST include the following fields:
        </t>
        <dl>
          <dt>pack_id (REQUIRED)</dt>
          <dd>UUIDv7 uniquely identifying this Evidence Pack.</dd>
          <dt>vap_version (REQUIRED)</dt>
          <dd>VAP framework version (e.g., "1.2").</dd>
          <dt>profile (REQUIRED)</dt>
          <dd>Object containing profile id and version.</dd>
          <dt>conformance_level (REQUIRED)</dt>
          <dd>"Bronze", "Silver", or "Gold".</dd>
          <dt>generated_at (REQUIRED)</dt>
          <dd>ISO 8601 timestamp of pack generation.</dd>
          <dt>time_range (REQUIRED)</dt>
          <dd>Object with start and end ISO 8601 timestamps.</dd>
          <dt>statistics (REQUIRED)</dt>
          <dd>Object containing total_events and events_by_type breakdown.</dd>
          <dt>completeness_verification (REQUIRED for Silver+)</dt>
          <dd>Object containing invariant_type, invariant_valid boolean,
          and per-pipeline results.</dd>
          <dt>integrity (REQUIRED)</dt>
          <dd>Object containing checksums (SHA-256 per file), merkle_root,
          and pack_hash.</dd>
          <dt>external_anchors (REQUIRED for Silver+)</dt>
          <dd>Array of anchor records referencing this pack's time range.</dd>
        </dl>
        <t>
          The manifest MAY include additional profile-specific fields as
          defined by the domain profile specification.
        </t>
      </section>
    </section>

    <!-- ============================================================ -->
    <section anchor="privacy-verification">
      <name>Privacy-Preserving Verification</name>
      <t>
        VAP enables verification of system integrity without disclosure
        of sensitive data. This is achieved through:
      </t>
      <ol>
        <li>Hash-based attestation: Sensitive fields are stored as
        cryptographic hashes, enabling existence verification without
        content disclosure.</li>
        <li>Selective disclosure via Merkle proofs: Individual events can
        be proven to exist within an Evidence Pack without revealing other
        events.</li>
        <li>Per-tenant salting: Hash salts are unique per tenant to prevent
        cross-tenant correlation attacks.</li>
      </ol>
      <t>
        This mechanism is particularly critical for LAP, where
        attorney-client privilege prevents disclosure of consultation
        content while still requiring verifiable audit trails.
      </t>
    </section>

    <!-- ============================================================ -->
    <section anchor="retention-framework">
      <name>Retention Framework</name>
      <table>
        <name>Retention Requirements by Conformance Level</name>
        <thead>
          <tr><th>Level</th><th>Events</th><th>Anchor Records</th><th>Evidence Packs</th><th>Keys</th></tr>
        </thead>
        <tbody>
          <tr><td>Bronze</td><td>6 months</td><td>N/A</td><td>On-demand</td><td>1 year after last use</td></tr>
          <tr><td>Silver</td><td>2 years</td><td>5 years</td><td>2 years</td><td>3 years after last use</td></tr>
          <tr><td>Gold</td><td>5 years</td><td>10 years</td><td>5 years</td><td>7 years after last use</td></tr>
        </tbody>
      </table>
      <t>
        Retention periods MUST be extended upon: regulatory investigation
        notification, legal hold orders, security or safety incidents, and
        third-party audit requests.
      </t>
      <t>
        Domain profiles MAY specify extended retention periods beyond the
        VAP baseline where domain-specific regulations require longer
        retention (see <xref target="lap-conformance"/> for LAP extensions).
      </t>
      <t>
        For privacy regulation compliance (e.g., <xref target="GDPR"/> "right to be
        forgotten"), implementations at Silver level and above SHOULD
        support crypto-shredding: encrypting personal data with per-user
        keys and deleting those keys to render the data cryptographically
        unrecoverable while preserving hash chain integrity.
      </t>
    </section>

    <!-- ============================================================ -->
    <section anchor="third-party-verification">
      <name>Third-Party Verification Protocol</name>
      <t>Verification is available at three access levels:</t>
      <dl>
        <dt>Public</dt>
        <dd>Access to Merkle roots only. Verifies anchor existence.</dd>
        <dt>Auditor</dt>
        <dd>Access to Evidence Packs. Full chain and completeness verification.</dd>
        <dt>Regulator</dt>
        <dd>Real-time API access (Gold level). Live monitoring capability.</dd>
      </dl>
      <t>Verification steps:</t>
      <ol>
        <li>Anchor Verification: Confirm Merkle root in external timestamping
        service or transparency log.</li>
        <li>Chain Verification: Validate hash chain integrity from genesis
        to latest event.</li>
        <li>Signature Verification: Authenticate all events with public keys.</li>
        <li>Completeness Verification: Check invariant for the time period.</li>
        <li>Selective Query (optional): Verify specific events via Merkle proofs.</li>
      </ol>
    </section>

    <!-- ============================================================ -->
    <!-- PART II: LEGAL AI PROFILE (LAP)                              -->
    <!-- ============================================================ -->
    <section anchor="lap-overview">
      <name>Legal AI Profile (LAP) Overview</name>
      <t>
        The Legal AI Profile (LAP) is a VAP domain profile for judicial AI
        and LegalTech systems. LAP addresses unique challenges in the legal
        domain:
      </t>
      <dl>
        <dt>Unauthorized Practice of Law Risk</dt>
        <dd>Proving that AI does not independently practice law, through
        HUMAN_OVERRIDE events documenting attorney oversight.</dd>

        <dt>Hallucination</dt>
        <dd>Recording fact-check provenance through LEGAL_FACTCHECK events
        with citation chain verification.</dd>

        <dt>Selective Logging</dt>
        <dd>Preventing omission of inconvenient AI outputs through
        three-pipeline Completeness Invariant.</dd>

        <dt>Attorney-Client Privilege</dt>
        <dd>Maintaining confidentiality through privacy-preserving fields
        (prompt hashes instead of raw content).</dd>

        <dt>Accountability Ambiguity</dt>
        <dd>Recording "who, when, and on what basis" through the
        Accountability Layer.</dd>
      </dl>

      <section anchor="lap-registration">
        <name>Profile Registration</name>
        <table>
          <name>LAP Profile Registration</name>
          <thead>
            <tr><th>Field</th><th>Value</th></tr>
          </thead>
          <tbody>
            <tr><td>Profile ID</td><td>LAP</td></tr>
            <tr><td>Full Name</td><td>Legal AI Profile</td></tr>
            <tr><td>Domain</td><td>Legal AI / LegalTech</td></tr>
            <tr><td>Regulatory Scope</td><td>Attorney regulation, AI governance (informative)</td></tr>
            <tr><td>Time Precision</td><td>Second</td></tr>
            <tr><td>Profile Version</td><td>0.2.0</td></tr>
          </tbody>
        </table>
      </section>
    </section>

    <!-- ============================================================ -->
    <section anchor="lap-event-taxonomy">
      <name>LAP Event Type Taxonomy</name>
      <t>
        LAP defines three functional pipelines and one cross-cutting
        control event type:
      </t>

      <section anchor="pipeline-1">
        <name>Pipeline 1: Legal Query</name>
        <t>AI-powered legal consultation:</t>
        <ul>
          <li>LEGAL_QUERY_ATTEMPT: Question submission to AI</li>
          <li>LEGAL_QUERY_RESPONSE: AI response generated successfully</li>
          <li>LEGAL_QUERY_DENY: Response refused (content filter, unauthorized role)</li>
          <li>LEGAL_QUERY_ERROR: System error (API failure, timeout)</li>
        </ul>
      </section>

      <section anchor="pipeline-2">
        <name>Pipeline 2: Document Generation</name>
        <t>AI-assisted legal document drafting:</t>
        <ul>
          <li>LEGAL_DOC_ATTEMPT: Document generation request</li>
          <li>LEGAL_DOC_RESPONSE: Document generated successfully</li>
          <li>LEGAL_DOC_DENY: Generation refused (insufficient consent, unauthorized)</li>
          <li>LEGAL_DOC_ERROR: System error (API failure, parse error)</li>
        </ul>
      </section>

      <section anchor="pipeline-3">
        <name>Pipeline 3: Fact Check</name>
        <t>AI-powered legal fact verification:</t>
        <ul>
          <li>LEGAL_FACTCHECK_ATTEMPT: Fact-check request</li>
          <li>LEGAL_FACTCHECK_RESPONSE: Fact-check completed</li>
          <li>LEGAL_FACTCHECK_DENY: Fact-check refused (OPTIONAL)</li>
          <li>LEGAL_FACTCHECK_ERROR: System error</li>
        </ul>
        <t>
          Implementations MAY define LEGAL_FACTCHECK_DENY for cases where
          a fact-check request is refused due to rate limiting, insufficient
          permissions, or consent constraints. The deny_reason field SHOULD
          distinguish these from system errors.
        </t>
        <t>
          If an implementation does not support LEGAL_FACTCHECK_DENY,
          refusal conditions MUST be recorded as LEGAL_FACTCHECK_ERROR
          with a deny_equivalent indicator set to true in the error
          detail, ensuring the Completeness Invariant is maintained.
        </t>
      </section>

      <section anchor="human-override">
        <name>Cross-Cutting: Human Override</name>
        <t>
          HUMAN_OVERRIDE events record an attorney's review of any AI output:
        </t>
        <ul>
          <li>APPROVE: Attorney confirms AI output without modification</li>
          <li>MODIFY: Attorney edits AI output (modification hash recorded)</li>
          <li>REJECT: Attorney rejects AI output entirely</li>
        </ul>
        <t>
          HUMAN_OVERRIDE events reference the target outcome event via
          target_event_id (establishing a causal link) and include the
          attorney's identity (bar number hash), override type, and
          optional modification details.
        </t>
      </section>
    </section>

    <!-- ============================================================ -->
    <section anchor="lap-completeness">
      <name>LAP Completeness Invariant</name>
      <t>
        LAP applies the Completeness Invariant independently to all three
        pipelines:
      </t>
      <sourcecode type="pseudocode"><![CDATA[
For each pipeline P in {QUERY, DOC, FACTCHECK}:

  Count(LEGAL_{P}_ATTEMPT)
  = Count(LEGAL_{P}_RESPONSE)
  + Count(LEGAL_{P}_DENY)    [if supported]
  + Count(LEGAL_{P}_ERROR)

Expanded:

  LEGAL_QUERY_ATTEMPT = LEGAL_QUERY_RESPONSE
                      + LEGAL_QUERY_DENY
                      + LEGAL_QUERY_ERROR

  LEGAL_DOC_ATTEMPT   = LEGAL_DOC_RESPONSE
                      + LEGAL_DOC_DENY
                      + LEGAL_DOC_ERROR

  LEGAL_FACTCHECK_ATTEMPT = LEGAL_FACTCHECK_RESPONSE
                          + LEGAL_FACTCHECK_DENY  [if supported]
                          + LEGAL_FACTCHECK_ERROR
      ]]></sourcecode>
      <t>
        For implementations that do not support LEGAL_FACTCHECK_DENY,
        the invariant simplifies to ATTEMPT = RESPONSE + ERROR for
        Pipeline 3. Refusal conditions recorded as ERROR with
        deny_equivalent MUST be counted toward the invariant.
      </t>
      <t>
        Each outcome event MUST contain a causal link field referencing
        the originating attempt event's identifier, ensuring referential
        integrity can be verified independently of event ordering.
      </t>
    </section>

    <!-- ============================================================ -->
    <section anchor="lap-override-coverage">
      <name>Override Coverage Metric</name>
      <t>
        HUMAN_OVERRIDE events are outside the Completeness Invariant but
        LAP defines Override Coverage as a critical operational metric:
      </t>
      <sourcecode type="pseudocode"><![CDATA[
Override Coverage =
  Count(HUMAN_OVERRIDE) /
  (Count(LEGAL_*_RESPONSE) + Count(LEGAL_*_DENY))
      ]]></sourcecode>
      <t>
        This metric quantifies the degree to which human professionals
        review AI outputs. In jurisdictions where regulations require
        that a licensed professional personally scrutinize AI-generated
        work products, this metric provides measurable evidence of
        compliance.
      </t>
      <table>
        <name>Override Coverage Assessment</name>
        <thead>
          <tr><th>Coverage</th><th>Assessment</th><th>Operational Implication</th></tr>
        </thead>
        <tbody>
          <tr><td>100%</td><td>Ideal</td><td>Full professional oversight of all AI outputs</td></tr>
          <tr><td>70-99%</td><td>Good</td><td>Majority reviewed; low-risk outputs may be excluded</td></tr>
          <tr><td>30-69%</td><td>Warning</td><td>Insufficient review; operational improvement recommended</td></tr>
          <tr><td>&lt;30%</td><td>Critical</td><td>Professional oversight requirements likely unmet</td></tr>
        </tbody>
      </table>
      <t>
        ERROR events are excluded from the denominator because they do
        not produce an output suitable for professional approval or
        rejection. Completeness of error handling is evaluated separately
        via the per-pipeline invariant, where ERROR is a first-class
        outcome type.
      </t>
    </section>

    <!-- ============================================================ -->
    <section anchor="lap-privacy">
      <name>LAP Privacy-Preserving Fields</name>
      <t>
        Legal AI handles extremely sensitive data protected by professional
        privilege. LAP extends VAP's privacy-preserving verification with
        the following hashed fields:
      </t>
      <table>
        <name>LAP Privacy-Preserving Fields</name>
        <thead>
          <tr><th>Original Data</th><th>Hash Field</th><th>Sensitive Content</th></tr>
        </thead>
        <tbody>
          <tr><td>User query text</td><td>PromptHash</td><td>Legal consultation content (privileged)</td></tr>
          <tr><td>AI response text</td><td>ResponseHash</td><td>AI-generated legal advice</td></tr>
          <tr><td>Document output</td><td>OutputHash</td><td>Generated legal documents</td></tr>
          <tr><td>Case number</td><td>CaseNumberHash</td><td>Case identifier (high specificity)</td></tr>
          <tr><td>Bar number</td><td>BarNumberHash</td><td>Professional registration number</td></tr>
          <tr><td>Party names</td><td>PartyHash</td><td>Personal information of parties</td></tr>
          <tr><td>Modification detail</td><td>ModificationHash</td><td>Professional's corrections</td></tr>
          <tr><td>Factcheck content</td><td>TargetContentHash</td><td>Content under verification</td></tr>
        </tbody>
      </table>
      <t>
        Hash computation uses per-tenant salts to prevent cross-tenant
        correlation. Third-party verifiers can confirm event existence
        and chain integrity without accessing privileged content.
      </t>
    </section>

    <!-- ============================================================ -->
    <section anchor="lap-conformance">
      <name>LAP Conformance Level Mapping</name>
      <table>
        <name>LAP Conformance Matrix</name>
        <thead>
          <tr><th>Requirement</th><th>Bronze</th><th>Silver</th><th>Gold</th></tr>
        </thead>
        <tbody>
          <tr><td>Hash Chain</td><td>Yes</td><td>Yes</td><td>Yes</td></tr>
          <tr><td>Digital Signature</td><td>Yes</td><td>Yes</td><td>Yes</td></tr>
          <tr><td>External Anchoring</td><td>No</td><td>Daily</td><td>Hourly</td></tr>
          <tr><td>Completeness Invariant</td><td>No</td><td>3 Pipelines</td><td>3 Pipelines</td></tr>
          <tr><td>Override Coverage Tracking</td><td>No</td><td>Yes</td><td>Yes (with alerts)</td></tr>
          <tr><td>Evidence Pack</td><td>No</td><td>Yes</td><td>Yes</td></tr>
          <tr><td>Privacy Hashing</td><td>No</td><td>Yes</td><td>Yes</td></tr>
          <tr><td>HSM</td><td>No</td><td>No</td><td>Yes</td></tr>
          <tr><td>Retention</td><td>6 months</td><td>3 years</td><td>10 years</td></tr>
          <tr><td>Real-time Audit API</td><td>No</td><td>No</td><td>Yes</td></tr>
        </tbody>
      </table>
      <t>
        LAP extends the standard VAP retention periods. Silver level
        requires 3 years (vs. VAP baseline 2 years) and Gold requires 10
        years (vs. VAP baseline 5 years). This extension is driven by
        the longer statutory limitation periods typical in legal
        proceedings across multiple jurisdictions.
      </t>
    </section>

    <!-- ============================================================ -->
    <section anchor="lap-regulatory">
      <name>LAP Regulatory Alignment (Informative)</name>
      <t>
        This section is entirely informative and non-normative. It
        illustrates how LAP audit trail capabilities can support
        compliance with various regulatory frameworks. Legal compliance
        determinations are jurisdiction-specific and require independent
        legal analysis.
      </t>

      <section anchor="attorney-regulation">
        <name>Attorney Professional Regulation</name>
        <t>
          Many jurisdictions restrict the practice of law to licensed
          attorneys. Where AI systems assist attorneys in legal work,
          regulations may require that the attorney personally review
          and take responsibility for AI-generated outputs. LAP's
          HUMAN_OVERRIDE events and Override Coverage metric can support
          demonstrating such oversight.
        </t>
        <t>
          As an example, the Japanese Ministry of Justice guideline
          (<xref target="MOJ-GUIDELINE"/>) establishes that AI-based legal
          services provided to attorneys are permissible when the attorney
          personally scrutinizes and modifies the output as necessary. LAP
          audit trails can help meet these expectations through:
        </t>
        <ul>
          <li>Actor.role and BarNumberHash: supports verification that the
          user is a licensed attorney.</li>
          <li>HUMAN_OVERRIDE (APPROVE/MODIFY): supports demonstrating
          attorney scrutiny.</li>
          <li>ModificationHash: supports evidence of attorney modifications.</li>
        </ul>
      </section>

      <section anchor="eu-ai-act">
        <name>High-Risk AI System Governance</name>
        <t>
          Legal AI systems may be classified as high-risk under AI governance
          frameworks such as the EU AI Act (<xref target="EU-AI-ACT"/>),
          particularly under Annex III "Administration of justice" category.
          LAP Silver level and above provides audit trail capabilities that
          can help satisfy record-keeping requirements, including:
        </t>
        <ul>
          <li>Automatic event logging (supports Article 12 logging requirements)</li>
          <li>Hash chain continuity (supports lifetime recording)</li>
          <li>HUMAN_OVERRIDE events (supports human oversight documentation)</li>
          <li>Causal links between events (supports traceability)</li>
        </ul>
        <t>
          The degree to which these capabilities satisfy specific regulatory
          requirements should be evaluated on a per-jurisdiction basis.
        </t>
      </section>
    </section>

    <!-- ============================================================ -->
    <section anchor="security-considerations">
      <name>Security Considerations</name>
      <t>
        VAP-LAP implementations face several security considerations:
      </t>
      <dl>
        <dt>Key Compromise</dt>
        <dd>Compromise of signing keys allows event forgery. Bronze
        implementations SHOULD rotate keys annually. Silver MUST rotate
        semi-annually. Gold MUST use HSM storage and quarterly rotation.</dd>

        <dt>Hash Collision Resistance</dt>
        <dd>SHA-256 provides 128-bit collision resistance, considered
        sufficient for the foreseeable future. Implementations MUST support
        algorithm migration (crypto agility) for post-quantum transition.</dd>

        <dt>Privacy Leakage</dt>
        <dd>Per-tenant salting prevents cross-tenant hash correlation.
        Implementations MUST NOT share salts across tenants. Event metadata
        (timestamps, event types, counts) may leak statistical information
        even when content is hashed.</dd>

        <dt>Availability Attacks</dt>
        <dd>Denial-of-service attacks against the logging infrastructure
        could prevent event recording, violating completeness. Gold level
        implementations MUST have geographic redundancy.</dd>

        <dt>External Anchor Trust</dt>
        <dd>The security of external anchoring depends on the trusted
        timestamping service. Implementations SHOULD use multiple
        independent anchoring services for critical deployments.</dd>

        <dt>Completeness Invariant Circumvention</dt>
        <dd>An adversary with write access to the event store could insert
        fabricated ERROR events to satisfy the invariant while hiding actual
        outcomes. External anchoring at Silver level and above mitigates
        this by making post-hoc insertion detectable.</dd>

        <dt>Clock and Time Source Integrity</dt>
        <dd>Timestamp rollback or clock skew can cause false completeness
        verification failures and undermine event ordering guarantees.
        Implementations SHOULD use monotonic time sources and SHOULD
        cross-validate local timestamps against external anchoring
        timestamps. External anchoring at Silver level and above provides
        an independent time reference.</dd>
      </dl>
    </section>

    <!-- ============================================================ -->
    <section anchor="iana-considerations">
      <name>IANA Considerations</name>
      <t>
        This document has no IANA actions at this time.
      </t>
      <t>
        Future versions of this document might request registration of
        a media type for VAP Evidence Pack manifests
        (e.g., "application/vnd.vap.evidence-pack+json") and an IANA
        registry for VAP Domain Profile identifiers. Until then, profile
        identifiers are managed by the VeritasChain Standards Organization
        (VSO). The initial registered profiles are VCP (Finance),
        CAP (Content/Creative AI), and LAP (Legal AI).
      </t>
    </section>

  </middle>

  <back>
    <references>
      <name>References</name>

      <references>
        <name>Normative References</name>

        <xi:include href="https://bib.ietf.org/public/rfc/bibxml/reference.RFC.2119.xml"/>
        <xi:include href="https://bib.ietf.org/public/rfc/bibxml/reference.RFC.8174.xml"/>
        <xi:include href="https://bib.ietf.org/public/rfc/bibxml/reference.RFC.8032.xml"/>
        <xi:include href="https://bib.ietf.org/public/rfc/bibxml/reference.RFC.8785.xml"/>
        <xi:include href="https://bib.ietf.org/public/rfc/bibxml/reference.RFC.9562.xml"/>
        <xi:include href="https://bib.ietf.org/public/rfc/bibxml/reference.RFC.3161.xml"/>

      </references>

      <references>
        <name>Informative References</name>

        <xi:include href="https://bib.ietf.org/public/rfc/bibxml/reference.RFC.6962.xml"/>

        <reference anchor="EU-AI-ACT">
          <front>
            <title>Regulation (EU) 2024/1689 - Artificial Intelligence Act</title>
            <author>
              <organization>European Parliament and Council</organization>
            </author>
            <date year="2024"/>
          </front>
        </reference>

        <reference anchor="JAPAN-ATTORNEY-ACT">
          <front>
            <title>Attorney Act (Bengoshi-ho), Act No. 205 of 1949</title>
            <author>
              <organization>Government of Japan</organization>
            </author>
            <date year="1949"/>
          </front>
        </reference>

        <reference anchor="MOJ-GUIDELINE">
          <front>
            <title>Regarding the Relationship between AI-based Contract Document Support Services and Attorney Act Article 72</title>
            <author>
              <organization>Ministry of Justice, Japan</organization>
            </author>
            <date year="2023" month="August"/>
          </front>
        </reference>

        <reference anchor="JFBA-AI-GUIDANCE">
          <front>
            <title>Precautions Regarding the Use of Generative AI in Attorney Practice</title>
            <author>
              <organization>Japan Federation of Bar Associations</organization>
            </author>
            <date year="2025" month="September"/>
          </front>
        </reference>

        <reference anchor="IETF-SCITT">
          <front>
            <title>An Architecture for Trustworthy and Transparent Digital Supply Chains</title>
            <author>
              <organization>IETF SCITT Working Group</organization>
            </author>
            <date year="2024"/>
          </front>
          <seriesInfo name="Internet-Draft" value="draft-ietf-scitt-architecture"/>
        </reference>

        <reference anchor="GDPR">
          <front>
            <title>Regulation (EU) 2016/679 - General Data Protection Regulation</title>
            <author>
              <organization>European Parliament and Council</organization>
            </author>
            <date year="2016"/>
          </front>
        </reference>

      </references>
    </references>

    <!-- ============================================================ -->
    <section anchor="comparison">
      <name>Profile Comparison</name>
      <table>
        <name>Comparison of VAP Domain Profiles</name>
        <thead>
          <tr><th>Aspect</th><th>VCP (Finance)</th><th>CAP (Content)</th><th>LAP (Legal)</th></tr>
        </thead>
        <tbody>
          <tr><td>Time Precision</td><td>Nanosecond</td><td>Second</td><td>Second</td></tr>
          <tr><td>Key Invariant</td><td>SIG to ORD</td><td>GEN_ATTEMPT to GEN/DENY/ERROR</td><td>3 Pipeline Invariants</td></tr>
          <tr><td>Unique Feature</td><td>Signal integrity</td><td>Safe Refusal (SRP)</td><td>Human Override Coverage</td></tr>
          <tr><td>Regulatory Focus</td><td>Financial regulation</td><td>Content regulation</td><td>Attorney regulation + AI governance</td></tr>
          <tr><td>Privacy Model</td><td>Trade data</td><td>Creative content</td><td>Professional privilege</td></tr>
          <tr><td>Retention (Gold)</td><td>5 years</td><td>5 years</td><td>10 years</td></tr>
        </tbody>
      </table>
    </section>

    <!-- ============================================================ -->
    <section anchor="changes">
      <name>Change Log</name>
      <t>This section tracks changes between Internet-Draft revisions and
      will be removed before publication.</t>
      <dl>
        <dt>draft-ailex-vap-legal-ai-provenance-00</dt>
        <dd>Initial submission.</dd>
      </dl>
    </section>

    <!-- ============================================================ -->
    <section anchor="acknowledgments" numbered="false">
      <name>Acknowledgments</name>
      <t>
        The VAP Framework and LAP Profile were developed with input from:
        the CAP v1.0 Safe Refusal Provenance (SRP) design experience,
        the VCP v1.1 operational feedback, regulatory engagement from
        legal practitioners, and open-source community contributions.
      </t>
      <t>
        LAP v0.2 design draws from the AILEX SaaS reference implementation,
        the Ministry of Justice guideline on AI services and
        <xref target="JAPAN-ATTORNEY-ACT"/> Article 72 (August 2023),
        and the <xref target="JFBA-AI-GUIDANCE"/> on generative AI
        in attorney practice (September 2025).
      </t>
    </section>
  </back>
</rfc>
