| Internet-Draft | SCITT-Refusal-Events | January 2026 |
| Kamimura | Expires 2 August 2026 | [Page] |
This document describes a SCITT-based mechanism for creating verifiable records of AI content refusal events. It defines how refusal decisions can be encoded as SCITT Signed Statements, registered with Transparency Services, and verified by third parties using Receipts.¶
This specification provides auditability of refusal decisions that are logged, not cryptographic proof that no unlogged generation occurred. It does not define content moderation policies, classification criteria, or what AI systems should refuse; it addresses only the audit trail mechanism.¶
This revision (-01) incorporates lessons from the January 2026 Grok NCII incident, aligns with the CAP-SRP v1.0 specification, and addresses emerging regulatory requirements including EU AI Act Article 12/50 and the Korea AI Basic Act.¶
This note is to be removed before publishing as an RFC.¶
The latest version of this document, along with implementation resources and examples, can be found at [CAP-SRP].¶
Discussion of this document takes place on the SCITT Working Group mailing list (scitt@ietf.org).¶
The companion specification CAP-SRP v1.0 [CAP-SRP-SPEC] provides a complete domain profile for content/creative AI systems.¶
This Internet-Draft is submitted in full conformance with the provisions of BCP 78 and BCP 79.¶
Internet-Drafts are working documents of the Internet Engineering Task Force (IETF). Note that other groups may also distribute working documents as Internet-Drafts. The list of current Internet-Drafts is at https://datatracker.ietf.org/drafts/current/.¶
Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress."¶
This Internet-Draft will expire on 2 August 2026.¶
Copyright (c) 2026 IETF Trust and the persons identified as the document authors. All rights reserved.¶
This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents (https://trustee.ietf.org/license-info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document. Code Components extracted from this document must include Revised BSD License text as described in Section 4.e of the Trust Legal Provisions and are provided without warranty as described in the Revised BSD License.¶
This document is NOT a content moderation policy. It does not prescribe what AI systems should or should not refuse to generate, nor does it define criteria for classifying requests as harmful. The mechanism described herein is agnostic to the reasons for refusal decisions; it provides only an interoperable format for recording that such decisions occurred. Policy decisions regarding acceptable content remain the domain of AI providers, regulators, and applicable law.¶
AI systems capable of generating content increasingly implement safety mechanisms to refuse requests deemed harmful, illegal, or policy-violating. However, these refusal decisions typically leave no verifiable audit trail. When a system refuses to generate content, the event vanishes—there is no receipt, no log entry accessible to external parties, and no mechanism for third-party verification.¶
This creates several problems:¶
The SCITT architecture [I-D.ietf-scitt-architecture] provides primitives—Signed Statements, Transparency Services, and Receipts—that can address this gap. This document describes how these primitives can be applied to AI refusal events.¶
The January 2026 Grok incident exposed the critical need for verifiable refusal mechanisms. xAI's generative AI system produced approximately 4.4 million images in 9 days, with external analysis indicating at least 41% were sexualized images and 2% depicted minors. This triggered unprecedented multi-jurisdictional enforcement:¶
When xAI asserted that moderation systems were functioning, no external party could verify this claim. The absence of verifiable refusal records meant regulators had to rely on provider self-reports, AI Forensics external testing, and user complaints rather than cryptographic proof.¶
This incident demonstrates that the problem is not detection accuracy alone, but verification capability. Even if an AI system has effective content classifiers, the inability to prove refusals occurred creates an accountability gap that this specification addresses.¶
Multiple jurisdictions are implementing AI transparency and logging requirements that this specification can help satisfy:¶
The EU AI Act (Regulation 2024/1689) establishes comprehensive logging requirements:¶
This specification's event model directly supports Article 12 compliance by providing tamper-evident logging with external anchoring for independent verification.¶
The Korea AI Basic Act (AI기본법), effective January 22, 2026, requires:¶
The completeness invariant defined in this specification provides a mechanism for demonstrating that AI systems make consistent decisions that can be explained and audited.¶
US regulations relevant to AI content provenance include:¶
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in BCP 14 [RFC2119] [RFC8174] when, and only when, they appear in all capitals, as shown here.¶
This document describes:¶
This document does NOT define:¶
This specification provides auditability of refusal decisions that are logged, not cryptographic proof that no unlogged generation occurred. An AI system that bypasses logging entirely cannot be detected by this mechanism alone. Detection of such bypass requires external enforcement mechanisms (e.g., trusted execution environments, attestation) which are outside the scope of this document.¶
This profile does not require Transparency Services to enforce completeness invariants; such checks are performed by verifiers using application-level logic.¶
This Internet-Draft provides the IETF-track specification for verifiable AI refusal events using SCITT primitives. The companion CAP-SRP specification [CAP-SRP-SPEC] published by the VeritasChain Standards Organization provides:¶
Implementations may conform to this Internet-Draft alone for basic SCITT interoperability, or additionally conform to CAP-SRP for comprehensive content AI audit trail capabilities.¶
This document uses terminology from [I-D.ietf-scitt-architecture]. The following terms are specific to this profile:¶
This document focuses on refusal events because successful generation is already observable through content existence and downstream provenance mechanisms (e.g., C2PA manifests, watermarks). Refusal events, by contrast, are negative events that leave no external artifact unless explicitly logged. The GENERATE and ERROR outcomes are defined for completeness invariant verification but are not the primary focus of this specification.¶
This profile maps refusal event concepts directly to SCITT primitives, minimizing new terminology:¶
| This Document | CAP-SRP Term | SCITT Primitive |
|---|---|---|
| ATTEMPT | GEN_ATTEMPT | Signed Statement |
| DENY | GEN_DENY | Signed Statement |
| GENERATE | GEN | Signed Statement |
| ERROR | GEN_ERROR | Signed Statement |
| AI System | Issuer | Issuer |
| Inclusion Proof | Receipt | Receipt |
Refusal events are registered with a standard SCITT Transparency Service; this document does not define a separate log type.¶
This document uses domain-agnostic event type names (ATTEMPT, DENY, GENERATE, ERROR) to enable application across multiple AI domains. CAP-SRP uses domain-specific prefixes (GEN_ATTEMPT, GEN_DENY, GEN, GEN_ERROR) appropriate for content generation. Implementations targeting CAP-SRP conformance SHOULD use CAP-SRP event type names in the eventType field; implementations targeting broader SCITT interoperability MAY use the names defined in this document.¶
This specification defines three conformance levels to accommodate different organizational capabilities and regulatory requirements. These levels align with the VAP Framework v1.2 [VAP-FRAMEWORK] conformance structure.¶
Minimum requirements for basic conformance:¶
Bronze level is suitable for voluntary transparency and early adopters.¶
All Bronze requirements, plus:¶
Silver level is recommended for organizations subject to EU AI Act Article 12 or similar regulations.¶
All Silver requirements, plus:¶
Gold level is required for Very Large Online Platforms (VLOPs) under DSA Article 37 and high-risk AI systems requiring maximum assurance.¶
A regulatory authority investigating AI system compliance needs to verify that a provider's stated content policies are actually enforced. Without verifiable refusal events, the regulator must trust provider self-reports. With this mechanism, regulators can request Evidence Packs for specified time ranges, verify ATTEMPT/Outcome completeness for logged events, confirm refusal decisions are anchored in an append-only log, and compare refusal statistics against external incident reports.¶
This directly addresses the verification gap exposed by the Grok incident, where regulators had no mechanism to independently verify provider claims about safety system effectiveness.¶
When investigating whether an AI system refused a specific request, investigators need to establish provenance. A Verifiable Refusal Record (ATTEMPT + DENY + Receipts) demonstrates that a specific request was received, classified as policy-violating, refused, and the refusal was logged with external timestamp anchoring.¶
AI service providers may need to demonstrate to stakeholders that safety mechanisms function as claimed. Verifiable refusal events enable statistical reporting on logged refusal rates, third-party verification of safety claims, auditable proof that specific requests were refused, and comparison against industry benchmarks.¶
In legal proceedings concerning AI-generated content, parties may need evidence that a system declined a request. Verifiable Refusal Records provide such evidence, subject to the limitation that they demonstrate logged refusals, not the absence of unlogged generation. Evidence Packs provide court-admissible documentation with cryptographic integrity proofs.¶
Users who receive refusals may need proof that their request was processed. Refusal Receipts enable users to verify their request was logged, appeal refusal decisions with evidence, and demonstrate to third parties that they attempted but were refused (useful for content creators documenting compliance efforts).¶
This section defines requirements for implementations. To maximize interoperability while allowing implementation flexibility, a small set of core requirements use MUST; other requirements use SHOULD or MAY.¶
The completeness invariant is the central requirement of this profile:¶
Formal definition:¶
∑ ATTEMPT = ∑ GENERATE + ∑ DENY + ∑ ERROR
For any time window [T₁, T₂]:
count(ATTEMPT where T₁ ≤ timestamp ≤ T₂) =
count(GENERATE where T₁ ≤ timestamp ≤ T₂ + grace_period) +
count(DENY where T₁ ≤ timestamp ≤ T₂ + grace_period) +
count(ERROR where T₁ ≤ timestamp ≤ T₂ + grace_period)
¶
Verifiers SHOULD flag any logged ATTEMPT without a corresponding Outcome as potential evidence of incomplete logging or system failure.¶
Violation detection:¶
| Condition | Meaning | Implication |
|---|---|---|
| Attempts > Outcomes | Unmatched attempts | System may be hiding results |
| Outcomes > Attempts | Orphan outcomes | System may have fabricated refusals |
| Duplicate outcomes | Multiple outcomes per attempt | Data integrity failure |
This completeness invariant is defined at the event semantics level and applies only to logged events. It cannot detect ATTEMPT events that were never logged. Cryptographic detection of invariant violations depends on the properties of the underlying Transparency Service and verifier logic.¶
This profile does not require Transparency Services to enforce completeness invariants; such checks are performed by verifiers using application-level logic.¶
To protect against tampering, implementations SHOULD:¶
PrevHash chaining is RECOMMENDED but not required because append-only guarantees are primarily provided by the Transparency Service. PrevHash provides an additional, issuer-local integrity signal that can detect tampering even before Transparency Service registration.¶
SHA-256 for hashing and Ed25519 for signatures are RECOMMENDED. Other algorithms registered with COSE MAY be used. Implementations SHOULD support algorithm agility for future post-quantum cryptography migration.¶
Refusal events may be triggered by harmful or sensitive content. To avoid the audit log becoming a repository of harmful content, implementations SHOULD:¶
The hash function SHOULD be collision-resistant to prevent an adversary from claiming a benign prompt hashes to the same value as a harmful one.¶
Hashing without salting may be vulnerable to dictionary attacks if an adversary has a list of candidate prompts. Mitigations include access controls on event queries, time-limited retention policies, and monitoring for bulk query patterns. Salting may provide additional protection but introduces complexity; if used, implementations must ensure verification remains possible without requiring disclosure of the salt to third-party verifiers.¶
To enable external verification without access to internal systems, implementations SHOULD:¶
To maintain audit trail integrity, implementations SHOULD:¶
External anchoring frequency requirements by conformance level:¶
| Level | Minimum Frequency | Maximum Delay |
|---|---|---|
| Bronze | Optional | N/A |
| Silver | Daily | 24 hours |
| Gold | Hourly | 1 hour |
Some operational scenarios may require delayed outcomes:¶
Implementations SHOULD document expected latency bounds in their Registration Policy. Extended delays SHOULD trigger monitoring alerts.¶
An implementation conforms to this specification if it satisfies the following requirements:¶
All other requirements (SHOULD, RECOMMENDED, MAY) are guidance for interoperability and security best practices but are not required for conformance.¶
Implementations MAY extend the data model with additional fields provided the core conformance requirements are satisfied.¶
Implementations claiming a specific conformance level (Bronze/Silver/Gold) MUST satisfy all requirements for that level as defined in Section 3.¶
This section defines example payloads for ATTEMPT, DENY, GENERATE, and ERROR Signed Statements. These are encoded as JSON payloads. This data model is non-normative; implementations MAY extend or modify these structures provided the conformance requirements in Section 5.6 are satisfied.¶
An ATTEMPT records that a generation request was received:¶
{
"eventType": "ATTEMPT",
"eventId": "019467a1-0001-7000-0000-000000000001",
"chainId": "019467a0-0000-7000-0000-000000000000",
"timestamp": "2026-01-29T14:23:45.100Z",
"issuer": "urn:example:ai-service:img-gen-prod",
"promptHash": "sha256:7f83b1657ff1fc53b92dc18148a1d65d...",
"inputType": "text+image",
"referenceInputHashes": [
"sha256:9f86d081884c7d659a2feaa0c55ad015..."
],
"sessionId": "019467a1-0001-7000-0000-000000000000",
"actorHash": "sha256:e3b0c44298fc1c149afbf4c8996fb924...",
"modelId": "img-gen-v4.2.1",
"policyId": "content-safety-v2",
"policyVersion": "2026-01-01",
"hashAlgo": "SHA256",
"signAlgo": "ED25519",
"prevHash": "sha256:0000000000000000000000000000000...",
"eventHash": "sha256:a1b2c3d4e5f6a1b2c3d4e5f6a1b2c3d4..."
}
¶
Field definitions:¶
A DENY records that a request was refused:¶
{
"eventType": "DENY",
"eventId": "019467a1-0001-7000-0000-000000000002",
"chainId": "019467a0-0000-7000-0000-000000000000",
"timestamp": "2026-01-29T14:23:45.150Z",
"issuer": "urn:example:ai-service:img-gen-prod",
"attemptId": "019467a1-0001-7000-0000-000000000001",
"riskCategory": "NCII_RISK",
"riskSubCategories": ["REAL_PERSON", "CLOTHING_REMOVAL_REQUEST"],
"riskScore": 0.94,
"refusalReason": "Non-consensual intimate imagery request detected",
"modelDecision": "DENY",
"humanOverride": false,
"escalationId": null,
"hashAlgo": "SHA256",
"signAlgo": "ED25519",
"prevHash": "sha256:a1b2c3d4e5f6a1b2c3d4e5f6a1b2c3d4...",
"eventHash": "sha256:e5f6g7h8i9j0e5f6g7h8i9j0e5f6g7h8..."
}
¶
Field definitions:¶
This specification does not standardize content moderation categories, risk taxonomies, or refusal reason formats. These are policy decisions that remain the domain of AI providers and applicable regulations.¶
A GENERATE records that content was successfully produced. This document focuses on refusal events; GENERATE is included for completeness invariant verification:¶
{
"eventType": "GENERATE",
"eventId": "019467a1-0001-7000-0000-000000000004",
"chainId": "019467a0-0000-7000-0000-000000000000",
"timestamp": "2026-01-29T14:23:46.500Z",
"issuer": "urn:example:ai-service:img-gen-prod",
"attemptId": "019467a1-0001-7000-0000-000000000001",
"outputHash": "sha256:b2c3d4e5f6a7b2c3d4e5f6a7b2c3d4e5...",
"outputType": "image/png",
"c2paManifestId": "urn:c2pa:manifest:...",
"hashAlgo": "SHA256",
"signAlgo": "ED25519",
"prevHash": "sha256:a1b2c3d4e5f6a1b2c3d4e5f6a1b2c3d4...",
"eventHash": "sha256:c3d4e5f6a7b8c3d4e5f6a7b8c3d4e5f6..."
}
¶
Field definitions specific to GENERATE:¶
GENERATE events are typically not the focus of regulatory audits since successful generation is observable through content existence and downstream provenance (e.g., C2PA manifests, SynthID watermarks). They are included here to enable completeness invariant verification.¶
An ERROR records that a request failed due to system issues:¶
{
"eventType": "ERROR",
"eventId": "019467a1-0001-7000-0000-000000000003",
"chainId": "019467a0-0000-7000-0000-000000000000",
"timestamp": "2026-01-29T14:23:45.200Z",
"issuer": "urn:example:ai-service:img-gen-prod",
"attemptId": "019467a1-0001-7000-0000-000000000001",
"errorCode": "TIMEOUT",
"errorMessage": "Model inference timeout after 30s",
"hashAlgo": "SHA256",
"signAlgo": "ED25519",
"prevHash": "sha256:e5f6g7h8i9j0e5f6g7h8i9j0e5f6g7h8...",
"eventHash": "sha256:h8i9j0k1l2m3h8i9j0k1l2m3h8i9j0k1..."
}
¶
ERROR events indicate system failures, not policy decisions. A high ERROR rate may indicate operational issues or potential abuse (e.g., adversarial inputs designed to crash the system). Implementations SHOULD monitor ERROR rates and investigate anomalies.¶
The following risk categories are provided as a non-normative reference taxonomy. Implementations MAY use different categories based on their policies and applicable regulations:¶
| Category | Description | Example |
|---|---|---|
| CSAM_RISK | Child sexual abuse material | Minor sexualization request |
| NCII_RISK | Non-consensual intimate imagery | Deepfake pornography |
| MINOR_SEXUALIZATION | Content sexualizing minors | Age-inappropriate requests |
| REAL_PERSON_DEEPFAKE | Unauthorized realistic depiction | Celebrity face swap |
| VIOLENCE_EXTREME | Graphic violence | Gore, torture |
| HATE_CONTENT | Discriminatory content | Racist imagery |
| TERRORIST_CONTENT | Terrorism-related | Propaganda, recruitment |
| SELF_HARM_PROMOTION | Self-harm encouragement | Suicide methods |
| COPYRIGHT_VIOLATION | Clear IP infringement | Trademarked characters |
| COPYRIGHT_STYLE_MIMICRY | Artist style imitation | Protected style requests |
| OTHER | Other policy violations | Custom policies |
An Evidence Pack is a self-contained, cryptographically verifiable collection of events suitable for regulatory submission or third-party audit.¶
evidence_pack/
├── manifest.json # Pack metadata and integrity info
├── events/
│ ├── events_001.jsonl # Event batch 1 (JSON Lines format)
│ ├── events_002.jsonl # Event batch 2
│ └── ...
├── anchors/
│ ├── anchor_001.json # External anchor records
│ └── ...
├── merkle/
│ ├── tree_001.json # Merkle tree structure
│ └── proofs/ # Selective disclosure proofs
├── keys/
│ └── public_keys.json # Public keys for verification
└── signatures/
└── pack_signature.json # Pack-level signature
¶
{
"packId": "019467b2-0000-7000-0000-000000000000",
"packVersion": "1.0",
"generatedAt": "2026-01-29T15:00:00Z",
"generatedBy": "urn:example:ai-service:img-gen-prod",
"conformanceLevel": "Silver",
"eventCount": 150000,
"timeRange": {
"start": "2026-01-01T00:00:00Z",
"end": "2026-01-29T14:59:59Z"
},
"checksums": {
"events/events_001.jsonl": "sha256:...",
"events/events_002.jsonl": "sha256:...",
"anchors/anchor_001.json": "sha256:..."
},
"completenessVerification": {
"totalAttempts": 145000,
"totalGenerate": 140000,
"totalDeny": 4500,
"totalError": 500,
"invariantValid": true,
"verificationTimestamp": "2026-01-29T15:00:00Z"
},
"externalAnchors": [
{
"anchorId": "019467b0-0000-7000-0000-000000000000",
"anchorType": "RFC3161",
"anchorTimestamp": "2026-01-29T00:00:00Z"
}
]
}
¶
Third-party verification of an Evidence Pack involves:¶
ATTEMPT, DENY, GENERATE, and ERROR events are encoded as SCITT Signed Statements:¶
The JSON payload is canonicalized per [RFC8785] and signed as the COSE_Sign1 payload bytes. This ensures deterministic serialization for signature verification.¶
After creating a Signed Statement, the Issuer SHOULD register it with a SCITT Transparency Service:¶
The Transparency Service's Registration Policy MAY verify that required fields are present and timestamps are within acceptable bounds.¶
Registration may fail due to network issues, service unavailability, or policy rejection. Implementations SHOULD implement retry logic with exponential backoff. Persistent registration failures SHOULD be logged locally and trigger operational alerts.¶
A Transparency Service operating as a refusal event log MAY implement a Registration Policy that validates:¶
This profile does not require Transparency Services to enforce completeness invariants. A TS accepting refusal events is not expected to verify that every ATTEMPT has an Outcome; such verification is performed by auditors and verifiers at the application level.¶
A complete Verifiable Refusal Record consists of:¶
Verifiers can confirm that a refusal was logged by validating both Receipts and checking the ATTEMPT/DENY linkage. This demonstrates that the refusal decision was recorded in the Transparency Service, but does not prove that no unlogged generation occurred.¶
For additional assurance, implementations MAY periodically anchor Merkle tree roots to external systems:¶
External anchoring provides defense against a compromised Transparency Service and satisfies regulatory requirements for independent timestamp verification.¶
Anchor record format:¶
{
"anchorId": "019467b0-0000-7000-0000-000000000000",
"anchorType": "RFC3161",
"merkleRoot": "sha256:abcd1234...",
"eventCount": 1000,
"firstEventId": "019467a0-0000-7000-0000-000000000001",
"lastEventId": "019467a0-0000-7000-0000-000001000000",
"timestamp": "2026-01-29T00:00:00Z",
"anchorProof": "MIIHkwYJKoZIhvc...",
"serviceEndpoint": "https://timestamp.digicert.com"
}
¶
The Coalition for Content Provenance and Authenticity (C2PA) provides standards for content provenance that complement this specification:¶
Generated content MAY include a C2PA manifest with a reference to the corresponding SCITT events:¶
{
"c2pa:assertions": {
"scitt:reference": {
"eventId": "019467a1-...",
"chainId": "019467a0-...",
"verificationEndpoint": "https://api.example.com/verify"
}
}
}
¶
This cross-reference enables verifiers to trace from the content artifact back to the complete audit trail including any prior refusal attempts in the same session.¶
This document has no IANA actions at this time.¶
Future revisions may request registration of:¶
This specification assumes the following threat model:¶
This specification does NOT protect against:¶
An adversary controlling the AI system might attempt to omit refusal events to hide policy violations or, conversely, omit GENERATE events to falsely claim content was refused. The completeness invariant provides detection for logged events: auditors can identify ATTEMPT Signed Statements without corresponding Outcomes. Hash chains detect deletion of intermediate events.¶
However, if an ATTEMPT is never logged, this specification cannot detect the omission. Complete prevention of omission attacks is beyond the scope of this specification and would require external enforcement mechanisms such as trusted execution environments, RATS attestation, or real-time external monitoring.¶
The requirement that ATTEMPT be logged BEFORE safety evaluation (pre-evaluation logging) prevents selective logging where only "safe" requests are recorded.¶
A malicious Transparency Service might present different views of the log to different parties (equivocation). For example, it might show auditors a log containing DENY events while providing a different view to other verifiers. Mitigations include:¶
Detection of equivocation requires coordination between verifiers; a single verifier in isolation cannot detect it.¶
A malicious Issuer might maintain separate logs for refusals and generations, showing only the refusal log to auditors. The completeness invariant mitigates this by requiring every logged ATTEMPT to have an Outcome; if the GENERATE outcomes are hidden, auditors will observe orphaned ATTEMPTs.¶
Direct modification of log entries is prevented by cryptographic signatures on Signed Statements, hash chain linking, Merkle tree inclusion proofs in Receipts, and the append-only structure enforced by the Transparency Service.¶
An attacker might attempt to replay old refusal events to inflate refusal statistics or create false alibis. Mitigations include: UUID v7 provides temporal ordering (events with earlier timestamps have smaller UUIDs), timestamps are verified against Transparency Service registration time, and prevHash chaining detects out-of-order insertion or duplicate events.¶
If an Issuer's signing key is compromised, an attacker could create fraudulent Signed Statements. Previously signed Signed Statements remain valid. Implementations SHOULD support key rotation and revocation. Transparency Service timestamps provide evidence of when Signed Statements were registered, which can help bound the impact of a compromise.¶
Gold level conformance requires HSM storage for signing keys, significantly reducing key compromise risk.¶
Although prompts are stored as hashes, an adversary with a dictionary of known prompts could attempt to identify which prompt was used by computing hashes and comparing. Mitigations include access controls on event queries, time-limited retention policies, monitoring for bulk query patterns, and rate limiting.¶
Salted hashing may provide additional protection but introduces operational complexity. If salting is used, the salt must be managed such that verification remains possible without disclosing the salt to third parties. This specification does not mandate salting.¶
An attacker could flood the system with generation requests to create a large volume of ATTEMPT Signed Statements, potentially overwhelming the Transparency Service or obscuring legitimate events. Standard rate limiting and access controls at the AI system level can mitigate this. The Transparency Service MAY implement its own admission controls.¶
This profile requires that harmful content not be stored. Prompt text is replaced with promptHash, reference images are replaced with hashes, and refusal reasons SHOULD NOT quote or describe prompt content in detail. This prevents the audit log from becoming a repository of harmful content.¶
Actor identification creates tension between accountability and privacy. Implementations SHOULD use pseudonymous identifiers (ActorHash) by default, maintain a separate access-controlled mapping from pseudonyms to identities, define clear policies for de-pseudonymization, and support erasure of the mapping while preserving audit integrity (crypto-shredding).¶
Event metadata may enable correlation attacks. Timestamps could reveal user activity patterns, SessionIDs link multiple requests, and ModelIDs reveal which AI systems a user interacts with. Implementations SHOULD apply appropriate access controls and MAY implement differential privacy techniques for aggregate statistics.¶
Where personal data protection regulations apply (e.g., GDPR), implementations SHOULD support data subject access requests, erasure requests via crypto-shredding (destroying encryption keys for personal data while preserving cryptographic integrity proofs), and purpose limitation.¶
Crypto-shredding architecture:¶
This section describes potential extensions and research directions that are outside the scope of this specification but may be addressed in future work.¶
Integration with Remote ATtestation procedureS (RATS) [RFC9334] could provide stronger guarantees that the AI system is operating as expected and logging all events. Hardware-backed attestation could reduce the trust assumptions on the Issuer.¶
High-volume AI systems may generate millions of events per day. Future work could explore batching mechanisms, rolling logs, and hierarchical Merkle structures to improve scalability while maintaining verifiability.¶
More sophisticated privacy mechanisms could be explored, including:¶
These mechanisms would add complexity and are not required for the core auditability goals of this specification.¶
Stronger completeness guarantees could be achieved through external enforcement mechanisms such as:¶
These approaches involve significant architectural changes and are outside the scope of this specification.¶
The current specification uses Ed25519 signatures which are vulnerable to quantum attacks. Future revisions should address migration to post-quantum algorithms (e.g., ML-DSA/Dilithium) as NIST standards mature. The hashAlgo and signAlgo fields support algorithm agility for this transition.¶
This appendix illustrates a complete flow from request receipt to Verifiable Refusal Record verification.¶
An auditor verifying the Verifiable Refusal Record:¶
This verification confirms that a refusal was logged, but does not prove that no unlogged generation occurred.¶
A regulator verifying an Evidence Pack:¶
The authors thank the members of the SCITT Working Group for developing the foundational architecture. This work builds upon the transparency log concepts from Certificate Transparency [RFC6962].¶
The January 2026 Grok incident, while harmful, provided critical motivation for this specification by demonstrating the real-world consequences of unverifiable AI safety claims.¶
Thanks to the VeritasChain Standards Organization for developing the CAP-SRP specification that informed this Internet-Draft.¶