Our ProtocolIntegrateTry It NowContact

Our Protocol

The Responsible AI Ledger Alliance utilizes the following open standard for providing trustworthy provenance for digital content. This standard, originally authored as Arweave Network Standard 112, is a framework and definition for the “Data Format” of provenance proofs.  Each proof of provenance lists a unique hash of the content's data, as well as the prompt that was used to generate the AI image. Other fields such as “Uploaded-For” and “Model” help to identify the content and make it searchable.

The RAIL protocol is built on top of Arweave, a decentralized data storage network that enables reliable, permanent storage of data on the internet. Unlike traditional blockchains, Arweave utilizes a structure known as the blockweave, which leverages a proof-of-access consensus mechanism to maintain the network's data integrity. Arweave‘s design features a 'storage endowment' and replaces the energy-intensive 'work' of blockchain networks with the useful validation of the network's dataset, ensuring that once data is stored on the network, it remains accessible indefinitely.

By recording RAIL's provenance records on the Arweave network, users can create an immutable and verifiable chain of custody for their data, supporting its credibility and trustworthiness in a wide variety of applications. Our specification focuses on the design and implementation of such a data protocol for AI-generated data, taking advantage of the characteristics of the Arweave network to provide a permissionless and immutable ledger of content provenance, without centralized controllers.

In addition, data storage on Arweave is scalable, meaning that RAIL provenance records can be adopted by large scale creators of AI content well into the future. Through the use of transaction bundles facilitated by RAIL members like Bundlr, content generators can create dozens of thousands of new provenance records every second. You can learn more about scalable bundling here.

Specification

1. Data format

1.1 Data tags

A provenance proof must include the following tags:

NameValuePurposeOptional
Data-ProtocolProvenance-Confirmation Provides ability to identify all Creative Commons transactions
Hashing-Algo string - Hash algorithm used on the data to generate Data-Hash. Defaults to sha256 Provides ability to use different has alogrithm within the standard✔️
Data-Hash string - Hash of the data using the Hashing-Algo algorithmProvides an easy content integrity check
Uploaded-For string - Identifier of the person that the data relates toProvides an easy attribution method for the uploader✔️
Prompt string - The prompt that led to the generation of the dataAllows for a prompt✔️
Prompt-Hash string - A hash of the prompt that led to the generation of the data Allows for a private prompt which can act as a proof if it needs to be revealed✔️
Model string - Identifier of model used to generate data Allows searchability based on the model the data relates to you✔️

1.2 Hashing Algorithm

The Digital Content Provenance Standard does not hold an opinion on which hashing algorithms to support. Specifying a hashing algorithm is left to the discretion of the users and distributors of the standard.

1.2 Content Data

Storing the entire data file for a corresponding piece of digital content is optional. The Data-Hash value of the data asset is sufficient to verify provenance.

2. Record Validation

A provenance proof is valid if and only if:

  • Data-Protocol

    is

    Provenance-Confirmation

  • Hashing-Algo

    is a valid hashing algorithm name (identified by its RFC-6234 form).
  • When

    Prompt-Hash

    and

    Prompt

    are present, then

    Prompt

    must hash to the same value as the value stored in the

    Prompt-Hash

    tag.
  • When the content's data is present it must hash to the same value as the value stored in the

    Data-Hash

    tag.

© 2022 Al Provenance Alliance