Our Protocol Integrate Try It Now Contact

Our Protocol

The Responsible AI Ledger Alliance utilizes the following open standard for providing trustworthy provenance for digital content. This standard, originally authored as Arweave Network Standard 112, is a framework and definition for the “Data Format” of provenance proofs. Each proof of provenance lists a unique hash of the content's data, as well as the prompt that was used to generate the AI image. Other fields such as “Uploaded-For” and “Model” help to identify the content and make it searchable.

The RAIL protocol is built on top of Arweave, a decentralized data storage network that enables reliable, permanent storage of data on the internet. Unlike traditional blockchains, Arweave utilizes a structure known as the blockweave, which leverages a proof-of-access consensus mechanism to maintain the network's data integrity. Arweave‘s design features a 'storage endowment' and replaces the energy-intensive 'work' of blockchain networks with the useful validation of the network's dataset, ensuring that once data is stored on the network, it remains accessible indefinitely.

By recording RAIL's provenance records on the Arweave network, users can create an immutable and verifiable chain of custody for their data, supporting its credibility and trustworthiness in a wide variety of applications. Our specification focuses on the design and implementation of such a data protocol for AI-generated data, taking advantage of the characteristics of the Arweave network to provide a permissionless and immutable ledger of content provenance, without centralized controllers.

In addition, data storage on Arweave is scalable, meaning that RAIL provenance records can be adopted by large scale creators of AI content well into the future. Through the use of transaction bundles facilitated by RAIL members like Bundlr, content generators can create dozens of thousands of new provenance records every second. You can learn more about scalable bundling here.

Specification

1. Data format

1.1 Data tags

A provenance proof must include the following tags:

Name	Value	Purpose	Optional
Data-Protocol	Provenance-Confirmation	Provides ability to identify all Creative Commons transactions	❌
Hashing-Algo	string - Hash algorithm used on the data to generate Data-Hash. Defaults to sha256	Provides ability to use different has alogrithm within the standard	✔️
Data-Hash	string - Hash of the data using the Hashing-Algo algorithm	Provides an easy content integrity check	❌
Uploaded-For	string - Identifier of the person that the data relates to	Provides an easy attribution method for the uploader	✔️
Prompt	string - The prompt that led to the generation of the data	Allows for a prompt	✔️
Prompt-Hash	string - A hash of the prompt that led to the generation of the data	Allows for a private prompt which can act as a proof if it needs to be revealed	✔️
Model	string - Identifier of model used to generate data	Allows searchability based on the model the data relates to you	✔️

1.2 Hashing Algorithm

The Digital Content Provenance Standard does not hold an opinion on which hashing algorithms to support. Specifying a hashing algorithm is left to the discretion of the users and distributors of the standard.

1.2 Content Data

Storing the entire data file for a corresponding piece of digital content is optional. The Data-Hash value of the data asset is sufficient to verify provenance.

2. Record Validation

A provenance proof is valid if and only if:

Data-Protocol
is
Provenance-Confirmation
Hashing-Algo
is a valid hashing algorithm name (identified by its RFC-6234 form).
When
Prompt-Hash
and
Prompt
are present, then
Prompt
must hash to the same value as the value stored in the
Prompt-Hash
tag.
When the content's data is present it must hash to the same value as the value stored in the
Data-Hash
tag.