Our Approach to Resources on-ledger

Using the capabilities of the DID Core specification for standards-compliant resource lookup

This blog post has been co-written by Alex Tweeddale, Ankur Banerjee and Ross Power.

Introduction

Since the beginning of Q2 2022, cheqd has been assembling the building blocks for anchoring “resources” to the cheqd network.

The concept of resources in self-sovereign identity (SSI) ecosystems is not new, however, as we will discuss throughout this blog post, existing approaches to resources in SSI oblige adopters to make compromises between security, availability and interoperability. We first noticed this when we were looking at how we could securely reference credential schemas, something we will expand on throughout this post.

Our objective in building resources on cheqd is to improve the way resources are stored, referenced and retrieved for our partners and the broader SSI community, in line with the existing W3C DID Core standard.

Within this blog, we will answer three simple questions:

  1. What are resources?
  2. What are the problems with the way resources are stored?
  3. How have we implemented resources on cheqd?

By answering these questions, we aim to provide a conceptual understanding of why we have chosen this approach, how it improves on existing approaches, and the timelines for this being implemented on cheqd in practice.

In self-sovereign identity (SSI) ecosystems, “resources” are often required in tandem with W3C Verifiable Credentials, to provide supporting information or additional context to verifiers receiving Verifiable Presentations.

For example, common types of resources that might be required to issue and validate Verifiable Credentials are:

Schemas

Describe the fields and content types in a credential in a machine-readable format. Prominent examples of this include schema.org, Hyperledger Indy schema objects, etc. You can think of them as a template for what is included in a Verifiable Credential.

Below is an example of a schema.org residential address with full URLs:

{
"@type": "http://schema.org/Person",
"http://schema.org/address": {
"@type": "http://schema.org/PostalAddress",
"http://schema.org/streetAddress": "123 Main St.",
"http://schema.org/addressLocality": "Blacksburg",
"http://schema.org/addressRegion": "VA",
"http://schema.org/postalCode": "24060",
"http://schema.org/addressCountry": "US"
}
}

This might also take the form of evidence schemes, which describe additional information about the processes used to validate the information presented in a Verifiable Credential in common, machine-readable format.

Revocation status lists

Allow recipients of a Verifiable Credential exchange to check the revocation status of a credential for validity. Prominent examples of this include the W3C Status List 2021 specification, W3C Revocation List 2020, Hyperledger Indy revocation registries, etc.

Visual representations for Verifiable Credentials

Source: Overlays Capture Architecture Specification Version 1.0.0

Although Verifiable Credentials can be exchanged digitally, in practice most identity wallets want to present “human-friendly” representations. A resource, using something like Overlay Capture Architecture (OCA) may enable a credential representation to be shown according to the brand guidelines of the issuer, internationalisation (“i18n”) translations, etc. Such visual representations can also be used to quickly communicate information visually during identity exchanges, such as airline mobile boarding passes.

Source: British Airways MediaCentre— Multiple Boarding Passes Land on BA APP

In the example above from British Airways, the pass at the front is for a “Gold” loyalty status member, whereas the pass at the back is for a “standard” loyalty status member. This information can be represented in a Verifiable Credential, of course, but the example here uses the Apple Wallet / Google Wallet formats to overlay a richer display.

While it’s useful to have digital credentials that can be verified cryptographically, the reality is that there are often occasions when a quick “visual check” is done instead. For example, when at an airport, an airline staff member might visually check a mobile boarding pass to direct people to the correct queue they need to join. The mobile boarding pass does get scanned at points like check-in, security, boarding etc., to digitally read the information, other scenarios where this is not done are equally valid. However, most Verifiable Credential formats do not explicitly provide such “human-friendly” forms of showing the data held in a credential.

Documents

More broadly, there are other types of resources that might be relevant for companies beyond SSI vendors, that want a way to represent information about themselves in an immutable and trustworthy way.

Many companies require documentation such as Privacy Policies, Data Protection Policies or Terms of Use to be made publicly available. Moreover, Trust over IP (ToIP) recommends making Governance Frameworks available through DID URLs, which would typically be a text file, a Markdown file, PDF etc.

Logos

Companies may want to provide authorised image logos to display across different websites, exchanges or block explorers. Examples of this include key-publishing sites like Keybase.io (which is used by Cosmos SDK block explorers such as our own to show logos for validators) and “favicons” (commonly used to set the logo for websites in browser tabs).

The current uses for resources are therefore very broad across the SSI ecosystem, and in addition, for other companies that may want to use DIDs to reference relevant information on ledger. For this reason, it is essential that the SSI community strengthens the way that resources are stored, referenced and retrieved in SSI ecosystems.

What are the problems with the way resources are stored?

There are multiple approaches to decentralised identity which rely on centralised infrastructure across different technical layers. Decentralised Identifiers (DIDs): are often stored on ledgers (e.g., cheqd, Hyperledger Indy, distributed storage (e.g., IPFS in Sidetree), or non-ledger distributed systems (e.g., KERI). Yet, DIDs can be stored on traditional centralised-storage endpoints (e.g., did:web, did:git).

Predominantly, however, the issue of centralisation affects resources providing extra context and information to support Verifiable Credentials. These resources, such as schemas and revocation lists, are often stored and referenced using centralised hosting providers.

Using centralised hosting providers to store resources may have a significant difference in the longevity and authenticity of Verifiable Credentials. For example, a passport (which typically has a 5–10 year validity) issued as a Verifiable Credential anchored to a DID (regardless of whether the DID was on-ledger or not) might stop working if the credential schema, visual presentation format, or other necessary resources were stored off-ledger on traditional centralised storage.

This section will therefore explain the pain points that should be addressed to improve the way resources are stored, managed and retrieved in SSI ecosystems.

Single points of failure

Even for highly-trusted and sophisticated hosting providers who may not present a risk of infrastructure being compromised, a service outage at the hosting provider can make a resource anchored on their systems inaccessible.

The high centralisation of cloud providers and history of noteworthy outages clearly demonstrates why we should not host resources on centralised cloud storage in production environments. In Q1 of 2022, the three largest players in the cloud (AWS, Google Cloud, Microsoft Azure) dominated with 65 per cent in nearly all regions (outside of China).

Source: CoinTelegraph “The future of the internet: Inside the race for Web3’s infrastructure’

Furthermore, beyond cloud providers, there are other events that exemplify the issuers relying on larger players. The Facebook outage of 2021 (shown in the graph below) took down apps that used “Login with Facebook” functionality. This highlights the risks of “contagion impact” (e.g., a different Facebook outage took down Spotify, TikTok, Pinterest) of centralised digital systems — even ones run by extremely-capable tech providers.

Source: Why Facebook, Instagram, and WhatsApp All Went Down Today

Ed Skoudis, president of the SANS Technology Institute amusingly commented on this issue:

“In the IT field, we sometimes joke about how we spend 15 years centralizing computing, followed by 15 years decentralizing, followed by another 15 years centralizing again,” he said. “Well, we have spent the past 10 years centralizing again, this time on [the] cloud.”

Likewise, with decentralised identity, there has been excellent work to decentralise, with standards that remove the need for centralised intermediaries — notably around Verifiable Credentials and the decentralised trust provided by DID Authentication. Yet, all of this excellent work may be eroded in practice, unless every component of an SSI ecosystem is able to maintain an equivalent level of decentralised trust. Resources are currently an area that has been centralised for the sake of convenience.

Link Rot

“Link rot” happens when URLs become inaccessible over time, either because the endpoint where the content was stored is no longer active, or the URL format itself changes. The graph below from an analysis by The New York Times shows the degradation over time of URLs.

Source: What the ephemerality of the Web means for your hyperlinks

For this reason, keeping an up-to-date version of the links themselves is crucial. Furthermore, a study of link rot found at least 66.5% of links to sites in the last 9 years are dead. This can have an adverse impact on the digital longevity of Verifiable Credentials if there’s “link rot” in the resources necessary to process the credential. For this reason, projects such as The Internet Archive’s Wayback Machine exist to snapshot digital ephemera before they are lost forever.

Source: Ahrefs Study on Link Rot

This illustrates that link rot can affect a significant proportion of links in a relatively small amount of time, and once again, looking at how resources are currently stored in SSI ecosystems, if the resource locations are moved and the links are broken, the Verifiable Credentials relying on these resources become unusable. Therefore, resources, once defined, should be architected to be used and referenced indefinitely, without being changed.

Tamper-evident changes and censorship resistance

Finally, the centralised way that resources are currently stored and managed is not immutable, and as a result, it is liable to tampering. For example, if a hosting provider is compromised, or if malicious actors are working for the company, resources may be changed and previous resource versions may be purged from the central database.

As we move towards a new web infrastructure with Web 3 (and beyond…), and as more projects leverage blockchain and distributed ledgers, it’s important not to port the previous issues of the web, and instead find novel ways to better manage information, with longevity in mind. This is why at cheqd, we have decided to redesign the way resources are captured on the ledger.

How have we implemented resources on cheqd?

cheqd’s on-ledger resources can be defined as data files, stored in an organised collection and common format on a distributed ledger. As such, these resources will be highly available, resilient to attack and with immutable archives showing expired, revoked or deprecated resources.

Resources, using the same method of referencing may also be stored off-ledger.

When working towards this objective, we laid out a set of requirements we benchmarked our implementation against:

  1. Resources must be referenceable via immutable and uniquely identifiable using Decentralised Identifiers (DIDs).
  2. Resources can be stored on-ledger if they are sufficiently small enough. (If a resource is too large to be stored on-ledger, e.g., an image or video file, they should still be referenceable via their DIDs.)
  3. Resources must be versioned, with each version easily accessible in the future.
  4. Resources can be indexed, to promote reuse of resources.
  5. Existing DID resolvers should be able to either resolve resource URLs or get references to them without significant modification to how they currently function and behave.
  6. There should be an ability to mark resources as deprecated or superseded by new versions.;
  7. On-ledger resources must fit within the existing W3C standards for decentralised identity;
  8. Resources should be assigned a media type, to allow client applications to apply logic to what resources they expect and want to consume.

Is this similar to Hyperledger Indy’s approach to Schemas and CredDefs on-ledger?

We heavily considered the schema implementation used by AnonCreds on Hyperledger Indy in our design phase, since it tackles many of the problems highlighted above. However, the issue we have with schemas and Credential Definitions for AnonCreds is that they are very tightly coupled with Indy-based ledgers. Both schemas and Credential Definitions, require Indy-specific transactions, limiting the interoperability of these Credentials outside of Indy ecosystems.

Our implementation will enable resources on the ledger to be far more interoperable, beyond cheqd, since the architecture does not tie resources to a specific ledger, and builds within the parameters of the W3C DID Core Spec. This means that partners using AnonCreds and proponents of JSON or JSON-LD Verifiable Credentials can benefit from this approach. We have tried to design our architecture as flexible as possible to allow new resource types to be created, without any vendor lock-in.

We will be writing a specific blog post on how cheqd supports AnonCreds and Credential Definitions using its resource architecture.

On-ledger resources on cheqd

To explain the details of how we have structured this, we will start by breaking down the high-level overview diagram shown in figure 1.

Figure 1: High-level architecture flow for resources on-ledger

The diagram above shows multiple layers to this architecture:

Issuer DID Document

This Issuer DID Document is created as per usual. It may be updated to reference a Collection DID Document within the service section, and also may specifically link to a Resource using a service endpoint.

This allows an Issuer to explicitly cite which resources it uses when issuing Verifiable Credentials.

Collection DID Document

This Collection DID Document references an on-ledger Collection, using the unique identifier of its DID URL, which is the same as the Collection ID. It also acts as the keys and gating mechanism for controlling, updating and managing on-ledger resources within that Collection.

The same verification methods listed in this DID Document are used to authenticate with and manage the resources within the collection it refers to.

Resource Collection

Resource Collections are a way of organising resources into different versions or media types. This enables new resources to be added to a collection, and old resources to be indexed and still be retrievable through querying a collection by version time.

The Collection ID is the same as the unique identifier of the Collection DID Document’s DID URL and ‘id’.

Resources

The resource contains the actual data of the resource itself, including its name and media type. A resource is directly retrievable through the service endpoints of both the Collection DID Document, and optionally within an Issuer DID Document.

The full architecture including specific layer-by-layer details about how each component references and links to the other can be found in the Resources section of our Identity Documentation.

Referencing resources with DIDs

We decided to identify and reference each ‘resource’ with its own unique identifier, within a ‘collection’ tied to a DID. This enables us to reference a resource in the following way:

Figure 2: resource configuration via DIDs

Each Collection and Resource is identified with its own Universally Unique Identifier (UUID). However, the Resource Collection ID is also the same as the unique identifier of the DID that controls the collection.

Utilising the ‘service’ section

We decided to reference ‘resources’ by using the ‘service’ section, rather than creating a new section in a DIDDoc for multiple reasons:

  1. While the DID Core spec technically allows creating new sections, most client apps expect the specific default/minimum list, and would not know how to handle the contents within a new section.
  2. Service Types are already designed to be extended. It is a well-trodden and well-recognised part of DID Documents. For example, the DID Spec Registries currently list two service types: LinkedDomains and DIDCommMessaging

We used the LinkedDomains service type in our first DID to directly reference an image hosted on IPFS using a DID.

{
"id": "did:cheqd:mainnet:zF7rhDBfUt9d1gJPjx7s1JXfUY7oVWkY#non-fungible-image",
"type": "LinkedDomains"
"serviceEndpoint": "https://gateway.ipfs.io/ipfs/bafybeihetj2ng3d74k7t754atv2s5dk76pcqtvxls6dntef3xa6rax25xe",
}

If you look up the IPFS link above through any valid IPFS gateway, you’ll find our Data Wars poster.

Likewise, we are using the ‘service’ section of DID Documents to reference specific resources:

{
“id”: "did:cheqd:mainnet:46e2af9a-2ea0–4815–999d-730a6778227c#DegreeLaw"
"type": "CL-Schema",
"serviceEndpoint": "https://resolver.cheqd.net/1.0/identifiers/did:cheqd:mainnet:46e2af9a-2ea0-4815-999d-730a6778227c/resources/688e0e6f-74dd-43cc-9078-09e4fa05cacb"
}

Through linking to the resource this way it is highly accessible and easily consumable for client applications, DID resolvers and developers. Furthermore, if a client app doesn’t understand a service section, most of them skip and ignore it rather than throwing an error and causing the system to fail, potentially catastrophically.

Creating and retrieving a resource

In order to create a resource on the ledger, the following steps laid out in figure 2 should be followed:

Figure 3: Creating and retrieving a resource on cheqd

Writing resources to the ledger

Create Collection DID Document

Anchor Collection DID and associated Collection DID Document to the ledger through a create DID operation

Create Resource

Anchor Resource to the ledger and specify Collection ID as the same identifier as the unique identifier from the Collection DID Document. Sign createResource transaction with the same private key as the verification method listed in the DID Document

Update Collection DID Document

Update Collection DID Document and reference the Resource within the ‘service’ section

This Collection DID Document and Resource may also be referenced within an Issuer DID Document, as shown in figure 1.

Retrieving resources from cheqd ledger

Query ledger for Resource

Through referencing resources using DIDs, as explained above, it makes it far easier to query historic or deprecated resources using DID resolvers and DID URL dereferencing,

For example, the following request could be made to a resolver to fetch a resource from a specific point in time:

https://resolver.cheqd.net/1.0/identifiers/did:cheqd:mainnet:46e2af9a-2ea0-4815-999d-730a6778227c#degreeLaw?versionTime=2022-06-20T02:41:00Z

This may be incredibly powerful where a resource, such as a schema, has evolved over time, but you want to prove that it was issued using the correct schema at the point of issuance.

Return Resource

Full Resource is returned including any data files attached.

Tutorials for creating a resource on-ledger can be found here on our identity documentation site. Further technical detail about creating resources can be found in our Architecture Decision Record 008.

How this improves the way resources are stored and retrieved

Through storing resources on ledger, referencing them through resolvable DID URLs, and authenticating them using DID Documents, the resources on-ledger will be:

Highly available and easily retrievable

Resources are identified by a DID URL which allows them to be retrieved easily from a distributed ledger using existing DID Resolvers.

Using a distributed ledger like cheqd to store and index resources removes the problem identified by centralised systems creating single points of failure, such as schema.org.

Schemas, for example, would therefore become on-ledger resources, represented in the format of the following example:

Resource1
{
"header": {
"collectionId": "46e2af9a-2ea0–4815–999d-730a6778227c",
"id": "688e0e6f-74dd-43cc-9078–09e4fa05cacb",
"name": "DegreeLaw",
"resourceType": "CL-Schema",
"created": "2015–02–20T14:12:57Z",
"checksum": "a7c369ee9da8b25a2d6e93973fa8ca939b75abb6c39799d879a929ebea1adc0a",
"previousVersionId": null,
"nextVersionId": "0f964a80–5d18–4867–83e3-b47f5a756f02",
}
"data": "<CLSchema.json containing ‘{\"attrNames\":[\"last_name\",\"first_name\"\"degree_type\"\"graduation_year\"\"degree_percentage\"]}>”
}

This schema could be resolved through a DID Resolver, with an input such as the following:

did:cheqd:mainnet:46e2af9a-2ea0–4815–999d-730a6778227c#degreeLaw?versionTime=2015–09–08T02:41:00Z

You can dive into further detail on the syntax of resources and how they can be retrieved within the Resources section of our Identity Documentation.

Controllable and self-attestable

Resources can be tied to DID Documents and control over resources can be exerted via the same verification method keys as those written into an associated DID Document.

This allows persons to authenticate with a DID Document to update or prove control of a Resource, which addresses the problem of tamper-proofing identified around centralised cloud providers.

Built to be consumed by client applications

Resources must specify a name, and resource type and compile into a media type, which provides specific information to any client application about what data format and syntax the data are expected to be retrieved in.

This allows client applications to apply business rules to what types of resources are expected, and which should be accepted, making resources far easier to be consumed by third-party software. This differs from existing Hyperledger Indy resources, which require the client applications to be familiar with Indy in order to process and consume Indy resources.

Conversely, our method gracefully allows “dumb” applications, that do not understand specific DID protocols, to still fetch and access a resource over HTTP/REST APIs. In addition, “smart” applications, that do understand these protocols, can process, query, and get their own resources from DID resolution and dereferencing.

Indexable

Resources are versioned with unique identifiers (UUIDs), allowing previous versions to be archived within a collection, and retrieved through querying a unique ID or a version time.

This mitigates the problem identified of link rot when using centralised storage systems since each version is archived immutably.

For more information, please refer to the section on Versioning and Archiving Resources in our identity documentation.

Conclusion

In building resources on-ledger, we want to avoid the risks associated with relying on centralised infrastructure for storing resources, while importantly, remaining conformant with W3C standards and avoiding ledger lock-in.

We believe that we have achieved this compromise by enabling the management of resources through the use of DID Documents, identifying resources using DID URLs and retrieving resources using existing DID Resolvers.

Not only does this work solve existing problems, but it opens the door for far more innovation using DIDs, including:

  • Fully identifiable and resolvable governance documentation and schemes on ledger, tied to DIDs and DID Documents
  • Full AnonCreds support on non-Indy ledgers (to be explained further in a future blog)
  • On-ledger revocation lists, where each tails file can be uniquely versioned and retrieved efficiently (this is our next priority roadmap item)
  • Logos and company information easily accessible on-ledger, referenced within that company’s DID

We look forward to working with our partners and the broader SSI community to see how we can innovate together using this new functionality, and properly securing SSI architecture in a decentralised end-to-end flow.

As always, we’d love to hear your thoughts on our writing and how resources on-ledger could improve your product offering. Feel free to contact the product team directly — product@cheqd.io, or alternatively start a thread in either our Slack channel or Discord.


Our Approach to Resources on-ledger was originally published in cheqd on Medium, where people are continuing the conversation by highlighting and responding to this story.