Origin of Verifiable AI:The Interplay of AI and Verifiable Credentials

This is the first article in a series of five.

Building Trust in a New World

More people in 2024 vote than in any other year in history, yet our ability to trust in the news, in pictures, videos, and the opinions of others we see online is at a nadir, helped along by the advent of near photo-realistic and well-written Generative AI. What do you trust if you can’t be sure whether you are speaking to a relative or a scammer, or that a picture is genuine?

Although this problem has rushed upon us, that is not to say we do not have the tools to begin combating it. AI should require its own set of credentials, digitally verifiable to ensure that every step in the information supply chain has an imprint, which can be viewed in a zero-knowledge, verifiable way. By using verifiable credentials, we can embed trust into the entire information supply chain, so we understand data provenance , that the hardware which trains it is uncompromised and high-quality, and that we know whether what we are looking at is of AI-origin.

In a digital world, where soon 90% of the material we consume may be AI generated, we need verifiable AI (vAI) to be able to trust the information in front of our eyes. Trust is quickly becoming the most important currency we have. In this discourse, we delve into what ‘verifiable AI’ may look like: the interplay between AI and verifiable credentials, exploring how their use can bring on the advent of Trusted Data; the AI information supply chain; and how verifiable credentials add value along the chain.

What is Trust?

Trust is one of the underlying social dynamics that underpins society. Being able to rely on a person or a business to do as we expect enables vastly more enterprise to occur, as we trust countless tasks to others to make our lives smoother.  Companies outsource many elements of their business, from cloud storage, computing power, operating systems; and must rely upon trusted suppliers to deliver the correct materials or services for their own business to function. With high levels of trust, ideally with an enforceable set of laws, we can rely on others to do their part in mutually beneficial interactions, playing by the established set of rules for each area of life allowing money and goods to move faster, allowing us to outsource more and specialise further.

“Trust” is, of course, earned. We learn to trust others by observing their behaviour, by understanding their reputation amongst their peers and by observing their incentives – what they have to win or lose from an interaction. This is why reviews on Google, Trustpilot and Airbnb can be of so much importance to business owners – we can shortcut the process of building trust by having others verify another’s trustworthiness. Trust is not only a key resource, but also a rare one, something which must be gained over time, yet lost in a moment.

Yet how do we maintain levels of societal trust and even that within companies or even families, when the very basis of trust is being undermined? How can we know that a restaurant review is genuine and not AI generated? Verifying that you are a real person and that your data can be trusted  will become of utmost importance for every layer of society. However, the need for trust goes way deeper – as AI becomes more important to society and decision-making more generally, the entire supply chain of information will need a greater layer of trust and verification if we are to maintain trust in the actions, data integrity, and models of Artificial Intelligence. Trusted Data and Trusted Data Markets will be essential for a future dominated by AI.

Verifiable Credentials - Creating Trusted Data

Verifiable Credentials and Decentralised Identifiers are key technologies in the emerging Self-Sovereign Identity industry, which should have a major change in the way we interact online in the coming years, especially as the EU’s eIDAS2 legislation comes into maturity. The use of zero-knowledge proofs enables the transfer of packets of Trusted Data without needing to overshare as a user, or to store this data securely  as an enterprise. One may, for example, have a Reusable KYC credential, with a cryptographic signature on-chain which can be referenced back to your country’s Driver’s Licence authority. Through creating additional faster methods of verification which protect user information, Trusted Data can be generated without the need for expensive cross-referencing; massively reducing the effort needed to have enough trust for high-value interactions, but also speeding up the process for low-value interactions. 

The Trust Triangle - Making Verifiable Credentials Work

Verifiable Credentials work because of a concept called the ‘Trust Triangle’ which broadly describes any digital interaction with Trusted Data. Its importance stems from the key ability to decentralise trust, as the verifier does not need to directly contact the issuer to confirm the attestations are correct. Each interaction necessitates the involvement of three actors.

The Holder: An actor ‘holding’ a credential who may need to prove something. They would hold their credentials inside a wallet.
The Issuer: An actor who ‘issues’ the credential in the first place – it is their Decentralised Identifier, or ‘signature’ that shows that the credential is valid and is written onto the cheqd network to be cross-referenced.
The Verifier: An actor who ‘verifies’ that the credential is Trustworthy by checking if the issuing party is trustworthy and valid.

  1. The Issuer registers a Decentralised Identifier (DID) on the cheqd network. DIDs act as signatures which prove that a Verifiable Credential has been signed by an Issuer. A version of this signature is written onto the cheqd network and can be cross-referenced. 
  2. The Issuer now ‘signs’ a  Verifiable Credential with his DID and sends it to the wallet of the Holder. 
  3. This credential can be considered a ‘Packet’ of Trusted Data, which any Verifier should be able to trust. The holder has complete control over who can view his data and who he sends it to. Neither the Issuer, nor Verifier hold this information.
  4. The Holder sends his Trusted Data in the form of a Verifiable Credential  to the Verifier when he is asked to prove an attestation.
  5. The verifier can then cross-reference the DID attached to the Verifiable Credential with what is written onto the cheqd network and establish if this is a trusted issuer or not.

The New Information Supply Chain

Artificial Intelligence is the result of an information supply chain, an entire lifecycle of data and processes involved in the creating, training, deploying and utilising of AI systems. Within this are several stages, from the collection of raw data, training of algorithms, deployment of AI models and the generation of inference (the actual use of the AI by an end-user). Take a car as an analogy – made up of multiple raw resources (Data), these are turned via different industrial processes in different places into working components (Training), which, when put together correctly, create a product which can be driven and used by consumers (Inference Generation).

Companies spend millions every year on optimising their Supply Chain Management, and, given AI’s capacity to not just influence decision-making, but to become involved in decision-making, it stands to reason that tracking the information supply chain will become of the utmost importance to ensuring the integrity of any LLM model.

How Verifiable Credentials Add Trust to the AI Information Supply Chain

Verifiable credentials can play a crucial role in adding value to the AI information supply chain by enhancing transparency, accountability, and security at multiple levels.

1. Data Collection and Provenance

Data quality, is key to running good models – hallucinations in AI are more likely to occur when trained on other AI-generated Data, and the undeniable hard truth is that  LLMs require huge, high-quality data sets to obtain any kind of utility – an AI is only as good as the quality and quantity of its data. Moreover, models trained on data which is the Intellectual Property of others may create huge legal issues, and as competing models continue to emerge, knowing that the model you are about to use is compliant with all laws and regulations will be crucial.

Verifiable credentials can be attached to data at the point of collection or to specific data sets, establishing a secure and tamper-proof record of its origin and characteristics. This ensures that stakeholders can trace the provenance  of the data, verifying its authenticity and quality. By having credentials associated with each data source, users can gain insights into whether the data is reliable, ethically sourced, suitable for training AI models and legally compliant.

2. Model Training

As Artificial Intelligence models evolve, different AI models will be trained on different datasets and have different characteristics for their final output . For example, most AI models should go through Bias Detection and Mitigation training, however, it is likely that different levels of bias training may be wanted by different users, thus requiring labelling so users can make an informed choice for their exact needs. 

3. Model Deployment

If a Verifiable Credential is being used in Data Collection, Dataset selection and Algorithm labelling, these verifiable credentials can extend to the deployment phase, confirming that the AI model being utilised is the result of a legitimate and unbiased training process. This includes credentials for the model architecture, hyperparameters, and any other relevant details. Users can verify that the deployed model aligns with ethical considerations, regulations and industry standards, fostering trust in the model’s decision-making capabilities and allowing users to make a better informed choice about the best model to use.

4. Hardware Integrity

Verifiable credentials play a critical role in ensuring the integrity of the hardware used in the AI infrastructure. By attaching credentials to the hardware components, such as GPUs, TPUs or Trusted Execution Environments (TEEs), stakeholders can verify that the hardware is free from compromises, operates correctly, and meets the required specifications without having to spend a large amount of time auditing the hardware, something especially hard in a Decentralised Computing environment where the network is not necessarily working with trustworthy entities. This is essential for mitigating risks associated with biased, faulty, geographically distant or sub-standard hardware that could impact the performance of AI models.

5. Zero-Knowledge Verification

Verifiable credentials enable zero-knowledge verification, allowing stakeholders to validate claims without disclosing sensitive information. In the context of AI, this means that the quality and accuracy of algorithms can be verified without revealing proprietary training data or methodologies. This privacy-preserving approach enhances the security of the information supply chain, assuring users that their data is handled responsibly.

6. Content Credentials

As citizens, we deserve to know where the content we consume comes from, especially since it is now much harder to distinguish between Generative AI and reality.  VCs are the perfect tool for this due to the maintenance of self-disclosure &  privacy, self-storage of data, tamper-protection and interoperability between different systems.

In fact, they have already found a use here, being the technological basis behind the C2PA, a new alliance of companies, including Adobe, Microsoft, Arm and Intel, building a new set of industry standards leveraging Verifiable Credentials, which aims to improve image and video provenance with a recorded supply chain going all the way back to the camera which took the picture. This transparency allows individuals to differentiate between human-generated and AI-generated content, fostering a better understanding of the sources of information and potential biases associated with AI-driven insights.

7. Proof of Personhood

Along with AI-generated  picture and video quality improving, it has also  become increasingly harder to tell if we are speaking to a human or a bot. A blue tick costing $8 a month is not enough to trust that you are speaking to a human. Knowing someone’s Proof-of-Personhood will become increasingly important if we are to rebuild our trust in online spaces and prevent sybil attacks, such as when a single person spins up multiple wallets to receive an outsized airdrop intended for individuals. Verifiable Credentials can be used here to give  individuals the ability to prove their personhood without having to compromise their personal data or complete a complex captcha,, whilst giving organisations a simple way to reduce the number of bots using their services without a huge amount of cross-referencing or increased bounce rates due to captchas.

8. Proof of Permission

As AI becomes more complex, ‘AI agents’ will become more common. These are artificial intelligence systems which can not only write text, but complete tasks for their owners, such as conducting internet research, enacting trading plans or writing complex code. AutoGPT was an early version of this, capable of completing much more complex tasks that ChatGPT, whilst Fetch.ai and Autonolas represent web3 examples of AI agents. As these improve, we will likely want these agents to do many authorized, important tasks that require ‘Proof of Permission’ from us, such as doing our taxes, trading on exchanges or conducting research behind paywalls. Verifiable credentials present a perfect use-case here for ‘Proof-of-Permission’ by creating zero-knowledge Trusted Data, which would allow the correct agents through with the correct permissions smoothly without the need for a hugely complex verification operation or large data collections on participants.

In summary, verifiable credentials enhance the information supply chain within AI by establishing a secure, transparent, and accountable framework at many points. They contribute to building trust in AI systems by ensuring the integrity of data, algorithms, and hardware components throughout the entire lifecycle, ultimately promoting responsible and ethical AI practices.

Verifiable AI in Practice: Healthcare AI Model Training

The potential benefits of AI models are enormous across many industries, one example being in healthcare, as AI models are able to notice patterns that may completely pass by healthcare workers. However, healthcare is a highly regulated industry, meaning the training of any AI model must take account of regulations around patient data privacy, ethics and security. Ensuring a Healthcare AI model meets these criteria is essential as no one will use a non-compliant model in this industry. Below we have laid out how the Trust Triangle and Verifiable Credentials function to ensure trust and compliance are built into the model from the start. 

  1. Holder (Data Provider)
  • The holder in this scenario is a healthcare organisation that possesses a vast dataset containing patient records, including medical histories, diagnoses, and treatment outcomes.
  • The healthcare organisation wishes to contribute its data to train an AI model for predicting disease outcomes while maintaining the privacy and security of patient information.
  1. Issuer (Data Custodian or Aggregator)
  • The issuer is a trusted entity responsible for aggregating and managing datasets from various healthcare organisations. It acts as a custodian of the data and plays a crucial role in ensuring data privacy and integrity.
  • Before contributing the healthcare data to the AI training process, the issuer issues verifiable credentials to the data, attesting to its quality, compliance with privacy regulations, and the ethical sourcing of information. These credentials could include information about the data’s origin, anonymisation processes, and adherence to relevant standards.
  1. Verifier (AI Model Developer or Trainer)
  • The verifier is the entity or organisation responsible for developing and training the AI model. This could be a research institution, a tech company, or any entity involved in AI development.
  • When receiving the healthcare data for model training, the verifier uses the verifiable credentials provided by the issuer to assess the quality, authenticity, and ethical considerations of the data. The verifier can verify claims about the data without accessing the detailed patient records, ensuring compliance with privacy regulations and maintaining the confidentiality of sensitive information.

Adding Trust through Verifiable Credentials

  1. Data Provenance and Quality
  • Verifiable credentials issued by the data custodian attest to the provenance and quality of the healthcare data. This enhances the trustworthiness of the data used for AI model training, as the verifier can be confident that the data originates from reliable sources and adheres to established standards.
  1. Privacy Preservation
  • Verifiable credentials enable zero-knowledge verification, allowing the verifier to confirm the credentials’ claims without compromising the privacy of individual patient records. This ensures that sensitive healthcare information remains confidential while still providing assurances regarding data quality.
  1. Ethical AI Practices
  • The use of verifiable credentials promotes ethical AI practices by allowing the verifier to ensure that the data used in model training adheres to ethical standards and regulations. This includes verifying that the data has been collected responsibly, with proper consent and anonymization measures in place.

In this example, the trust triangle of holder, issuer, and verifier, facilitated by verifiable credentials, adds significant value to the AI supply chain. It establishes a secure and transparent framework for sharing and utilising sensitive data, fostering trust among stakeholders and promoting responsible AI development practices.

Conclusion

As the artificial intelligence landscape continues to evolve, the importance of Verifiable AI will grow exponentially.  The integration of Verifiable Credentials with Artificial Intelligence will create new opportunities for the further development of Trust within the Information Supply Chain. Verifiable Credentials serve to not only authenticate each step in this supply chain – from data provenance to hardware integrity – but also to confirm the AI origin of content, helping to embed trust at every level. Ultimately, for society to function and for trust to grow, we need technological solutions which allow the sharing and production of Trusted Data, and Verifiable Credentials should be a part of the solution.

Contact Us

Are you a team member, or community member of an AI project that you think could use Verifiable Credentials and Decentralised Identifiers? We are always very happy to have a chat!  Contact us or get your favourite team to contact us at [email protected]!

Bringing On-chain Reputation Scores to Creds with DSID

cheqd partners with DSID to bring a new form of “reputation score” to its collectors

We’re excited to announce a new partnership with DSID (Digital Social ID),  a user-centric identity management solution that enables people to build a measurable online reputation. 

For businesses, DSID acts as a rating agency of blockchain accounts with vast datasets on users based on their on-chain activities and off-chain integrations. DSID is active in different domains such as credit scoring, social media and professional reputation as well as gaming reputation.

How does it work?

Users connect their wallets to the DSID Hub and their scoring algorithms initially analyze all available on-chain data. Within seconds, a preliminary score is created. Users will be able to enrich the score  by integrating additional data feeds throughout Web3 and Web2. This will  be done proactively by the user. Variables that compose the score and the score itself are stored off-chain as verifiable credentials and verified on a ledger. They can be requested by verifiers on demand to learn more about users connecting to their apps or services. 

The DSID score will evolve over time, at first being informed by a user’s connected Ethereum accounts, with Solana, Cosmos and other protocols and chains to follow. 

What will this look like for Creds collectors?

Creds, enables users to build portable reputations across platforms through verifiable credentials that prove identity and showcase achievements and participation. Unlike NFTs and SBTs, Creds are private and revocable credentials that allow users to selectively showcase their reputations across communities. Creds uses on-chain identifiers for security while keeping user data off-chain. Projects can leverage Creds to reward users with exclusive benefits based on their credentials and reputation.

We’ll soon be exposing our issuance APIs, offering a seamless process for developers to build and issue verifiable credentials. DSID is one such partner, which will issue credentials for a range of reputation scores based on on-chain activity. 

By offering DSID reputation scores, users will be able to start building their reputation based on their on-chain interactions. This is the first step in our pursuit of offering a blended and unified on and off-chain reputation, so users can prove who they are and what they offer in a more seamless and coordinated way, enabling them to access rewards such as airdrops, gain access to credit, and more.

How will the aggregated data be handled so that user privacy is protected?

We want users to feel confident in their online activities, and that begins with maintaining the utmost respect for privacy across all our products plus those we partner with. 

Creds follows a zero-compromise approach to privacy, meaning that users’ personal information and on-chain data are handled with the highest level of discretion. Our systems are fortified with encryption technologies and decentralised architecture, mitigating the risk of unauthorised access and ensuring that your on-chain interactions remain secure. Your data remains exclusively under your control.

With DSID, given all raw data that lives on-chain is already publicly available, the initial data is not at risk. However, as soon as DSID starts to interpret data and creates tags such as “frequent DEX trader” which is an analysis from on-chain behaviour, those tags are stored off-chain in a VC-based SSI framework. The same goes for all off-chain integrations, meaning any specific data or analysis on a user is protected with the same privacy standards that we apply at cheqd. 

Stay tuned to a future where reputations are effortlessly demonstrated and universally acknowledged.