Harnessing Verifiable AI to Defend Against Deepfakes

April 24, 2024
Ankur Banerjee

This is the third article in a series of five.

Reestablishing Trust with Content Credentials

In October 2023, Slovakia went to the polls to elect a new government. Just days before the country went to vote, an audio recording of the leader of the Liberal Progressive Party, Michal Šimečka, speaking to a journalist about how they planned to rig the election spread like wildfire through social media apps like Telegram. The audio was immediately denounced as fake by both Šimečka and the journalist, but in the final two days of the election, in which politicians were not allowed to speak to the press, the damage was done. With few in-app tools available to consumers to verify the accuracy of what they saw online, only those willing to leave their platform of choice and actively check against other news sources would have been able to confirm that the recording was fake. A very close-run campaign, the election swung in favour of the rival party and the deepfake may have been directly attributed to this shift.

With more than 64 countries around the world going to the poll booth this year, it seems likely that misinformation using Generative AI may unfortunately play a huge part in the future of democratic states. We are quickly hitting a point where we are losing our ability to believe what we see with our eyes and hear with our ears. Deepfakes have the potential to impact society in a multitude of ways directly. Photoshop may have existed for a long time, but the ability to create misinformation at scale has only just arrived. Trust is quickly becoming the most important currency we have. In a digital world, where 90% of the material we consume may soon be AI-generated, we need to be able to trust the information in front of our eyes.

This is getting out of hand!
– Will Smith pic.twitter.com/hHxqB07xC1
— Will Smith (@WillSmith2real) February 19, 2024

Could you tell if this latest video version was real or AI-generated?

Although this problem has rushed upon us, that is not to say we do not have the tools to begin combating it. It is a problem that some of the biggest companies in the world have been thinking about for some time. Through using Content Credentials, a form of Verifiable Credentials, we can embed trust within the metadata of the content we see online so we know where that image or video was taken, who took it (whilst accounting for privacy concerns), and how (and by whom) it was edited. Verifiable AI (vAI) offers a way to rebuild trust and offer consumers the tools necessary to evaluate what they see and hear. A world where a 38-second deepfake that cost $522 in 2019 can now be created for free is in dire need of a solution.

Deepfake: A History of Visual Manipulation

1917 – Cardboard cutouts made to look like fairies are published in UK Magazine Strand and convince Sir Arthur Conan Doyle of the existence of magic beings

1988 – Photoshop is first developed

1992 – The word ‘Photoshop’ enters the Oxford English Dictionary as both a noun and a verb

2003 – Kate Winslet speaks out against GQ magazine after they significantly alter her image, halving the size of her legs.

2004 – An image of Jane Fonda is edited into a picture of John Kerry to make it appear he was at an Anti-War protest during the Presidential Campaign

2008 – An image of Vice-Presidential candidate, Sarah Palin in a bikini is widely circulated by media and social media, before being revealed to have been photoshopped to discredit her

2017 – a user on reddit called u/deepfakes publicly releases algorithms on the site, allowing anyone with the skills and computing power to have a go. The term ‘deepfake’ becomes popularised from the user’s name

2018 – FakeApp, an app to make Deepfakes more accessible, is released

2019 – Popular application FaceApp goes live with an age filter that allows users to see older versions of themselves

2019 – First fraud case using Deepfake techology is recorded. Criminals mimic the voice of a chief executive and walk away with €220,000 ($243,000).

2020 – A deepfake Queen’s speech is done by channel 4 on Christmas Day

2021 – A lawyer in the United States goes viral with the phrase “I’m not a cat” whilst taking part in a court hearing whilst his streaming video was stuck on an image of a talking cat using deepfake tech

2022 – Stable Diffusion is released, enabling non-technical users to create generative AI images using textual prompting

2023 – PhotoAI.com, known for its suite of AI tools for photo editing, launches significant enhancements including advanced face-swapping and photo manipulation tools

2024 – OpenAI announces Sora – a video generation AI platform capable of generating photo-realistic videos with realistic physics of up to one minute

The Inherited Problem with Content

Can you spot the 11 telltale signs that this image was manipulated?

Image manipulation has been a problem for some time. Even before the advent of Generative AI and deepfake generation, Photoshop has already eroded trust in what we see with our eyes. The Princess of Wales, Kate Middleton, for example, invited weeks of social media speculation about herself a few weeks ago when she posted a Photoshopped image of herself to social media, leading the Associated Press and other press agencies to issue a ‘kill notice’, stating that “it appears that the source has manipulated the image…” and requesting that media organisations remove it from their media lists and online articles. Kill notices are issued when an image or story they have shared is flagged as being untrustworthy – a huge embarrassment for the Royal Family. Fuelled by the already-building speculation caused by her prolonged absence from the media, this kill notice sent the internet into a frenzy as people endlessly discussed, theorised and farmed engagement trying to work out what had happened to the Princess. This situation goes to show that consumers and citizens have a strong desire to know when they are being lied to, and that tools which reduce opacity in where our content comes from are needed. While in this situation, it was easy to spot the clear signs of Photoshopping, what happens when it is not so obvious? When transparency is built into the system, it becomes a lot harder to be dishonest and a lot easier for organisations or individuals to judge an image’s origin and accuracy.

A message from Catherine, The Princess of Wales pic.twitter.com/5LQT1qGarK
— The Prince and Princess of Wales (@KensingtonRoyal) March 22, 2024

Now people are even questioning whether this video is AI generated! Without Content Credentials, telling the difference will soon be almost impossible.

Although Generative AI offers huge benefits to all layers of society, it also makes our problems with trust in content 10x worse. Not only does Generative AI make the production of photorealistic images possible (soon the days of Generative AI struggling with hands, feet, and shadows will be gone), but it can also do so at scale. This means that we could soon see a digital world where companies can spin up deepfake adverts with ‘real humans’ selling you a product with a script based around you and what your social media apps say about you. With Sora by OpenAI already capable of creating very realistic videos, it will not be long before these tools are in the hands of nefarious actors and fooling humans with video content becomes entirely possible. The implications for society are huge as now almost anyone, from a terrorist or rogue state actor to a teenager in their bedroom has access to tools capable of significantly moving markets, affecting election results, or defrauding someone.

The Solution: Verifiable Content Credentials

The problem of verifying content is not a new one but has been something industry giants have been working on for some time. Even Fox News has been developing its own verification system on the Polygon Blockchain, known as Verify. Additionally, a larger group of different media organisations has got together to create a standards body capable of dealing with the complexities of technology and international organisations. Known as the Coalition for Content Provenance and Authenticity (C2PA), with founding members including Adobe, BBC, Google, Intel, Microsoft and Sony, the coalition aims to set interoperable standards which enable consumers of content to understand where the content was made, by whom it was made or by what.

By using a form of Verifiable Credential, it becomes possible to verify at the moment of a picture being taken, important metadata about its provenance, such as the location, the time, the fact a camera took it and was not AI-generated, and possibly the owner of the camera in question.

Any edits then made to the picture are also added to the metadata. It is important to note that not all edits are bad: news publications regularly retouch pictures to fix lighting or redact faces. Even metadata needs editing: it may be that a photographer wishes to remain anonymous, and instead, the press agency publishing it removes the Personal Identifying Information (PII) from the metadata. Editing is a necessary part of the journey; what is important is that these edits are recorded. The use of Verifiable Credentials creates a chain of trust which enables the privacy of individuals and the production of content ready for publication. This enables the creation of established trust anchors able to digitally sign and verify information, including that which has been removed or changed.

Content Credentials are a great tool to both protect against misinformation and yet protect individual privacy. They can establish a verifiable chain of custody for digital media, documenting its origins and any subsequent modifications. This information can be valuable for forensic analysis and attribution, helping to trace the source of deep fake content and identify the individuals or entities responsible for its creation. In the examples below, we illustrate how they would work with both the truth and falsehood if Content Credentials become widely adopted:

Content Creation: A protestor in a state governed by an authoritarian regime records acts of police brutality at a peaceful protest with their smartphone. Their smartphone is enabled with C2PA-enabled hardware, which records important metadata like the location and time, as well as information such as the owner of the phone’s Personal Identifying Information (PII). This information can be removed from the picture’s metadata at a later date (though this action will itself be recorded).
Editing: The photographer edits the photo using C2PA-enabled editing software to blur the faces of some of the protestors; these changes are logged in the image’s metadata and signed by their Decentralised Identifier (DID). They then send the image, including its metadata to a journalist at a foreign publication, such as the BBC or al-Jazeera.
Signing: The journalist looks at the image’s metadata and verifies that the photo they are looking at was shot on a C2PA-enabled device and therefore not AI-generated, as well as checking what edits were made by the photographer. He signs a content credential with his organisation’s DID, attesting to the accuracy of the photo
Redaction: He redacts the photojournalist’s PII, signing a record of these changes into the content credential, attesting that they are a trustworthy source.
Publication: After final edits are made to get the picture publication-ready, the journalist posts the image on his news publication’s website and social media.
Verification: Readers are then able to look at the image metadata and check the Content Credentials to see that:
1. The image was not AI-generated.
2. It was taken at the time and location they claim it was taken at.
3. The faces of the protestors were blurred
4. The photographer’s PII was removed
5. No other edits were made.
6. The news agency has vouched for the trustworthiness of the photographer

Meanwhile, a government misinformation officer from the authoritarian regime wishes to create evidence showing the protestors being violent to justify the police attack:

Content Creation: The misinformation officer quickly generates images and videos of protestors holding weapons at the same rally using an image generator such as Dalle or MidJourney. The image generator attaches Content Credentials to the metadata attesting that the image is AI-generated.
Editing: They then use PhotoShop to make further edits which are logged as Content Credentials in the image’s metadata.
Signing: As the information officer does not have a good reputation, they must sign with a DID from an unknown account without a public reputation (or may not attest to the accuracy at all)
Publishing: The misinformation actor posts their image on social media and amplifies the post through multiple bot accounts.
Verification: When readers check the still-intact metadata of the image, they can see that:
1. The image does not have a credential proving that it was taken by a device
2. The image has a credential showing that it was generated by AI
3. The image has a credential showing that multiple manipulative edits were made
4. The image’s has not been attested to by any trust anchor of good standing

Demonstrably, Content Credentials, in this case, make it much easier for people to spot misinformation, but if the misinformation officer is smart, he would remove the Content Credentials from the image’s metadata. However, even if removed, the lack of Content Credentials proving the image’s provenance will increase scepticism in the provenance of the image, preventing it (at least) from being publicised by major news organisations. By creating tools that enable us to inspect and create a base standard for information (e.g. this is AI generated, this was taken at the correct location etc), we can start relearning how to trust the information we see or apply greater scepticism to content which lacks a chain of custody or provenance of how it was generated.

Why might an organisation or individual be incentivised to adopt Content Credentials?

Content credentials solve huge problems that we face in the wake of the proliferation of at-scale deepfakes and AI-generated misinformation, but they also offer huge opportunities.

Reducing the spread of misinformation: The greatest benefit of this kind of technology is how it aids in helping better inform the public about the provenance of what they are seeing online. The better informed the public is, the greater the capacity to critically examine what they are consuming and the less likely people are to be taken in by misinformation.

Brand protection: The use of a company’s brand in a fake news article or screenshot can cause a lot of damage to public trust in that organisation. Content credentials can create a useful source of truth that brands can point to as being their official facts, or used to establish if an AI model with the correct licensing generated an image.

Brand Legitimacy: Being a trust anchor in an established chain of content provenance also improves the legitimacy of brands by enabling them to become established sources of truth that people trust to give them the facts.

Meeting compliance requirements: Many industries have legal requirements for content authenticity and attribution. For example, the UK Advertising Standards Authority (ASA) recently reminded brands that ads using AI-generated content will need to comply with existing advertising rules such as the rules on misleading advertising, especially the rules concerning testimonials and endorsements. Content Credentials can support compliance by providing a verifiable trail of the content’s origin and claims. With codes of practices being refined and the political spotlight likely to once again fall on disinformation over the coming years, it seems likely that many companies will use Content Credentials to improve compliance with regulations and standards.

Agreeing to disagree: Organisations on different sides of the political spectrum, for example, CNN and Breitbart, may have very different perspectives and facts on an unfolding situation. However, it is still important that users have the opportunity to choose which organisation they should trust within the same framework. As long as everyone agrees on shared rules, for example, that an image should show if it is a genuine picture or something created or adjusted by AI, then we can begin having more trust in the images shared across the political spectrum.

Decentralisation of Trust: Many people today get their news from social media influencers, freelance journalists or other non-traditional news sources. By agreeing to a shared system of attestation, independent reporters, photographers and fact-checkers can establish themselves as trust anchors outside of traditional news media organisations and networks.

Financialisation of Trust: Establishing oneself as a ‘trust anchor’ may enable new commercial models for organisations currently struggling with their commercial model. Being a Trust Anchor holds value, and payment systems capable of microtransactions will enable new ways to monetize one’s credibility.

Seamless integration: Content credentials can seamlessly integrate into existing authentication pipelines. By embedding content credentials into the metadata of digital content, verification is made much simpler. Deepfake detection systems can cross-reference Content Credentials with detection results to confirm or refute suspicions about the content’s authenticity.

What challenges might we still face with Content Credentials?

Although Content Credentials present huge opportunities, they are not a deus ex machina which can solve all our problems related to establishing Verifiable AI(vAI). Many issues still be dealt with and explored to ensure a working system of trust based on Content Credentials.

Metadata can be redacted: The ability to remove information from the metadata, or just to not have any metadata in a piece of content at all is up to the person sharing that content. If the majority of the media does not adopt these standards, then it will become a mark of establishment-approved content, rather than an overarching system used by everybody to establish a shared truth.

Who do you trust? Unless you are getting your news directly from the source, or the source is happy to be public, you still have to trust whatever organisation is attesting to a fact before believing it. Just because a picture of a UFO has been confirmed as real by InfoWars, it does not follow that you should believe it is a UFO. However, it may be that others think an image attested to by InfoWars is more trustworthy than one attested to by the BBC. Although Content Credentials do create more space for the truth, who you believe will always remain a factor.

Can the system still be gamed? A smart misinformation spreader could generate a picture with AI, then take a picture if it using a C2PA-enabled camera, thus starting the chain of provenance later than where it actually started.

Do you trust the tech? Owning the right Content Credentials may become synonymous with being accurate, but if this can be gamed, or the technology can be hacked, it creates opportunities for misinformation that has been verified to spread.

Are Content Credentials fake news? Those who have interest in spreading misinformation to large audiences are incentivised to discredit a content credential systems (just as many attempt to discredit peer-reviewed scientific papers) which may potentially lead to mistrust in the technology as an ‘establishment surveillance tool’

How can the cheqd network help?

Here at cheqd, we have been working on Verifiable Credential technology for over 3 years. We have done seminal work in the creation of trust registries and helped to create the w3c standards on which Verifiable Credentials are based. Our products are fully compliant with eIDAS2 EU regulations on identity and we are in the process of becoming an EU-recognised Electronic Distributed Ledger. As well as being fully compliant, we are one of the most interoperable DID methods on the market, meaning our Verifiable Credentials can interact with multiple DID networks, and we have the capability of enabling payments for microtransactions.

Our unique privacy-preserving payment rails for Verifiable Credentials unlock the possibility of new commercial models for trust anchors involved in the issuance of Content Credentials. For example, perhaps a news organisation can use its established reputation to charge for fact-checking an independent journalist’s work or improve the possibility of royalty payments for photojournalists not previously associated with a news organisation.

Coming back to our first example of a photojournalist getting a photo from a violent protest verified and published, payment rails enable the commercialization of this model at the republishing stage. Websites that wish to redistribute or republish the image could verify that they have the stamp of approval from the publishing press agency, and once established as attested for, pay for the rights to the image. Due to the customisable nature of cheqd’s payment rails for Verifiable Credentials, this would also allow for split payments for the multiple parties involved, enabling both the news agency and the photojournalist to be paid for their work, creating a new potential automated royalties commercial model built around microtransactions.

Conclusion

Content credentials are poised to become a significant growth area and a major use of Verifiable AI (vAI) in the coming years. Given the mushrooming of deepfake misinformation and the breakdown in trust in our societies, a new way to track trust is desperately needed to provide some kind of faith in a shared version of the truth for society. We believe that the cheqd network offers a range of tooling as an infrastructure partner that will be of use to any organisation which sees itself as part of the content verification process.

Contact us

Are you a content creation platform, a publisher, news aggregator or media agency? Contact cheqd to see how you can use Verifiable AI to add trust and authenticity to content and your brand, and how we can help to unlock new business models. We are always up for a chat – contact us at [email protected]!

Ankur Banerjee

Ankur is an engineering and product management leader with deep expertise in digital identity and decentralised systems. He has a solid background in software development innovation as a co-inventor on multiple patent-pending solutions. Ankur has led exceptional multi-disciplinary teams across multiple countries and has in-depth knowledge and expertise in designing, launching, and operating of large-scale technology products. He has previously contributed to industry working groups on digital identity frameworks. Ankur is passionate about making the tech industry accessible to diverse backgrounds and to make inclusive products.

4 Jul

Become a cheqmate

Join our community to learn more about what we’re building. Get the latest news and insights in our groups below.