AI-generated text and video watermark with SynthID

Company

Post it: May 14, 2024

Announcing our new watermarking method for AI-generated text and video and how we’re bringing SynthID to core Google products

Artificial intelligence generation tools — and the big model language technologies behind them — have captured the public’s imagination. From assisting with work tasks to enhancing creativity, these tools are quickly becoming part of products used by millions of people in their daily lives.

These technologies can be extremely beneficial, but as they become more popular in use, there is an increasing risk that they can accidentally or intentionally cause harm, such as disinformation and phishing, if AI-generated content is not properly identified . That’s why last year, we introduced SynthID, our new digital toolkit for watermarking AI-generated content.

Today, we’re extending SynthID’s capabilities to watermark AI-generated text in Gemini app and web experienceand video in Veo, the most capable video production model.

SynthID for text is designed to complement the most widely available AI text generation models and to scale, while SynthID for video relies on image and audio watermarking to include all frames in the generated videos. This innovative method incorporates an invisible watermark without affecting the quality, accuracy, creativity or speed of the text or video creation process.

SynthID is not a silver bullet for identifying AI-generated content, but it is an important building block for developing more reliable AI identification tools and can help millions of people make informed decisions about how to interact with AI-generated content. AI. Later this summer, we plan to open source SynthID for text watermarking so that developers can build with this technology and integrate it into their models.

How text watermarking works

Large language models generate text sequences when given a prompt such as, “Explain quantum mechanics to me like I’m five years old” or “What’s your favorite fruit?”. LLMs predict which token is most likely to follow another, one token at a time.

Tokens are the building blocks that a generative model uses to process information. In this case, it can be a single character, word, or part of a phrase. Each possible token is assigned a score, which is the percentage probability that it is correct. Tokens with higher scores are more likely to be used. LLMs repeat these steps to create a coherent response.

SynthID is designed to embed subtle watermarks directly into the text generation process. It does this by introducing additional information into the token distribution at the point of generation, shaping the probability of token generation — all without compromising the quality, accuracy, creativity, or speed of text generation.

SynthID adjusts the likelihood score of tokens generated from a large linguistic model.

The final pattern scores for both word choices of the model combined with the adjusted probability scores are considered the watermark. This pattern of scores is compared to the expected pattern of scores for watermarked and unwatermarked text, helping SynthID identify whether an AI tool generated the text or whether it may have come from other sources.

A piece of text created by Didymos with the watermark highlighted in blue.

The benefits and limitations of this technique

SynthID for text watermarking works best when a language model generates longer responses and in a variety of ways — such as when asked to generate an essay, a play script, or variations on an email.

It even performs well on some conversions, such as trimming chunks of text, modifying a few words, and mild paraphrasing. However, its confidence levels can be significantly reduced when an AI-generated text is carefully rewritten or translated into another language.

SynthID text watermarking is less effective on responses to real prompts because there are fewer opportunities to adjust the token distribution without affecting real-world accuracy. This includes messages like “What is the capital of France?” or questions where little or no variation is expected, such as “recite a poem by William Wordsworth”.

Many currently available AI detection tools use algorithms to label and classify data, known as classifiers. These classifiers often only perform well on specific tasks, which makes them less versatile. When the same classifier is applied to different types of platforms and content, its performance is not always reliable or consistent. This can lead to text being mislabeled, which can cause problems, for example, where the text can be misidentified as AI-generated.

SynthID works effectively on its own, but can also be combined with other AI detection approaches for better coverage across all content types and platforms. Although this technique is not designed to prevent directly motivated adversaries such as cyberattackers or hackers from causing harm, can make it harder to use AI-generated content for malicious purposes.

How video watermarking works

At this year’s I/O we announced Veo, our most capable video production model. While video generation technologies are not as widely available as image generation technologies, they are evolving rapidly and it will become increasingly important to help people know whether a video is generated by AI or not.

Videos consist of single frames or still images. So we developed a watermarking technique inspired by the SynthID tool for image. This technique embeds a watermark directly into the pixels of each video frame, making it imperceptible to the human eye, but detectable for identification.

Empowering people with knowledge of when they interact with AI-generated media can play an important role in preventing the spread of misinformation. As of today, all videos generated by Veo are in VideoFX will be watermarked by SynthID.

SynthID for video watermarking marks every frame of a generated video

Bringing SynthID to the wider AI ecosystem

SynthID’s text watermarking technology is designed to be compatible with most AI text generation models and to scale across different content types and platforms. To help prevent widespread misuse of AI-generated content, we’re working to bring this technology to the wider AI ecosystem.

This summer, we plan to publish more about the text watermarking technology in a detailed research paper and will open source SynthID text watermarking through our updated Responsible Generative AI Toolkitwhich provides guidance and essential tools for building safer AI applications so that developers can build with this technology and integrate it into their models.

Thanks

The SynthID text watermarking project was led by Sumanth Dathathri and Pushmeet Kohli, with key research and engineering contributions from (listed alphabetically): Vandana Bachani, Sumedh Ghaisas, Po-Sen Huang, Rob McAdam, Abi See, and Johannes Welbl.

We thank Po-Sen Huang and Johannes Welbl for helping to initiate the project. Thanks to Brad Hekman, Cip Baetu, Nir Shabat, Niccolò Dal Santo, Valentin Anklin, and Majd Al Merey for their help in product integration. Borja de Balle Pigem, Rudy Bunel, Taylan Cemgil, Sven Gowal, Jamie Hayes, Alex Kaskasoli, Ilia Shumailov, Tatiana Matejovicova, and Robert Stanforth for technical information and comments. We also thank many others who contributed to Google DeepMind and Google, including our partners at Gemini and CoreML.

The SynthID video watermarking project was led by Sven Gowal and Pushmeet Kohli, with key contributions from (listed alphabetically): Rudy Bunel, Christina Kouridi, Guillermo Ortiz-Jimenez, Sylvestre-Alvise Rebuffi, Florian Stimberg, and David Stutz. Additional thanks to Jamie Hayes and others listed above.

We thank Nidhi Vyas and Zahra Ahmed for promoting SynthID product delivery.

Why harmonize bank statements? Explain the importance and benefits

Que sont les règles métier ? : The wizard is not complete

Training AI music models is about to get very expensive

DataRobot: A Leader in the 2024 Gartner® Magic Quadrant™ for Data Science and Machine Learning Platforms

ERP in procurement and how to use it effectively

Understanding YOLOv5 Loss: A Comprehensive Analysis

Master Advanced Prompt Engineering with LangChain for Context-Aware Language Models

Arduino vs Raspberry Pi: What’s the difference?

Top 20 Generative AI Applications/ Use Cases Across Industries

Top 35+ Finance Interview Questions And Answers

AI-generated text and video watermark with SynthID

AI-generated Al Michaels to provide daily recaps during 2024 Summer Olympics

What you need to know about this new Chinese text-to-video AI model

New Stable Diffusion 3 release excels at AI-generated body horror

Video Friday: Drone vs. Flying Canoe

The Rise and Fall of BNN Breaking, an AI-Generated News Outlet

Why Hellblade 2 is more than a video game

Understanding the visual knowledge of language models | MIT News

How AI-Driven Innovations Can Help Optimize PCB Materials

The dangers of voice fraud: We can’t detect what we can’t see

Our Picks

Understanding the visual knowledge of language models | MIT News

How AI-Driven Innovations Can Help Optimize PCB Materials

The dangers of voice fraud: We can’t detect what we can’t see

Subscribe to Updates

AI-generated text and video watermark with SynthID

How text watermarking works

The benefits and limitations of this technique

How video watermarking works

Bringing SynthID to the wider AI ecosystem

Related Posts