How to Stop Deepfake Porn Using AI

By Margaret Mitchell, Sasha Luccioni, Elizabeth Allendorf, Emily Witko, Bruna Trevelin

June 12, 2024

Closeup of holding smartphone and fiber optic

In this op-ed from Hugging Face's Margaret Mitchell, chief ethics scientist; Sasha Luccioni, AI research scientist; Elizabeth Allendorf, backend engineer; Emily Witko, head of culture and DEIB; and Bruna Trevelin, legal counsel, explore how to stop deepfake porn, and what to do if you see it. Hugging Face is an open science and community-driven platform for AI builders, with dedicated ethics, society, and legal teams working towards responsible AI.

The moment we heard that fake images of Taylor Swift were being passed around online, we knew what had happened. Swift, like many women and teens, was a target of “deepfake porn,” the massively harmful practice of creating nonconsensual fake sexualized images. As women working in AI, we’ve all experienced inappropriate sexualization and know first-hand how tech companies can do a better job at protecting women and teens. So let’s talk about what “deepfakes” are and what can be done to stop their proliferation.

What is a “deepfake”?

About 10 years ago, a technique known as “deep learning” began working well for labeling images – like automatically labeling dog pictures without being told that a dog has four paws and a tail. Deep learning is a type of Artificial Intelligence (AI), and specifically machine learning, in which systems learn based on example inputs (like a dog photo) and desired outputs (like the label ‘dog’). Recently, computing breakthroughs have made it possible to use AI to do everything from generate videos to write book summaries (whether these summaries are good is another matter). When an AI-generated image, video, or audio is difficult to distinguish from real content, it’s called a “deepfake.”

Deepfakes can take many forms, spanning everything from silly memes mixing one person’s face with the body of another, to seriously harmful audio of a public figure saying something they never said. Deepfake technology can be used to create realistic, yet entirely fictional, characters or scenarios, making it a powerful tool for entertainment, such as in movies or video games. On the darker side, it can be used for malicious purposes, including intentionally spreading false information (called “disinformation”) in order to manipulate public opinion and creating nonconsensual photos or videos.

Around 2017, deepfake porn images began to emerge. Deepfakes can be created from images people post online, and over 90% of deepfakes are nonconsensual porn, the vast majority of which depicts women. Their distribution can cause emotional distress and damage reputations, making it critical for everyone to step in and do something to stop them. There is no single solution, but we can all make it harder to create and proliferate nonconsensual deepfakes.

What can I do?

One of the most important ways to combat deepfakes is by raising awareness. Parents and educators can help teens understand why this exploitation is not OK by discussing consent, responsibility, and how nonconsensual porn can be objectifying, demeaning, and traumatic — being a victim of deepfake porn can mean being hurt for the rest of your life. Bring the topic up with your parents and teachers, to let them know this technology exists — adults can help with the larger picture once they know this is happening. Teens can help their peers learn too, by speaking up. One of the most powerful ways to disrupt the normalization of harmful technology is to call it out for what it is: unacceptable. And creating them can give rise to serious consequences.

Another important thing we can all do is to respect peoples’ privacy online. Completely innocent images, video, and audio shared online may be abused to create deepfake porn. Consent is key: check with people before posting depictions of them online.

We can also request deepfake content be removed by asking the person who posted it directly, as well as by using social media platform reports (although social media platforms have a long way to go in taking down deepfakes) and search engine removal requests. If you find a deepfake of someone you know, let them know as soon as possible, and remember that kindness and emotional support can go a long way.

If the image is of you, you can also send the hosting platform a formal DMCA takedown request for U.S.-based content, and there are data protection authorities where you can file a complaint in some states and countries, such as the CPPA in California. The law is limited, but moving in promising directions. We are seeing case law that discusses the responsibilities of platforms in moderating content, and legislative efforts that attempt to address privacy concerns and create civil remedies for victims of nonconsensual, sexually explicit deepfakes.

How technology can help, not hurt

Fortunately, researchers have recently begun developing ways to make it more difficult to create deepfakes. There are currently two basic approaches: A defensive approach called “immunizing” that disrupts what an AI model can do with an image, and a more experimental offensive approach called “poisoning” that disrupts what an AI model can learn from an image.

“Immunizing” an image involves altering its pixels in a way that makes it a fundamentally different image for an AI model, even though the changes are imperceptible to human eyes. To get a sense of what this means, check out the images of a pig here. Adding different values to each pixel of a “pig” picture makes the model classify it as an “airliner.” Applied to images of people, this can make it impossible for an image to be manipulated as intended by a malicious user (see an example here). Example tools for immunizing include Glaze and Photoguard.

Poisoning an image is a much newer idea, with the goal of breaking models that are trained on images without consent. It works similarly to immunizing, but with somewhat different math to make it particularly pernicious for AI model training. At a high level, it works by altering images with patterns from very different types of images to make it so that a deep AI system might “learn” that people actually look like kittens, for example. Example tools include Nightshade and Fawkes. We’ve collected some demos of these cutting-edge tools here. The tools aren’t perfect, but we’re making progress.

If you share a picture of someone online, consider experimenting with tools like these (as long as the images are not stored by the tool). Seeing that people are using tools to combat deepfakes motivates funders and scientists to further advance this kind of protective technology, and encourages regulators to implement policies that curb their spread.

We can prevent the proliferation of deepfakes with education, takedown requests, and experimenting with new tools. Responsibility lies with society as a whole: we must advocate for widespread legal and societal changes to protect individuals' rights and dignity, and so end nonconsensual deepfake porn for good.