The Challenge
- Voice cloning technology is becoming increasingly sophisticated due to improving text-to-speech AI. The technology offers promise, including medical assistance for people who may have lost their voices due to accident or illness. It also poses significant risk: families and small businesses can be targeted with fraudulent extortion scams; creative professionals, such as voice artists, can have their voices appropriated in ways that threaten their livelihoods and deceive the public.
- The FTC ran an exploratory challenge to encourage the development of multidisciplinary approaches—from products to polices to procedures—aimed at protecting consumers from AI-enabled voice cloning harms, such as fraud and the broader misuse of biometric data and creative content. The goal of the Challenge was to foster breakthrough ideas on preventing, monitoring, and evaluating malicious voice cloning.
- The Voice Cloning Challenge is one part of a larger strategy. The risks posed by voice cloning and other AI technology cannot be addressed by technology alone. It is also clear that policymakers cannot count on self-regulation alone to protect the public. At the FTC, we will be using all of our tools—including enforcement, rulemaking, and public challenges like this one—to ensure that the promise of AI can be realized for the benefit, rather than to the detriment of, consumers.
The Results
- Top Submissions that will split $35,000 in prize money:
- “‘AI Detect’ for consumer and enterprise apps and devices” (Video | Abstract)
- Submitted by David Przygoda and Dr. Carol Espy-Wilson from the small organization OmniSpeech (located in College Park, Maryland).
- AI detect uses AI algorithms to differentiate between genuine and synthetic voice patterns. Additionally, the submission proposes a framework for increased public and private sector responsibility.
- "DeFake: Using Adversarial Audio Perturbations to Proactively Prevent Malicious Voice Cloning" (Video | Abstract)
- Submitted by Dr. Ning Zhang, an Assistant Professor in the Department of Computer Science and Engineering at Washington University in St. Louis.
- DeFake proposes a protective mechanism to add carefully crafted perturbations to voice samples to hinder the cyber criminal’s cloning process.
- “OriginStory: Authenticating the human origin of voice at the time of recording" (Video | Abstract)
- Submitted by Dr. Visar Berisha, Drena Kusari, Dr. Daniel W. Bliss, and Dr. Julie M. Liss of the small organization OriginStory.
- OriginStory proposes using off-the-shelf sensors already integrated in many devices to simultaneously measure speech acoustics and the co-occurring biosignals in the throat and mouth as a person is speaking, thus authenticating the human origin of voice recordings at the point of creation and embedding this authentication as a watermark or signature in the stream.
- “‘AI Detect’ for consumer and enterprise apps and devices” (Video | Abstract)
- Recognition Award (with no monetary prize):
Challenge Judges
FTC staff would like to thank the Challenge judges: Arvind Narayanan, Beau Woods, and Brit Paris for all their help in making this contest a success!
- Arvind Narayanan is a professor of computer science at Princeton and the director of the Center for Information Technology Policy. He co-authored a textbook on fairness and machine learning and is currently co-authoring a book on AI snake oil. His work was among the first to show how machine learning reflects cultural stereotypes, and his doctoral research showed the fundamental limits of de-identification. Narayanan is a recipient of the Presidential Early Career Award for Scientists and Engineers (PECASE).
- Beau Woods is a leader with the I Am The Cavalry grassroots initiative, Founder/CEO of Stratigos Security, and Cyber Safety Innovation Fellow with the Atlantic Council. His work bridges the gap between the security research and public policy communities, to ensure connected technology that can impact life and safety is worthy of our trust. Over the past several years in this capacity, he has consulted with the healthcare, automotive, aviation, rail, and IoT industries, as well as cyber security researchers, US and international policy makers, and the White House.
- Britt Paris, an assistant professor at Rutgers University School of Communication & Information, is a critical informatics scholar studying the political economy of information infrastructure, as it relates to evidentiary standards and political action. She has published work on Internet infrastructure projects, artificial intelligence-generated information objects, digital labor, and civic data, analyzed through the lenses of political economy, cultural studies, and feminist social epistemology.
Additional Information
- For more information about the contest, see the press release, FTC Announces Winners of Voice Cloning Challenge and the Tech Blog, Approaches to Address AI-enabled Voice Cloning.
- For the full Rules, including judging criteria, see the FTC Voice Cloning Challenge Rules. The Rules include updates announced on January 2, 2024 in a Corrective Notice.