The Deepfake Problem in Video Calls Is Already Here
Here's a scenario that sounds like something out of a thriller but has actually happened, more than once: you join a video call with your CFO and a few colleagues. Everyone looks normal, sounds normal. You authorize a large financial transfer because, well, why wouldn't you? Except none of those people were real. Every single person on that call — except you — was an AI-generated deepfake.
That's exactly what happened to engineering firm Arup in early 2024. An employee based in Hong Kong ended up approving a string of wire transfers that cost the company $25 million. A similar attack hit a multinational company operating in Singapore in 2025. And those are just the high-profile incidents that made headlines.
When you zoom out (pun kind of intended), the numbers are genuinely alarming. Deepfake-enabled fraud reportedly exceeded $200 million in losses in just the first quarter of last year alone. The average hit per corporate incident has crossed $500,000. So while most people will never personally experience this, businesses — especially those moving serious money over video — are sitting in the crosshairs.
Why Existing Detection Methods Are Falling Short
You might be thinking: can't software just detect when a face is fake? And yes, some tools already try to do that by analyzing video frames for signs of AI manipulation. The problem is that video generation models are getting dramatically better, really fast. The telltale glitches — the weird blinking, the slightly-off skin texture — are disappearing. Both Zoom and World have acknowledged that frame-by-frame detection is becoming less and less reliable as a standalone defense. It's essentially a cat-and-mouse game, and the mice are getting faster.
That's the gap this new partnership is trying to close. Instead of just asking "does this video look fake?", it shifts the question to "can we confirm this is the same human who registered?"
How World ID Deep Face Actually Works
World's approach — called World ID Deep Face — takes three separate data points and requires all of them to match before it'll confirm anything. Think of it like a three-way handshake for your humanity.
First, there's a signed image captured when the user originally registered through World's Orb device. Second, the system does a real-time face scan directly from the participant's device during the call. Third, it checks the live video frame that other people in the meeting can actually see on their screens. All three have to line up. When they do, the participant gets a "Verified Human" badge next to their name.
That last detail is kind of remarkable when you sit with it for a second. We're now at a point where video conferencing platforms need to tell us whether the person we're looking at is actually a person.
What This Looks Like Inside a Zoom Meeting
On the practical side, Zoom hosts will be able to turn on a Deep Face waiting room, which essentially gates entry into the call behind identity verification. No badge, no entry. But it doesn't stop there — participants can also request mid-call that someone else verify themselves on the spot. So if something feels off halfway through a conversation, there's a mechanism to check.
Zoom has framed the integration as part of its broader open ecosystem philosophy, giving business customers more flexibility to decide how much verification they want to build into their workflows depending on what's at stake.
World's Growing Footprint Beyond Zoom
This isn't an isolated deal for World. The company has been quietly building out a network of verification partnerships across different industries. Tinder and Visa are among the consumer-facing platforms it's already working with. More recently, World rolled out technology designed to verify that actual humans — not automated AI programs — are the ones driving AI shopping agents at the point of purchase.
The thread connecting all of it is the same underlying concern: as AI gets better at impersonating people, the question of "who am I actually talking to?" becomes harder to answer without some kind of anchored identity system.

