What was the Arup deepfake scam?

A Hong Kong office of the British engineering firm Arup lost $25 million in early 2024 after an employee was tricked into making fifteen wire transfers across five banks. Every executive on the video call instructing the transfer, including the firm's CFO, was a deepfake generated from publicly available video and audio of the real people.

How was the employee convinced to transfer $25 million?

A phishing email asked for an urgent, confidential transaction. The employee was skeptical and nearly dismissed it. What changed his mind was a video call where he saw and heard people who looked exactly like his real colleagues, all agreeing the transfer was legitimate and time-critical. The attackers had built convincing fakes of the CFO and several executives from social media, conference recordings, and virtual meeting footage.

How much audio does an attacker need to clone someone's voice?

About twenty to thirty seconds for a baseline clone in 2026. A two-minute podcast clip, a recording of you speaking at a conference, or a voicemail greeting is more than enough. Variation in pitch helps the model but is not required. The bar to start is roughly the length of a voicemail.

Why did multifactor authentication not stop this?

Multifactor authentication protects accounts from being logged into by attackers. It does not protect a person from being talked into authorising a transaction themselves. In the Arup case, the employee logged in with his own credentials and approved the transfers himself, on the instruction of what he believed to be his CFO. MFA was not the failure point.

What is a verification word and how do I set one up?

A verification word is a short phrase that you and the people you trust have agreed on in advance. When anyone calls or messages asking for an urgent transfer, password, or favor that involves money, you ask for the word. If they cannot produce it, the request stops. Set one up by picking a phrase no one would guess from your public life, sharing it with your family and the colleagues authorised to move money, and using it the next time anyone asks for anything urgent.

Episode 1

The $25 Million Call That Never Happened

How a Hong Kong engineering team transferred $25 million to a deepfake CFO on a video call. The story, the mechanics, and the verification habit that stops it.

Spotify Apple Podcasts YouTube

Key Takeaways

→ Arup's Hong Kong office lost $25 million to a deepfake video call in 2024. Five banks, fifteen wire transfers, every executive on screen was synthetic.
→ Voice cloning in 2026 needs about twenty to thirty seconds of audio. A podcast clip, a conference recording, or a voicemail greeting is enough.
→ A real CFO does not know fifteen account numbers across five banks by heart. That was the signal that should have stopped the call.
→ More security training does not fix this. A callback rule on a different channel and a verification word do.
→ If everyone in the room agrees, at least one person has stopped thinking. The 10th man's job is to disagree on purpose.

What happened

A Hong Kong office of the British engineering firm Arup transferred $25 million in early 2024 to attackers, across fifteen wire transfers to five different banks. The employee who authorised every one of those transfers was sitting on a video call with what he believed was his CFO and several senior colleagues. Every face on that call was a deepfake. Every voice was cloned.

The employee was not reckless. We want to be clear about that. He almost did the right thing.

What almost stopped it

A phishing email arrived first, asking for an urgent, confidential transaction. The employee was skeptical. He nearly dismissed it as fraud. What changed his mind was the video call. Seeing colleagues he knew, hearing their voices, watching them all agree the transfer needed to happen now.

The technology beat the instinct. That is the entire story of this case.

How little it takes to clone you

Voice cloning in 2026 needs about twenty to thirty seconds of usable audio. A podcast clip is enough. A conference recording is enough. The “leave a message” greeting on your voicemail is enough.

Face cloning needs short video. A public LinkedIn profile, a conference recording on YouTube, an internal town hall someone posted to a company channel: any of these gives an attacker hours of material.

The attackers in the Arup case did not invent anything. They scraped LinkedIn, conference recordings, virtual meeting footage, and media appearances. Then they ran the source material through tools that anyone with a few hundred dollars and an afternoon can run.

The signal that should have triggered alarm

The thing André kept coming back to during the episode: a real CFO does not know fifteen account numbers across five different banks by heart. That is not the CFO’s job. A CFO knows strategic flows. They know that the Singapore entity needs to move funds to the Shanghai entity. They do not know the IBAN of an HSBC account in Hong Kong sitting third on a list of fifteen wires.

When the “CFO” in the video call started reading off precise account numbers, that should have been the moment everything stopped. It was not.

Why more training is not the answer

The default corporate response to an incident like this is to order more security awareness training. From André’s twenty-five years in financial compliance: training catches the first time, maybe the second. After that it becomes noise. People click through it. The same employee who fails the company’s quarterly phishing simulation will, on a stressful Tuesday afternoon, do exactly what the CFO on the screen asks.

What works is process. A callback rule for any wire above a threshold, on a channel different from the one the request came in on. A second person who must independently authorise the action. And a verification word.

The verification word

This is the “What’s Your Move” for this episode. The whole show, really.

Pick a short word or phrase. Share it with the people who matter: your partner, your kids, your parents, the colleagues authorised to move money. When anyone calls asking for an urgent transfer, a password reset, or “Mum I lost my phone, send money to this number,” ask for the word.

If they do not have it, the request stops there.

It costs nothing. It takes five minutes to set up. It would have stopped the Arup attack at the third sentence of the video call. And it works against attack technologies we have not seen yet, because the verification step is independent of whatever the attacker is pointing at you.

The 10th man

If everyone in the room agrees, at least one person has stopped thinking.

In the Arup video call, every face on screen consented. The CFO consented. The executives consented. The pressure to comply was deliberate. There was no one in that meeting whose job was to say “hold on, this feels wrong, can we step out and verify on a different channel?”

The “10th man doctrine” is structured dissent. If nine people agree, the tenth person’s job is to find the disagreement. To prove the consensus wrong, or fail trying. Not stubbornness for its own sake. A deliberate slot for the unwelcome question.

Most companies do not have a 10th man. Most do not even have a habit of structured disagreement. The cost of that absence shows up in cases like Arup.

On AI regulation

We will get this question every episode for the next year: shouldn’t this be regulated more? André’s position, which we will keep arguing about on the show: more rules do not help, because the technology is moving faster than legislation. The right response is more thinking, not more bureaucracy. The right response is also more individual responsibility, more callback rules, more verification words. The right response is the kind of culture that lets a junior employee say “no, I am not doing this until I verify.”

FAQ

Frequently asked questions

What was the Arup deepfake scam?

A Hong Kong office of the British engineering firm Arup lost $25 million in early 2024 after an employee was tricked into making fifteen wire transfers across five banks. Every executive on the video call instructing the transfer, including the firm's CFO, was a deepfake generated from publicly available video and audio of the real people.
How was the employee convinced to transfer $25 million?

A phishing email asked for an urgent, confidential transaction. The employee was skeptical and nearly dismissed it. What changed his mind was a video call where he saw and heard people who looked exactly like his real colleagues, all agreeing the transfer was legitimate and time-critical. The attackers had built convincing fakes of the CFO and several executives from social media, conference recordings, and virtual meeting footage.
How much audio does an attacker need to clone someone's voice?

About twenty to thirty seconds for a baseline clone in 2026. A two-minute podcast clip, a recording of you speaking at a conference, or a voicemail greeting is more than enough. Variation in pitch helps the model but is not required. The bar to start is roughly the length of a voicemail.
Why did multifactor authentication not stop this?

Multifactor authentication protects accounts from being logged into by attackers. It does not protect a person from being talked into authorising a transaction themselves. In the Arup case, the employee logged in with his own credentials and approved the transfers himself, on the instruction of what he believed to be his CFO. MFA was not the failure point.
What is a verification word and how do I set one up?

A verification word is a short phrase that you and the people you trust have agreed on in advance. When anyone calls or messages asking for an urgent transfer, password, or favor that involves money, you ask for the word. If they cannot produce it, the request stops. Set one up by picking a phrase no one would guess from your public life, sharing it with your family and the colleagues authorised to move money, and using it the next time anyone asks for anything urgent.

Resources Mentioned

Fortune — Arup deepfake CFO scam, Hong Kong