Skip to main content

Man or machine? Toronto company finds a way to determine how real audio clips are

Yan Fossat of Klick Labs poses for a photo in Toronto on Tuesday, May 14, 2024. Fossat and his team at Klick Labs in Toronto have found a way to determine whether audio clips are voiced by humans or artificial intelligence. THE CANADIAN PRESS/Nathan Denette Yan Fossat of Klick Labs poses for a photo in Toronto on Tuesday, May 14, 2024. Fossat and his team at Klick Labs in Toronto have found a way to determine whether audio clips are voiced by humans or artificial intelligence. THE CANADIAN PRESS/Nathan Denette
Share

Eyes may be the windows to the soul, but at Klick Labs, it's all about the voice.

The Toronto-based research arm of life sciences technology firm Klick Health has found a way to analyze voices in a manner that’s so granular, it can tell whether it's a person or an artificial intelligence-powered machine.

The capability comes as the number of deepfakes — AI-produced video, audio clips or photos that appear real – has exploded with the recent release of several AI chatbots. Everyone from pop star Taylor Swift to U.S. President Joe Biden and the Pope have fallen victim to the phenomenon.

And it’s not expected to abate any time soon. The European Union’s law enforcement agency Europol recently predicted as much as 90 per cent of online content may be synthetically generated by 2026 and the Canadian Security Intelligence Service has called the situation "a real threat to a Canadian future."

But Yan Fossat, Klick Labs’ senior vice-president of digital health research and development, is hopeful his company can help make the world of AI a bit safer.

"Every technology that's not regulated is dangerous and this is moving a bit faster than a lot of things," he said, while standing in Klick's downtown Toronto lab.

It was in that space — cluttered with wires, pieces of household electronics and whirring 3D printers — that Fossat and a team of three started thinking about how their favourite science fiction films could help them tackle deepfakes.

“In 'Terminator,' they use dogs to smell if the people look like humans and in ‘Blade Runner,’ there's the Voight-Kampff machine and I've always wanted to make a Voight-Kampff machine,” said Fossat, referencing a fictional test used in the film to measure physiological responses, such as eye motion and reaction time, to determine whether a character was a human or replicant.

For their own project, the Klick team assembled 49 humans with diverse backgrounds and accents, whose audio they fed to a deepfake generator to make synthetic clips.

The clips were analyzed based on their vocal biomarkers — features embedded in voices that tell us something about the speaker’s health or physiology.

For example, if someone has just dashed up a flight of stairs, they breathe faster, which can be heard in their voice. Voice can also indicate when someone is just waking or feeling tired.

Klick Labs has identified 12,000 of these biomarkers, but to tell man from machine, Jaycee Kaufman, Klick's lead scientist, said it so far relies on five — the length and variation of speech, the rates of micropauses and macropauses and the overall proportion of time spent speaking versus pausing.

Micropauses are less than a half second and macropauses are more than that time, she said. They often occur naturally when someone is speaking and simply takes a breath or is grasping for words.

“We don’t really pay attention to it, but it’s happening,” added Fossat.

“We have a brain and it needs to think and we have lungs and we need to breathe. Machines don't have that, so they don't do it.”

So far, Klick Labs’ method of identifying deepfakes has an 80 per cent success rate, but it might not last long.

It’s only getting harder to tell whether a clip is a deepfake or not because AI is constantly evolving and “becoming better and better at sounding like voices from humans,” Fossat said.

“For example, OpenAI, the company who makes (generative AI chatbot) ChatGPT just came out a couple of weeks ago with a new vocal deepfake voice that is so good it breathes,” he said.

“It fakes those micro breaths, which is quite amazing.”

He insists the development hasn’t rendered Klick Labs’ research useless because there are thousands of other biomarkers, like heart rate, it can test for deepfake detection.

Sixteen other studies on vocal biomarkers and diseases that Klick Labs is conducting could also aid its research.

One of those studies has used vocal biomarkers to diagnose diabetes with 89 per cent accuracy for women and 86 per cent for men.

That research will soon be continued with a study Klick is set to run with Humber River Hospital in Toronto and Fossat said it could eventually form the basis of phone-based tools anyone can use to find out how at risk they are of having the disease.

Every advance in Klick's research means more chances to learn about biomarkers and apply it back to the detection of diseases and deepfakes, which are proving hard to keep up with.

"It moves so fast every time you do something, by the time you're finished … everything has changed and we have to do it again," Fossat said.

This report by The Canadian Press was first published May 26, 2024. 

CTVNews.ca Top Stories

opinion

opinion Can you cut your monthly bills through negotiation?

If you feel like you're in over your head with monthly bills and subscription fees, personal finance contributor Christopher Liew has some tips and tricks on how to negotiate with certain companies to help cut your expenses and put money back in your pocket.

Tipping in Canada: How much really goes to the employee?

Consumers may have many reasons to feel tip fatigue. But who loses out when we decide to tip less, or not at all? CTVNews.ca spoke with a few industry experts to find out how tipping works and who actually receives the money.

Local Spotlight