NewsNational NewsScripps News

Actions

Researchers develop a way to hear photos using artificial intelligence

The technology uses image stabilization and artificial intelligence to extract audio from still images and muted videos.
AI audio
Posted
and last updated

Researchers at Northeastern University have developed a way to extract audio from both still photos and muted videos using artificial intelligence.

The research project is called Side Eye.

“Most of the cameras today have what's called image stabilization hardware,” said Kevin Fu, a professor of electrical and computer engineering at Northeastern University. “It turns out that when you speak near a camera lens that has some of these functions, a camera lens will move every so slightly, what's called modulating your voice, onto the image and it changes the pixels.”

Basically, these small movements can be interpreted into rudimentary audio that Side Eye artificial intelligence can then interpret into individual words with high accuracies, according to the research team.

“You're able to get thousands of samples per second. What does this mean? It means you basically get a very rudimentary microphone,” Fu said.

SEE MORE: Companies plan to build largest image-based AI model to fight cancer

Even though the recovered audio sounds muffled, some pieces of information can be extracted.

“Things like understanding what is the gender of the speaker, not on camera but in the room while the photograph or video is being taken, that's nearly 100% accurate,” he said.

So what can technology like this be used for?

“For instance in legal cases or in investigations of either proving or disproving somebody’s presence, it gives you evidence that can be backed up by science of whether somebody was likely in the room speaking or not,” Fu said.

“This is one more tool we can use to bring authenticity to evidence, potentially to investigations, but also trying to solve criminal applications,” he said.

@scrippsnews Researchers have developed a way to extract audio from still photos and muted videos, meaning you can 'hear' who is speaking in them. This technology could help with criminal investigations, for example. What are your thoughts on this? #techtok #tech #artificialintelligence ♬ original sound - Scripps News


Trending stories at Scrippsnews.com