Scientists find way to extract sound from silent videos and still photos

When you take a photo on your phone, the vibrations of your voice can create tiny bends in the light that are enough to extract audio, according to Kevin Fu, a professor of engineering and computer science at Northeastern University. Credit: by Matthew Modoono/Northeastern University

In this world of constant video calls and virtual meetings, we all are familiar with the phrases like “mute yourself” or “you’re muted”.

However, the concept of muting may not be as secure as we thought, thanks to a revolutionary technology developed by Kevin Fu, a professor at Northeastern University.

Kevin and his team have designed a machine-learning tool called Side Eye, which can, astonishingly, extract audio from pictures and muted videos!

This means that if someone is speaking in the room where a photo was taken or a video was recorded but muted, Side Eye can figure out who was speaking and even what words were spoken.

Here’s a simplified scenario: Think about a muted TikTok video with dubbed music.

Side Eye could reveal what the person was really saying during the recording, and it could even pick up off-camera conversations! Sounds like something out of a sci-fi movie, right?

In fact, the inspiration for Side Eye did come from a science fiction TV show called “Fringe”.

Side Eye operates by leveraging the image stabilization technology found in most phone cameras today.

This technology uses small springs to suspend the camera lens in liquid to avoid blurry photos due to shaky hands.

However, when someone speaks near a camera lens, it produces tiny vibrations in the springs and slightly alters the light angle.

Using Side Eye, Kevin and his team can detect these microscopic vibrations and extract sonic frequencies from them, converting them into audible sound.

Even if the resulting audio might sound a bit unclear, like the muted adult voices in Peanuts cartoons, using machine learning, they are able to decipher the sound, identify words spoken, and potentially even identify the speaker.

This technology doesn’t just stop at deciphering words. It has implications for cybersecurity and legal systems. It opens up new possible threats in cybersecurity, as conversations believed to be private due to muted microphones or silent recordings can now potentially be heard.

On the other hand, in legal contexts, it can serve as a new form of digital evidence to verify alibis by confirming whether a person was present in a certain location at a certain time, based on whether their voice is detected in recordings or photos from the scene.

While the ability to hear from silent videos and photos raises concerns, it also presents exciting possibilities.

It could be a groundbreaking way to gather information, and in the right hands, could be used to solve crimes, verify information, and perhaps even explore new realms in science and technology.

Kevin Fu and his team have truly pushed the boundaries of what is possible with technology, showing us that sometimes, reality can be just as surprising as fiction. Side Eye presents a blend of possibilities and challenges, making us reconsider the way we perceive sound, silence, and privacy in the digital age.

This innovation is a stark reminder of the endless possibilities that technology holds, inviting us to explore the uncharted territories of scientific advancements.

Follow us on Twitter for more articles about this topic.