
Imagine wearing a pair of ordinary wireless earbuds that can quietly “see” what you see and answer your questions about it.
Researchers at the University of Washington have created a new prototype system that does exactly this.
Called VueBuds, the device uses tiny built-in cameras to capture images and allows users to talk with an AI assistant about what is in front of them.
For example, if someone looks at a food package written in Korean, they can simply ask, “Hey Vue, translate this for me.”
Within about a second, the AI responds through the earbuds with an English translation. The goal is to make everyday tasks like reading labels, identifying objects, or understanding surroundings easier and more natural.
Unlike smart glasses or virtual reality headsets, which many people find uncomfortable or intrusive, earbuds are already widely used and socially accepted.
The research team wanted to build a system that fits seamlessly into daily life while also addressing privacy concerns.
Instead of recording high-quality video or sending data to the cloud, VueBuds captures low-resolution, black-and-white images and processes them directly on the user’s device.
A small light turns on when images are being taken, and users can delete them immediately.
Power use was a major challenge. Cameras usually drain battery quickly, and Bluetooth connections cannot handle large amounts of data.
To solve this, the team used extremely small, low-power cameras—about the size of a grain of rice—that take simple still images instead of video. This keeps the system efficient while still providing useful information.
Another question was whether cameras placed in earbuds could actually capture a clear view. The researchers found that angling the cameras slightly outward gives a wide field of view, covering most of what the user sees.
Although there is a small blind spot for objects very close to the face, this rarely affects normal use.
To improve speed, the system combines images from both earbuds into a single picture before sending it to the AI. This reduces response time to about one second, making the interaction feel almost instant.
In testing, participants compared VueBuds with smart glasses that use higher-quality images processed in the cloud.
Surprisingly, both systems performed similarly overall. VueBuds did especially well in translation tasks, while the glasses were slightly better at counting objects. The earbuds achieved over 80% accuracy in identifying objects and translating text, and even higher accuracy when recognizing book titles and authors.
The current system cannot detect colors because it uses grayscale images, but the team hopes to add color in the future. They also plan to develop more specialized AI features, such as helping people with low vision read books or navigate their environment.
This early research shows that powerful visual AI may soon be built into something as simple and familiar as a pair of earbuds, quietly turning everyday moments into interactive experiences.


