Skip to Content

This App Lets You Control Your Phone Using Sonar

The built-in microphone and speakers of an iPhone can be used for far more than just talking.
November 15, 2016

As I swat my hand left and right, a Space Invaders-style rocket moves across the screen of my iPhone. But my fingers are at least six inches from the handset, because I’m actually controlling the game via sonar, which is being generated by the phone’s original hardware.

This is made possible by software built by Wei Wang and Alex X. Liu from the Department of Computer Science and Technology at Nanjing University in China. It uses the phone’s onboard speakers to emit sound at frequencies between 17 and 23 kilohertz—toward the upper end of human hearing, and just barely audible to younger ears. By analyzing the reflected signals detected by built-in microphones, it’s possible to measure the proximity of an object, such as your hand, to within four millimeters.

Controlling a device without having to directly touch it isn’t a cheap gimmick. It’s useful anywhere there’s a mess, from a kitchen to an operating room. Efforts to create low-cost gesture control for computers have been proposed in the past (see “Leaping into the Gesture-Control Era”). But Wang and Liu think their feature could be cheaply baked into any modern smartphone without the need for any additional hardware.

All it takes is a wave of your hand.

It’s not the first time that a smartphone has been used to perform echolocation. Shyam Gollakota from the University of Washington (an MIT Technology Review Innovator Under 35 in 2014), for instance, has used a similar approach as part of his FingerIO project, which investigated whether it was feasible to locate human hands using high-frequency sound and on-board hardware. But Gollakota is impressed by the work of Wang and Liu, because they’re able to use it for direct control of the phone. It’s “very cool,” he says, that the technique has been implemented to work in real-time.

There is a little lag in its response to human gestures. Wang says that the phone takes 15 milliseconds to detect and process movement. In practice that’s barely discernable, though, and the control itself is impressive. At first you feel compelled to move your whole hand, but with a little practice it becomes possible to achieve the same results by moving a single finger. 

The app is currently a research project and not available in any app store. But the pair plans to turn it into an API that can be used by other developers to bake the echolocation system into apps on iPhones and Android devices. They reckon that it could end up being used to scroll Web pages, say, or turn pages of an e-book.

Those are much like the features being proposed as part of Google’s Project Soli. But the approach being taken by Google researchers is to build a dedicated radar chip that could be added to a device. While that provides more accuracy than the sonar technique, it also means integrating yet another technology into a smartphone.

Wang points out that hardware manufacturers could instead simply optimize the positioning of microphones and speakers about existing devices. They could even increase the upper frequency at which devices transmit and receive sound to achieve submillimeter resolutions—the hardware would still likely be cheaper than that required for radar.

Meanwhile, Wang and Liu continue to improve their software in the lab. They hope to develop versions for smart watches and VR headsets, where interacting with a screen is difficult or impossible, as well as building new algorithms to detect individual fingers. They already claim that they’re able to track the motion of a hand accurately enough to identify characters being written with more than 90 percent accuracy.

Keep Reading

Most Popular

Large language models can do jaw-dropping things. But nobody knows exactly why.

And that's a problem. Figuring it out is one of the biggest scientific puzzles of our time and a crucial step towards controlling more powerful future models.

OpenAI teases an amazing new generative video model called Sora

The firm is sharing Sora with a small group of safety testers but the rest of us will have to wait to learn more.

Google’s Gemini is now in everything. Here’s how you can try it out.

Gmail, Docs, and more will now come with Gemini baked in. But Europeans will have to wait before they can download the app.

This baby with a head camera helped teach an AI how kids learn language

A neural network trained on the experiences of a single young child managed to learn one of the core components of language: how to match words to the objects they represent.

Stay connected

Illustration by Rose Wong

Get the latest updates from
MIT Technology Review

Discover special offers, top stories, upcoming events, and more.

Thank you for submitting your email!

Explore more newsletters

It looks like something went wrong.

We’re having trouble saving your preferences. Try refreshing this page and updating them one more time. If you continue to get this message, reach out to us at customer-service@technologyreview.com with a list of newsletters you’d like to receive.