Man
Professional
- Messages
- 3,222
- Reaction score
- 812
- Points
- 113
A group of scientists from five American universities have developed a side-channel attack called EarSpy, which can be used to eavesdrop on Android devices: recognize the gender and identity of the caller, and partially parse the contents of the conversation. The eavesdropping is proposed to be carried out using motion sensors that can detect the reverberation of mobile device speakers.
The EarSpy attack was presented by experts from Texas A&M University, New Jersey Institute of Technology, Temple University, University of Dayton, and Rutgers University. They said that similar side-channel attacks had been studied before, but several years ago smartphone speakers were found to be too weak to generate enough vibration for eavesdropping.
Modern smartphones use more powerful stereo speakers (compared to models of previous years), which provide better sound quality and stronger vibrations. Likewise, modern devices use more sensitive motion sensors and gyroscopes, which are able to register even the smallest nuances of the speakers' operation.
Visual proof of this can be seen in the image below, where the performance of the 2016 OnePlus 3T speakers is barely visible in the spectrogram and is compared to the stereo speakers of the 2019 OnePlus 7T, which apparently extract significantly more data.
From left to right: OnePlus 3T, OnePlus 7T, OnePlus 7T
In their experiments, the researchers used a OnePlus 7T and a OnePlus 9, as well as various sets of pre-recorded sounds that were played through the devices’ speakers. They also used a third-party app called Physics Toolbox Sensor Suite to collect accelerometer readings during a simulated call, then fed them into MATLAB for analysis.
The machine learning algorithm was trained using readily available datasets for speech recognition, caller ID, and gender recognition. The results from the tests varied depending on the dataset and device used, but overall the researchers’ experiments yielded promising results and proved that such eavesdropping is possible.
For example, caller gender accuracy on the OnePlus 7T ranged from 77.7% to 98.7%, caller ID classification ranged from 63.0% to 91.2%, and speech recognition was accurate from 51.8% to 56.4%.
On the OnePlus 9, gender recognition accuracy exceeded 88.7%, but caller ID dropped to an average of 73.6%, and speech recognition showed results ranging from 33.3% to 41.6%.
Researchers admit that the effectiveness of the EarSpy attack can be significantly reduced by the volume that users themselves choose for the speakers of their devices. That is, low speaker volume may well interfere with the implementation of wiretapping as a whole.
In addition, the reverberations and the resulting output are significantly affected by the arrangement of the device's hardware components and the assembly density, and the accuracy of the data reduces user movement and vibrations caused by the environment.
OnePlus 7T Device
Let me remind you that one of the studies from previous years used the PoC application Spearphone, which also abused access to the accelerometer and analyzed the reverberations that occurred during telephone conversations. However, at that time, the experts used a speakerphone, due to which the accuracy of determining the gender and caller ID reached 99%, and the accuracy of speech recognition was 80%.
The authors of EarSpy conclude that phone manufacturers must ensure a stable sound pressure level during phone calls, and also place the sensors in the case in such a way that internal vibrations do not affect them or have the least possible effect.
Source
The EarSpy attack was presented by experts from Texas A&M University, New Jersey Institute of Technology, Temple University, University of Dayton, and Rutgers University. They said that similar side-channel attacks had been studied before, but several years ago smartphone speakers were found to be too weak to generate enough vibration for eavesdropping.
Modern smartphones use more powerful stereo speakers (compared to models of previous years), which provide better sound quality and stronger vibrations. Likewise, modern devices use more sensitive motion sensors and gyroscopes, which are able to register even the smallest nuances of the speakers' operation.
Visual proof of this can be seen in the image below, where the performance of the 2016 OnePlus 3T speakers is barely visible in the spectrogram and is compared to the stereo speakers of the 2019 OnePlus 7T, which apparently extract significantly more data.

From left to right: OnePlus 3T, OnePlus 7T, OnePlus 7T
In their experiments, the researchers used a OnePlus 7T and a OnePlus 9, as well as various sets of pre-recorded sounds that were played through the devices’ speakers. They also used a third-party app called Physics Toolbox Sensor Suite to collect accelerometer readings during a simulated call, then fed them into MATLAB for analysis.
The machine learning algorithm was trained using readily available datasets for speech recognition, caller ID, and gender recognition. The results from the tests varied depending on the dataset and device used, but overall the researchers’ experiments yielded promising results and proved that such eavesdropping is possible.
For example, caller gender accuracy on the OnePlus 7T ranged from 77.7% to 98.7%, caller ID classification ranged from 63.0% to 91.2%, and speech recognition was accurate from 51.8% to 56.4%.

On the OnePlus 9, gender recognition accuracy exceeded 88.7%, but caller ID dropped to an average of 73.6%, and speech recognition showed results ranging from 33.3% to 41.6%.

Researchers admit that the effectiveness of the EarSpy attack can be significantly reduced by the volume that users themselves choose for the speakers of their devices. That is, low speaker volume may well interfere with the implementation of wiretapping as a whole.
In addition, the reverberations and the resulting output are significantly affected by the arrangement of the device's hardware components and the assembly density, and the accuracy of the data reduces user movement and vibrations caused by the environment.

OnePlus 7T Device
Let me remind you that one of the studies from previous years used the PoC application Spearphone, which also abused access to the accelerometer and analyzed the reverberations that occurred during telephone conversations. However, at that time, the experts used a speakerphone, due to which the accuracy of determining the gender and caller ID reached 99%, and the accuracy of speech recognition was 80%.
The authors of EarSpy conclude that phone manufacturers must ensure a stable sound pressure level during phone calls, and also place the sensors in the case in such a way that internal vibrations do not affect them or have the least possible effect.
Source