How algorithms decipher your secrets through a microphone

Carding

Professional
Messages
2,828
Reputation
17
Reaction score
2,101
Points
113
Acoustics or magic?
A team of researchers from British universities has trained a deep learning model that can steal microphone-recorded keyboard keystrokes with up to 95% accuracy.

When using Zoom to train the sound classification algorithm, the prediction accuracy was reduced to 93%. However, even this indicator remains at a dangerously high level and is a record for this communication medium.

Such an attack poses a serious data security risk because it can leak passwords, discussions, messages, or other sensitive information to attackers.

Unlike other side channel attacks that require special conditions, acoustic attacks are made easier by the proliferation of devices with high quality microphones. This, combined with the rapid development of machine learning, makes acoustic attacks much more dangerous.

The first step of the attack is to record keystrokes on the target's keyboard, as this data is needed to train the prediction algorithm. This can be achieved by using a nearby microphone or the target's phone, which may have been infected with malware accessing its microphone.

Alternatively, keystrokes can be recorded using a Zoom call, where a fraudulent meeting attendee correlates the messages typed by the target with their audio recording.

The researchers collected training data by pressing 36 keys on a modern MacBook Pro and recording the sound of each press 25 times. Then, spectrograms were created from these recordings, which made it possible to determine the differences between each key.

Sample keystroke sound
They then created signals and spectrograms from the recordings that visualize identifiable differences for each key, and performed specific data processing steps to amplify the signals that can be used to identify keystrokes.

Produced spectrograms
The spectrogram images were used to train 'CoAtNet', an image classifier. The process required experimentation with learning rate and data splitting parameters before the best prediction accuracy results were achieved.

Parameters chosen for CoAtNet training
The experiments used the same laptop, iPhone 13 mini and Zoom. The CoANet classifier achieved 95% accuracy for smartphone recordings and 93% accuracy for Zoom recordings. Skype showed a lower but still acceptable accuracy of 91.7%.

Test setup
For those concerned about acoustic attacks, it is recommended to change the typing style or use random passwords. Other methods of protection include using software to play keystrokes, white noise, or software audio filters for the keyboard.

Remember that the attack model has proven to be very effective even against very quiet keyboards, so adding noise dampeners to mechanical keyboards or switching to membrane keyboards is unlikely to help.

Ultimately, using biometric authentication, as well as password managers to automatically enter sensitive information, can also help keep users safe.
 
Top