A team of UK researchers has trained a deep learning model to interpret keystrokes remotely based solely on audio.
By recording keystrokes to train the model, they were able to predict what was typed on the keyboard with up to 95% accuracy. This accuracy dropped to 93% when using Zoom to train the system.
According to the new research, this means that sensitive information like passwords and messages could be interpreted by anyone within hearshot of someone typing away on their laptop, either by recording them in person or virtually through a video call.
These so-called acoustic side-channel attacks have become much simpler in recent years due to the abundance of microphone-bearing devices like smartphones that can capture high-quality audio.
Combined with the rapid advancements in machine learning, this makes these kinds of attacks feasible and a lot more dangerous than previously thought. Basically, you could hack sensitive information armed with nothing more than a microphone and machine learning algorithm.
“The ubiquity of keyboard acoustic emanations makes them not only a readily available attack vector, but also prompts victims to underestimate (and therefore not try to hide) their output,” the researchers said. “For example, when typing a password, people will regularly hide their screen but will do little to obfuscate their keyboard’s sound.”
The team conducted the test using a MacBook Pro. They pressed 36 individual keys 25 times a piece. This was the basis for the machine learning model to recognise what character is associated with what keystroke sound.
This information was recorded both via a phone in close physical proximity to the laptop and Zoom. There were enough subtle differences in the waveforms produced by the recording for it to recognise each key with a startling degree of accuracy.
To prevent someone hacking your keystrokes, the researchers recommend typing style changes, using randomised passwords as opposed to passwords containing full words, adding randomly generated fake keystrokes for voice call-based attacks, and using biometric tools, like fingerprint or face scanning.