CMU-CS-20-134
Computer Science Department
School of Computer Science, Carnegie Mellon University



CMU-CS-20-134

This Is Your Behavioral Keystroke Biometric on Rubbish Data

Roy Maxion, Vrishab Commuri

September 2020

CMU-CS-20-134.pdf


Keywords: Keystroke dynamics, behavioral biometrics, data quality

Keystroke dynamics is a behavioral biometric, typically used to identify people based on their typing rhythms, and to distinguish between legitimate and fraudulent users/behaviors. Two common sources of typing data are: controlled laboratory environments; and real-world, field environments. Because keystroke researchers tend to use lab and field data interchangeably, it is conjectured that any differences between lab and field data are nil or trivial, the effects of which can be ignored at no cost to an automated decision-making procedure that distinguishes between legitimacy and fraudulence. We test this conjecture by conducting a lab-based typing experiment, and replicating it under field conditions, each with 100 participants. The lab environment used a single hardware/software platform and keyboard with high-resolution keystroke timing, whereas the field environment relied on whatever hardware and keyboard a volunteer participant happened to have. An analysis of both sets of typing data revealed that USB keyboards, used in the field, injected artifacts into the data, causing the data to lack fidelity to the actual keystroke signal. These artifacts were observed to change an algorithm's decision by nearly 20 percentage points, wrongly reversing a distinction between legitimacy and fraudulence. This paper chronicles the methods by which these artifacts and their damaging effects were discovered.

31 pages


Return to: SCS Technical Report Collection
School of Computer Science

This page maintained by [email protected]