CHI+MED logo banner

Stochastic evaluation: randomly simulating users pressing buttons

Key points

  • Many different user interfaces for entering numbers - as found on devices from office products to medical devices - could be made much safer without affecting their normal use.
  • Our technique, called stochastic evaluation, which involves computer simulation of users pressing buttons, is a very fast and effective way of seeing how usable and safe real or proposed designs are.
  • We have identified ways to make common designs much safer, quickly identifying bugs or defective design choices. Our approach also allows choices to be compared by putting a value on the size of a problem’s impact on safety.
  • The technique is automatic and best when there is an objective measure of how significant an error is. For example, ‘drug doses that are out by a factor of ten or more’ is easy to quantify and hence easy to evaluate automatically.

Randomly simulating people
Many errors people make have bad consequences because they don’t notice them until it is too late. This is a problem both for those using medical devices and for designers, who may make mistakes in the design and construction of devices. A quick and simple way to find design issues is to run a computer program that simulates people randomly using a design over and over again to see how many errors occur and how bad they are. One can then redesign to fix problems found as well as compare the safety of different designs. Designers can then choose the safest designs and even learn principles for safer system design more generally.

A big advantage of random simulation (which is called, more formally, stochastic evaluation) is that it can find, and give a value to, the likelihood of design errors leading to problems that nobody anticipated. It is also very easy to do. All that is needed is a medical device or a prototype to run, a way a computer can automatically press buttons on it at random, and a way of inspecting the result of doing so. The method complements the current way such problems are searched for, empirical testing, which involves getting real users to use a device, noting problems that arise. That approach finds common error patterns, but is slow and costly and less likely to find obscure problems. Random simulation also complements more advanced techniques such as formal methods, which involves mathematical analysis that requires sophisticated skills.

Designs that prevent mistakes entering numbers
We have applied random simulation to the designs of interfaces for entering numbers. This has led to our finding a whole series of design problems in existing interfaces that lead to a high probability of people making mistakes.

Silently ignoring key presses
Many devices that require number entry, such as calculators and infusion pumps, just ignore the user pressing the decimal point more than once. This behaviour means that if a person tries to correct the key presses [0 • • 7 5], by trying to delete a decimal point, they could turn it into 75 rather than 0•75. If the second decimal point was just silently ignored then the attempted correction would delete the only one registered instead. Does this really make a difference? We studied a variety of designs with computer simulated user key pressing to see what effect decimal point handling has on the accuracy of numbers. We found that correctly handling decimal points can at least halve the rate of ‘out by ten’ errors, where a person enters a number that is ten or more times larger or smaller than that intended. This can be achieved by registering all keys the user presses (including multiple decimal points) on the display and by blocking and alerting the user to the entry of numbers that violate the international guidelines for formatting numbers of the Institute for Safe Medication Practices. Few medical devices do this, yet it would be easy to fix them — and normal (error free) use would be unaffected.

There is a related problem with some numeric keypads, like calculators, i.e., those which have keys for each digit. Commonly, the devices only have space to display numbers with at most 8 digits. The delete key behaves unexpectedly because applications often ‘ignore’ digit keys pressed after the display is full. Trying to correct the 9 digit number 123456786 to 123456789 with the sequence of keys [1 2 3 4 5 6 7 8 6 DEL 9] could delete the 8 if the last 6 had just been ignored. This would turn it into 12345679.

Overall the important design lesson here is that devices should always alert users when they have made a mistake, like entering too many digits or multiple decimal points, rather than silently ignoring the flawed key press. Otherwise the person may not notice their mistake, or not notice that it has been silently corrected and then mistakenly try to correct it.

Moving a cursor over numbers
Some devices that require numbers to be entered use arrow keys to move a cursor over a number and to adjust the digits in it. How should the arrow keys work? For example, pressing UP will increase 1 to 2, 2 to 3, 3 to 4 and so on, but what happens when 9 is increased? Does it go to 0 or does it go to 0 and increase the digit to its left (does 29 “increase” to 20 or to 30?). Perhaps it should stay at 9. Similarly what happens if you press LEFT when in the leftmost position? What happens when the device gets to the largest number it can display? We studied combinations of design choices from a set of common features like these found in designs. We inserted simple, common keying errors such as pressing a key twice or missing a key press at random. We looked at each design decision separately in terms of whether it was included or not. We found the safest choices in terms of which is least likely to lead to a large error in the sense of being least sensitive to simple keying slips are:

  • when users attempt to increase or decrease numbers beyond the maximum or minimum values allowed on the device they are stopped from doing so,
  • the cursor starts on the leftmost digit, and
  • when moving left or right, the cursor does not jump from one end of the display to the other, but instead the cursor stays where it is.

We found that the safest combination of design choices was for the cursor to start on the left, the digits to act as independent dials, the cursor not to jump from one end of the display to the other and overflow/underflow to be blocked. Several different combinations of design choices gave a design that was far less sensitive to error than the worst cases, though and more work is needed to study how the features should interact. The best behaviour for the UP and DOWN buttons when the digit is at 9 or 0, respectively, in particular is very subtle. Which choice is best depends on what other choices have been made. Overall, the safety of a design is dependent on the specific combinations of design decisions made, not just individual choices taken in isolation.

We also identified a large number of quirky design bugs (i.e., very obscure behaviour of arrow keys) in widely used infusion pumps: medical devices used to give patients drugs or other fluids. The safest combinations of design choices we identified are much safer in terms of sensitivity to slips than many common medical device designs.

Improving Design
The clearest conclusion from this work is that user interfaces should always block detectable errors rather than ignore them, and should give warnings that the user must acknowledge. Design problems, such as obscure behaviour of arrow keys, needs to be eliminated. These problems make it more likely that those using medical devices will make mistakes. Fixing them will reduce the chances of someone being harmed.

Key people
Abigail Cauchi, Paul Curzon, Andy Gimblett, Paolo Masci, Patrick Oladimeji, Harold Thimbleby

A. Cauchi, A. Gimblett, H. Thimbleby, Simulation to Evaluate Alternative Approaches to Blocking Use Errors, Journal of Medical Devices, 6(1):017502, 2012. doi 10.1115/1.4026680

A. Cauchi, P. Oladimeji, G. Niezen, H. Thimbleby, Triangulating Empirical and Analytic Techniques for Improving Number Entry User Interfaces, EICS2014, 6th ACM SIGCHI Symposium on Engineering Interactive Computing Systems, 243–252, 2014. doi 10.1145/2607023.2607025,

A. Cauchi, P. Curzon, A. Gimblett, P. Masci, H. Thimbleby, Safer "5-key" Number Entry User Interfaces using Differential Formal Analysis, Proceedings BCS Conference on Human-Computer Interaction — BCS-HCI, XXVI:29–38, 2012.

A. Gimblett, H. Thimbleby, User Interface Model Discovery: Towards a Generic Approach, Proceedings ACM SIGCHI Symposium on Engineering Interactive Computing Systems — EICS 2010, G. Doherty, J. Nichols, M. D. Harrison eds., 145–154, 2010. doi 10.1145/1822018.1822041.

P. Cairns, H. Thimbleby, Reducing Number Entry Errors: Solving a Widespread, Serious Problem, Journal Royal Society Interface, 7(51):1429–1439, 2010. doi 10.1098/rsif.2010.0112.

Institute for Safe Medication Practices. 2006 List of error- prone abbreviations, symbols and dose designations. See http://www.ismp.org/tools/errorproneabbreviations.pdf.