UIEtips Article: Cue: A Usability Testing Bake-Off

Jared Spool

April 17th, 2007

UIEtips 4/17/07: Cue: A Usability Testing Bake-Off

What’s the difference between science and art? Is usability testing more of a science, that produces predictable, repeatable results? Or is it more of an art form, where the skill and talents of the people involved will have tremendous influence on the outcome? These are critically important questions, as we try to bring techniques like usability testing into the mainstream.

In 1999, Rolf Molich challenged the usability world with the CUE-2 study. In this landmark experiment, he had 9 separate usability teams each evaluate the same interface: Microsoft’s Hotmail.

If usability testing is a science, we would expect that every team would produce essentially the same results by finding the same problems and reporting them essentially the same way. Certainly, as with any scientific process, we could expect that the teams would generally agree on the most serious usability problems and rate problems with similar scales.

But that isn’t what Rolf found. Instead, each team essentially had their own methods and, surprisingly, found their own set of problems. 75% of the problems reported were only found by a single team — many of them very serious problems with the interface.

When everybody does something differently, it means we’re closer to art than to science. And being close to art, it means that the people involved have more impact than the process or methods chosen.

Is this bad? Not really. However, I do think there is something good about moving our craft to be more scientific. And it’s research like the CUE studies that will help us do just that.

In this week’s UIEtips, we’re re-printing an article I wrote back in 2005 about how Rolf’s work can help us learn about usability testing best practices and hone our own capabilities.

As always, I’m interested in your views on this. How do you learn to do your work better? Have you come up with ways to learn new techniques and tricks? I’m always interested to hear what you’re doing to improve your results. Leave your thoughts and join the discussion below.

Read today’s UIEtips article.

[If you want the opportunity to compare your work to practitioners taking part in Rolf's CUE projects, then you should seriously consider attending his full-day seminar at UI12, Advanced Methods for Usability Testing. Under Rolf's careful direction, you'll walk through all aspects of conducting a usability test, from test scenario creation to reporting, to see what some of the best practitioners have done. You can read more about Rolf's session here: Advanced Methods for Usability Testing]

9 Responses to “UIEtips Article: Cue: A Usability Testing Bake-Off”

  1. Susanne M. Furman Says:

    What you are describing appears to be a heuristic evaulation and not a usability test. Without providing more information about methodology you are leading the reader to believe something that may not be the case. You should know better than that!
    If you are trained in research methodologies – you know better than to do this. You produce repeatable results with good applied scientifid methodology – you know – what you learn in graduate school!

  2. Craig Duncan Says:

    I attended a seminar on the CUE studies a few years ago and it really opened my eyes. Rolf adds scientific rigour to this emerging discipline, and I think this is an essential element. The CUE results are pretty terrifying to those of us that work in usability, but I also find them strangely encouraging. There is still so much to explore…

  3. Douglas Potts Says:

    Rolf’s work only shows that usability testing is a badly performed science…poor experimental design, coupled with poorly trained experimenters. Usability testing should only be considered a science.

  4. Ronnie Battista Says:

    I too subscribe to the ‘more art than science’ argument when it comes to crafting usability studies. Drawing on a variety of sources, from customization based on direct testing experience to applying rigorous/unwavering test methodologies to adopting the ‘best of’ from what you’ve seen the competition doing, there’s no ‘right’ way to do it.

    I think what was missing from Rolf’s landmark study was a follow-up to see what resulted from any implemented changes (i.e. could it be proven that one company’s high priority proven a big issue, and another’s proven less impactful?). And is there any way to do this based on method alone, or does the skillset of the test team members play a more significant role than the methodology?

  5. Zephyr Says:

    I attended one of Rolf’s classes several years ago and found it very insightful. To me it really dispelled the myth that ‘anyone can do a usability test’ and produce the same results. But I also came away with practical tips on how to craft a better test, one that will produce more reliable and useful results. Which is good.

  6. Alexander Muir Says:

    There are several interesting issues in this question – Jared you’ve put your finger on something we could surely debate vigorously!

    So I’ll try to add something a little different.

    Usability is inherently multidisciplinary: the art and the science are two ends of the same stick. We need qual and quant methods. Quant methods are generally taught in a scientific setting, and ‘look’ very scientific and rigorous. But when we seek to understand a participant’s mental model which even they are fully conscious of – we have use empathy, intuition, and inference.

    So, my experience – having practiced and taught usability in universities and corporations – is that we improve our skills by enjoying both the artistic and scientific side of usability. Letting both rip! Find the methods that turn us on and come easily to us, and then try to develop the one’s that don’t.

    In my case, I love the loosely scripted interview, and relying heavily on my sense of empathy. I can get that to work for me quite easily. I have to work harder on putting together testing scripts that will be reusable over a whole series of tests. But both types of activity support each other.

  7. Lachlan Scott Says:

    I’m largely in agreement with the Susanne M. Furman here that Jared’s editorial isn’t entirely clear or fair. I appreciate what he’s trying to get at here, which I think is that since science requires repeatability, if different results are produced by different groups, surely the scientific process or its resulting hypothesis should be in question. But I suggest this would only be true if the same hypotheses were being tested. Different scientists using different experiments to test different hypotheses may well produce different results, but it would still be science.

    Likewise, individual skill and talent is not primarily what differentiates good artists and their art, I don’t believe. They must all be technical masters, and understand what they can and cannot do with their medium – that is craft. It is each artists’ unique preferences in what they want to *produce* with their craft skills that differentiates them, and elevates one above another, I believe.

    If Jared’s asking whether usability testing is art or science, I’m inclined to say that yes, it is science – but it’s a soft science, just like literary or music criticism. The problem for us as people interested in usability is to know into which canon it fits so that we can know which tools we are using to work with it. Donald Norman and Alan Kay (to take two names off the top off my head) are scientists, and their work is scientific. I say that our work is science too – the science of good design.

  8. Susanne Furman Says:

    Some additional thoughts. Part of the problem is that there are so many individuals out there calling themselves “usability specialist or usability engineer” etc. Some of them have academic training (e.g., human factors or applied research degrees) and some don’t. Those who don’t – often don’t understand how to do research or experimental rigor.

    You can get whatever results that you want from usability testing – simply by the scenarios you write. And you can turn around and say – wow – we had these fabulous results and our interface rocks. And maybe it doesn’t.

    Let me make an analogy – I have done electrical work in my house and have managed not to electrocute myself. However, I wouldn’t go out and call myself an electrician by any means. I am not discounting those individuals who are out there trying to make a difference. But some of these issues are due to exactly this.

  9. Jay Cross Says:

    Jared, science is not necessarily reliable, e.g. the same circumstances always producing the same results. What about the interplay of complex systems? Emergence? Half-baked scientific methods?

    jay

Add a Comment