Text Transcript of SpoolCast: Follow-up to Conducting Usability Tests in the Wild with Jared Spool and Dana Chisnelll. Original audio available here: [music] Brian Welcome to the SpoolCast. I'm Brian Christiansen. This week we have Christiansen: a special show, answering audience questions and follow-up from our most recent virtual seminar "The Quick the Cheap and the Insightful: Conducting Usability Tests in the Wild." During the seminar we received more questions than we had time to cover. So Jared Spool sat down with Dana Chisnell to cover many of the remaining questions. If you are unable to attend this seminar don't worry. There's plenty in here for you, too. And now, one with the show. [music] Jared Spool: Hello, everyone. We are here at the studios of User Interface Engineering. And I have with me Dana Chisnell, who recently did a virtual seminar with us for a quick and dirty usability Testing in the Wild. And she, we are here to answer some questions that our virtual seminar attendees had that we didn't get a chance to answer during the seminar as completely as we would have liked. So we are going to do that. Hello, Dana. Dana Chisnell: Hi, Jared. Jared: Hello. So why don't you quickly just give a... It was a nice 90minute virtual seminar but why don't you, in two minutes, say everything you said. [laughter] Dana: Really, really fast. Jared: Really, really fast. Dana: The gist was that classic usability testing comes with a very serious step by step methodology but you can do a much more minimalist usability tests if you are in the formative exploratory stages of your design and get really great results. Jared: OK, so we had a bunch of questions that people asked. And we're just going to sort of march through some of them here. Wendy asked: "Basically on what the difference was if there is a middle row between a full blown classical tests and this quick and dirty method that you talked about?" Dana: There's a really wide middle road between them. When I think of the classical usability test, I think of something that's summative or validating about a design that you already know a lot about. So you have benchmarks and you are going to do a very rigorous test, end to end with participants who are exactly like your real users and you're probably doing it a pretty controlled situation most likely a lab. On the other end of the scale are the kinds of tests that I'm thinking of in the wild. Much less formal. Usually in the context where the user is using it or in some other convenient place, where users are that you can get very quick insights from them. And you've done really only minimal planning and designing basically but the team has got together and said: let's put on the test and find out what we can find out really quickly. Jared: so the middle ground would be just adding variables back in to the quick and dirty or removing them from the more formal laboratory based testing. Dana: Right. Spending more time together on the team discussing what you want to find out and planning that out sort of step by step, looking at the design of the test to construct task scenarios where you definitely will be able to collect data about particular behaviors. And issues that you want to explore. Jared: OK. So, people had questions about every stage of the usability test process. So I thought we would walk through them. And one of the early stages that one has to deal with is actually getting participants. Rene asked about the numbers of participants in the study, which of course, is always a question people want to know. How many do you need to make it a reasonable study to make sure you're not making mistakes? And she said, she commented on the fact that during the seminar you suggested that four or five would work, but she had read studies that said that you really need to have eight or more to potentially make these things more rigorous. How many people do you think? Dana: There's a lot of debate about how many participants you should have in a usability test. My short new answer is: the number of participants it takes for you and your team to be confident in the data that you have to make decisions on. So if you only have time and money and resources to do one session and you are pretty confident that that one person reflects your audience, then go for that. But if you have more to defend in your organization and your team, then you'd probably need to do more sessions. My thing that you could do for five sessions, individual sessions, comes from two things. One is that I'm expecting that you are going to do more tests. That over time you are going to do four or five or six usability tests of the same design through the course of the development of the design, with four or five people. So you end up with a lot of data that you have iterated the design with. The other part of that is that four or five people are often enough when you are in the early stages of a design to begin to see patterns and repetition in the kinds of problems people have. And that's the main point of doing these usability tests is to find the things that people are frustrated with, what they have issues with so you can go and remedy those things. There's no point in doing 20 sessions if 18 of those people are just going to go through pure agony and you are going to have to see exactly the same things over and over again. If you feel like you've gotten through four or five or six people and you're seeing the same things, stop. Go change the design and then go do more testing as soon as you can. Jared: So, one of the things that we've often referred is that you should be working towards the point of least astonishment. Dana: Exactly. Yeah. Jared: The point where the team is no longer surprised by what they are seeing in the study. And that could happen with just a few users or it may take a lot, depending on how broad the users are and the tasks are and the nature of the works. That makes sense to me. Glory asked about the criteria that you use when you're screening your participants. Dana: The main criteria to use to screen participants is really about the behavior that you want them to do. If you sit down and talk with your team about: who are the people who are going to be using this, and what are they doing with it. If you can come up with a one or two sentence description of what that person would be like, that's your selection criteria. With that you can go out and do a wild test. That visualization and articulation of who that person is will give the team a lot of ideas about where to go to find those participants, and what sources there are. It may even be who you know in your company down the hall, or in your family or friend network who might meet that description. After that, if you want to get more formal, then you might want to add more in the way of requirements or classifiers that have to do with frequency, or I don't know. It depends on your situation, but probably not in the way of demographics unless you have a demographically-based product. Jared: OK. So, we recruit our participants, we go out into the wild, and we find our participants? Dana: Right. Oh, and we should talk about what wild means in this case. We talked about it a little bit. Jared: It's not some deep, dark woods in Central Europe. [laughter] Dana: It could be if you're testing an orienteering application on a phone. Jared: Right, gingerbread houses. [laughter] Dana: Yes, because those bread crumbs just don't work. Jared: It's like I've been telling people for years. [laughter] Dana: Right. I was thinking of Hansel and Gretel, but I guess it's the same thing. In this case when I think of in the wild, it's things like cafes where there's a WiFi hotspot, a food court, user group meetings, trade shows, exhibition halls, and things like that which go along with the conventions. But you could also be in the wild wherever you think that your users might be working. Some of those situations might not be right. It might be the company cafeteria instead, or a local store or some other venue that's not a lab where you can just walk up and start talking to people. Jared: OK. So, we're there and we've got our design team in tow. Do we let them be part of the observers? Andrea asked if we recommend that designers are part of the observation team. Dana: This is not about letting them. This about insisting that the designers be part of the observation team. All the designers, whether internal or external, everybody who thinks of themselves as having a finger in any part of the design--including management-- should have some experience observing these sessions. That's another parameter on how many participants to recruit. You may end up with enough data, but you may not end up with enough sessions for all the people who are on the larger team, who need to experience this sort of intellectual property of getting all that day. Yes, the observer should include the designers. You don't want to have 20 people there observing with one participant. That can be intimidating. But if you have one or two observers with you as the moderator with one participant, it's not too bad. Usually that works out OK. And you can schedule enough session so that you have a pair of observers with you rotating in and out. If you have five or six people in your team, then you can do three or four sessions and having different observers in each one. If you have more sessions and you still only have five or six people in your team, then your observers get to see more sessions that way. But schedule them so that everybody has an opportunity to sign up. You know who is going to be with you, and you can give them coaching about how to be a good observer and what you're expecting of them during the session. They should have responsibilities besides just being there. Jared: We've got a couple of questions about dealing with the pushback that some people have to deal with. It's about going off and talking to four or five users, and watching them do things that you asked them to do. It's not very scientific, although it seems awfully scientific to me. Science is usually about all sorts of structure, and making things do things that they weren't intended to do. But people seem to insist that this is somehow not scientific. How do you talk to that? Dana: I think we get caught in the bind that usability testing looks like a psych experiment. If you do a formal classical usability test in a lab, it has the guise or framework of appearing to be scientific. But the truth is that in any kind of user research usability testing, it would be tough to say that this is science. You can say that you're applying rigor by being very structured in how you develop the research questions and the measures of the data that you're collecting. But basically what we're talking about is craft, a way to get enough information to make decisions, as opposed to guesses, about what you're doing in your designs. How you defend doing a less formal usability test, or some other kind of evaluation, to people who think that this is not scientific? Well, this is the way I've done it with teams and clients that I've worked with. We've had this discussion about how they're making their design decisions now, and whether they're actually observing users using designs or not, and where their other data is coming from. There's something to be sad about market research not exactly being a science either. Jared: That's where I think a lot of this comes from. Market researchers get a lot of pushback for making their work more scientific. They get into controls and sample sizes. There is a lot of statistical science that goes into market research. Dana: Well, it's quantitative, that's for sure. Jared: Right, but there's a lot of experimental theory that market researchers like to pull from to justify the validity of their results. That's primarily because they don't want to end up with another New Coke. They don't want to have done all this research to prove that something is a good idea, only to discover that the public hates it. Dana: And yet, how they get there is doing focus groups. Jared: Right, there is that. My favorite is the folks at Procter & Gamble, the Folgers team. They actually used not just focus groups, but focus groups that utilized past life regression. Dana: Past life regression? [laughter] Jared: Yes, past life regression. They would hypnotize the members of the focus group, bring them back to a previous life, and ask them questions about how much they enjoyed coffee. They swear that this research is the greatest thing, and it actually produced changes in their marketing for Folgers that created huge new revenue streams that they never had before. I'm just saying. Dana: Because the past lives went out and bought Folgers? Jared: Basically that's scientific. Dana: Yeah, OK. [laughter] Dana: The thing to focus on with usability testing is that it's about identifying design problems that frustrate users. It's not about generalizing findings to large populations like market research is. Where you're looking at preference and opinion. It's really about trying to eliminate frustrations for users. If you only one person having this issue, it's likely that other people out there in the world are going to have the same issue. So, why not spend a little bit of money with very few people risking exposure to your design early on, rather than launching a product that hasn't been through usability testing and risking your design on thousands or millions of people at a time. Jared: Right. I wonder how much of it has to do with the fact that a usability test is collecting information about behavior, whereas lot of market research is collecting opinion. Opinion is statistically unreliable. It requires a lot of things to sort of line up. You have to make sure that the question isn't leading, you have to make sure that the person who is answering the question understands the answer. You have to make sure that they've decided to tell you the truth. It's not necessarily that they're outright lying, but they may just want to please you. Dana: People do a lot of filtering. You don't know everything that is going on. Jared: But in a behavioral study, where people are behaving, you can see the behavior. It's not reporting the behavior. It's not, "How many times in the last month have you bought milk?" It's actually looking at how many times they've bought milk. In the case of a usability test, it's watching them actually go through the design. If they run into problems in the design, you can see that and you can ask yourself if it's a problem that you think other people are going to run into. Dana: Exactly. Seeing that direct interaction makes all the difference. To me, that's the defensible part of doing this. In other methods or channels there's not a good way to do that. This actually makes me think of another question that came up about doing remote testing, and how testing in the wild might work with that. I consider remote testing being in the wild, actually. That's because this is a way for you to reach people who you wouldn't normally be able to reach. By remote testing, I mean making sure that what you are testing is in the hands of the participant who is very far away from you geographically. And being able to observe their behavior and interaction with it through some mediated way, like a net meeting, GoToMeeting, Live meeting, or WebEx or something like that. Jared: UserView is another tool that people like. Dana: Right, and listening to them on the phone while they do the interaction. Now, they may or may not think aloud. But at least you can see what's going on on the screen. Jared: And you can talk to them and ask them questions. A lot of market research can't do that. A survey asks you, "Was your trip to the bakery pleasing/somewhat pleasing/not pleasing or displeasing/somewhat displeasing." The person says, "Well, I really like the person behind the counter, but the bread was stale." OK, but is that pleasing or somewhat pleasing? You don't know what that means. You get these ratings back that don't mean a whole lot. Dana: And here, whether you are physically present with the participant or not, you can see how they're interacting with the design in a very direct way. That is what you want to find out. You can see where they get stuck and where they make wrong turns. You can see where they make mistakes or misinterpret things. Then you can make corrections to the design based on those issues. Jared: OK, that makes total sense to me. During the seminar, you talked about people recording their studies. You mentioned that you could use a simple webcam to make recordings in the wild without much fuss. A bunch of people asked about the ethics of this, and asking permission. How do you handle the video recording part of this process? Dana: Well, if you tell them that they are being recorded, sometimes that's enough. In some situations people can say that they don't want you to do that, and you just don't do that. You have to figure out some other way to collect the data. In the story that I told in the virtual seminar, which I'll recap briefly now. A guy told me a story about not being able to do a usability test, so he set up webcams in his company's lobby where he could then observe people using a PC that was sitting in the lobby. His situation was a little bit different in that he's located in England, where there are cameras everywhere. Jared: And they have very different rules about... Dana: privacy, personal space, and all that kind of stuff. You're on camera everywhere you go in England. That's true in a lot of places in Europe actually. There are little signs almost everywhere that say that you're on camera. In that case he didn't feel a responsibility. He didn't feel that it was an issue to tell people that they were being recorded. In the United States there are some places where you can do that. Stores, for example, all have security cameras. If you're feeling at all uncomfortable about recording somebody, then the thing to do is err on the side of telling them. That A, and B is asking them for permission. Then if you company is tight about these kinds of things, you might want to create some little permission slip that you can get your participants to sign. That shows that you have evidence that they're OK with that. It doesn't have to be a super legal official thing. It can just say "I know I'm being recorded" and that's OK. Jared: Do you recommend, if you're going through a formal recruitment process where you're recruiting people in advance, to actually tell them at the time that you recruit them that you'd like to record this? Dana: Yeah. Often part of the recruitment process that my company uses is to tell people that the sessions will be recorded and ask them if they have any objections to that. If they do have objections, then sometimes they won't even be picked, unless they're just the perfect participant. That's the only issue, and then we will bring them in and not record them. But yes, we give them a heads up during the recording process. And then we even repeat that in the confirmation emails and calls that we sent people. Jared: OK. So, now we've gone out into the wild. We've watched people and collected our notes. Now it's time to figure out what we're going to do with this. A whole bunch of folks asked us questions about debriefing. You had mentioned during the seminar a technique that you often use involving sticky notes. How does that work? Dana: You give pads of sticky notes, larger ones to give them a little bit of space, to your observers. You ask them to write one observation per sticky note, and to keep those themselves. Jared: That observation might be...? Dana: An observation might be "clicked on the gray sort button without selecting anything to sort". Jared: OK, something that the user does with the design that was worth noting. Dana: Right, but that is not filtered through your interpretation of what might have happened, or why that might have happened. It's only what happened. Don't make any inferences or assumptions. Jared: You've got the designers there at the study, you've given them the Post-It notes and asked them to make these recordings. They're sitting there piling up their little notes as they go. Then what do they do with that? Dana: Let's say it's at the end of a day of testing. It's better to do this incrementally if you're doing a bunch of sessions, but you might want to wait until the end of the test too. That's fine. Jared: Incrementally means after each test, or maybe at the lunch break? Dana: Right. Sometimes that makes it handleable for everybody. With a couple of clients I've had, it doesn't work to hold debriefings at the end of the day, because I can't keep people there after six or seven o'clock at night. Instead, we schedule a longer lunch break. We have pizza for everybody. They can put this on their calendar, it's just part of their day, another meeting. But we hold these debriefings in the middle of a day and rotate in the next sessions. Jared: Is pizza the official usability testing debriefing tool? Dana: No, M&Ms is. Jared: M&Ms? OK, I see. [laughter] Dana: Everybody is together and they have their little stickies. Then we go through an exercise where everybody shows what they've got. There's a whole approach to doing KJ analysis that you've talked about in your sessions. That's a very good way to do it. I won't go into it. Maybe you want to talk about that right now. Jared: No, but we'll have Brian put a note in the podcast announcement that links to the article describing it. Dana: OK, that's a good idea. Jared: That's what we'll do. Dana: But the point is to get everybody's observations out on the table, find a way that works for you all to sort through them, and get to a point where you have priorities. First of all, it's to record the observation and cluster them in some way that makes sense to the click. You can use that to keep track throughout the session of what the issues are. Then when you get into the end debriefing, you can use the KJ method to refine those to things for which you then want to create design direction. Everybody comes to consensus about what the issues are, makes inferences, and then you get to recommendations and actions. Jared: OK. Lisa from Landesk asked if you include non-observers in this debriefing process, or are the only people in the room the observers. Dana: That depends on your culture, I think. I've done it a couple of different ways with different clients. For some clients, it has worked to keep it exclusive to the team. The people who want to be included in the debriefs couldn't somehow make themselves available to come to even one session. And so, they don't get invited to the party. But they might be stakeholders who you really want to be there. Then It's a question of managing the discussion in a way that they're included at the appropriate level, even though they may not have seen any of the sessions. Usually that is self-governing. The team will be so invested and involved that the outsiders get a view into what has happened and contribute if they can. They're not the hippo in the room, the highest-paidÉ Jared: Hippo: Highest-paid Person's Opinion. Dana: Right, overriding all the work that you've done so far. To some extent it's what works for your team and how you think you can manage those outside people while they're there to begin with. Jared: OK. That's a bunch of the questions that we had, left over from the session. Thanks. This has been great. Dana: Do you have any questions, Jared? Jared: Do I have questions? I am without questions. I am completely question-free. [laughter] Dana: OK. Well, thanks. It was fun. Jared: Thank you. We'll talk to everybody next time. Thank you for encouraging our behavior. Take care. [music] Announcer: Don't forget, as always you can purchase access to this and all of our previous virtual seminars at uie.com/virtual_seminars. That's all for this week. Thanks for listening. Goodbye.