Belated Homework 3

David S. Cargo (escargo@anubis.network.com)
Wed, 4 Mar 1998 10:29:37 -0600

Date: Wed, 4 Mar 1998 10:29:37 -0600
From: escargo@anubis.network.com (David S. Cargo)
Message-Id: <199803041629.KAA01198@brutus.network.com>
To: dahlen@cs.umn.edu
Subject: Belated Homework 3

David S. Cargo
CSci 5110 Winter 1998
Homework 3 (Test Plan)

Because I overlooked the individual part of the test plan assignment, I
failed to turn on in on the required schedule. (I believe I was working
on the prototype at the time.) Below I have included the test plan
assignment with the following twists: Since this class is about process
and learning from experience, I'm including a REVISED test plan that
takes into account the experiences I had conducting the user testing.
While I could just pretend that it hadn't happened yet, I think I can
benefit more from developing an improved test plan than an uninformed
one. (If I'm wrong here, please let me know.) Just as a user test can
form the basis of an improved interface, a user test can form the basis
an improved user test. Bearing that assumption in mind....

o What background information you will provide to the user about the
tasks or computer system, if any.

Especially given the lack of understanding by our testers, rather than a
verbal description of the system, I think a brochure like what we intend
to have for the open house would be appropriate. This brochure would
provide the motivation for the application (publishing information of
interest to more than one person on the web, for people like support
staff, class instructors, etc.), some theory of operation (including
describing the simple and the complex cases), and showing some sample
output (including URLs where users can see for themselves what the
output looks like). While testers might not read the brochure in
detail, providing even a glimpse of the theory of operation for the
application would vastly improve the understanding of application by the
tests.

o What tasks you will ask the user to complete, along with anything
important about this instructions.

In our actual tests, we had five tasks of increasing complexity. We had
each tester do tasks 1 and 2; two testers did task 3; two testers did
task 4; one tester did task 5.

Task 1 asks the user to convert a default mailbox into one HTML archive
in the default location.

Task 2 asks the user to convert a user-specified mailbox into one HTML
archive in the default location.

Task 3 asks the user to convert a user-specified mailbox into one HTML
archive in a specified location.

Task 4 asks the user to convert three user-specified mailboxes into one
HTML archive in the default location.

Task 5 asks the user to convert a specified mailbox into an HTML archive
in the default location that does not contain any messages with the
subject "fun".

These tasks are in graduated difficulty; tasks 1 and 2 were always first.

What we discovered as we did our tests is that we needed to have given
our testers a better description of our application. One comment we
received often (and which we will address for the open house) is that
our description of the application did not give the testers enough of a
conceptual model of what we were doing to set a context for the tasks
that they were performing. Therefore, given that we would be giving a
brochure describing our application to the testers, the tasks would be
set in the context of exercising the advertised features of the
application, and not just as freestanding tasks.

Instructions would also include assumptions about default directories,
default mailboxes, etc., so that testers would know what the defaults
are and why. Instructions would be read to testers, and then they would
get a copy.

o What role you (or your group members) will take during the user test
(e.g., running a paper prototype)

Because of my familiarity with the underlying application, I would
observer the testers to see where there are program bugs (and what they
are), navigation failures, missing feedback, and inconsistencies. That
is, as a recorder. If the testers falls off the prototype, I would step
in and specify what the expected behavior would have been so that
testing could continue.

o At what points, if any, will you rescue the user if s/he is stuck?
What help will you offer?

I think that rescues would be important when the tester doesn't know how
to proceed with the task. In the case our our particular prototype, I
would point the tester to where there was information that wasn't
noticed or remembered. If that wasn't sufficient for the tester to
continue, I would thank the user, apologize for our failure, and
terminate the test.

o What do you plan to record while observing the user.

I would plan to record every bug revealed by the testing, everywhere the
tester got confused or didn't know how to proceed with the task,
everywhere the tester had a question, everywhere the tester stated a
mistaken conclusion, everywhere the tester needed feedback that wasn't
available, and everywhere the tester makes a mistake. I would
also record everywhere I had to intervene when the prototype couldn't
perform the actual functions.

I would create paper copies of our UI's screens for the observers to
make notes on. This would let us avoid having to record where on the
screen the tester did "interesting" actions; we could note it directly
on the model of the screen.

This would also allow easier correlation of the results between
different notetakers, since it enforces a common format. This would
improve analysis.

o Anything else you consider important.

Based on our own performance, I would say that we needed to have
asked some questions of the testers to see how well they fit the
profiles of our anticipated users. Similarly, after the tests, I would
ask for immediate feedback on our application and the testing
experience. I would do this for two reasons. First, the feedback would
be valuable to us as developers. Second, having the testers verbalize
their feedback immediately would probably help them remember it better
when it was time for them to write their own evaluations of our
application. (This latter point might not be some important to real
user tests unless there were follow-up interviews.)

dsc