Introducing Exploratory Testing

Introducing Exploratory Testing

If you've been taught that the software testing process must include a complete set of detailed test documentation, or that an explicitly-defined set of expected results must accompany each step in a testing plan, you should read this white paper.

By: Yaron Kottler and Bernie Berger

It is a popular belief in the software engineering field that the correct method of testing software requires complete and detailed documentation.  In this approach, someone writes out a detailed step-by-step set of instructions, with a corresponding set of expected results per action, neatly and meticulously laid out in template format.  Then, when the product is ready to be tested, it is a simple matter of following the script.  Someone – sometimes the same person who wrote the test cases, and sometimes someone else – reads the test case, follows the step by step instructions, reads the expected result from the document, and if what the tester observes the software to be doing at that point in time, corresponds, in the tester’s opinion, to the what s/he understands to be what is written as the “expected result” in the document, then the test has passed.  In this paper, we call this method Scripted Testing, because the tests that are being run have been scripted in advance.

The problem is that in the real world it doesn’t work this way, or at least, it shouldn’t.  Good testing is an intellectual activity. It is something you do with your brain, and it not only involves routine comparison, but also complex thought, as we will demonstrate.

We see a number of problems with the Scripted Testing approach:

  • Scripted testing is not likely to find bugs.  If test cases are based solely on some set of requirements – the same requirements which developers fulfill when they code, and assuming the developers are not malicious or cluelessly incompetent, then this testing approach isn’t particularly useful if your goal is to find bugs.
  • Scripted testing assumes that the product is locked down, and that the customers are not allowed to change their minds once the project begins. Many hours worth of test case documentation can instantaneously become insignificant when the customer gets new information and chooses a better functional solution, or if it turns out that the development project is going down the wrong path, for whatever reason.
  • Most people find the activity of scripted testing to be boring, which is a problem unto itself, but also…
  • leads to high turnover in your QA department, as the testers with any ambition whatsoever will run for the exit as soon as a more intellectually stimulating position opens up, leaving you, in the long term, with a severely unmotivated group responsible for testing your software.
  • Lastly, the Scripted Testing approach complements the offshore outsourcing model, which is proving to be failure in terms of long-term cost savings and product quality.

“Playing Around”

How many times have you or your testers found a bug by accident, or by “just playing around”?  In our experience, many bugs are found this way, and we anticipate that you have similar experiences. If this is how we find bugs, by “playing around” well, “playing around” should be the primary way that we test.  But how can this be true?  How unscientific!  Just playing around?  Is this how you’re going to explain your testing strategy to management? To your clients? It sounds so unmeasurable. How can you manage such an amorphous methodology?

Wouldn’t it be great if there was some sort of format – a framework or a set of guidelines – that we could use, something with a common vocabulary so that we could communicate and mean the same things, something that we can teach, something that we can measure –  and still tap into that mysterious force of “just playing around”?

Introducing Exploratory Testing

There is such a framework, called Exploratory Testing. You may have already heard of such a thing, or maybe not. In any case, we find that although the phrase “Exploratory Testing” has become more popular lately, its principles are either not well known in the software engineering field, or worse, widely misunderstood. We have seen some literature calling Exploratory Testing “ad-hoc” testing, or “monkey testing”. We find such definitions inaccurate, at best, and at worse, offensive (…to monkeys).

The rest of this white paper will describe what Exploratory Testing is, and how it is different from ad-hoc testing. We will describe this approach to testing and explain some rules of thumb that you or your testers can use to more formally structure your testing when “just playing around”.  Additionally, we will describe different situations where Exploratory Testing can take place, as well as tips on implementing exploratory testing.  Finally, we will give a broad overview to a specific system that is tailored to manage software testing in an exploratory way.

Historical Background

Exploratory testing is hardly new. The term was coined by the software testing Guru Cem Kaner in 1983, and in his classic bestselling book, Testing Computer Software (1988) he writes, “No matter how many test cases of how many types you’ve created, you will run out of formally planned tests. You can keep testing.  Run new tests as you think of them, without spending much time preparing or explaining the tests. Trust your instincts.”

Testing by instinct was a good starting point, but a lot of work has been done on the topic since those words were published.  Instead of simply dismissing the notion of tester intuition and instinct as unscientific, immeasurable, and therefore unworthy, the exploratory testing approach has been continuously elaborated upon for the last 25+ years, and, more recently, Kaner has updated his original description:  “Exploratory testing is simultaneous learning,  test design, and test execution, with an emphasis on learning.

The key idea is that each test is an experiment that provides information about the software, and with the information that you learn with each test, you are better able to design and experiment the next time.  Each test is necessarily better than the preceding one, because you are constantly learning about the software as you are testing it.  With this approach, test design is done in the moment of testing, by the tester him/herself. The tester himself is in control of the testing, not a test case document or script.

The key difference between exploratory testing and “ad-hoc” testing is the aspect of learning. Anyone can sit at a computer and bang randomly on the keyboard. But it takes a skilled tester, someone who is a competent critical thinker with advanced analytical skills and a creative imagination to be able to come up with a test, observe the results, decipher them, collect and consolidate data about the tests, formulate a hypothesis about the behavior of the software based on the data already collected, and think of new tests, on the spot, to check if the hypothesis is accurate. This can get quite complex, as new experiments often give unexpected and sometimes temporarily unexplainable results.  A skilled tester with good exploration skills can get to the bottom of a complex testing issue by continuously refining his experiments and disproving his own hypotheses, all the while learning about the product. Good software testing is a challenging intellectual activity, and at the same time is a very scientific and systematic experience.

If you’ve been taught to believe that tests must be planned in detail and written down in advance, you may be feeling uncomfortable as you read this. That’s expected, because the exploratory approach is the polar opposite of pure scripted testing, as we will soon see. On the other hand, you may be feeling a sense of excitement because you may be already doing exploratory testing on your own, but have been too scared to admit it because your management requires you to fill out test case templates. In either case, we urge you to read on.

Inattentional Blindness

Why is scripted testing so bad?

Well, it is not always bad – it depends on what your testing objective is.  For example, if you need to hire a consultant to certify that your product is compliant with a certain regulatory standard, and all that needs to be done to achieve product certification is provide evidence that your application has passed a set of industry-standard test cases, then, your consultant should definitely be following the standardized test documentation – this is scripted testing.  On the other hand, if your job is to find important bugs quickly, then, rigidly following a pre-written test plan without the flexibility to try out new ideas as they come to you is a sure way to miss your objective.

Will you really find fewer bugs by following a scripted test?  Why should this be so? This is certainly a valid question, which has an equally valid answer: The process of following a set of instructions actually trains you  to focus just on what the instructions tell you to do, and not to look at other areas where interesting things, like software bugs, may be happening. The scientific name for this phenomenon is “inattentional blindness”, and it is a serious subject studied in behavioral psychology. By focusing your attention on one aspect of the situation, you are not focusing on other parts.

Many people are skeptical about this claim when they first hear it, so it pays to demonstrate it with an example. Kaner demonstrates inattentional blindness with an exercise similar to the one that we use here:

Please get a pencil and sheet of paper, and really work on the example that follows – you will not understand the point of the next section if you just read it – you really have to play along.  Please get a sheet of paper and a pen or pencil now, and do not continue reading until you’re ready.  This exercise will take about 10 minutes.

Exercise:

Imagine that you have a huge barrel filled with thousands of coffee beans.  You also have three containers, each a different size than each other; Large, Medium and Small. If you would scoop the coffee beans into each of the containers so that the containers would be full with coffee beans, the first container would hold 1200 coffee beans, the second container would hold 700 coffee beans, and the third one would hold 200. From the barrel of coffee beans, using only the three containers, your job is to produce a pile of 1700 coffee beans. How would you use the containers to produce the desired number of coffee beans? Please do not continue reading until you’ve written down your answer on a sheet of paper.

 Now we repeat the scenario with different numbers.  The large container can hold 320 coffee beans, the medium holds 190, and the small holds 80.  How can you get a pile of exactly 430 coffee beans? Do not continue reading until you’ve answered the question.

 Next, the large container holds 150, the medium holds 110 and the small holds 30. Produce a pile of 380 coffee beans. Again, don’t continue until you’re done.

 Once again, the large container holds exactly 1000 coffee beans, the medium holds 50 and the small container holds only 10 coffee beans. We want 730 coffee beans. Don’t continue until you have your answer.

 One more time, the large container holds 450 coffee beans, the medium holds 230 and the small holds 30. We want 650 coffee beans when you’re done.  Like before, don’t continue until you’re ready.

 And one last time; the large container holds 380, the medium holds 80 and the small holds 40.  Please produce a pile of 420 coffee beans, and don’t continue until you’ve written down your answer.

 Thanks for playing along.  If you’ve done all the problems in this example, you’ve probably figured out that the trick is to use the smaller containers to scoop out the excess coffee beans from the larger containers.  Let’s take a close look at the last example.  Most people answer that they fill the large container with 380, fill the medium container with 80, and then use the empty small container to remove 40 from either of the two larger containers.  380+80-40=420.  This certainly gives the correct answer… but wouldn’t it be much simpler if you just filled the large and small containers?  380+40 is an easier way to get to 420 than 380+80-40.

Why is it that most people don’t see the easy answer in the last problem?  After all, it’s right there in front of their eyes.  If you’ve missed the easy solution, why do you think you missed it? Don’t feel bad if this happened to you – it wasn’t your fault, and here’s why:

The first five problems have trained you to think in a certain pattern, and when you started the last problem you were simply continuing in that pattern. That’s the way the mind operates. Your mind has set itself up to think in a certain way with each example, one after the other. By the time you got to the last example, you were thinking within a certain model, and were not able to “see” beyond the boundaries of that model.

 

This is precisely what happens inside a tester’s mind when he is following the step-by-step actions in a scripted test. It is likely that the tester will miss interesting things happening right in front of his eyes after following a series of action steps, because the mental framework in which he is working, caused by the sequence of scripted actions, will blind him to other events that occur in the software. Just like the last example in the coffee bean exercise when you couldn’t see the easy answer, the tester will not likely see the bugs (even if they are obvious) because he’s concentrating on following the steps in the test script. That’s why this phenomenon is called inattentional blindness – you are blind to one aspect of the situation because you are not paying attention to it and because you are paying attention to a different aspect. That is the reason why following a purely scripted test is an ineffective approach to use if the testing mission is to find bugs.

An approach, not technique

Another key distinction about Exploratory Testing is that it is not a technique; it’s an approach. What’s the difference? A technique is a specific method that is used to accomplish a specific goal.  For example, making a checklist is a technique for remembering a group of items. Load testing is a technique for checking the system when there are a lot of inputs within a given amount of time. Bug reporting is a technique used to formalize the parts of your test results that need further development work.

By contrast, an approach is a more general concept. It is the overall manner in which you act. It is your game-plan, your strategic process.

The point is, Exploratory Testing is not a specific testing method used to accomplish a specific quality goal.  Instead, it is the overall manner in which testing occurs. In other words, manual testing is not Exploratory Testing, but, you can do manual testing in an exploratory way, just as you can do manual testing in a non-exploratory way. Manual testing, boundary testing, functional testing, automated testing, integration testing, system testing, load and performance testing… all of these are test techniques.  And all test techniques can be done in either an Exploratory, or non-exploratory way. That’s the difference between a technique and an approach. You could say that your approach is the way in which you execute your techniques. So, all testing could be (but doesn’t have to be) exploratory.

 The Continuum of Exploration 

At this point, we should clear up another potential misconception which might be confusing for you. So far, you might be under the impression that testing is either purely scripted, or purely exploratory, but this is not the case. Earlier, when we said “you can do manual testing in an exploratory way, just as you can do manual testing in a non-exploratory way”, we do not mean that these approaches are mutually exclusive – that it is one or the other. No, that’s not how it works. There are degrees of exploration that exist on a continuum. Let’s elaborate on this for a moment.

Imagine a sliding scale, where one end represents pure scripted testing and the other represents pure exploration. Pure scripted testing is as we described before: the testers are human robots, following orders without thinking, simply executing the test and marking down the result.  On the other end of the scale is pure exploration. This is closer to the ad-hoc definition (but it is not ad-hoc because of all the learning that’s happening) but there is an unbounded space where the testers’ creative curiosity is free to roam without any consequence. Exploratory Testing exists on this sliding scale. Sometimes ET is closer to the scripted testing side, and sometimes it is closer to the pure exploration side, as we will now explain.

ET with emphasis on exploration

Sometimes when testing, you have absolutely no idea what’s going on with the software. It is as if you’ve been dropped by parachute at night into the middle of a wild jungle, and you have to find your way back home. You have no idea where you are; can’t see very well; don’t recognize the landscape and don’t know who or what you will encounter. You feel confused and maybe even a little scared.

A new software product can sometimes resemble the wild jungle. Maybe you need to test a new feature in the application, and this is the first time you’re seeing it. In testing this situation you are doing high-learning  exploration, where your tests will likely have a broad focus at first, until you learn enough about the product – by testing – to think of and run deeper tests. To take it a step further, maybe you’ve never even heard of this application before, or even worse, this is a new class of product and you’ve never seen an application that resembles this application, so you don’t have a mental model with which you can make a comparison. You might not even know what the software is supposed to do, or how the basic features are supposed to work. In this situation, the concepts are so new, and you require so much pre-knowledge that you just can’t test anything at all because you don’t know where or how to begin. This is also a part of Exploratory Testing.  This is not the point where you give up – instead, you decide to use your valuable time more wisely, and temporarily, learn about the software in other ways. If you don’t know what you’re doing, simply stop wasting your time unproductively pressing keys and clicking buttons, and find someone to help you. Note, this is not a defeat of exploration; it is an expected situation that is built into the exploration’s systematic approach.

Getting additional information

When you need to get additional information, actually find someone – a person – to help you. Don’t let them put you off by directing you to read some system documentation. While reading system documentation might be helpful in some situations, nothing beats a face-to-face working meeting with the subject matter expert (SME). Sitting down with the developer, business analyst, or customer for 30 minutes or so, will be much more valuable to learn the basic ideas, or whatever it is that you need to find out.

Here’s a tip: when you interview the subject matter expert, bring a digital voice recorder and prepare a list of questions. Ask the SME to elaborate, question by question. Record the entire interview. The SME will probably be using language that you are not used to, and will use buzzwords or acronyms that you may not be familiar with. Don’t worry, that’s why you have the recorder. When the interview is over go back to your desk and listen to the recording and take notes. Pause the recording between ideas to let the information sink in; give yourself a few minutes of silence to think through the ideas for yourself, before you go on to the next idea.  While you’re actively listening to the SME’s recording, make a list of items that you still don’t understand – like those buzzwords, acronyms and concepts. Afterward, you can go back and follow up, asking for clarification on specific points. This time, you will be asking much more intelligent questions about the product, and should be ready to continue exploration. Soon enough, you will become an SME on the product yourself.

When the emphasis is on exploration, you can allow yourself a little more leeway to try out and learn about the more fundamental aspects of the application. The first exploratory session on a new application will more closely resemble a tour or demo of the feature set, rather than running detailed test scenarios. As you get more familiar with the application, you will think of more interesting tests to run each time.

ET with emphasis on scripting

On the other end of the spectrum is a situation where you already have knowledge of the software; maybe you’ve been working on it for a while, and you may even have some scripts available. People are usually shocked to hear that you can do exploratory testing if you already have scripts, especially after learning about the dangers of inattentional blindness. There is no contradiction! Purely scripted testing is the approach we warn about. This is the approach in which no independent testing thoughts are done whatsoever, and can lead to ineffective testing.  But as can probably imagine by now, there are varying degrees of scriptedness.

You can follow a test script more loosely. If you use the script as a list of test ideas or system features that you want to try out, the script becomes more of a guide to your exploration rather than a list of commandments. In this case the testing decisions are up to you, the tester, and not the document. For example, consider using a checklist to jog your memory as you test. Checklists can contain different types of very helpful items for the tester such as:

  • System features which need to be tested independently
  • Combinations of features that might interact negatively
  • Data types and combinations of data types
  • Different system configurations
  • GUI windows or screens
  • A list of Drill-down screens or boxes not accessible from the main window
  • A list of useful warning and error messages
  • A list of different testing types which need to be covered

The main point is that exploratory testing does not exclude documentation – as long as the documentation is valuable. Checklists are examples of very valuable documentation for testers. They can act as a reminder of items to test without being overly prescriptive about how or what to look for.

Implementation – How does it work?

Let’s talk a little about the dynamics of Exploratory Testing. What is the bottom line?…How does it actually work, in practice?  We will introduce four interrelated concepts to help explain:

  • Test Mission
  • Test Charters
  • Test Sessions
  • Test Heuristics

Test Mission

The testing mission is the underlying motivation for your testing. To know what your test mission is, you need to provide a clear, articulated, and above all, bluntly honest answer to the question, “Why am I testing this?” You may need to Drill down a few levels by continuously asking “why” three or four times in order to get to the real underlying mission. Without a doubt, the most important prerequisite for successful testing is for all the testers on the team to not only know, but understand and appreciate the test mission.

Test Charters

With the exploratory testing approach, the test mission is broken down into actionable units called test charters.  A test charter is the general idea of a particular software experiment. It is the theory which you’re trying to prove by testing, and is a general guide for your exploration. Good test charters are moderately specific. For example, “Check that it works” is much too general to be a useful test charter, but “check the gizmo feature under such-and-such conditions when the thing-a-ma-bob is installed and the doo-hickey is set to 11” is probably too specific for a test charter.  All your test charters for your testing mission should be about equivalent in size.

Test charters might be feature-driven, component-driven, or test-driven. For example, a feature-driven test charter might be to check a system’s login feature under different user permissions, or to test the “checkout” function in an online shopping cart.  A component-driven charter might be to check the accuracy of the system’s computation engine, or check the GUI presentation layer for ease of use and accessibility. Other examples of component-driven charters could be “Check every error message”, or “Check what happens when dataflow between components becomes interrupted”. Sometimes, test charters can be driven by the tests themselves, such as, “Check the test coverage of the automated test suite”, or “Try to reproduce the bugs marked as irreproducible in the bug database”. The level of generality or detail in a test charter corresponds to how long the testing takes.

Test Session

Testing occurs in special time-boxes called test sessions.  A test session is a period of uninterrupted time where exploration occurs, usually 60-120 minutes long.  Sometimes test sessions are done individually, where the tester sits at the computer and becomes engaged with the software, exploring the test charter. Other times, exploration is done in pairs,  where one tester is sitting at the keyboard and explaining the test ideas and hypothesis which emerge, while another sits alongside, taking notes and suggesting additional ideas along the way. Paired exploratory testing has proven to be quite a valuable approach.

The goal is to work on one test charter per session. What often happens, as is the case with exploration, is that as testing occurs, it becomes evident that additional test charters are necessary. This is a classic example of the exploration feedback loop with the emphasis on learning. The testers have learned about a new area of the software which needs to be tested in a certain way and hasn’t been thought of before. This is one of the biggest benefits of the exploratory approach.

Test Heuristics

Where do test ideas come from? Test ideas are experiments which testers perform to provide evidence for, or do disprove a hypothesis about the software. Test ideas are usually driven by a set of heuristics, which have been defined as “a fallible idea or method which may help you simplify or solve a problem”. In other words, heuristics can be thought of as rules-of-thumb which can be used to drive your test ideas.

As an example, imagine that you are testing a database-driven application such as an inventory management system with a front end GUI interfaces and a relational database on the back end. You may be familiar with the CRUD heuristic for database operations – CRUD stands for the different operations which a database application performs on its records: Create, Read, Update and Delete. This heuristic will drive your test ideas serving as a reminder to explore what happens to this inventory management system when each of these operations are done.

There are many heuristics available to the exploratory tester, too many to list here in detail.  James Bach, a major proponent of the exploratory testing approach, famously talks about a mnemonic he uses to remember heuristics of different test aspects of any product, called “San Francisco Depot”, or SFDPOT:

  • Structure
  • Function
  • Data
  • Platform
  • Operations
  • Time

Each of these categories are exploration paths; that is, areas in which tests can be developed and executed in real time. According to Bach, each of these unique dimensions of software products should drive a set of test charters to reduce the possibility of missing important bugs.

Managing Exploratory Testing

Bach has done extensive work on developing a management system specifically for Exploratory Testing, called Session Based Test Management (SBTM) and we will briefly describe some of its highlights here.  A key element of SBTM is the Session Report.  In a session report, basic information is gathered about the test session, such as the names of the testers, the title of the test charter, and the date, time, and duration of the session. In addition, a sentence summarizing the hypothesis and a list of the tests that were performed listed as well. One of the most important parts of the session report however, is a list of open questions and issues that came up during the exploration. Remember, one of the goals is to learn about the software, and coming up with a list of questions to go back and ask the subject matter expert is a great way to learn. Some of those open questions will have simple answers and can be dismissed, but others will be authentic bugs, which will get entered into the bug tracking system.

SBTM also keeps track of certain metrics about the test session. Since testers are encouraged to explore, and exploring testers can very easily find themselves off on a tangent somewhere, there needs to be a balance between focused work on the test charter, and roaming into unknown territory. For example, it is interesting to see what percentage of time is spent on-charter vs. off-charter. Exploratory managers would like to know how much time was spent pursuing the goals of the test session, as opposed to how much effort was spent exploring new areas. Another word for off-charter testing is test opportunity, because that is the core of exploration – the “playing around” that we mentioned earlier. Experienced exploratory testers have developed a skill about how long to explore in opportunity mode, before saying, “let’s take note of this opportunity because we just found a new test charter for our next test session, and let’s get back to testing our original charter”.

Another metric that SBTM tracks is the division of time between Testing, Bug Hunting and Reporting, and Setup. Testing refers to the tasks done to check the software, both on-charter and opportunity. When done correctly, this will often lead to periods of Bug Hunting, where you notice something is wrong with the software and you run on-the-spot experiments to try to reproduce the issue. This is one of the most rewarding aspects of exploratory testing – when the testers find and learn how to reproduce a bug. Setup means anything that needs to be done in advance so that testing may continue, including system or network configuration.

Often times, the setup of tests take longer than the tests themselves, especially during the first few test sessions of the project. This is information that is very interesting to the test manager, and it indicates that the product might be in its early stages and we may need to plan for additional test sessions. If there is a lot more time spent on Bug Hunting than testing, you may want to schedule another test session with the same charter, or, the test charter may have been to broad, so you’ll want to narrow it down a bit.

Getting Started

As you probably have begun to see, Exploratory Testing is a vast subject, of which this white paper has only begun to scratch the surface. If you’d like to introduce exploratory testing to your organization, how can you get started?  Here are some tips to put you in the right direction:

  • Locate your current testing methodology on the Exploratory Continuum. You may already be doing more exploratory work than you think. Once you know where you are, it’s easier to plan for the future.
  • Begin small. Find an important but not critical project to use as a pilot. You can get your feet wet with this approach and use it as a learning experience as well.
  • Continually improve. Remember, there are no failures, only learning experiences.
  • Seek help. Calling in the right expert to guide you can be an effective way to get started. Look for a professional consulting team that has extensive reach and a global presence, with expertise in various styles of testing so that they can help you to customize the right plan for you.

Professional History and Credentials:

Bernie Berger has been involved in the QA and Testing field for over 10 years. He provided testing, planning, managerial and consultant services to a host of major firms in New York’s financial community, including Zurich Scudder/Deutsche Bank, Market Data Corporation, Moody’s Investor Services, Citibank, Bank of New York and ILX Systems. Bernie is the founder of the Software Testing in Financial Services Workshop (STiFS), a series of intimately-sized, invitation-only meetings of senior software quality professionals with particular interest in improving the testing of financial software systems. He is also moderate the Yahoo message group “tester-career-support”, a forum for testers and QA folks to help each other find better jobs. Mr. Berger is also active in the greater QA community lecturing and publishing in various professional venues and periodicals.

Yaron Kottler is a Software Testing Expert (STS) in QualiTest Group. Mr. Kottler joined QualiTest in 1999 and progressed from the role of test engineer, to project manager, business manager and international business development. Prior to joining QualiTest US in 2006, Mr. Kottler led the opening of QualiTest’s Turkish branch. Mr. Kottler is an expert in Software Testing, Load and Performance testing and in the implementation of automated testing tools and methodologies. He has lead large scale testing project in ERP and CRM applications for a number of market leaders.