Sunday, July 24, 2011

The Double Sin of the Early Perfect Test Case



Since I started leading my current testing team, I’ve been struggling with the test case base.

There are a few factors that made the test case base clumsy and outdated. One of the most stunning facts for me was that even test cases that demonstrated big investment in details were often out dated. Despite the large investment, some details were wrong, had changed, had never been true, or were outdated. Often, you could find new testers struggling to understand and execute the test baseline.

The First Sin: Detailed Gen 0 Test Cases

In my experience when test cases are created before the test designer sees and experiences the product, it’s more than likely that they will not be accurate.

The reason for the failure is the limitation of our mind to perfectly imagine an abstract design. Sometimes even the designers doesn’t have a 100% complete design. While you can plan many things ahead of time, you can also anticipate that you will have gaps in your planning, but not anticipate their exact location.

The Second Sin: Detailed Gen 1 Test Cases

What about tests that successfully made it from Gen 0 to Gen 1 and proved to be correct? What about tests that were designed after the product was introduced and tried? They might not suffer from the first sin, but they will suffer from the second sin. Although these tests were accurate in the assumptions about the product itself, not all of them were the correct ones to run. Moreover, some of the tests that did a great job for Gen 1, finished their duty. Using these tests in regression will not be efficient.

As we progress with the test execution, we learn more about the risks of the product. At the end of the first generation testing we can plan better regression testing for the next generations. Typically we will add a small number of test cases and get rid of a larger amount of tests.

Conclusion: investing in too many detailed test cases during Gen0 and Gen1 is not efficient.

I’ll try to define basic guidelines to deal with this issue:

1) Lower the expectations from Gen0 and Gen 1 test cases – understand the built-in limitation of these test cases: Gen 0 might be inaccurate and Gen 1 will not fit your regression needs.

2) Seek for alternatives when planning Gen 0 and Gen 1 test cases. For example, use checklists instead of steps (See “The Value of Checklists and the Danger of Scripts” by Cem Kaner).

3) Try to thinks of better uses of your time during test planning. For example, invest in automation infrastructure during preparation.

4) Realize that moving from Gen 1 o Gen 2 will require more time in test documentation, and is not just copy-paste from Gen0-1 test cases. In this stage, you can save time by creating less “perfect” test cases of new features in the same product introduced at this time.

5) Consider the possibility that you will come to like your lean Gen 0 and Gen 1 test cases so much, that you won’t want to invest in more details for the regression test case base.

In case you claim that your experience is different and it’s possible to create perfect re-usable test plans in early stages, I can think of the following possibilities:

1) You are a better product and test designer than the ones I work with (please mentor me).

2) You don’t have complex and innovative products like the products that I test.

3) You follow a perfect process that prevents you from falling in such traps (and I would like to hear more about it).

Sunday, May 8, 2011

Is there a Pesticide paradox in testing?

As a tester, I have heard and read the term “Pesticide paradox “ on many occasions . However, I do not feel comfortable with it so I avoid using it. In the last few days, I decided to examine it more carefully – does this term makes sense? I did some googling to explore the common use of the term in SW testing, the definition of the paradox in the real pesticide world, and to try to give an answer to the question.

The original paradox is explained in Wikipedia :
“The Paradox of the pesticides is a paradox that states that by applying pesticide to a pest, one may in fact increase its abundance. This happens when the pesticide upsets natural predator-prey dynamics in the ecosystem.” I'll refer to this definition as the "original".

The common use of the term in Testing as I experienced it, is to describe that when using scripted tests (automated in most cases), which are repeated over and over again, eventually the same set of test cases will no longer find any new defects (I took a quote from the ITCQB syllabus which is great source to find "terms of common use"). I'll refer to this definition as the "common use"

I also found the explanation that "A static set of tests will become less effective as developers learn to avoid making mistakes that trigger the tests." (A paper by Rex Black ).

I could say that the common use of the term is to describe that repeating the same checks tends to yield less bugs from run to run. I agree that usually this is true – according to my experience, since bugs get fixed, and usually when no major changes are introduced and no very bad development occurs, products become more stable from release to release also, the development learning explanation is logical.

If we try to correlate between the Testing common use of the term and the original one, we will be able to see a very loose connection – SW bugs do not increase due to the fact that you repeat some checks, and moreover, where is the paradox here? If you ask a question over and over, it is very likely that most of times you'll get the correct answer. I don’t call this a paradox.

When you analyze a term, it is good practice to read the source. I on't have the book Software testing techniques by Boris Beizer, from which is the term origin, but thanks to http://www.softwarequotes.com/  , I found it:
First law: The pesticide paradox. Every method you use to prevent or find bugs leaves a residue of subtler bugs against which those methods are ineffective.
- Boris Beizer - Chapter 1, Section 1.7. Boris notes that farmers solve this problem by planting sacrifice crops for the bugs to eat, and laments that programmers are unable to write sacrifice functions., Software testing techniques by Boris Beizer , ISBN: 0442206720. I'll refer to this quote as "Bezier's".

Well, that makes sense too, and is a good foundation law before you learns about methods – any method is not fully effective. Like with the common use term, I don't see the paradox.

I'll summarize my conclusions on the subject

• The connection between the original term – the biological phenomena of the "Pesticide paradox" and the common use in the testing world is mostly due to the use of the term “bug” to describe a defect, and that the original paradox deals with type of in efficiency when trying to pesticide pests.

• A clear logical paradox appear in the original phenomena – you kill bugs, but this increases their abundance, while the Bezier's and the common use of the term talk about less efficiency, not a paradox. A possible response to this statement will be to argue that when you do something and it is not efficient this is a paradox, to my taste this is too apologetic argument.

• The original SW testing usage quote from Bezier is a warning about relying on a sole method, while the common use by others, which usually refer to Bezier as the source, is to describe the decreased efficiency of repeating a scripted test.

It’s fine to use a cool term with loose analogy to describe your idea, but as you can see in our example, this might cause others to "steal" your term to describe other things (and worse – reference you as the source). In addition, it will be hard to convince people with critical thinking to use your term. I will leave the pesticide paradox to its original meaning.

Tuesday, January 4, 2011

Note about terminology

Sometimes, It’s all about branding. When you want to sell a product or an approach, using terminology that will "sell" your approach to your stakeholders or the professional community has impact on the chances that it will be accepted .
When Fred Hoyle coined the term Big Bang during a 1949 BBC radio broadcast, he did not anticipate that he is doing a branding service to the competitive theory. According to Hoyle, who favored an alternative "steady state" cosmological model, he used the striking image to highlight the difference between the two models. Probably he did it too well :-)

Markus Gärtner in his post Active vs. passive testing  introduce refreshing terminology for what we use to call Testing Vs. Checking or Exploratory Vs. Scripted. He uses the terms Active Vs. Passive testing. He also talk about the role of judgment which is part of being active, but basically, the new pair of active Vs. Passive comes to describe Research, critical, exploratory approach versos executing the planned tests, checking and following defined scripts.

This new terminology has some benefits on the terms we are used to. It's not using a term that we already use to describe wider area like "Testing". And unlike the term exploratory it's not suffer from the "unstructured" public image.

Thursday, November 11, 2010

Using the definition of quality as a tool for context awareness

What is quality?
Since testers job is to perform quality assessments, ask questions and provide answers regarding quality, understanding what quality is can help identify dilemmas in our work and put them in context.
My preferred definition for quality is “value to someone” (Jerry Weinberg). Cem Kaner adds the extension “who matters”. Usually I refer to this VIP as a “User”.

Recently, I noticed that using this definition helps identify context and explain the context to others during discussion.
A few examples:

Focusing a discussion on the goal rather than the process.
Process is important, but sometimes process discussions disconnect from the goal. For example, when a bug is opened and there is a discussion about whether it’s a requirement violation or not, providing insight on the User value could help direct the discussion to a productive place. This is also true when a tester spots an issue and is not sure whether it falls under his responsibility to report it – “is there a threat to the value to the users?” is a good litmus test to aid the decision.


Selecting a process
When defining a process, understanding how it’s connected to the value to the user is a good way to examine it. When we define a process it should connect our efforts to the goal rather than disconnect them. A negative example is when mixing between the priority for fixing the bug and the bug Severity. Many times there is correlation between the two, but it will be a good idea to define process that will address the cases when there is a difference between the two (like setting different field for each goal) so the information of the value threat severity will not be replaced by the work plan.

Determine classification of a problem
We face many types of problems. Some related to the quality of the product and some interfere with other aspects of our work, delay our progress or block our testing efforts. When a testability issue is examined in perspective of user value, it can be underrated since our inability to test efficiently is the real issue here, and not the impact of the tested attribute on the end user.
Sometimes there are two types of issues combined together in one problem description. Distinguishing between the value to the user and the other issue, helps provide a clearer explanation.

Overcoming the tunnel effect when setting bug severity
Setting correct bug exposure classification helps the bug life cycle start on the right foot. When a tester tests his area of responsibility and spots a problem, sometimes it’s not easy to relate to the big picture – what will be the impact on the user? Will he be able to recover? Or in other words – what is the threat in terms of value to the user. Answering this question easily directs the bug submitter to specify the correct bug severity.

Thursday, September 2, 2010

Indicators of Useful or Useless Indicators

The Testing Experience magazine published my article about Indicators of Useful or Useless Indicators In its recent issue about metrics.

I am very happy with this article publication for the following reasons: 
  • I was able to transform my thoughts that came from my experience, into a context-driven article.
  • The publication exposed more of my thoughts and writing to my colleagues within the design center.
Following is an embedded version of the article. I recommend to download the magazine issue itself .



Wishing you a Happy Hebrew New Year שנה טובה ומתוקה,
Issi

Sunday, August 1, 2010

Peaks are fun, but bit scary

Writer Meta Thoughts.
Last week was definitely a peak in my writing journey. On Sunday, an article of mine was accepted for print by "Testing Experience" magazine. On Tuesday, Pradeep Soundararajan mentioned my name among other "Good thinkers and future experts". I wrote a post on Wednesday, and on Thursday discovered that my site had a record number of visitors. A short investigation led me to the traffic source - The following Tweet by James Bach:
jamesmarcusbach: Another new sapient testing blogger (avoids pat answers, critical self-analysis). I like this guy http://bit.ly/dl042u
Thank you, James!
What a week! No wonder, this week was also my Birthday week. While enjoying the compliments, a concern sneaks into mind - How do I maintain my reputation? The concern fades when I look down the path - remembering the journey I started a year and a half ago, daring to blog my first post and going from there, searching for my inner voice and improving my writing skills. Continuing the journey, looks less scary than taking the first step.
Here's to another year of new Insights and Peaks.

Wednesday, July 28, 2010

The numbers monster

I just finished polishing my article about meaningful indicators; I had hit the "Send" button and felt good. I was able to express my thoughts about meaningful and meaningless numbers, and my article had been accepted to print.

While writing about indicators and numbers, I remembered that James Bach described the moment when he realized that the test cases count he had been asked to give his manager was meaningless as a turning point in his career (utest blog, "Testing the limits" with James Bach).

smiling, I lean my head towards my laptop, taking a small nap.
Through the mist, an email message popped up to my face. The magical forest keeper forwarding me a message from the numbers monster: The numbers monster is hungry and wants me to give her some numbers.

I replied to the forest keeper: I understand that the numbers are very tasty, but they are complete junk food. We can ask the monster what is her purpose in digesting the numbers, so I will be able guide her to find some good herbs that will satisfy her, instead of fattening her with the junk food she asked for.

The forest keeper replied again in few seconds: The monster demand her numbers.

I replied back: our forests success depends on the monsters health. I know that she is used to the junk food that she gets from other creatures in the forest, but I don't think it is good for her. I watch monsters when they eat the numbers junk food, they start going in the wrong direction, and even when they go in the right direction, they waste the dwarfs energy supplying them junk food, instead of gardening the herb plot and taking care of the forest's prosperity.

The forest keeper replied again: The monster demands her numbers. As a forest keeper, I need you to co operate in order that the dwarfs will keep feeding the monsters numbers. It’s easy for you and for them, and most importantly, it gives the monsters the feeling that they control the forest and not the dwarfs.

My mind starts thinking how to convince the forest keeper to give up his tradition and do a better job for our forest, but then I woke up from the dream.

What do you do when you meet the forest keeper and the numbers monsters?