Stress
failures and bug advocacy - looking at stress tests from a value perspective
Stress is part of your test strategy. You use it as a tool to test your
product and find bugs. This is one of the “non-functional” test categories you
run. Did you devote the time to think about what is actually being tested by
your stress tests?
You may continue to test without answering this question, but when the time
comes for bug advocacy, you have to defend your test strategy and findings, and
this may force you to search for an answer.
What are we stressing for?
1)
Statistical failure - Stress increases the chances of the
appearance of a sporadic defect since it executes a flow a lot of times
2)
Run stability tests in a shorter
time – the stress speeds up the time factor – failure reveals in a short time a
defect that a system which runs in normal conditions (amount of data, number of
simultaneous actions, etc.) will experience after a longer run. A common example of such a failure is a memory
leak found using the stress setup.
3)
Load (sometimes defined as
a category by itself) – when we test how our system scales with multiple calls,
large amount of data or both. Here, the failure reveals a point when the system
fails to handle the load.
4)
Any combination of 1, 2 or 3.
In a utopic scenario, when a stress related
defect is reported, it follows the path of debug, root cause and fix. But in
many cases, we will need our bug advocacy skills in order to convince our
stakeholders of the need to fix the defect.
A typical bug discussion can start like this:
Developer: “Stress of 4½ hours and 5MB data
files is not a normal usage of our system. A typical use case takes 15 minutes
and a smaller amount of data. We should reject this bug.”
This point in the discussion can reveal whether
you did your homework or not.
To decide that the failure is from the 1st
classification – statistical, we need to decompose the stress to a meaningful use
case and run it over and over while bringing the system to a clean state
between the each use case. Automation can be a big help here.
If we succeed in reproducing the failure
under such conditions, our report will transform from a stress failure report
to a use case failure report with reproduction rate. When we have a sufficient statistical
sample, the impact is clear.
Pinpointing whether the failure is related to
time or to load is more complex, as we need to “play” with both factors in
order to reach a conclusion about the amount of time, load or both that is
needed in order to cause the system to reach a failure point. The awareness of
the possible options is an important tool in bug advocacy. For example, it can
enhance stakeholder’s perspective when you are able to say that “we’re not sure
yet, but it is possible that we will see the failure in normal conditions after
a long period of time.”
Doing complete research before reporting the
stress failure can consume lot of resources and time, so I don’t suggest delaying
the report till the tester has all of the answers. Many times, we can reach
faster and better conclusions about the failure from a focused code review or a
debug log analysis.
I would like to suggest the following: learn
to classify your stress failures. When
you see and report a stress failure, treat it as a start of the classification
and investigation. While sometimes the report will be enough to call for a bug
fix, many times it will serve as a call for investigation. During the
investigation – make clear to stakeholders what you already know and what you
don’t know yet. Make sure that new findings are updated in the bug and don’t be
afraid to change the title to reflect it.
There is much more to learn than the basics I
summarized in this post. Learning more about stress in general and about your
specific system, can help you classify and investigate your stress failures and
no less important – plan your stress tests better.