Sunday, May 15, 2011

A Brief Six Sigma Tutorial

This continuous process improvement process is named from the statistical term "sigma" which is the standard deviation of a population distribution.  Six sigma or plus or minus three sigma as a bound on statistical distribution and when we apply this to the distribution of failures, then the failure rate would be less that or equal to 3.4 times per million "opportunities." This may seem somewhat esoteric, but it gives a goal to drive toward.

I learned the Six Sigma process as a five step procedure: Define, Measure, Analyze, Improve, and Control (DMAIC).

We start with something that fails.  A common symptom is something that requires rework.  Rework is a quality issue and costs money that we could save if the mistake that caused the rework is not repeated.  So Define starts with a defect of some kind. But there are two parts to the definition: Defects and Opportunities for that defect to occur whether it did or not.  Classically, we relate to widgets, like light bulbs, and define opportunities as bulbs and defects as ones that don't work, but this is a very simplistic view of Defects and Opportunities.  In a future post I will work through a couple of software examples; but be forewarned, a Line-of-Code (LOC) cannot be considered an opportunity.

Measurement is the collection of data, Defects and Opportunities.  The Measurement activity may indicate that we have chosen the wrong Defect/Opportunity (D/O) metrics.  No problem there, fixing any procedural problem that creates measurable defects is the goal of continuous process improvement.

Analysis is the discovery of causes of the defects and can be a long list including such items as Lack of Standards, Failure to Conform to Standards, Poor Requirements Definition, Poor Training, Failure to Communicate, and the list goes on and on.  We might sort the failures by order of the number of failures per cause category (a Pareto Chart) and select only the top few to fix that gives us the best leverage (most improvement for issues addressed).Root Cause Analysis techniques are generally applicable here.

The Improve phase applies mitigation techniques to the issues needed to improve the D/O ratio.Improvements may be such things as improving (or establishing) standards, standards training, improving requirements acquisition, and establishing communications policies such as for meetings and documentation. Establishing, and conforming to, new standards for legacy software can be problematic.

Control is a management function.  The purpose is to ensure that the mitigation put in place as a result of a Six Sigma project remain in place and the improved defect rate is retained.  Typically this involves repeating the measurements of Measurement phase periodically and publishing the results to the monitoring manager.

Saturday, May 14, 2011

The Application of Six Sigma to Software Production - Part 2

Just what is the problem?

I would refer to the success of Motorola and GE with the Six Sigma
activities that not only address problem Definition, problem Measurement,
problem Analysis, Implementation of the solution, but putting in place
management Controls (DMAIC) so that the changes instituted as a result of
finding out what is wrong stay in place. Part of the issue here is the
metrics, of course. But in Six Sigma, one identifies the metrics that
characterize the problem and not vice versa.

In Six Sigma there are no best practices only better ones than the ones we
have now. And better ones than that. And when you have reached less than 3
undesirable events in 1,000,000 opportunities, your practices are "good
enough."

In the most generic sense, I suppose the problem is that mistakes cost
money, and if we can avoid the mistakes, we can save money. You can bet
that the cause of mistakes resides in the culture in which the mistakes were
made. Cultures are generally created by the founding fathers and
propagated by successive generations; in businesses, by generations of
management. Cultures have a lot of momentum; See NASA for example. Is
saying that we can have a better culture tantamount to saying that our
current culture is not good enough? Is it wrong? Is it not satisfying?



(This first occurred in substance in the Yahoo Testing-Software forum.)

Wednesday, April 20, 2011

The Application of Six Sigma to Software Production - Part 1

I am a firm believer in continuous process improvement.  My first computer testing position was with General Electric Process Control Division, (1964, later Honeywell) and though we had no formal process improvement process, usually, when something failed, we performed root-cause-analysis (RCA) to discover what mistakes we had made so that we could avoid making those same mistakes in the future.

In later years (c. 1998?), I was again working for GE and had the opportunity to learn about and apply Six Sigma continuous process improvement techniques to a software development environment.  I saw it work.

I hope to publish a series of posts that will describe the Six Sigma process as I used it, the arguments against using it for improvement of the software development process, the myths concerning Six Sigma circulating in the software development community, and, perhaps, an occasional rant about the failure to apply engineering discipline to software development.

Here is my RCA for why software has such a rotten quality reputation - and an economically justified solution:

You can bet that the cause of mistakes resides in the culture in which the mistakes were made. Cultures are generally created by the founding fathers and propagated by successive generations; in businesses, by generations of management. Cultures have a lot of momentum; See NASA for example. Is saying that we can have a better culture tantamount to saying that our current culture is not good enough? Is it wrong? Is it not satisfying? 

This is enough to test our very belief systems.

In the most generic sense, I suppose the problem is that mistakes cost money, and if we can avoid the mistakes, we can save money.  This is a problem we can address.

One method of addressing this problem is the Six Sigma process.

Tuesday, April 19, 2011

Qualitative Analysis and Software Testing Instincts

I think there is value in subjective, qualitative software analysis.  I think that is a guiding principal of Exporatory Software Testing (ET): "Should I look here or over there?"  and our well trained instincts guide us.  So what trains us and what are those instincts?  Ideas worth looking into. 

I rely on the "One Roach Conjecture":  "If you see one roach . . ."

I frequently use this principal while performing ET.  For example, I look to see if there is a simple parsing problem, if there is, then I think, "Ah HA! These programmers don't know how to parse.  I wonder where else parsing is critical to the function?"  (Examples of simple parsing test cases: 0.0.0, --1, -0.  I found that Acrobat Reader had a lot of trouble with these. ergo .  . .)

The subjective led to the objective so I think exploring subjective analysis has some bearing on our software testing craft.

(A slightly different version of this post appeared Oct 27, 2010 in the "Software-Testing" Yahoo group.)

Monday, April 18, 2011

Microsoft(tm) Calculator Challenge

The Microsoft Calculator Challenge

Here is challenge for those of you with some time on your hands (it has happened to me) and illustrates an interesting testing technique.

Background

I have been a nemesis to the calculator people at Microsoft.  One time in the late 90's, I had some time on my hands (See?  I told you.) while I was working, and I was supposed to be testing but the system under test was not available (probly never happened to you) and I was looking for something to test.

My desktop at work was a PC running Windows NT and there it was: Windows Calculator.

So I began trying things.  It was definitely broken.  Mostly precision issue matters.  For a complete list, see my Ph.D. dissertation, "Appendix A - Calculator Anomalies."

Microsoft "fixed" most of these problems by adding "infinite" precision.  You can try this with the following string (without quotes) pasted into the scientific view (ALT-2) of MS Calculator: "1x9999s".  You can click on "Continue" as many times as you want. 

So I faced that challenge myself, since what I considered a really stupid fix made me somewhat unhappy with Microsoft.  I had developed a random test case generator and it generated a very long string that I could paste into calculator as above and got a catastrophic failure.  I spent some time reducing that test case and found that the following string (again, without quotes) would cause that same symptom:

"(((((((((((((((0=)))))))))))))))"

During the test string reduction, I also discovered the following string that produced an interesting result:

"(0=)(2+2=)"

The problems with these test cases persisted in subsequent versions of Windows including 2000, XP, and Vista.  If you have one of those operating systems, try those test cases.

However, if you have Windows 7, as I now have, these cases no longer "fail."  Microsoft has once again "fixed" the problem by terminating the operation once the "=" sign is entered.  Thus:

"(((((((((((((((0="

Is a complete and valid calculator operation.

The Challenge

I am incensed!  So you know what I did?  I removed the "=" from my random test case generator and generated another long string, and guess what?  It still fails catastrophically in Windows 7.  (I haven't tried it with other Windows versions, but I'll bet it fails.)

The original test case is  17,475 characters long.  As before, I have reduced this case to a case that is only 376 characters long as follows:

(1+(1+(1/(1/((1+((1+1)+1))+(1+1/((1+(1+(1+(1+(1+(1+(1+(1+(1+((1/(1+(1/(1/((1+(1+(1/((1+(1/(1)+(1/(1+(1+(((1))/1))/1))))/1))))))/1)))+(1+(1+(1+(1/(1+(1)/1))/1))+1)/1)/1)/1))+1))+1)/1)/1+(1/1)))))/1))/1)/1+(1+(1+((1/(1+((1/(1/(1/(1+(1/(1/(1+(1+((1+1)+1)/1)+(1+(1/(1+(1/(1+(1+1+1)+1)/1))/1)+1)))+1/(1/(1)/(1/((1)/1)/1)/1)+1)))/1)/1)/1)))/1)/1)+(1/(1/(1+((1+(1)/1)/1))))/1)/1)/1)/1

So the challenge is this:  What is the minimum length string that will cause this failure?

An interesting note: If you remove that last "/1" from the test case, it will not fail.

Hint:  I save the string in MS Notepad, make a change reducing the length, copy it, and paste it into calculator and if it fails I save it under a new name, if not, I restore the previously failed version.  The reduced case you see above is TC18.txt.  Be sure there are no line breaks in the string if you copy it from here.

I am also interested in intermediate results, different results, different symptoms, and failing results with a different OS or application.

Getting Started - - - Again

I once started a BlogSpot Blog before  Google took it over and I didn't take advantage of the account conversion offer so I had to create this new account.

Though my primary thrust here is to relate to software testing, I do reserve the right to comment on anything that comes to mind.

I have particular interest in software exploratory testing, test automation and static code review and I plan to publish, here, tools that I have developed over the years.