There is one thing that many new engineers struggle with, something that is of great importance but is not taught often in undergraduate courses—methods of simple quality analysis and control.
The closest most undergraduates get to any discussion of this might be in a semiconductor design course where yields and process variation might be touched upon, or in my case, the class I had on active networks and active filters discussed the sensitivity of various designs to component value changes.
Yet for any manufacturing environment, this simple skill is of utmost importance in getting a handle on chaos. And believe me, it will be chaos if your critical processes are not being monitored and under control.
We have all seen it—the box full of bad boards that can’t be fixed, and get set aside, or perhaps the test limits on some automated measurement that have to be adjusted all the time, and my favorite: the boards that get to final-inspection but can’t be shipped because they do not meet the customer specifications.
With no process control in place, these issues will all seem unconnected, and it will appear that the entire operation is in chaos. Which it is…
Learn from the best
When I got out of school, the Japanese were the quality leaders. Their products were precise, worked all the time and were at a lower cost than ours. I had heard that this was because of their use of statistical quality control. I had no idea how to do this, and neither did the engineers who were mentoring me, so I purchased a book on statistics, unfortunately, this book—although full of Chi-Square, Poisson distribution theorems and the like—didn’t have any practical meaning behind any of it.
A few years later, I had the good fortune to attend a series of classes on the Analysis of Variance in a practical setting. These folks used the book: “Understanding Variation” by the well-known author Donald J Wheeler . His books are still available today and they are little gems. Like a well-written application note, they are short and to the point, teaching the subject in a concise, easily applied manner.
What most companies do
Everyone gives at least lip service to quality because everyone knows that it is important. But the approach taken is usually that of Dr. Deming’s  Red Bead Experiment.
In the Red Bead Experiment, a bin of white beads is mixed with a few red beads. The white beads signify a good product, and the red beads signify a bad product. The class is divided into about four groups to signify work centers or work shifts. Each group is given a spatula that has a grid of holes in it. In turn, each group sticks their spatula into the bin containing the mixed beads and gets some beads in the grid of holes.
The instructor counts the number of red beads on each turn and starts to proclaim the group that gets the lowest number of red (or bad) beads as being the “best production team”.
Well, as you can imagine, the number of good or bad beads is totally out of the control of the groups—they stick the spatula in and get a random number of bad beads each time—they can’t pick or select at all.
Yet the instructor, who by the way is akin to “management”, starts to tout the success of the “best team”. Yet it is all random!
During the exercise, because of the random nature of the process, the previous “best team” will undoubtedly fail and go “backwards”. The instructor will show his “displeasure” with this previously good team’s high-quality standards, which are now clearly “backsliding”.
In the end, there will be some team that just happens to have the best score and they will be awarded a bonus for the “best quality”.
Dr. Deming was a genius because this is a perfect example of how most workplaces try to implement quality control—randomly, and worse yet, with random goals!
If you have never been in one of these kinds of classes, I can’t recommend enough that you watch one on YouTube. Just search “Deming’s red bead experiment”. You can even find some taught by Dr. Deming himself. There are many available to watch and they only last about 30 minutes. This will be the best 30 minutes that you spend this week.
It sounds familiar, doesn’t it?
Well, that was Dr. Deming’s point…We all just hope that the Heisenberg uncertainty principle (i.e., if we pay attention to something, it will change) will work out, but in quality, it doesn’t. Paying attention is a start, but attention in and of itself doesn’t just fix any problem.
Simplest starting point
As I related at the onset, the simplest start to managing chaos is to apply Wheeler’s description of the Shewhart “XmR chart” . “XmR” means X-bar (or the average), and “mR” means the moving range. This is a simple way to get a look at data, and it doesn’t require a computer, just measurements and a piece of graph paper. Oh, you see I mentioned “data”, actual “data”. While it is important to have “feelings” about process-related things, the only way to get a handle on them is to have actual data in a graphical form to start analyzing the possible root causes of the problems.
What this analysis can show is that sometimes processes are really out of control, and now that you have a way to measure it, you can start to understand the process to be able to change and to monitor it. Many times, however, you will find that the process is actually in control and is producing exactly what it can produce, yet you still can’t meet customer specifications. This says that the design of the process needs to be changed or the customer limits need to be changed to match reality.
If you do the Red Bead experiment enough times and measure the results, you will find that the system has an X-bar and a moving range that is actually “in control”, and it properly operates the way that the system is designed. It will produce a statistical number of red beads every time, and there is simply nothing that the workers can do about it short of changing the way that the system is designed or changing the expectation.
Now, this information won’t make anyone who designed the process or management happy immediately, the “red bead = bad product = bad workers” method of looking at things is simply too deeply ingrained in people without having had this training, but at least you will know the facts and that is the start of “managing chaos”.
I have used XmR charts for 30 years now and there have been some really interesting observations along the way. There are many more of course, and many of those are detailed in Wheeler’s book. I mostly plot XmR charts to monitor a critical process continually, other times I use it to analyze where a previously in-control process has gone out of control, as it can be used to look back in time (if you have the data saved somewhere, that is).
Figure 1 shows a process that is out of control. Naturally, there will be all sorts of pushback from the team that designed the process and management to “remove the obviously outlier points”. Unless you can positively say from your own personal investigation that those points were indeed exceptional outliers with a definite root cause found, don’t do it. If you do remove the outliers at the start, you will give the impression that the system is better than it is, and when they “re-appear” later, it will look to everyone that the system is getting worse. Naturally, not removing out-of-specification units from any chart will make no one else in the company happy—resist the peer pressure to do so.
Conversely, many new products may start out of control, that is why it is important to measure any process early in the prototype phase so that these things can be worked out and the root causes can be determined and fixed.
Figure 1 The process is out of control from the start.
In Figure 2, the minimum gain for the complete system to function properly is 4.75. This chart of measured gain shows that the customer specification is well within the natural process limits. In other words, the natural process shown here cannot meet the customer’s specifications. The options are:
- Toss the bad units and hope the process doesn’t get any worse
- Negotiate the customer specification so that it matches reality
- Redesign the process so that the specification can be met
Hoping that the process won’t get worse is an act of desperation and the least desirable option.
Figure 2 Minimum customer specification of 4.75 cannot be met with current process.
The average common mode response of an amplifier as shown in Figure 3 keeps shifting during the day, then repeats the next day. This was traced to the temperature inside the facility changing during the day in a spell of hot weather. Without some sort of chart, this effect would have been hard to diagnose. This effect was proven by taking the same units measured at the start of the shift and then measuring them later in the day and seeing that the units themselves showed the same shift. What the process was measuring was the temperature coefficient of the device. This problem may be more properly classed as a “measurement uncertainty issue”.
Figure 3 The shifting average common mode response of an amplifier due to changes in the facility’s temperature.
The bandwidth of a filter is measured and plotted in Figure 4. The bandwidth is adjusted with hand tweaking by expanding or compressing the coils of the inductors in the design. Everything looks okay, but looking at the measured value, the “local” average varies some over time. It was found out that these filters were built in the factory in batches of 20 each, and with that knowledge, the pattern can be seen in the chart. These batches might be better analyzed in those groupings and then compared between groupings.
Figure 4 Shifting patterns in filter bandwidth.
One week the grouping is nice and well within specifications, and the next week it takes a giant step and becomes out of specification. This is the measurement of a 7805-type regulator in Figure 5, and the shift was caused by running out of one SMT tube of parts and the next tube used was from a second source. Both manufacturers were well within the +/-4% absolute output specification, but their wafer fab processes were operating at different center points when the parts were made. Nothing is “wrong” with either manufacturer’s parts, but you can see the result of the raw material part-to-part differences in your finished product measurements.
Figure 5 Binomial groupings.
Here you have a decision to make: continue to track the actual measurements or, you can set the limits at the specified data sheet part limits.
I generally measure the data as it comes to me. Later, if I continually chase the limit specifications around, but if there is no root cause to fix, then it may be time to set the limits on the calculated data sheet or design values.
There is always a way
What if your production is sporadic or infrequent? How can you chart that? It turns out that Wheeler also wrote another book: “Short Run Process Control”  where he covers how to chart and monitor these types of processes.
There are also examples of bartenders implementing these processes to improve the accuracy of their drinks, etc. It may not always work, but looking at and thinking about actual data presented to you in an XmR chart is never a wasted effort. It beats the alternative of simply having feelings about a process.
Not the whole story
Quality control is also not the only important issue in a company, rather, it is on equal footing with other issues such as sales, profit, manufacturing capacity, ethics, etc. This is borne by the fact that many quality award winners in the past have subsequently gone out of business, just as many industry leaders have gone out of business. Much of this will be out of your control, but in my experience, applying an XmR chart to your daily analysis of what you can control will make life much better for you because, even if no one listens to you, you will know what your process is capable of, and not just be guessing. This is a real personal chaos reducer.
I added my Octave (Open Source Matlab clone) and Python Scripts that easily generate XmR charts from a CSV file data on Github for anyone interested in using them. See Reference  below.
Box: How to make an XmR chart
Start with some measurement of something. Here I have five measurements of a “widget”,
Measurements = 1.1, 1.0, 1.3, 0.8 and 0.9
The moving range (mR) is derived by finding the absolute value of the difference between the first and second, second and third, third and fourth measurements, etc. The mR is always positive, as it is the difference between successive measurements.
Moving Range = 0.1, 0.3, 0.5, 0.1
Plot the values (Wheeler’s book has some nice blank charts that you can copy and use), but anything works, even a piece of graph paper, as shown in Box Figure 1.
Box Figure 1 You don’t have to have a computer to make a XmR control chart.
To even think about calculating the limits, you need to start with at least 5 values. Start by calculating the average of the moving range (R),
R = (0.1 + 0.3 + 0.5 + 0.1) / 4 = 0.25
the upper control limit on the range (UCLr) is given by,
UCLr = 3.268 * R
for our data,
UCLr = 3.268 * 0.25 = 0.82
The UCLr value should be plotted on the mR chart. Calculate the average value of the measurements (X),
X = (1.1 + 1.0 + 1.3 + 0.8 + 0.9) / 5 = 1.02
To compute the measurement upper and lower control limits (UCL, LCL) use the following formulas,
UCL = X + (2.66 * R)
LCL = X – (2.66 * R)
for our data,
UCL = 1.0 + (2.66 * 0.25) = 1.7
LCL = 1.0 – (2.66 * 0.25) = 0.3
Now plot all the data and limits together (Box Figure 2).
Box Figure 2 The completed XmR chart plotted using my Python Script. Although you don’t need a computer to make these graphs, it sure does look prettier if you do.
The Upper graph consists of:
- Blue line = Measured data connected by lines
- Green line = The calculated UCL
- Dashed orange line = The calculated X
- Red line = The calculated LCL
The bottom graph is the moving range plot, it consists of:
- Blue line = Range data connected by lines
- Green line = The calculated UCLr
- Dashed orange line = The calculated R
Box: Add customer specifications
Sometimes in a presentation setting, it is important to plot the customer specifications on an XmR chart, just to graphically show the actual situation. No one cares if the process is producing parts that are well within specification, but things get more interesting when it can be shown that the process today cannot meet the current customer specifications.
Note: The data for Figures 3 and 5 were re-created from an experience that I have had in the past but didn’t still have the real data for.
 Wheeler, Donald J, “Understanding Variation: The Key To Managing Chaos”, 1993, SPC Press, Knoxville, TN, ISBN: 0-945320-35-3
 Dr. W. Edwards Deming, https://en.wikipedia.org/wiki/W._Edwards_Deming
 More information on the Shewhart Control Chart, https://en.wikipedia.org/wiki/Shewhart_individuals_control_chart
 Wheeler, Donald J, “Short Run SPC”, 1991, SPC Press, Knoxville, TN, ISBN: 0-945320-12-4
 Python and Octave scripts can be found at, https://github.com/Hagtronics/statistics-scripts
—Steve Hageman has been a confirmed “Analog-Crazy” since about the fifth grade. He has had the pleasure of designing op-amps, switched-mode power supplies, gigahertz-sampling oscilloscopes, Lock In Amplifiers, Radio Receivers, RF Circuits to 50 GHz and test equipment for digital wireless products. Steve knows that all modern designs can’t be done with Rs, Ls, and Cs, so he dabbles with programming PCs and embedded systems just enough to get the job done.