Alarms: Behind the Red Screen

Aug. 31, 2009
When operators fail to follow or actively dismantle “Andon” devices it’s usually a sign of serious systemic problems. Here’s how to prevent them from resulting in recalls or worse

Recent news reports have alleged that operators at the generic drug manufacturer Mylan Labs in Morgantown, West Virginia had been regularly bypassing “Red Screen” alarms, designed to indicate when tablets or capsules might fail to meet hardness, thickness or weight requirements, and, thus, fail to deliver the correct dose of medication to patients.

Whether or not these specific allegations prove to be true, they point out serious manufacturing issues that can, and often do, exist in any manufacturing plant in any industry. 

Sadly, most of the discussion surrounding such scenarios fails to address root causes and effective action. Companies generally respond with management changes (which have reportedly already begun at Mylan). Management blames workers, workers blame the production equipment, and the production equipment designers insist that their product is sound. New “band-aid” policies and procedures are then typically implemented that may only cover serious wounds.

This article will briefly examine why problems like this occur, and what actions should be taken to make meaningful changes while preventing future repetitions of the same mistakes.

First, let’s consider the reasons why production workers may ignore alarms:

  • The alarms do not provide adequate warning.
  • Workers are distracted.
  • Operators are subject to so many alarms that they can’t prioritize actions.
  • The required action is ambiguous or improperly understood. Intentionally circumventing Red Screens or warnings that stop production, however, is a far more serious problem than unintentionally ignoring warnings.

Except for sabotage and revenge, workers will intentionally ignore or circumvent warnings when:

  • They know or believe that the sensors generating the warnings are faulty.
  • They believe that warnings are not important, and that the product is still fit for the intended purpose.
  • They are rewarded for ignoring the alarm.

Since each of these can and should be prevented in the design of the production equipment, some elaboration is helpful.

Faulty Sensors

Sensor accuracy and precision can degrade as the sensor ages, and sensors can fail. When most Red Screen warnings are traced to faulty sensors, workers are trained to believe that warnings are useless. Because of the critical nature of “Red Screen” warnings, it is unacceptable to allow faulty gauges to generate these warnings.

There are several ways to prevent this problem. The first is preventative maintenance, in which the life of each sensor has been characterized or its performance tracked, and the sensors are replaced long before they generate faulty readings.

Where preventative maintenance is not adequate, redundant gauges should be used with voting, as NASA does in its space missions. Where only two sensors are used, it is often impossible to tell which sensor is faulty, since voting schemes often require at least three gauges. Where two sensors are used and one indicates an out-of-tolerance condition, the first warning and actions should be to replace one or both sensors rather than a Red Screen. Ideally, equipment should be designed so that sensors can be safely and reliably changed while the equipment is operating.

A third strategy is to select and use sensors, sensor configurations, and circuitry that enable active sensor monitoring that detects faulty sensor conditions such as a ground fault, or circuit break or changes in sensor performance. Naturally, all of these strategies can be combined, where needed.

Overuse of Alarms

Invalid warnings are worse than no warning at all. Consider the case where the response time of the sensor is longer than the available measurement time. Even if perfectly calibrated, these sensors may produce out-of-tolerance warnings, even when careful measurement of the product, after the fact, shows that all of it is within specification. The product may be perfectly fit for use, despite the “Red Screen” warning.

How production performance is monitored turns out to be extremely critical, and simple changes can provide more effective control solutions without requiring frequent or immediate shut down of the process. Consider the case where one cavity in a frame or wheel is not being properly filled and the weight of pills formed in this cavity is only 60% of target weight. If 100 pills are weighed at a time to check conformance, this defect may never be detected, yet it is a far more serious problem than the case where all pills are 0.4% under the nominal target weight.

On the other hand, if the entire process is shut down every time there is problem with a single cavity, the line will be down regularly.

Instead of generating a “Red Screen” alarm for out-of-tolerance conditions, it is far better to detect and remove each defective item without shutting down the line, unless the defect fraction becomes excessive. For example, passing pills or capsules over an air stream may separate over- and underweight parts from acceptable product without requiring an interruption of production.

Lessons From Potato Chip Makers

Food processors and, notably, potato chip makers often weigh or sense features on each individual item, and then remove each defective item individually. For example, a jet of air combined with sensors can remove every overcooked or blemished potato chip without a warning or stopping the line where hundreds of millions of chips are produced every day.

Potentially the worst quality violation is creating an environment that rewards workers for bypassing warnings, thus contradicting policies and procedures. Workers, supervisors, and managers who are paid for meeting production goals will bypass warnings if they believe that these are not causing any real harm because they are being personally penalized for stopping and fixing quality problems.

The higher the penalty for following specified actions, the stronger the incentive is to bypass the procedures. In many cases the rewards may not even be monetary, but can nevertheless be very real and powerful.

Consider the case of an experienced operator who has repeatedly and successfully fixed a problem identified by a specific Red Screen alarm. If the operator can execute the corrective action, but bypasses the alarm procedures, he or she doesn’t have to wait for the quality assurance inspector to come and review the changes, and thus avoids a shutdown. In such cases, fixing the quality problem becomes easier than following the procedures, and a less stressful and painful experience for the operator.

Determining True Root Causes, and Preventive Actions

These examples explain why workers circumvent alarms, but they don’t explain the root causes. To prevent such scenarios, we need to dig a bit deeper, as will be discussed in the following topics.

Absence of Mistake-Proofing

Regulatory bodies, like the FDA, define requirements for the safety and protection of society. Unfortunately, procedures assigned on the plant floor reflect the corporate interpretation or translation of the laws and rules and may add burdens to production that are not required and fail to add value. Workers may routinely bypass procedures and warnings when the procedures are burdensome and when they fail to offer a clear benefit in the mind of the worker.

Need for Cross-Training

There is nothing that prevents operators from being cross-trained for quality assurance. When a problem occurs, the most skilled responder can implement corrections while an experienced associate who is in the same area can validate the corrective actions and outcome. Generally, because of their cross-training and familiarity with the equipment operators are less likely to make an error in validating the corrective actions than an individual from quality assurance organizations. The key point is that independence in the validation by trained individuals can be achieved in more than one way.

Sadly, the best human inspection is very imperfect and it would take at least six successive checks by quality assurance personnel to obtain the same level of quality control as can be achieved by mistake-proofing the corrective actions! When companies demonstrate that mistake-proofing achieves better results than successive or independent checks, regulators will accept this alternative approach. In general, we have not tried this approach because we don’t understand it well, reverting to our traditional solution paradigms. Mistake-proofing can be used to prevent most Red Screen conditions as well as improving the response to these events.

Need for Operator Input

During the late 1980s, everyone proclaimed that they were doing “concurrent engineering,” but most of the products that are used in this country are developed in very different ways. For example, at a time when Toyota was able to go from product concept to production in 27 months, GM took 8 to 10 years. The best examples of concurrent engineering come from the U.S. during World War II. The first Jeep reached production in a few months, and the P51 Mustang, a far more complex product, reached production in roughly a year.

Nuclear weapons, requiring new technology of remarkable sophistication, were delivered to targets in two years. To achieve this, the buildings for the Y-12 plant at Oak Ridge, Tennessee, were being constructed before the design of the uranium separation equipment that went in the buildings was completed. It would have been impossible to successfully achieve these remarkably short development cycles without the intense participation of users and equipment operators!

Any operator of continuous flow equipment can predict that frequent shutdowns requiring the intervention of a separate organization to continue or restart production will be a production nightmare. Unfortunately, most equipment designers fail to involve operators in the design process, do not understand their needs, or even worse, disregard their input.

At Toyota, the operator is the most important person in the production planning process. Operators define the assembly sequences, assembly methods, and man-machine interfaces, and it is the responsibility of the industrial engineers and equipment designers to support these workers. This is a key concept that enables production to move forward concurrently with design, while preventing major problems in production startups. Naturally, the most skilled operators must be involved to obtain the best results.

Misunderstanding of Andon

In order to learn Lean production methods, GM entered into a joint venture with Toyota at the NUMMI facility in Fremont, California. GM engineers worked side-by-side with Japanese engineers at the facility where American workers assembled the Corolla, Tacoma, and the Chevy Prism. After several years of observing the Toyota production, GM attempted to implement these methods in the Saturn startup, which, among other features, included Andon cords that enable operators to stop production when problems occur. Red Screens are nothing more than automated Andon. The equipment simply “pulls the cord” rather than an operator. 

At Saturn, production immediately stopped when Andon cords were pulled, resulting in frequent disruptions that collectively had a major impact. On the other hand, the line hardly ever stops at Toyota’s NUMMI plant in Fremont, California. It took GM five years after the Saturn startup to recognize that the production line doesn’t stop immediately in the NUMMI plant when an Andon cord is pulled! As soon as the cord is pulled at NUMMI, an alarm brings the supervisor and every available cross-trained worker that is not otherwise occupied to work on the problem. Typically, there are five to 10 workers that can help. They have 60 seconds to resolve the problem while the line moves before the line stops. Most problems are fixed before the line stops.

As described, the Red Screen deployments alleged to have been put in place at Mylan appear to miss every key concept of correct line stoppage:

  • Warnings should be given before shutdown conditions are reached.
  • When a warning is given it must provide clear specific indicators where corrective action is needed (i.e., rather than an indication that weight is out of spec it must show where action must be taken to correct the weight).
  • Where possible, it should indicate what corrective action is needed.
  • Operators need to be sufficiently qualified and cross-trained so that they can take needed corrective actions immediately and independently.
  • The operators must be authorized to take these actions.
  • Every available worker must helps solve the problem to avoid shutdown.
  • Where possible, the corrective actions should be mistake-proofed so they can be completed error-free without the oversight of a separate organization.
  • Corrections can be, and are safely made while the production line is moving.
  • Advance warning for conditions that will lead to a shutdown allow sufficient time to make the vast majority of corrections before the Red Screen shutdown state is reached.

Allegations that Red Screen warnings may have been occurring as frequently as five times per shift at Mylan would suggest that the implementers did not understand the correct implementation of Red Screen controls any better than GM initially understood Andon at Saturn. If the controls are implemented correctly, the need for a shutdown and quality assurance intervention should be rare.

The Absence of Kaizen

Kaizen has been translated as continuous improvement, but we most frequently hear this term used in the phrase “Kaizen Event.” Kaizen means continuous, and Event is a discreet activity, thus the association of these concepts is an oxymoron. We have taken our interpretation of Kaizen from the least effective implementers of this principle.
In the best Kaizen companies, the term means a small shop on the factory floor that is constantly making productivity improvement and mistake-proofing devices. Rather than a series of staged events, this capability enables them constantly fix and improve production problems constantly, rapidly making and deploying devices out of the scrap that would normally be thrown away.

How effective is this approach? MIT’s International Motor Vehicle Study showed that the best Japanese plants reached their steady state low level production defect rates one to one-and-a-half months after the startup of a new production line. In contrast, it was taking U.S. manufacturers 12 months or more to reach their steady state low defect rates [1].

Thus, top performers are literally fixing most of the quality and shutdown problems in four weeks. This suggests that effective and permanent solutions to quality problems are being developed and deployed in a few days.

Suggesting that any organization might be encountering five Red Screen events per shift well into production would be absolute proof that they have not been continuously mistake-proofing and solving their quality problems.

In organizations that solve problems efficiently, the Andon signal or Red Screen event triggers an immediate Kaizen activity every time to assure that the cause of the problem can never recur. However, in the U.S., rather than solving the problems we tend to change policies and procedures to make it appear that the problems have been resolved.
Lack of Jidoka.

Jidoka has been translated as “Automation with a Human Touch”[2]. The use of sensors to detect non-conforming production conditions is an essential part of Jidoka that allows the operator to work while the machine works. However, the use of sensors, controls, automation, and IT are not necessarily sufficient to create an environment with a “Human Touch.” Calling an implementation “user friendly” does not make it user friendly any more than the addition of sensors for warnings or shutdown  create automation with a human touch. To provide a human touch, the warnings and shutdown when required must help the workers respond promptly and safely to the warnings, just as the Andon cord summons help the correct location.

Alarms that disrupt production and result in the summons of quality assurance personnel with long delays in restart are painful. If this happens infrequently it can be tolerable, but when such shutdown conditions occur frequently it is extremely annoying and frustrating, and can be nearly inhumane.

Sadly, in U.S. manufacturing, we generally have a very poor understanding of Lean product development and production, and it is the sum of many errors that results in broken production systems. An outstanding implementation of any single capability often overcomes the absence of the other skills. To illustrate, with outstanding concurrent engineering, the implementation of Red Screens would be more effective. With good Kaizen, the production problems leading to Red Screens would be quickly resolved. By understanding the Andon principles, most of the factors that make responding to Red Screens difficult would be avoided.

Delays, defective sensors, frequent shutdowns, lost productivity, dependence on quality assurance response, criticism by coworkers, and a potential decrease in personal income provide strong motivations that train workers to bypass Red Screen alarms, which is the worst possible production habit with the most serious potential consequences. In contrast, mistake-proofing, concurrent engineering, Andon, Jidoka, and Kaizen help workers prevent or solve problems to assure that the extremely painful consequence of shutdown are avoided in all but the rarest of occasions.

Properly implemented, these techniques train workers to respond promptly to every warning and to find permanent and effective solutions because this approach is more effective in avoiding the consequences of frequent shutdowns and produces safer and superior products than bypassing warnings. In addition, preventing shutdowns contributes to increased job satisfaction, self-confidence, and pride in the product and in the corporation. Thus, these techniques develop the best production habits while minimizing the likelihood of adverse production consequences.

References

  1. Clark, K., Fujimoto, T., Product Development Performance, Harvard Business School Press, Boston, 1991, p.201.
  2. Hirano, H., Ed., JIT Factory Revolution, Productivity Press, Inc., Portland, Oregon, 1987, p. 134.

About the Author
Martin Hinckley is President of Assured Quality, Inc. in Perry, Utah. He can be reached at [email protected], or (888) 599-2100.

About the Author

C. Martin Hinckley | Assured Quality