Electrostatic Discharge
 
Electrostatic Discharge
10.1 INTRODUCTION
This chapter reviews the
problem of electrostatic discharge (ESD) as it affects wireless sensor network
nodes, and the importance of proper electrical, mechanical, and product design
in its control. Similar to electromagnetic compatibility (EMC) problems, ESD
problems often require a multi-disciplinary approach for proper resolution.
Unlike most EMC problems, however, ESD problems can be difficult to identify in
the field; often, the first indication of an ESD weakness in a design are
sporadic reports of "dead" nodes not attributable to other causes, or reports of
very specific failure types, such as corruption of certain register contents in
a microcontroller. ESD problems are difficult enough to detect in wireless
sensor networks designed for consumer and home automation applications, where an
ESD event may be instigated by a charged individual, but become even more
troublesome in networks for industrial, agricultural, and military applications,
many of which operate essentially autonomously, without direct human contact.
Lack of human contact may lower the overall failure rate. but can result in a
lack of information on the failures that do occur. It is important, therefore,
to understand the ESD problem, so that it may be "designed out" of the finished
product.
THE PROBLEM
10.2.1 Examples
Most people are familiar with static electric discharges on
cold winter days or other especially dry conditions. These discharges, although
only unpleasant to humans, may be fatal to modern electronic products; further,
weak discharges that are undetectable to humans can still damage electronic
products.[1] What follows
is a selection of ESD-related events associated with consumer electronic
products, illustrating the importance of the problem and the variety of guises
in which it may appear.
-
A new product was developed one spring, and volume shipments
began the following summer. That winter, however, field failures, in the form of
random access memory (RAM) errors, were noted in temperate regions. The problem was believed to be ESD-related,
and became serious when the RAM errors could not be duplicated in the
manufacturer's laboratories, because the problem could not be corrected until
the product was made to fail under laboratory conditions. Although many
attempted design improvements were tried in the field, the field failures did
not stop until spring. It became apparent that the ESD test procedure as
performed by the manufacturer at that time was inadequate, and a task force was
organized to identify improvements. The recommendations of the task force,
including an environmentally controlled ESD testing room and a new testing
procedure based on internationally accepted standards, were adopted after the
new test accurately duplicated not only the field failure of this product (which
was cured by a bypass capacitor on a RAM integrated circuit [IC]), but also the
warranty failures of several other products. (ESD had not even been considered
as a possible root cause of the failures in the other products.) By the next
winter, products of modified design were in the field, and there was no seasonal
rise in ESD-related failures. The cost of the new ESD testing room and equipment
was more than repaid by the improvement in the manufacturer's reputation among
its customers.
-
During development of a product, several problems were
uncovered during ESD testing. It was noted that the switching voltage converter
IC on a microcontroller circuit board would go into a test mode when an ESD
discharge reached a circuit board runner through a gap in the housing around the
"5" key on the product's keypad. An added capacitor and diode to the runner
cleared this problem. It was next noticed that the RAM memory locations storing
trim values for analog circuits would become corrupted when a discharge occurred
on the serial port contacts used to program the microcontroller, through another
housing hole. Placing a diode between the two contacts and placing spark gaps
(pointed circuit board runner shapes) between the runners to the contacts fixed
the RAM problem; instead, a discharge to the serial port contacts would now
cause the microcontroller to reset. Making the serial port runners thin to
increase their inductance and modifying the ground metal beneath them, fixed
this problem.
Now, after a discharge to the serial port contacts, the
product's real time clock (RTC) appeared to stop — but the product would
continue to work normally in every other way. It was found that the
microcontroller had two programming bits that controlled the RTC increment rate.
The RTC could be set to increment either once per second, once per minute, once
per hour, or to not increment at all. The ESD event was somehow corrupting these
bits so that the RTC would not increment, and, therefore, appear to stop.
Working with the designers of the microcontroller, the product engineer assigned
to this problem determined that the RTC was
supplied power through a particular supply pin, the Vrr pin. A
150-Ω resistor in series with the Vrr pin
(made possible due to the very low current drain of the RTC) fixed it. To speed
testing, special product firmware was written during this part of the
investigation that constantly checked the value of the sensitive bits. If a
change were noted, a light-emitting diode (LED) on the product would turn
on.
10.2.2 Failure Modes
Product failures resulting from ESD events may take many
forms. They can, however, be placed into two general categories, long-term and
short-term failures, depending on the time scale of the failure. Short-term
failures are generally found immediately after the ESD event. Long-term failures
take time, perhaps even months or years, to become apparent.
-
Long-term failures and latent defects.
Long-term failures, by definition, do not become apparent until some time after
the ESD event.[2] The
most common long-term failure involves CMOS ICs, which may have a shorter
operating life after an ESD event, due to gate oxide degradation. The noise
figure of radio frequency (RF) transistors may also increase over time after ESD
exposure. It has also been demonstrated that ESD events during manufacturing
processes produce latent defects in Schottky diodes used in microwave
transceivers, leading to a loss of sensitivity after a few months of
operation.[3]
-
Short-term failures. Short-term
failures become apparent immediately following the ESD event. They can have a
confusing, application-or market-specific taxonomy; in particular, the
distinction between catastrophic ("hard") versus noncatastrophic ("soft")
failures can be vague. Although one may debate the classification of a
particular product failure, the point to remember is that all failures are
failures, from a quality and customer satisfaction point of view (the only point
of view that matters). It is rare to find a user interested in entering into a
lengthy discussion with a manufacturer on the precise categorization of a
product failure due to ESD.
-
Catastrophic failures. A catastrophic
failure is, simply put, any failure from which the network node cannot be made
to recover by action taken over the air or at the user interface. This includes
transceiver parametric failures, such as loss of receiver sensitivity, and
failure of any internal sensors and actuators. Typically, catastrophic failures
involve physical component damage, and require the replacement of hardware
(e.g., integrated circuits) to return the network node to its proper operating
condition. These types of failures are those most commonly encountered when
reading the ESD literature.
-
Noncatastrophic failures.
Noncatastrophic failures, also called system upsets, include such transient
phenomena as inadvertent turn-off, improper operation of the transceiver
(partial or complete transmission of inappropriate messages, for example),
improper operation of any internal sensor or actuator, low battery indication,
or system reset (which can include loss of microcontroller state, loss of
volatile memory, and system turn-off). Of these undesirable events, system
turn-off is particularly undesirable. If the user (or network, if automatic
network monitoring is in place) does not notice the ESD event and its effect on
the node, the user may not turn the node back on and so miss future messages.
Because such future messages could be to inform the user of a fire alarm,
breaking window, or other safety-related event, the loss of future messages may
not be "noncatastrophic." Even if the wireless sensor network is not performing
such a critical function, if the node suffering the ESD failure is the master of
a star network, or performing some other centralized function, loss of the node
can lead to a failure of the network. Noncatastrophic failures should be
considered just as seriously as catastrophic failures.
-
Data loss. Loss of data falls in the
gray area between catastrophic and noncatastrophic failure. The loss of messages
queued for transmission, unprocessed sensor data, or received data not stored in
nonvolatile memory, although unfortunate and undesirable, probably falls in the
"noncatastrophic" category. The loss of factory tuning values for the
transceiver or calibration values for a sensor, which renders the network node
inoperable, probably falls in the "catastrophic" category. The loss of security
keys needed to communicate with other network nodes via symmetric-key
cryptography may or may not be considered catastrophic, depending on whether or
not a mechanism exists by which the user may enter keys into the node. (A
failure of a memory chip itself, so that it is no longer capable of storing
information, is clearly a catastrophic
failure.)
165 times read
|