# Accelerated testing for cosmic soft-error rate

by J. F. Ziegler
H. P. Muhlfeld
C. J. Montrose
H. W. Curtis
T. J. O'Gorman

J. M. Ross

This paper describes the experimental techniques which have been developed at IBM to determine the sensitivity of electronic circuits to cosmic rays at sea level. It relates IBM circuit design and modeling, chip manufacture with process variations, and chip testing for SER sensitivity. This vertical integration from design to final test and with feedback to design allows a complete picture of LSI sensitivity to cosmic rays. Since advanced computers are designed with LSI chips long before the chips have been fabricated, and the system architecture is fully formed before the first chips are functional, it is essential to establish the chip reliability as early as possible. This paper establishes techniques to test chips that are only partly functional (e.g., only 1Mb of a 16Mb memory may be working) and can establish chip soft-error upset rates before final chip manufacturing begins. Simple relationships derived from measurement of more than 80 different chips manufactured over 20 years allow total cosmic soft-error rate (SER) to be estimated after only limited testing. Comparisons between these accelerated test results and similar tests determined by "field testing" (which may require a year or more of testing after manufacturing begins) show that our experimental techniques are accurate to a factor of 2.

### Introduction

Integrated circuits are complex structures which occasionally fail. These failures are usually classified as hard fails, which indicates a permanent fail, or the circuit may make one or more intermittent mistakes, called soft fails. A system can be designed so that both of these are undetectable to the user (hard fails can be treated as repetitious soft fails). Detection and correction of either type of error are possible with extra circuitry. System architecture designers may meet desired reliability goals by either choosing low-SER components or including extra components to detect and correct errors.

It can take several years from the initial concept of a new LSI chip to the manufacture and sale of that chip. During this period, system architecture is developed on the basis of anticipated chip performance as predicted by modeling. The modeling of electronic performance is usually quite accurate. The modeling of chip reliability, especially with respect to soft fails due to alpha-particles or cosmic rays, has been used extensively within IBM [1]. This paper describes accelerated testing procedures which can establish the sensitivity of new LSI chips to radiation as soon as partially functional chips are available, which is often a year before final manufacturing. At the end of the paper we show the accuracy of these testing procedures by comparing the failure estimates based on these accelerated tests with actual field tests of chips.

All testing described here is performed with a low electronic noise background, and any possible synergy between electronic noise and cosmic-ray-induced fails is not evaluated.

<sup>&</sup>lt;sup>e</sup>Copyright 1996 by International Business Machines Corporation. Copying in printed form for private use is permitted without payment of royalty provided that (1) each reproduction is done without alteration and (2) the Journal reference and IBM copyright notice are included on the first page. The title and abstract, but no other portions, of this paper may be copied or distributed royalty free without further permission by computer-based and other information-service systems. Permission to republish any other portion of this paper must be obtained from the Editor.

| Introduction                                                            | 51   |
|-------------------------------------------------------------------------|------|
| Experimental overview                                                   | 52   |
| Determining circuit cosmic SER                                          | 53   |
| Assumptions of accelerated testing                                      | 53   |
| Circuit recovery                                                        | 53   |
| Circuit access time                                                     | 54   |
| Low-energy SER threshold                                                | 54   |
| Neutron vs. proton SER                                                  | 54   |
| Chip testing: Experimental conditions and procedures                    | 54   |
| • Beam energy                                                           | 54   |
| • Beam uniformity                                                       | 56   |
| Pictures of the beam distribution                                       | 56   |
| Error distributions on chips                                            | 56   |
| Beam dosimetry                                                          | 57   |
| Dosimetry: Using memory chips                                           | 57   |
| Dosimetry: Using Faraday cups                                           | 57   |
| Ionization chambers                                                     | 58   |
| Dosimetry: Individual proton counting                                   | 58   |
| Thermoluminescent dosimetry                                             | 58   |
| Scattering from double scintillators                                    | 59   |
| • Chip orientation to the beam                                          | 59   |
| Memory pattern arrays                                                   | 60   |
| • Loading memory arrays and interrogation                               | 60   |
| • Effects of operating voltage and chip temperature on                  | 60   |
| SER                                                                     |      |
| Experimental results                                                    | 61   |
| • Orientation effects                                                   | 61   |
| • Effects of nucleon energy on SER cross sections                       | 61   |
| • Low-energy threshold of SER cross sections                            | 62   |
| • Process variations on circuit SER cross sections                      | 63   |
| • Effect of chip temperature on SER cross sections                      | 63   |
| • Effect of pinching voltage on SER cross sections                      | 64   |
| Typical SER values for DRAM chips                                       | . 64 |
| • Predicting chip SER from limited experimental data                    | 64   |
| Experimental results on bipolar SRAM memory chips.                      | 65   |
| Experimental results on DRAM memory chips                               | . 67 |
| Experimental results on CMOS SRAM memory chips                          | 67   |
| Accelerated testing vs. field testing                                   | . 67 |
| • Field testing procedures                                              | . 67 |
| Read/write ratio for memories                                           | . 68 |
| • Comparison of accelerated testing vs. field testing                   | . 68 |
| Notes and sources for failure reports                                   | . 68 |
| <ul> <li>Comparison of accelerated testing vs. field testing</li> </ul> | 69   |
| (non-IBM chips)                                                         |      |
| Experimental sites for SER studies                                      | . 70 |
| Harvard University Cyclotron Laboratory                                 | , 70 |
| Los Alamos Meson Production Facility                                    | . 70 |
| University of California at Davis Cyclotron                             | . 70 |
| Brookhaven National Laboratory                                          | . 71 |
| University of Indiana Cyclotron Facility                                | . 71 |
| References and notes                                                    | . 71 |
|                                                                         |      |

The terrestrial cosmic SER of a circuit is caused primarily by energetic hadrons (neutrons, pions, or protons) interacting with circuit materials and creating a localized burst of electronic charge. The physics of this phenomenon is discussed in another paper in this issue, "Terrestrial Cosmic Rays" [2]. Mostly, these electronic bursts are due to nuclear reactions between the cosmic ray and a silicon nucleus which releases nuclear fragments that cause dense bursts of charge in the semiconductor. This charge burst interacts with the circuit and may cause a soft fail. No detectable permanent damage to the circuit occurs.

The flux of energetic cosmic ray nucleons at sea level is about  $10^5$  nucleons/cm<sup>2</sup>-yr, and this flux increases with altitude, up to 13 times at higher terrestrial sites (3000 m). Every sea-level LSI chip has  $10^4$ – $10^5$  energetic nucleons pass through it every year.

Accelerated testing for chip SER is accomplished by placing LSI chips in beams of nucleons with fluxes of about 10<sup>6</sup> nucleons/cm<sup>2</sup>-s, which accelerates the natural fail rate by a factor of 10<sup>8</sup>. Although cosmic ray fails are due to a mixture of neutrons, pions, and protons, routine testing is done only with protons, and the results are scaled to include neutron and pion components. This exclusive use of protons is necessary because there are no available sites with neutron or pion beams for routine circuit testing (which requires a wide range of beam energies, specific beam currents, and permission to conduct proprietary experiments).

The study of soft fails in satellite electronic systems is very advanced. More than a thousand papers have been published on satellite SER in the last twenty years. We find that these papers have been of limited value for determining terrestrial SER values, since the radiation which most affects satellite electronics is very different from the particle flux which is found at sea level (see Reference [2]). However, some valuable measurements have been made for proton or neutron upset rates, in part because of interest in the upsets caused by neutrons from special weapons, and to explain measurements from upset rates in the Van Allen Belt from protons. Examples of such work may be found in References [3–15] or in review papers [16–18].

Author's note: This paper does not discuss the absolute SER fail rates of any identified chips, but instead uses arbitrary units to make comparisons.

The structure of this paper is outlined in Table 1.

### Experimental overview

A typical experimental setup is illustrated in Figure 1. All experiments are conducted in air to simplify electronic connections. (Protons at 150 MeV have a range of 143 m, and lose energy at 0.5 MeV/m in air.) The chip tester in the experimental room must be constructed using chips with very low SER sensitivity (a PC typically locks up within a minute from ambient radiation in the experimental room). Typically, this means using exclusively bipolar logic instead of CMOS logic chips. The sensitivity of chips under test may be dependent on their cycle time, i.e., the time between either read or write instructions, so the chip tester must be located near the chip under test in order to minimize electronic noise and to allow chip testing at >10-MHz rates. A preliminary experimental run with a dilute beam of particles is made to establish the approximate fail rate of the chips. The beam intensity of the particles is then set at a level such that the circuits will recover

from one fail before the next fail occurs. The theoretical recovery time of LSI SRAMs, for example, is typically faster than  $10^{-7}$  s, so keeping fail rates below 10/s eliminates recovery problems.

The chip is loaded by the tester with fixed patterns of ones or zeros, but special tests are also run such as putting a DRAM into all "physical-low" levels, or putting an SRAM into its "preferred-state" complement pattern, which evaluates the chip in its most sensitive state. These patterns are discussed in the section Memory pattern arrays. When any commercial chip is placed in the beam of protons for accelerated testing, some fail rate is observed. A real-time map of these fails is maintained, such as that shown in Figure 2. The map is constructed so that the displayed distribution follows the physical distribution on the chip. Figure 2 shows the bit map of a chip with an organization of  $1024 \times 9$  (circa 1984) which was laid out on the chip in a manner very close to that shown. A total of 388 fails was observed. The uniformity of the fail map indicates that the beam of nucleons probably is uniform over the chip area. However, even with a uniform beam there may be some fail pattern, since chips with edge-connectors often lose significant voltage into the chip, and devices farthest from the power inputs are more sensitive to radiation. Access to chip layout designs is essential to prevent misinterpretation of results. The time used to conduct the test shown was 210 seconds.

During the exposure, the integrated dose of the nucleon beam is monitored (see the section on beam dosimetry). When the chip exposure is completed, the total beam dose is recorded in units of nucleons/cm<sup>2</sup>. From these two experimental numbers, the total chip fails and the beam dose, an SER cross section is calculated:

SER cross section (per bit)

or

SER cross section (per chip) = (fails/chip)/(nucleon dose).

(2

The nucleon dose is measured in nucleons/cm<sup>2</sup>, so the units of the SER cross section are cm<sup>2</sup>, and must be identified as *per bit* [using Equation (1)] or *per chip* [using Equation (2)]. The physical unit, cm<sup>2</sup>, is an area, and that is why the result of the experiment is called a "cross section." The idea of a cross section arises if one imagines firing at a target; the probability of hitting it depends on the size of the target's area. In the same way, the probability of hitting a nuclear target is proportional to the size of its cross section.



### Figure 1

Typical experimental setup for accelerated testing of circuits. The proton beam enters from the left. All experiments are conducted in air, so the beam exits the vacuum pipe from the accelerator through a thin foil (protons at 200 MeV lose energy at 0.5 MeV/m in air). Lucite blocks may be put into the beam path to lower the beam energy. The beam then goes through a thin lead sheet which acts as a diverging lens, spreading out the beam. Twin ionization chambers monitor the beam current (see the section on dosimetry). A beam collimator then cuts the beam down to about 2 cm in diameter. A chip is placed about 50 cm downstream which prevents any of the electron shower from the final collimator from reaching the chip. The chip may be mounted on a goniometer, which allows remote rotation of the chip in the beam to evaluate how the chip SER depends on the angle of the beam to the circuit plane. The beam is finally captured in a Faraday cup. The chip tester is controlled remotely because of the high radiation during testing.



### Figure 2

Fail map of a static RAM. During the accelerated testing of a chip, a fail map is monitored which indicates the chip soft fails. The distribution is shown similar to the actual physical layout of the chip. If a uniform distribution of fails is seen, the chip is properly centered in the nucleon beam and the beam itself has a uniform current density. Large chips with edge connections may show high SER sensitivity for their central sections because of voltage drop into the chip. The experiment is continued until 1–5% of the bits fail. If this number is too small for adequate statistics, such as for a 32×3-bit SRAM, the chip is repeatedly cleared and tested until adequate fail statistics have been achieved.





### Figure 3

Circuit SER cross section and sea-level nucleon flux. The two components needed to calculate a circuit SER are shown. The black dots are experimental points from the accelerated testing of the circuit. Through these is fitted a smooth curve which is extrapolated to higher and lower energies. The ordinate scale on the left shows that the SER cross section for the chip ranges from 8 to  $50 \times 10^{-12}$  cm². The right ordinate shows the differential cosmic ray nucleon flux. The integration of the product of these two curves gives the total sea-level fail rate. The chip shown is a 4Kb bipolar SRAM.

### Determining circuit cosmic SER

To calculate the sea-level fail rate of a circuit, the above SER cross section is multiplied by the sea-level cosmic ray nucleon flux. An example is shown in Figure 3. The dots represent accelerated measurements of the SER cross section for a 4Kb bipolar SRAM chip (circa 1984). Eight different chips were measured, and the chip-to-chip variation is indicated by the spread of the experimental dots at 70 and 150 MeV (about 2× for the chips shown; however, variation may be much larger for chips in early development). Also drawn is the cosmic ray nucleon flux at sea level. Since the SER cross section for chips changes with energy, and the sea-level flux also changes with energy, the final fail rate is the integral of these two quantities multiplied together:

### SER fail rate

=  $(SER cross section) \times (sea-level nucleon flux).$  (3)

The product of the two curves shown in Figure 3 is shown in Figure 4. The area under this curve is the estimated cosmic ray SER for the chip. Since the curve looks like a hyperbola, it is difficult to estimate which particle energies are most important to the total SER. The



### Figure 4

Soft-fail rate of a circuit. The product of the two curves shown in the previous figure is shown as a function of nucleon energy. The area under this curve is the fail rate of the chip. This plot shows how the fail rate changes with nucleon energy. Note that it is the area and not the peak value which is important, and the median energy for failure is 165 MeV (see next figure). The final chip SER is 1060 ppm/khr = 0.009 fails/chip-yr.

SER energy dependence may be seen more clearly by converting the fail rate to an integral plot such as that shown in **Figure 5**. This plot has the nucleon energy as the abscissa, but for the ordinate it has the percentage of fails from 0 to 100%. For any nucleon energy, this plot shows the percentage of the total fails which comes from nucleons with that energy or less. This plot is essential in evaluating the effectiveness of the accelerated testing. The 10–90% segment (shown with dotted lines) contains the central bulk of the fails, and for this chip these points occur between 38 MeV and 1000 MeV. In contrast, some CMOS chips have a 10–90% band which extends from 200 to 3000 MeV. These latter chips may require much higher-energy beams to evaluate their SER.

# • Assumptions of accelerated testing Several assumptions are made in accelerated testing.

The more important ones are discussed below.

### Circuit recovery

Circuits must be allowed to recover to a quiescent state between subsequent fails in accelerated testing, or else the testing does not simulate the actual rare soft-fail events. Theoretically, this time is conservatively estimated as a few hundred nanoseconds for bipolar SRAMs and CMOS DRAMs. This recovery time is experimentally checked by

running accelerated tests on circuits and varying the fail frequency by varying the incident nucleon beam intensity. This beam usually may be varied by a very large factor, up to 10<sup>6</sup> in intensity. Any increases of circuit SER with intensity may indicate recovery problems. All testing is kept in a safe region, with the beam at most 1/100 of the intensity where cumulative effects are observed.

### Circuit access time

In real systems, electronic noise from circuit switching may weaken circuit resistance to localized charge bursts from cosmic rays. If the chip is to be used in main memory, static testing is adequate, since, except in the smallest computers, chips may be interrogated only once every several computer cycles. For cache memories, or embedded memories in logic, accelerated testing must be done in dynamic mode to evaluate the increased SER sensitivity in this normal operating condition. We find up to 8× increase in SER in some chips between static and dynamic testing.

### Low-energy SER threshold

The cosmic ray intensity of nucleons below 50 MeV depends greatly on local building materials [2]. The cosmic ray nucleon flux interacts with all surrounding materials, and the cosmic flux below 50 MeV which bombards a circuit consists mostly of collision products in the last 100 g/cm<sup>2</sup> of materials; e.g., in a building it might be the 16 in. of concrete in the floors above. Experiments have shown that over 10× changes in nucleon flux may occur depending on the types of local surrounding materials (see citations in Reference [2]). Circuits within computer component packages may experience fluxes which are 10× different from those for chips mounted in air. A weakness in the accelerated testing calculation is the assumption of a fixed sea-level cosmic ray flux for energies below 50 MeV. Very little is known about this low-energy nucleon flux and how it varies with local materials, so this portion of SER curves, such as those shown in Figures 3-5, might be considerably in error.

### Neutron vs. proton SER

The cosmic nucleon flux consists mostly of neutrons. However, the accelerated testing reported here has been done mostly with protons. Above 100 MeV, proton and neutron nuclear reactions with silicon are almost identical, and are expected to produce similar SER cross sections. Below 100 MeV this changes, with neutrons usually having higher SER cross sections (see Reference [19] for a review). However, it was shown in Figure 1 that for SRAMs only 30% of the fail rate is due to low-energy nucleons (for CMOS chips, only about 10%). Thus, if there is even a 2× difference in SER between low-energy protons and neutrons, this would affect the final SER by



### Figure 5

Circuit fails vs. nucleon energy. The fails of the circuit are normalized so that the ordinate shows from 0 to 100% of the total fail rate. This allows an analysis of what nucleon energy band has the most effect on the chip. For the circuit shown, the median nucleon energy is 165 MeV (50% of the circuit fails come from nucleons with energy greater than this), and the 10–90% SER energy band is from about 38 to 1000 MeV. In order to accelerate-test this circuit accurately, experiments must be run which extend over most of this energy band. The final chip SER is 1060 ppm/khr=0.009 fails/chip-yr.

only 30%. The only IBM neutron SER experiments on chips are discussed later in the section *Low-energy* threshold of SER cross sections.

# Chip testing: Experimental conditions and procedures

### • Beam energy

Testing should cover particle energies which cause most of the fails. The sensitivity of chips to cosmic rays is believed to be essentially zero for particles below about 5 MeV, because of nuclear thresholds. Protons from 5 to 30 MeV may not be able to penetrate the chip packaging module, and these energies produce such erratic experimental results that they are rarely measured (a 20-MeV proton stops after only 2 mm in Si or Al). The SER cross section usually increases with particle energy, to at least 800 MeV, and the energy band of 30-800 MeV is where most SER data are taken. Above 800 MeV there are no laboratories available for testing commercial proprietary parts. Above 2 GeV, the number of cosmic ray particles decreases with energy, by about  $E^{-1.5}$ , and these very high-energy particles are too rare to be statistically important. Our experience has shown that the median



### Floure

Photograph of beam uniformity. This Polaroid photograph was exposed using the proton beam at the Harvard Cyclotron Laboratory. It shows a beam uniformity of better than 10% of the central 2.5 cm of the beam which is used in SER experiments. The exposure was made by  $2\times10^8$  protons/cm² at 148 MeV (about a 5-s exposure).

energy for SER (as shown by plots similar to that in Figure 5) is about 200 MeV for bipolar SRAMs and 400 MeV for CMOS.

All accelerators which are used for SER testing are based on cavity-resonant acceleration of protons with very high-Q operation. We have found that any spread in proton beam energy is due less to broadening of the RF generation amplifiers than to incidental scattering of the beam as it exits from the accelerator. For example, the measured beam energy spread at the Harvard Cyclotron Laboratory is less than 1 MeV at 160 MeV. The measured beam energy spread at Los Alamos (LAMPF) is less than 2 MeV at 800 MeV. Both of these measurements were done by the respective accelerator staffs, and were not independently measured by the authors.

### • Beam uniformity

It is imperative that the experimental beam be made uniform over an area larger than the chip. Since the beam is characterized by its current density, ions/cm²-s, it is not important how big the beam is as long as it reliably covers the chip. Tests are conducted before experimental runs to ensure the beam position, size, and uniformity, and the uniformity is also monitored during experiments, as discussed below.

One reliable method for uniformly spreading the beam is to pass it through a heavy metallic foil upstream of the target chip. This is shown in Figure 1, where a "beam spreader" is placed in the beam path where it enters the experimental room. The spreader is constructed of various materials depending on the beam energy. For energies of 100-150 MeV, the spreader is a thin lead sheet, 3.1 mm thick, with a diameter of 6.3 cm (the beam is about 5 mm wide at the point shown). The beam penetrates the foil and is slightly scattered by the  $6 \times 10^{21}$  Pb atoms/cm<sup>2</sup>. Since there are very many individual collisions, statistical uniformity of scattering is achieved. About 60 cm downstream of the beam spreader is a collimator which trims the divergent beam to 1.11-cm diameter. This beam continues on to the target chip, about 0.5 m farther downstream, where the beam is about 2.5 cm in diameter, well in excess of the chip size.

For lower-energy runs, the beam energy is lowered by introducing "energy degraders" into the beam (Figure 1). These degraders are precisely milled blocks of Plexiglas® (Lucite®). They not only lower the mean energy of the beam, but they also introduce scattering and spread the beam like the Pb foil, so the foil is not necessary when a thick degrader is used for the lower energies. Plexiglas is used because a) it does not activate under bombardment and b) it contains mostly low-atomic-number atoms which minimize the lateral scatter of the beam. Too much scatter reduces the beam intensity on target and extends the time necessary to run experiments. The beam energy straggle when a Plexiglas degrader is used is moderate, about 7 MeV (FWHM) for reducing the beam from 150 to 50 MeV, for example.

The beam uniformity on target can be checked by the two methods described below.

### Pictures of the beam distribution

It is possible to take a picture of the beam; an example is shown in **Figure 6**. This photograph was made with medium-sensitivity Polaroid film at the chip position. The exposure was about  $2 \times 10^8$  protons/cm<sup>2</sup> at 148 MeV. Photographs are usually taken at three different exposure levels to look for fine structure in the beam uniformity. The accuracy of this technique is estimated at 30%. This accuracy has been determined by making detailed beam studies using a Faraday cup with a small (1-mm) aperture, which is moved into various positions to sample different parts of the beam distribution. This technique takes a long time, since the Faraday cup must be manually moved between each sampling.

### Error distributions on chips

Another way of determining the beam uniformity is to display SER fails on a map which shows the actual physical position of each cell on the chip. An example is shown in Figure 2, where the small white dots indicate the physical position of the fail on the chip. This kind of map

is normally shown during SER experiments if the physical mapping of the chip is available. A problem occurs for large chips with only edge-mounted connectors. The memory cells in the center may be up to 30% more sensitive because of voltage drops into the chip, and these chips always show a slight hot-spot in the middle of their arrays. The accuracy of this method is in the eye of the beholder.

### • Beam dosimetry

Cosmic rays are so dilute that they can be considered to hit a chip one at a time. When the chip is undergoing accelerated testing, it is important to ensure that the chip recovers fully from a hit before a second hit occurs; otherwise it may be anomalously sensitive. To establish the recovery time of a chip, the beam current is varied more than  $1000\times$ , and the chip SER is measured for each current intensity. If any variation in SER is seen with increased beam current, the accelerated testing is continued at a beam current at least  $100\times$  below the lowest current level which showed some beam current effect.

As discussed before, the only two measurements which are necessary to evaluate the cosmic SER of a circuit are the dose of nucleons used in the test and the number of circuit fails which occur for that dose. Measurement of the number of nucleons which have hit the chip is called dosimetry. Extensive efforts have been made to find various independent methods which would measure the beam intensity to ensure the accuracy of the dosimetry. All of the methods agree within ±15%. They are described in the following subsections.

### Dosimetry: Using memory chips

The most reliable method of ensuring the reproducibility of beam dosimetry is to insert a previously measured chip (called a *golden chip*) and measure its fail rate. This is the only technique for measuring the total nucleon dose, neutrons plus protons. All of the other techniques, discussed below, measure qualities specific only to a proton or a neutron, with the assumption that the beam is 100% of one particle type. This purity cannot be ensured because of the interactions of the beam with filters, collimators, and windows in the beam line, which always contaminate the beam with some quantity of extra protons and neutrons. Thus, the best monitors for nucleon beams are golden chips, and they can be calibrated to be quantitative like any other technique.

As an example, we used bipolar 4Kb SRAM chips for eight years, 1985–1992. The calibration chips were measured at the beginning and at the end of each experimental run to validate the dosimetry. Experimental values for one group of golden chips measured over these years, showing the reliability of the routine dosimetry, are

 Table 2
 Experimental SER cross sections for a golden chip set.

| Year of<br>experiment | Starting<br>SER | Ending<br>SER |
|-----------------------|-----------------|---------------|
| 1985                  | 28,29,26,28     | 29,25,24,26   |
|                       | 25,22,28,28     | 23,24,25,24   |
| 1986                  | 25,23,27,29     | 26,27,25,25   |
|                       | 26,25,28        | 28,24,29      |
| 1987                  | 26,28,27,28     | 28,30,27,28   |
|                       | 27,29,28        | 28,28,27      |
| 1988                  | 28,29,30        | 29,27,26      |
| 1989                  | 26,27,27        | 25,27,24      |
| 1990                  | 24,23,25        | 26,26,24      |
| 1991                  | 26,28,27        | 25,29,24      |
| 1992                  | 28,29,28,27     | 28,26,25,26   |

reviewed in Table 2. The SER cross sections given in the table are in units of  $10^{-12}$  cm<sup>2</sup> per bit. The main problem with using chips for dosimetry is that there may be unanticipated aging effects. The golden chips are routinely used at the start and end of each testing session to ensure that other methods of dosimetry remain accurate.

### Dosimetry: Using Faraday cups

A proton beam may be captured in a Faraday cup and the resulting charge measured. However, there is a twofold problem with this technique. First, the accelerated testing is conducted in air, and the beam ionizes the air around it. The Faraday cup is located within this ionization medium, and for the very low currents used in accelerated testing, typically 10 pA, there is a problem of leakage currents to the ionized air. This leakage current also changes unpredictably with local humidity. A second problem is the required size of the Faraday cup. It takes only a few inches of iron to stop a 150-MeV beam, but for higherenergy proton beams, heavy atoms such as iron cannot be used to stop the beam because they become radioactive. The beam-stop of choice is high-purity carbon. But for an 800-MeV beam, it would take about a 6-ft cube of carbon to stop 98% of the beam. Such a large object would be difficult to transport and install for a temporary experiment.

Most of the IBM experiments have been conducted at the Harvard cyclotron. There is a time-tested non-vacuum Faraday cup available there, and we have also built one based on different principles which is described elsewhere in this journal [20]. The reproducibility of these and other measurements (discussed below) is illustrated in **Table 3**, which shows the measured ion dose per golden chip fail for various methods of dosimetry. Measurements over a period of eight years show a reproducibility of better than 10%, and an absolute measurement accuracy better than 30% based on comparison with other dosimetry techniques.

| Year | Protons/cm <sup>2</sup> -fail | Measurement methods                            |
|------|-------------------------------|------------------------------------------------|
| 1985 | $9.7 \times 10^{6}$           | Harvard Faraday cup                            |
| 1985 | $8.8 \times 10^{6}$           | Single-particle counting with silicon detector |
| 1986 | $9.7 \times 10^{6}$           | Harvard Faraday cup                            |
| 1986 | $9.9 \times 10^{6}$           | TLD crystals (low dose)                        |
| 1986 | $8.2 \times 10^{6}$           | TLD crystals (high dose)                       |
| 1987 | $9.7 \times 10^{6}$           | Harvard Faraday cup                            |
| 1988 | $9.1 \times 10^{6}$           | Harvard Faraday cup                            |
| 1988 | $9.4 \times 10^{6}$           | IBM Faraday cup                                |
| 1989 | $9.7 \times 10^{6}$           | Harvard Faraday cup                            |
| 1989 | $9.2 \times 10^{6}$           | IBM Faraday cup                                |
| 1990 | $9.7 \times 10^{6}$           | Harvard Faraday cup                            |
| 1990 | $1.1 \times 10^{7}$           | TLD crystals (low dose)                        |
| 1991 | $9.7 \times 10^{6}$           | Harvard Faraday cup                            |
| 1991 | $9.1 \times 10^{6}$           | TLD crystals (low dose)                        |
| 1992 | $9.2 \times 10^{6}$           | Harvard Faraday cup                            |
| 1992 | $9.3 \times 10^{6}$           | IBM Faraday cup                                |

### Ionization chambers

For routine experiments conducted at Harvard, ionization chambers were used for all experiments because they are built into the beam lines. This method of dosimetry is not absolute, but the counting statistics are so good (better than 0.1% counting accuracy for a typical experiment) that accurate relative dosimetry is easy to obtain. An ionization chamber is a box with thin windows at opposite ends through which the beam enters and exits. The box contains a gas which is easily ionized, such as methane. On either side of the beam axis are two metal plates which are biased so that any ionized gas atoms are in a uniform transverse electric field. This field pulls the electrons and the ionized atoms in opposite directions. The bias on the plates is typically 2 kV, so the particles which are accelerated toward the plates ionize other gas atoms, forming a cascade of charged particles. This gives the ionization detector a built-in amplification of about 10<sup>3</sup>. When there are more than  $10^7$  protons/s going through the ionization chamber, the detector is effectively swamped. Individual events are not seen; rather, a dc current of about a microampere is monitored which is proportional to the proton current.

The ionization detector has two problems. First, it is not quantitative and must be calibrated using other techniques. Second, this calibration must be repeated at every beam energy, since the probability of the ionization of the gas by a proton beam depends on the proton velocity. This calibration typically takes about two hours.

### Dosimetry: Individual proton counting

The only technique which is quantitative without requiring any calibration is to detect every single nucleon and count

them individually. This can be done for a proton beam by putting a silicon surface-barrier detector (SBD) in the beam. An SBD is a wafer of high-resistivity silicon, typically 1000  $\Omega$ -cm, which has a thin metal layer which makes a Schottky barrier on one side, with a resistive contact made to the opposite side. A bias of a few hundred volts is applied across the wafer, resulting in a very deep depletion depth of 0.5 mm. The high-energy protons which are used in accelerated testing go completely through such wafers. But as they penetrate, they interact with the electron sea of the silicon and produce a wake of electron-hole pairs. The depletion field pulls the carriers apart, and virtually all of the 106 charges produced by each proton are collected in about a nanosecond. This pulse is so large when compared to the background that the efficiency of proton detection is 100%. The only limitation is pile-up if the count rate becomes too high. Protons are counted at rates up to 6M/s. This technique is essentially identical to that of the ionization chambers described above, but the cascade pulse for each particle is about 10<sup>4</sup> times faster, allowing for individual particle counting for currents up to about 1 pA.

Since the SBD maximum count rate is less than 1/10 of the typical beam currents used during accelerated testing, SBD dosimetry cannot be used during actual testing experiments. This technique is used to quantify other methods of dosimetry at the low range of their scales. If the two methods agree, there is confidence in the higher-current measurements.

### Thermoluminescent dosimetry

For high-energy experiments above 300 MeV, all of the above methods of dosimetry become difficult because of the high level of radioactivity induced in the detectors by the proton beam. We have developed a new technique which allows accurate dosimetry at energies up to 800 MeV.1 This technique is similar to that using the ionization chamber above, but the ionization charge caused by the proton beam going through the detector is measured after the experiment is concluded. The detector consists of small samples of single-crystal lithium fluoride (LiF). These single crystals are fabricated to be chip size,  $\sim 3 \times 3$ mm, and are then heated to 400°C to remove any residual crystal damage. They are placed in small opaque black plastic pouches so that no radiation such as sunlight can cause any crystal damage before they are used. During an experiment, one or more of the pouches are placed in the proton beam just in front of the chip. The proton beam causes ionization in the crystal during the experiment. After the experiment is concluded, the chip is placed in a

<sup>1</sup> This TLD dosimetry method was developed in collaboration with Stanley Woligora of Eberline Instruments Co., Albuquerque, NM (now retired).

special furnace which has built-in photomultiplier tubes. As the crystal is slowly heated, the electron-holes created by the proton beam in the LiF begin to recombine, with each recombination emitting a photon (hence the name "thermoluminescent dosimetry"). The crystal is heated until it no longer gives off light (for proton energies above 100 MeV, there are significant micro-amorphous volumes, and heating must be ramped slowly to 450°C to anneal the samples completely). This total emitted light is proportional to the proton dose of the experiment.

The accuracy of this method of dosimetry has been evaluated in two ways. First, one can theoretically calculate that for LiF crystals a proton beam at 150 MeV will deposit 1.29 MeV/cm of crystal transited [21]. Since our crystals were 0.089 cm thick, each proton would deposit 115 keV of energy. For LiF, each 2 eV of deposited energy creates one electron-hole pair (there are seven different defect states possible, and the analysis of the excitation details is explained elsewhere). Each proton at 150 MeV creates 67650 electron-hole pairs, each of which emits a photon during the annealing of the crystal damage.

To evaluate this estimate, crystals were exposed in 150-MeV beams at Harvard using both ionization chambers and Faraday cups for beam dosimetry. Experiments were run to compare calculations with experiment, and also to find out at what proton dose the LiF crystal response might become nonlinear because of damage saturation. For proton doses from  $10^8$  to  $10^{10}$  protons/cm<sup>2</sup>, the crystals remained linear and the damage agreed with calculations within  $\pm 17\%$ . Above  $2 \times 10^{10}$  protons/cm<sup>2</sup>, the crystals began to slowly saturate with damage and became unreliable (saturation occurs when damage occurs in already damaged parts of the crystals).

A completely independent method also evaluated the crystal sensitivity to charged particles. The crystals were put into a well-calibrated electron beam at Eberline Instruments Co., Albuquerque, NM. The dosimetry of this beam is better than 1%. The damage to the crystal agreed with the theoretical predictions to better than 10%.

The crystals can be used for proton beams at various energies with the understanding that the proton energy loss in LiF changes from 1.29 MeV/cm for a 150-MeV proton beam to 0.702 MeV/cm for an 800-MeV proton beam [21]. This reduction of energy loss is accurate to about 2% from numerous experimental papers.

The LiF method of dosimetry has several problems. First, it is inconvenient because a different crystal must be used for each chip exposure, and also because the actual measured dose cannot be determined immediately. The crystal stored-damage measurement is usually done the week after the experiment is completed. A second problem is the saturation of the crystals for proton doses above

 $2 \times 10^{10}$  protons/cm<sup>2</sup>. For harder circuits, such as CMOS, this limits the statistics of an individual run. For a dose of  $2 \times 10^{10}$ , a CMOS circuit may have about 40 fails for a 9216-bit chip (for protons at 148 MeV). Thus, several runs would have to be made using separate crystals in order to get a statistically accurate fail rate. There is no known scientific literature about using LiF crystals for high-energy dosimetry such as that described above.

### Scattering from double scintillators

This method is particularly appropriate for neutrons, but it can also be used for low-current proton beams. The technique involves two thin plates of scintillating material, each connected to a photo-multiplier tube. One small plate is placed in the beam of nucleons, and the other is placed downstream about one meter, at an angle of about 45° from the beam direction. The scintillation plates are made from a material with a high density of hydrogen. If an incident nucleon hits a proton (which is the nucleus of a hydrogen atom) in the upstream scintillator, it can transfer much of its energy to the recoiling proton. If the proton recoils at 45°, toward the second downstream scintillator, it will have exactly half of the original neutron energy. The recoiling proton will cause a flash in the upstream scintillator in which it originated, and if it hits the downstream scintillator it will cause another flash. Since the time of flight of the proton between the two scintillators can be calculated from the proton energy, a search is made for a delayed coincidence between flashes from the two scintillators. From the number of delayed coincidences, one can calculate the beam current through the first plate. This dosimetry may be considered absolute, without calibration.

This technique does not work if there is significant particle radiation where the scintillators are located. A large background of electrons may swamp the detectors, making the search for coincidences inaccurate.

### • Chip orientation to the beam

Since chips are asymmetrical, one might expect that the angle of the beam to the chip would affect the observed SER. Experimentally, this is tested by varying the beam angle from normal incidence to a 5° glancing angle to the circuit plane. This tilting is then repeated after the chip is changed in azimuth angle, so that it is hit from different directions. In practice, less than a 2× change of SER has been observed which depends on beam direction. This conclusion indicates that it is the sensitive volume of a circuit which dominates the SER, and not its planar patterns (we do not know how to define the sensitive volume of a chip). Sending the beam through the chip in an opposite direction has shown no differences in SER, for energies in which the energy loss of the beam in the chip is negligible.

- 0 = All zeros in memory.
- 1 = All ones in memory.
- 0,1 = Checkerboard of zeros or ones.
- $\langle 0,1 \rangle$  = Complementary checkerboard.
- PS = Preferred-state pattern. Cell was powered on, and the bit *preferred* by the circuit was used. This pattern usually is less sensitive than its complement.
- (PS) = Complement of the above preferred state.
- Highs = DRAM set with physical high levels.
- Lows = DRAM set with physical low levels.

### • Memory pattern arrays

Shown in **Table 4** are the patterns used to test for any pattern dependence of the chip SER. Differences based on the stored bit patterns are small for flip-flop memory chips, although they can be quite large for other types of memories.

The array patterns have little effect on the SER of bipolar or CMOS flip-flop memory arrays. Only the preferred-state pattern showed an SER difference compared to its complementary pattern, but this difference was always less than 30%. The FET arrays had a remarkable sensitivity to physical high or low array patterns, with measured SER variations up to 30×.

For a typical SER measurement, the chip is filled with a checkerboard pattern. The chip driver scans the memory constantly, reading each bit and logging any errors, and then writing back the complementary state. Hence, each scan leaves the bits in an opposite state, and there is no buildup of errors. The real-time display of the array map shows the location of any fails, causing any inhomogeneity of the beam to show up as a shift in the error pattern away from a pure random distribution.

### • Loading memory arrays and interrogation

There are many ways to load memory-array patterns, and then to interrogate the chip for fails either during or after the radiation exposure. Shown in **Table 5** are the normal variations in testing procedures for loading the test array in the chip, and later interrogation.

• Effects of operating voltage and chip temperature on SER Chips are tested for SER sensitivity at voltages below and above their nominal values in order to evaluate nonstandard operation, and to understand problems in SER modeling. All chips became more sensitive when they were operated below nominal operating voltages, except SRAM memory chips with standby modes. Cells in standby modes may be significantly harder than in the R/W mode. Increasing the voltage above nominal values rarely increases the chip hardness. Variations between experimental SER and modeling SER may sometimes be useful for model improvement.

The temperature of bipolar chips is usually monitored by a temperature diode built into the LSI chip itself. Bipolar chips are mounted without any packaging, and are cooled by a cold compressed air stream. Testing usually covers the temperature range 25–85°C, with the temperature maintained by controlling the cooling air flow. Some tests have been run with chip temperatures as high as 130°C. An external heat source is needed for temperature cycling of

Table 5 Chip loading and interrogation procedures.

| R/W               | <ul> <li>Load array with pattern. Continuously read<br/>memory, correcting any errors as they<br/>occur. Typical cycle time is 100 ns.</li> </ul>                                                                                                                                                                                                            |
|-------------------|--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| R/W-C             | <ul> <li>Same as R/W, but constantly write the<br/>complement of existing state. This causes<br/>alternating <i>read</i> and <i>write</i> instructions<br/>and exercises the chip at maximum<br/>internal noise levels.</li> </ul>                                                                                                                           |
| WORM              | <ul> <li>Write once and read many. Pattern is<br/>written only at the beginning, and errors<br/>accumulate during the run. Total errors<br/>must be kept below 5% to prevent<br/>correcting previous upsets.</li> </ul>                                                                                                                                      |
| WORO              | <ul> <li>Write once and read once. Pattern is<br/>written only at the beginning, and errors<br/>accumulate during the run. Total errors<br/>must be kept below 5% to prevent<br/>correcting previous upsets.</li> </ul>                                                                                                                                      |
| $E_{\rm CC}$ -off | — Some modern chips have onboard error-correction circuits to prevent any errors from propagating out of the chip. This test mode turns off such $E_{\rm CC}$ to test the basic chip sensitivity.                                                                                                                                                            |
| $V_{ m EE}$ tests | <ul> <li>Change the operating voltage of the chip to<br/>see whether there are abrupt changes in<br/>SER with voltage. Some chips have very<br/>small operating ranges, and have operating<br/>voltage thresholds beyond which they are<br/>unusable.</li> </ul>                                                                                             |
| Refresh tests     | <ul> <li>The SER of DRAM chips depends<br/>significantly on the refresh time. Tests<br/>determine the SER versus refresh times<br/>(typically from about 0.1 to 10 ms).</li> </ul>                                                                                                                                                                           |
| Temperature tests | <ul> <li>Test the variation of chip SER with<br/>operating temperature (usually only for<br/>bipolar memories). See later discussion.</li> </ul>                                                                                                                                                                                                             |
| Standby<br>mode   | <ul> <li>Modern low-power memories may have<br/>standby modes, which draw less power.</li> <li>The cell is loaded with a pattern, put into<br/>standby mode for irradiation, and then<br/>returned to normal mode for interrogation.</li> <li>In general, a cell in standby mode is much<br/>harder (smaller SER) than normal operating<br/>mode.</li> </ul> |
| Battery<br>mode   | <ul> <li>Some memories have onboard batteries to<br/>prevent data loss when there is no power.</li> <li>The cell is loaded with a pattern, put into<br/>battery mode for irradiation, and then<br/>returned to normal mode for interrogation.</li> </ul>                                                                                                     |

DRAM, n-MOS, or CMOS technology. Experimental results, discussed in sections following, have repeatedly shown that LSI modeling programs for circuit performance are poor in predicting the temperature dependence of SER sensitivity.

### **Experimental results**

Below, we briefly review our overall conclusions after testing more than 80 different LSI circuits.

### • Orientation effects

Cosmic ray nucleons hit circuits from all directions. To simulate this effect, the chips are sometimes mounted on goniometers, which can rotate the chip so it can be hit from any direction. Extensive testing has shown that chip orientation has, at most, a 2× effect on most circuit SERs. As an example, consider two 4096-bit bipolars called chip A and chip B. These two chips have very similar SER cross sections and active device areas, but they are geometrically quite different. The A cell device is almost square, with dimensions  $31 \times 32~\mu\text{m}$ , while the B cell device is rectangular,  $22 \times 50~\mu\text{m}$ . The device areas are  $992~\mu\text{m}^2$  for chip A and  $1100~\mu\text{m}^2$  for chip B. The total array areas of both 4096-bit chips are also almost identical:  $A = 8~\text{mm}^2$  and  $B = 9~\text{mm}^2$ . Full-orientation SER



Variation of bipolar SRAM failure cross sections with energy. This figure shows experimental SER cross sections per bit for recent bipolar memories versus nucleon energy. We include only chips with memory densities above 16Kb manufactured from 1988 to 1994. The most important feature of this figure is the wide variability in failure cross sections, with the sea-level chip SER varying by over  $100\times$ , from about 0.003 to 0.4 fails per year for  $128\mathrm{Kb}$  chips.



### Figure 8

Variation of DRAM failure cross sections with energy. This figure shows experimental SER cross sections per bit for recent DRAM memories versus nucleon energy. We include only chips with memory densities from 1 to 16Mb manufactured from 1988 to 1994. There was no significant difference among the various DRAM cell designs: single capacitor, stacked capacitor, or trench. The most important feature of this figure is the wide variability in failure cross sections, with the sea-level chip SER varying by over  $100\times$ , from about 0.002 to 0.3 fails per year for 1Mb chips.

measurements were made not only by varying the proton beam relative to the circuit plane (we call this the "polar" angle of rotation), but also by rotating the circuit to various azimuthal angles while the beam had a glancing incidence to the circuit plane (beam at 90°). This azimuthal rotation allowed us to probe the B chip with a glancing beam down the long side and with the beam entering the narrow side of the cell devices. Experiments were conducted at 70, 110, 148, and 800 MeV with six chips, each from a different chip lot. The result of these experiments was a maximum SER variation with orientation of only ±32%.

• Effects of nucleon energy on SER cross sections
We have measured about 80 different chip types which
were manufactured from 1969 to 1994 by many different
manufacturers. From this database, it is possible to
establish a general pattern of chip sensitivity to nucleons
as a function of the nucleon energy. SER cross sections
increase with energy up to 800 MeV, above which we have
made no measurements.

We begin our discussion of our experimental results on chip SER by showing the broad picture of experimental SER cross sections (per bit) for bipolar SRAMs (Figure 7), modern FET DRAMs (Figure 8), and CMOS SRAMs



### Finite 9

Variation of CMOS SRAM failure cross sections with energy. This figure shows experimental SER cross sections per bit for recent CMOS SRAM memories versus nucleon energy. We include only chips with memory densities from 64Kb to 1Mb manufactured from 1988 to 1994. One feature of this figure is the narrower variability in failure cross sections; however, the band includes only six different chips. The sea-level chip SER varies from about 0.01 to 0.3 fails per year for 1Mb chips.

(Figure 9). These figures give the range of experimental SER for various LSI memory technologies, and also indicate significant differences in their variation with nucleon energy.

The bipolar SRAM chip cross sections shown in Figure 7 have great variability, with a range extending over  $100\times$  in the final SER per bit. These cross sections increase rapidly at low energies, changing slope at about 200 MeV and then continuing to increase up to the highest beam energy we have used, 800 MeV.

In contrast to bipolars, the DRAM experimental cross sections shown in Figure 8 have a slower increase in cross section with energy. We show only the results for modern DRAMs, with array sizes of 1–16 Mb, to illustrate how even these chips with similar specifications and performance have widely different SER values, ranging over  $100\times$ , from 0.002 to 0.3 fails per chip-year for 1Mb chips.

Shown in Figure 9 is the distribution of SER cross sections for CMOS SRAMs. The most striking feature of these chips is their insensitivity to nucleon energy, with less than 4× cross-section change from 30 to 800 MeV.

• Low-energy threshold of SER cross sections
In published papers on the SER of circuits for space and military applications, many experiments have been made

which show that there is a lower-energy threshold for the SER cross sections. These thresholds usually occur at about 30–60 MeV, below which the cross sections drop to zero. The concept of a minimum required charge for upset,  $Q_{\rm crit}$ , is that a minimum charge injection is necessary to change binary information, i.e., cause a soft fail. The complicated subject of circuit  $Q_{\rm crit}$  is discussed in detail in Reference [22]. Further, nuclear reactions between protons and circuit materials have a threshold of a few MeV, below which absolutely no nuclear fragmentation occurs. This nuclear cross-section threshold is due to the Coulomb repulsion between a proton and a nucleus, and if the proton cannot get within  $10^{-12}$  cm of the nucleus, there cannot be a nuclear reaction.

To test this low-energy threshold hypothesis, experiments were run with more than a dozen bipolar and DRAM chips from 20 MeV to 800 MeV. For all cases, there was no observed threshold below which no fails occurred. Testing with proton energies below 20 MeV is difficult, since the protons lose significant energy while penetrating the typical 10  $\mu$ m of surface layers. Therefore, these experiments were extended by testing with neutrons



### Figure 10

Low-energy SER cross sections. Eight modern circuits were tested to determine their low-energy threshold for soft fails by using a 14-MeV neutron beam. All of the chips failed, at a level consistent with a low-energy extrapolation of higher-energy proton beam measurements. The illustration shows the experimental failure cross sections for the three bipolar chips which were measured from 20 to 800 MeV with proton beams, and then with the 14-MeV neutrons. All chips showed significant errors at 14 MeV, and supported the conclusion that there was no low-energy nucleon threshold down to this low energy.

**Table 6** SER process variation within the same chip family.

| Circuit type  | Number of<br>chips | SER maximum variation<br>from mean SER<br>(±%) |
|---------------|--------------------|------------------------------------------------|
| 4Kb bipolar   | 26                 | 16                                             |
| 4Kb bipolar   | 12                 | 19                                             |
| 64Kb bipolar  | 11                 | 1300                                           |
| 1Mb DRAM      | 4                  | 183                                            |
| 4Mb CMOS SRAM | 6                  | 20000                                          |

at 14 MeV.<sup>2</sup> Previous unpublished work at Boeing Co. by Eugene Normand had indicated some skepticism about low-energy thresholds from their work using 14-MeV neutrons.<sup>3</sup> We tested eight different parts, including bipolars and CMOS memories, observing fails in all cases. We show typical results in **Figure 10** for bipolar memory chips ranging from 7 Kb to 32 Kb (these were the only chips tested over the complete energy span of 14-800 MeV). The experimental 14-MeV neutron failure cross sections were not quite smooth extensions of the higher-energy proton points, but the dosimetry for the neutron beam was much more difficult, since the neutron beam was quite divergent at the point where it intercepted the chip. We estimate that the experimental accuracy for these neutron cross sections was about a factor of 3×.

The clear conclusion is reached from these 14-MeV neutron experiments that all LSI chips fail down to nucleon energies of 14 MeV, and probably down to the point of the nuclear reaction thresholds. (Note that we have tested only commercial chips, and no military "radiation-hardened" circuits.)

These results indicate that the circuits may have a much larger sensitivity to low-energy nucleons than previously thought. This low-energy sensitivity will make SER predictions much more difficult, since low-energy nucleons in the cosmic ray flux come mostly from nearby materials such as the walls and ceilings of a building. Since these are made from many different materials, the local flux of low-energy nucleons may vary by more than an order of magnitude (this subject is discussed in greater detail in the paper "Terrestrial Cosmic Rays" in this journal [2].

• Process variations on circuit SER cross sections
The term process variations refers to the small variations between identical chips due to slight changes in manufacturing techniques. The effect of process variations on SER cross sections has been discussed in extended detail in Reference [22]. This analysis shows that within



### Floure 1

Critical charge vs. temperature for various chips. Shown is the critical charge of various chips as a function of chip temperature as calculated using ASTAP modeling [23]. Only the effect on the  $Q_{\rm crit}$  of the collector-substrate volume is shown, since this is the most sensitive part of these circuits. The 4Kb bipolar chip has a dramatic change in  $Q_{\rm crit}$ , and since uncooled chips normally run very hot, one would expect a dramatic change in SER between hot and cooled operation. Experimentally, almost no change in SER cross section vs. temperature was found for these chips.

acceptable manufacturing tolerances, if everything worked in the wrong way, an SER variation of  $100 \times (10000\%)$  might be found in chip SER. Experimentally, this variation is determined by measuring many identical chips from different manufacturing lots and comparing their SER. Table 6 lists the typical variation found for parts from several different manufacturers.

The results for the small bipolar circuits are within the accuracy of the SER experiments, and show excellent reproducibility. The later, 64Kb bipolars show a significant increase in SER distribution due to process variation. The process variation for DRAM chips shows a larger variation than does that for bipolars, possibly because they use aggressive cutting-edge technology. The 4Mb example in Table 6 is the most variable commercial chip we have measured, and is not typical.

• Effect of chip temperature on SER cross sections Chips which are run hot are expected to have  $Q_{\rm crit}$  values different from those of chips run at normal operating temperatures (see **Figure 11**). This change in  $Q_{\rm crit}$  is caused by mobility changes in doped silicon, and hence by changes in device gain. As an example, a 4Kb bipolar chip may have a normal  $Q_{\rm crit} \sim 220$  fC when operated at 50°C, but if this chip is operated without cooling, it may rapidly reach temperatures of 120–130°C. At this elevated

<sup>&</sup>lt;sup>2</sup> SER experiments using 14-MeV neutrons were run at the U.S. Naval Academy, Annapolis, MD, in December 1990. We are indebted to our host, Professor Martin Nelson of the Nuclear Engineering Department

Nelson of the Nuclear Engineering Department.

3 W. R. Doherty and E. Normand, "Advances in Neutron SEU Analysis for Avionics in Natural and Weapons Environment," *Boeing Document No.* D180-29353-4, Boeing Corporation, 1989, unpublished.

**Table 7** Effect of operating temperature on bipolar chip SER cross sections.

| Chip<br>year | <i>Circuit</i><br>size<br>(Kb) | Cold<br>temperature<br>(°C) | SER cross<br>section<br>$(10^{-12} \text{ cm}^2)$ | Hot<br>temperature<br>(°C) | SER cross<br>section<br>$(10^{-12} \text{ cm}^2)$ |
|--------------|--------------------------------|-----------------------------|---------------------------------------------------|----------------------------|---------------------------------------------------|
| 1981         | 9                              | 62                          | 10                                                | 133                        | 10                                                |
| 1982         | 4                              | 59                          | 29                                                | 80                         | 27                                                |
| 1982         | 1                              | 56                          | 67                                                | 95                         | 56                                                |
| 1985         | 4                              | 63                          | 26                                                | 126                        | 21                                                |
| 1989         | 64                             | 28                          | 1.8                                               | 78                         | 1.4                                               |
| 1989         | 32                             | 41                          | 2.8                                               | 72                         | 2.5                                               |
| 1990         | 7                              | 23                          | 21                                                | 63                         | 9.3                                               |

temperature, the  $Q_{\rm crit}$  of the estimated device increases to more than 300 fC. Thus, the chip should become more resistant to soft fails at elevated temperatures. Experimentally, this effect was not observed; there was little change in chip SER as a function of temperature. This chip is used as an example because ASTAP modeling [23] predicts that it undergoes the largest shift in  $Q_{\rm crit}$  with respect to the temperatures which could be achieved without external chip heating. The experimental SER for the other chips modeled in Figure 11 also did not change with operating temperature.

Table 7 illustrates the lack of dependence of bipolar circuit SER cross sections on operating temperature. Each SER number is the average of several independent measurements (SER is in units of 10<sup>-12</sup> cm<sup>2</sup>). Our conclusion from these tests is that bipolar SRAM chips show little change in SER within their normal operating temperature range of 40–85°C. Since FET and CMOS chips do not normally run hot, they were not included in these special temperature-sensitivity experiments.

• Effect of pinching voltage on SER cross sections Originally, there were believed to be two basic quantities which dominate circuit SER—the sensitive volume and the critical charge,  $Q_{\rm crit}$ , of the circuit. As discussed in detail in Reference [22], this is a simplistic view because, for example, changes in on-chip signal shape can also cause large changes in SER, and the concept of  $Q_{\rm crit}$  is only a crude parameter in discussing the theoretical SER of a chip. The concept of  $Q_{\rm crit}$  has remained viable because it allows some comparison of relative chip SER, especially if one does not have access to the details of the chip design.

Because of the discussion made in the section Orientation effects on SER, it is believed that the individual shapes of devices are a secondary effect compared to the device-sensitive volumes. To determine the importance of  $Q_{\rm crit}$  with respect to circuit SER, various circuits were measured, with the value of  $Q_{\rm crit}$  being varied by changing the circuit operating voltage. By using

calculations made with the ASTAP circuit modeling programs [23], the circuit  $Q_{crit}$  was established both as a function of circuit voltages and as a function of chip temperature. A typical result of this modeling is shown in Figure 12 for both the temperature and voltage dependence of an IBM bipolar 1Kb memory chip, circa 1986. As the operating voltage decreases, the  $Q_{\rm crit}$  of the circuit decreases. (Only the collector-substrate  $Q_{crit}$  is shown because it dominated the theoretical circuit SER.) Experimental examples of chip SER variation with calculated  $Q_{crit}$  are shown in Figure 13. These cross sections were determined using protons at 148 MeV, and varying  $Q_{crit}$  by changing operating voltage. Note that pinching the 1Kb chip from 220 fC to 60 fC (with the temperature kept constant) changes the chip SER by a factor of 400%.

### **Typical SER values for DRAM chips**

This section tabulates typical computer DRAM memory chips, and their measured SER, to show the range of typical values. We do not show the results for either bipolar or CMOS SRAM chips, since these chips may be custom-made for various applications and may use widely different technologies. Their SER varies by over two orders of magnitude (per bit), and it is not scientific to compare their raw SER values without extensive notation of their individual technologies, which is beyond the scope of this review paper. However, DRAM memory chips for main memory are reasonably similar in characteristics. We show in Table 8 a review of the SER of standard memory chips from most of the main suppliers.

Of particular note is the variation in SER by more than  $100 \times$  between chips from different sources. Also, there is a clear improvement with time from the weak chips in early release to the mature chips manufactured several years later, which may be due both to circuit redesign and to improved process control.

• Predicting chip SER from limited experimental data
As shown in Figure 5, the SER of a chip is caused mostly
by nucleons with energies from about 20 to 1000 MeV.





### Figure 12

Theoretical critical charge of a bipolar memory chip. Shown is the theoretical critical charge of a high-speed 1Kb bipolar memory chip, manufactured about 1986, as a function of circuit voltage and also chip temperature. As the voltage drops, the amount of charge needed to flip the circuit cells decreases, so the circuit becomes more sensitive and its SER increases (higher probability for a soft fail). As the chip temperature changes, the semiconductor mobility changes, which leads to modified device gain (see also Figure 11). For the case of the chip shown above, the critical charge should increase with temperature. At its nominal 2.2 V and at 25°C it should have the same  $\mathcal{Q}_{\rm crit}$  as the same chip at 0.8 V but operated at 115°C. (Calculations by M. Nicewicz, IBM.)

Since it takes considerable effort to test chips over this large energy range (no single accelerator covers this large energy span), we have developed methods to extrapolate data taken over a small energy range to predict the full sea-level SER. The basic premise is to establish a firm failure cross section at a middle energy (we use 150 MeV), and take enough data to establish the slope at that energy (slope of SER vs. energy). From these two numbers we can estimate the total sea-level SER to an accuracy of about 3×. (Since the accuracy of SER prediction using a full testing over 38–1000 MeV has an accuracy of about 2×, this limited testing increases the inaccuracy by only 50%.) Similar methods for predicting fail rates for satellite

We have run SER experiments on more than 80 different memory chips manufactured from 1970 to 1994. Of these, we have measured 31 chips over the full particle range of 50–800 MeV. We discuss below the scaling we have developed for predicting the sea-level SER based on only limited low-energy accelerated testing. Separate sections cover bipolar SRAM memory chips, DRAM memory chips, and CMOS SRAM memory chips.

electronics in orbit have been suggested by the U.S. Naval

### Figure 13

Changing SER by pinching chip voltage. The SER of circuits may be changed by lowering the operating voltage. This lowers the  $Q_{\rm crit}$  of the circuit and makes the chip more sensitive. The plot shows the SER changes for various high-speed bipolar memory chips as a function of their theoretical  $Q_{\rm crit}$ . All cross sections were measured using 148-MeV proton beams. Similar data were taken at 800 MeV and showed similar curves, but with different slopes. This indicates that circuit  $Q_{\rm crit}$  and nucleon energy are coupled together; they cannot be considered to be independent parameters of a chip's SER.

Table 8 Typical SER data: DRAM memory chips.

| Chip<br>source | Bits per<br>chip | SER per chip*<br>(fails/yr) | Year<br>tested |
|----------------|------------------|-----------------------------|----------------|
| Mfr. 1         | 4Mb              | 0.00046                     | 1993           |
| Mfr. 2         | 4Mb              | 0.0026                      | 1993           |
| Mfr. 3         | 4Mb              | 0.024                       | 1992           |
| Mfr. 4         | 4 <b>M</b> b     | 0.038                       | 1990           |
| Mfr. 5         | 4Mb              | 0.09                        | 1990           |
| Mfr. 4         | 1Mb              | 0.0018                      | 1989           |
| Mfr. 5         | 1Mb              | 0.0041                      | 1989           |
| Mfr. 2         | 1Mb              | 0.0050                      | 1989           |
| Mfr. 2         | 1Mb              | 0.15                        | 1987           |
| Mfr. 3         | 1Mb              | 0.36                        | 1987           |
| Mfr. 6         | 256 <b>K</b> b   | 0.24                        | 1988           |
| Mfr. 7         | 256Kb            | 0.2850                      | 1989           |
| Mfr. 3         | 256 <b>K</b> b   | 0.38                        | 1986           |
| Mfr. 1         | 256 <b>K</b> b   | 0.52                        | 1987           |
| Mfr. 2         | 288Kb            | 1.14                        | 1986           |

<sup>\*</sup>Units are per chip, regardless of number of bits per chip.

Experimental results on bipolar SRAM memory chips
Our most extensive measurements have been done on
bipolar SRAM memory chips. We show in Figures 14 and
15 a selection of bipolar SRAM memory chips which have
been measured over a wide span of particle energies. The

Research Laboratory [24].



### Figure 14

Slopes of measured soft-fail cross sections for bipolar SRAM chips. The figure shows the measured cross sections for a selection of bipolar memory chips manufactured from 1973 to 1989. Indicated at the end of each experimental line is the date of chip manufacture, which is associated with the lithographic linewidth of that generation of chip. All soft-fail data have been normalized so that the cross section at 150 MeV=1.0. As noted in the text, there is a general tendency for the slope of the cross section versus particle energy to become steeper as the device linewidth shrinks.



### Figure 15

Slopes of measured soft-fail cross sections for bipolar SRAM chips. This figure is identical to Figure 14, except that the chips are identified by their memory size. Note that there is no clear pattern relating the slope of the cross section for failure to memory size for bipolars.

**Table 9** Factors for scaling bipolar 150-MeV cross sections to sea-level SER.

| Soft-fail slope*<br>(50–150 MeV) | Sea-level SER<br>(fails/hr-cm <sup>2</sup> ) | Sea-level SER<br>(fails/yr-cm <sup>2</sup> ) |
|----------------------------------|----------------------------------------------|----------------------------------------------|
| 3.0                              | 18.6                                         | $16.3 \times 10^4$                           |
| 2.5                              | 15.7                                         | $13.8 \times 10^4$                           |
| 1.6                              | 13.5                                         | $11.8 \times 10^4$                           |

<sup>\*</sup>See definition of slope in text.

plots were normalized so that the failure cross section = 1.0 at 150 MeV. These data plots show that there is a marked change in SER slope which can be somewhat related to the chip manufacturing date (Figure 14), but not to the memory size of the chip (Figure 15). One may roughly associate manufacture date with device linewidth, but it is not clear that this is what causes the change in SER slope, since so many other factors with respect to device design changed simultaneously.

From the experimental data of 52 bipolar SRAM chips (typical data are shown in Figure 14), we have developed rules of thumb for scaling from experimental fail cross sections to sea-level chip SER. The scaling factor is based on the slope<sup>4</sup> of the fail cross sections from 50 to 150 MeV, since it is relatively easy to obtain experimental data for this range of beam energy. See **Table 9** for the scaling factors.

As an example of using Table 9, assume that the fail cross section for a bipolar memory chip is  $1 \times 10^{-7}$  cm<sup>2</sup>/chip for 50-MeV particles, and  $3 \times 10^{-7}$  cm<sup>2</sup>/chip for 150-MeV particles. Then the cross-section slope from 50 MeV to 150 MeV is 3. Using row 1 of Table 9 (slope of 3.0), the chip sea-level SER =  $(3 \times 10^{-7}$  cm<sup>2</sup>/chip)(18.6 fails/hr-cm<sup>2</sup>) =  $5.6 \times 10^{-6}$  fails/chip-hr = 0.049 fails/chip-yr. If 100 of these chips are used in a system, the fail rate will be about five fails per year.

The bipolar memory chip data may also be analyzed using the technique indicated in Figure 5. For the chips with a high slope (row 1 in Table 9), we find that 80% of the sea-level fails are caused by nucleons in the energy band of 70–3900 MeV. In contrast, the chips with only a slight change in cross section with energy (row 3 in Table 9) have their 80% energy band over the narrow band of 30–900 MeV. Thus, these later chips may show more variability in actual field tests, since they are very sensitive to the final low-energy nucleons generated in the nearby ceilings and walls [2].

 $<sup>\</sup>overline{^4}$  Note: The term slope is used here to mean specifically the change in cross section from 50 MeV to 150 MeV. The cross section over this range usually has the shape  $\sigma=aE^b$ , where  $\sigma$  is the cross section (cm²), a and b are constants, and E is the energy (MeV). On a log-log plot (see Figure 3), this equation is a straight line. As an example, for cross sections of 1 cm² (50 MeV) and 3 cm² (150 MeV), the equation is  $\sigma=0.02E^{\frac{1}{3}}$ ; for an increase of 1 cm² (50 MeV) to 2 cm² (150 MeV), it is  $\sigma=0.0847E^{631}$ . To obtain SER factors for slopes other than for 150/50 MeV, the equations must be used to convert to equivalent numbers.

Experimental results on DRAM memory chips Experimental results from DRAMs might be expected to be more difficult to scale than bipolars, since there are fundamentally different structures used for their storage nodes. The original surface-capacitor design which was used for chips less than 1Mb in size now has two siblings: surface stacked capacitors (NEC, Hitachi, etc.) and trench capacitors (IBM, Siemens, etc.). However, there is less difference between the soft-fail cross sections of these different types of LSI designs than we found for bipolars; see Figure 16. Since there is little difference in the experimental slopes of cross section vs. nucleon energy, we can estimate the full SER from a cross section at one medium energy. The cross-section factor which may be used to estimate the total chip sea-level SER is shown in Table 10. For example, a 16Mb stacked capacitor DRAM chip with a cross section at 150 MeV of  $0.2 \times 10^{-12}$ cm<sup>2</sup>/bit will have a predicted sea-level fail rate: SER =  $(0.2 \times 10^{-12} \text{ cm}^2/\text{bit})(16777216 \text{ bits})(15.4 \text{ fails/hr-cm}^2) =$  $5.2 \times 10^{-5}$  fails/hr = 0.45 fails/yr. A system with 100 chips will have a fail rate of about one per week. Note that we have tested only a few of the newer DRAM design types (trench or stacked capacitor types), so our estimates are less accurate than for the bipolar memory chips.

Experimental results on CMOS SRAM memory chips Experimental results on CMOS SRAM chips have been limited to six chips, ranging from 128 Kb to 1 Mb (Figure 17). There are two basic types of CMOS SRAM designs, one with a four-device structure and one with a six-device structure (called respectively A and B in Figure 17). The cross-section results for the type-A structures had little slope with particle energy, which was very similar to the cross sections found for DRAM chips (Figure 16). The cross sections for the six-device cells were more energydependent. The factor which may be used to estimate the total chip SER is shown in Table 11. For example, a 1Mb CMOS four-device SRAM chip with a cross section at 150 MeV of  $0.2 \times 10^{-12}$  cm<sup>2</sup>/bit will have a predicted sea-level fail rate: SER =  $(0.2 \times 10^{-12} \text{ cm}^2/\text{bit})(1048576)$ bits)(16 fails/hr-cm<sup>2</sup>) =  $3.4 \times 10^{-6}$  fails/hr = 0.03 fails/yr. A system with 100 chips will have a fail rate of about three per year.

### Accelerated testing vs. field testing

### • Field testing procedures

"Field testing" of SER is the real-time determination of the soft-fail rate of a chip from all causes. It is done by observing many chips under normal operating conditions for a long time and logging their fail rate. The SER of a circuit is caused both by cosmic rays hitting the chip and by particles from radioactive trace contaminants in circuit materials and modules. In addition, there may be soft fails



### Figure 16

Slopes of measured soft-fail cross sections for DRAM chips. This figure shows the measured cross sections for a selection of DRAM chips from various manufacturers, normalized at 150 MeV to clarify the relative slopes. The most unique feature of these chips is the relative flatness of the cross section with energy, with little change from 30 MeV to 800 MeV, in contrast to the marked slopes shown in the SRAM data (see Figure 14). The various types of DRAM construction are not shown, since the effect of this on the total SER is quite small (see Table 10). Our best-guess interpretation of the flatness of the sensitivity with particle energy is that the sensitive volume of the devices is quite small, and any nuclear interaction within that volume causes an upset, regardless of size. One way to test this hypothesis is with 14-MeV neutrons, which we have not done.

**Table 10** Factors for scaling DRAM 150-MeV cross sections to sea-level SER.

| Bit cell structure | Sea-level SER<br>(fails/hr-cm <sup>2</sup> ) | Sea-level SER<br>(fails/yr-cm <sup>2</sup> ) |
|--------------------|----------------------------------------------|----------------------------------------------|
| Planar capacitor   | 16.9                                         | $14.7 \times 10^4$                           |
| Trench capacitor   | 13.8                                         | $12.0 \times 10^4$                           |
| Stacked capacitor  | 15.4                                         | $13.4\times10^4$                             |

**Table 11** Factors for scaling CMOS SRAM 150-MeV cross sections to sea-level SER.

| CMOS SRAM<br>design | Sea-level SER (fails/hr-cm <sup>2</sup> ) | Sea-level SER (fails/yr-cm <sup>2</sup> ) |
|---------------------|-------------------------------------------|-------------------------------------------|
| 4-device cell       | 16                                        | $14 \times 10^{4}$                        |
| 6-device cell       | 12                                        | $10.5 \times 10^4$                        |



### Figure 17

Slopes of soft-fail cross sections for CMOS SRAM chips. This figure shows the measured cross sections for a selection of SRAM chips from various manufacturers, normalized at 150 MeV to clarify the relative slopes. The curves marked A are for four-device cells, while those marked B are for six-device cells. The number associated with each curve is the number of bits per chip. The four-device cell chips show a relative flatness of their cross section with energy, with little change from 70 to 256 MeV. The six-device cells show significant change of cross section with energy, similar to that found for bipolar memory chips; see Figure 14. These six-device cell chips are measured using particle beams from 40 to 800 MeV.

caused by power-line noise, nearby electromagnetic pulses, process variations, circuit malfunctions, or even software bugs. The field test SER of a circuit attempts to measure the chip SER in a normal operating environment.

By building isolated and quiet testers, and operating them in controlled environments, all SER but that induced by cosmic rays and radioactive particles should be eliminated. The SER results from these testers combine these two types of particle SER, but by operating a tester under shielding the cosmic component may be eliminated, leaving only the SER due to radioactive contaminants. As noted in the paper "Terrestrial Cosmic Rays" [2], 96% of the cosmic portion of the particle SER may be eliminated by running tests under shielding of ten feet of concrete. When both the total particle SER and the shielded SER of a circuit have been measured, the cosmic SER can be found by subtracting the two numbers. It is this number which is predicted by the accelerated testing of this report.

A second method is to obtain fail rates at two or more altitudes. For example, Denver has about four times greater cosmic ray intensity than sea-level cities. If the observed total fail rate in Denver is 1700 and the fail rate in Boston is 500, one can solve for the radioactive SER (= 100) and the cosmic SER (= 400). This is the method used in the "statistical analysis" quoted in Table 12 and defined below.

### Read/write ratio for memories

Not all single-bit memory upsets cause problems. If the upset bit is written over before it is read, the fail is erased without causing a problem. The probability that a fail will be read before it is overwritten has been called the "accessibility ratio," the "sampling correction," the "memory susceptibility," or the memory "read/write ratio." We use the latter term, abbreviated as R/W ratio. If a fail occurs and it is read, a parity error occurs, and some systems make a notation in a problem log. In the analysis of fail rates to obtain SERs, one must collect the problem log entries and then divide by the R/W ratio to obtain the number of statistical fails which probably occurred.

The R/W ratio is dependent on the system software being used. Lengthy studies have been made by many computer scientists to obtain a R/W ratio, and measured values have ranged from 0.26 to 0.45 for the cache memory of some systems, with a value of about 0.38 being commonly used for analysis of data.

• Comparison of accelerated testing vs. field testing

Table 12 lists various IBM reports which make
comparisons between predicted cosmic SER (from
accelerated testing) and determinations of the same value
from analysis of logs or experiments. The first column
identifies the reports, which are described in detail in the
subsections following the table.

Notes and sources for failure reports

Note 1: The Messina Report This report analyzed the problem logs in high-altitude cities during the year 1985. The analysis surveyed 3798 identical memory chips, and showed a fail rate of 1700 (arbitrary units). A read/write ratio correction of 0.38 corrects that number to 4474 for the chip fail rate at a mean altitude of about 1900 m. Assuming a  $4\times$  change in cosmic ray intensity to sea level, the nominal sea-level rate is  $4474/4 = 1118.^5$ 

Note 2: The Blue Spruce Experiment Field test of 1152 4Kb bipolar chips in Leadville, CO (cosmic intensity 13× sea level), then in Boulder, CO (cosmic intensity 4× sea level), and finally deep underground (zero cosmic intensity). The fail rate observed scaled closely with the predicted cosmic intensity. Zero fails were observed in the ten months of the underground tests. This demonstrates

<sup>&</sup>lt;sup>5</sup> J. G. Pantalone and N. N. Tendolkar, IBM Poughkeepsie, 1984.

Table 12 Soft-error predictions vs. experimental or log data.

| Note no. | Chip type <sup>a</sup> | Predicted SER <sup>b</sup> | Observed SER <sup>b</sup> | Analysis method <sup>c</sup> | Typical application |
|----------|------------------------|----------------------------|---------------------------|------------------------------|---------------------|
| 1        | 4Kb BP 1               | 1590                       | 1118                      | Statistical analysis         | Cache memory        |
| 2        | 4Kb BP 2               | 1720                       | 1770                      | Field test                   | Fast cache memory   |
| 3        | 4Kb BP 2               | 1720                       | 1300                      | Statistical analysis         | Fast cache memory   |
| 4        | 4Kb BP 3               | 1670                       | 618                       | Statistical analysis         | Cache memory        |
| 5        | 4Kb BP 3               | 1670                       | 1340                      | Field test                   | Cache memory        |
| 6        | 288Kb DRAM             | 130000                     | 126000                    | Field test                   | Main memory         |
| 7        | 1Mb DRAM               | 2700                       | 3000                      | Field test                   | Main memory         |
| 8        | 9Kb Bipolar            | 1600                       | 998                       | Statistical analysis         | In/out channels     |
| 9        | 144Kb ĆMOS             | 250                        | 210                       | Field test                   | Secondary cache     |

BP = Bipolar memory chip

that the SER of these chips was more than 98% due to cosmic rays.6

Note 3: The Graff Report Analysis of more than 1000 problem logs with special 4Kb bipolar cache chips. The logs were divided into those located in cities with cosmic ray intensities within 2× those of New York City, and those greater than 2x. The logs from higher altitudes showed a fail rate 3.7 times that of sea-level logs (the definition of fail rate is too complex to be included here). The cosmic portion of the SER was found to account for >80% of the fails, based on scaling with cosmic ray intensity. The fail rates reviewed in this study were not analyzed for their local electronic noise ambient, and this may have been a contributing factor.

Note 4: The Farley Report This report was of an analysis of 670 problem logs. No analysis was done as a function of altitude or cosmic ray intensity, so the fail rate is for a sampling of all logs. The analysis presumed that only 47% of the fails were due to cosmic rays, and put 53% into "other causes." We have included a 38% read/write correction in the value reported in the table.8

Note 5: The Blue Spruce Experiment This was a field test of 576 bipolar 4Kb memory chips in Leadville, CO (cosmic intensity 13× sea level).9

Note 6: The NITETRAIN Experiment A total of 252 288Kb FET DRAM chips were tested at sea level, underground, and at various altitudes, and the results published in an IEEE journal [25].

Note 7: The Antelope Experiment A total of 1000 1Mb

FET memory chips were field-tested in the IBM Burlington, VT, laboratory. These chips were replaced almost monthly

Note 8: A Control-Store Analysis Analysis of the problem logs of 323 channel cards containing one specific SRAM chip, a 1Kb  $\times$  9 ECL bipolar array. The SERs of cards sited below 2000 ft altitude were compared to those from higher elevations. The fail rate at high altitudes was 3.4 times that for low altitudes. 11

Note 9: The Dallas Experiment A total of 3100 144Kb six-device CMOS SRAM chips were field-tested in Leadville, CO, and the results corrected for equivalent sea-level values.10

## Comparison of accelerated testing vs. field testing (non-IBM chips)

Results of field testing and accelerated testing have also been compared for DRAM chips from non-IBM commercial sources. (The field-testing results are from G. Fitzgibbon, J. Orro, and G. Unger, IBM Procurement, Poughkeepsie, NY.) Since the design details of these circuits are unknown, the only value of these results for this report is to compare their accelerated-testing SER with their field-testing SER to look for experimental consistency. These DRAM SER values, in units of % fails/khr, are shown in Table 13.

Arbitrary SER units

Analysis method

<sup>•</sup> Predicted Based on accelerated testing using particle beams.

Field test

Chip tester result using natural cosmic radiation.

alysis Analysis of problem logs. It includes read/write correction, but no correction for building shielding of cosmic rays.

with a new set of 1000 chips as a standard manufacturing quality test. There were no significant variations in SER in the two years of testing. No altitude experiments have been done to separate cosmic SER from that of residual radioactive contamination.10

<sup>&</sup>lt;sup>6</sup> H. Muhlfeld and C. Montrose, IBM ME Division, E. Fishkill, NY. 7 W. S. Graff, V. W. Morabito, E. L. Swarthout, and V. R. Tolat, IBM Poughkeepsie, internal IBM report, August 1988. <sup>8</sup> R. T. Farley, private report, IBM E. Fishkill, 1988.

<sup>9</sup> Experiment conducted by H. Muhlfeld and C. Montrose, IBM E. Fishkill,

<sup>10</sup> T. J. O'Gorman and T. Sullivan, IBM Burlington internal report. 11 S. K. Springer, IBM Burlington Internal Report TR-19.0854, October

**Table 13** Experimental values of chip SER from non-IBM sources.

| Technology<br>type* | RAM<br>size   | Field test<br>SER per chip* | Accelerated test<br>SER per chip* |
|---------------------|---------------|-----------------------------|-----------------------------------|
| DRAM                | 64 <b>K</b> b | 0.021                       | 0.012                             |
| DRAM                | 64 <b>K</b> b | 0.31                        | 0.12                              |
| DRAM                | 64 <b>K</b> b | 0.053                       | 0.024                             |
| SRAM                | 64 <b>K</b> b | 0.096                       | 0.011                             |
| DRAM                | 256Kb         | 0.13                        | 0.066                             |
| DRAM                | 256Kb         | 3.4                         | 0.59                              |
| SRAM                | 256Kb         | 0.20                        | 0.22                              |
| DRAM                | 1024Kb        | 0.042                       | 0.407                             |
| DRAM                | 1024Kb        | 0.023                       | 0.246                             |
| DRAM                | 4096          | 0.595                       | 0.733                             |
|                     |               |                             |                                   |

\*SER units are in %/khr. Note that this unit differs from the SER units of fails/chip-hr by 100 000. These are the standard units used by many commercial chip manufacturers

### **Experimental sites for SER studies**

We briefly review below various sites which have been used for experimental studies of chip SER. Experimental accuracy and reproducibility are greatly dependent on the site of the experiment, since beam quality is largely beyond the control of the SER tester.

### Harvard University Cyclotron Laboratory

The Harvard University Cyclotron Laboratory contains a 160-MeV proton accelerator which was originally built in 1948 by Lawrence and others of the Harvard faculty. This machine preceded the one at the University of California at Davis (described below) and has about 1/1000 the beam current but more than twice the energy. Protons at 160 MeV have a range of 160 m in air and 9 cm in silicon (3.4 in.). Since the accelerator is a cyclotron, the beam is accelerated in bursts of ions. Each burst has a duration of 200  $\mu$ s with a pulse repetition rate of 200 Hz. Each burst contains about 1.5 × 10<sup>8</sup> protons in a narrow beam less than one centimeter in diameter.

The beam exits the accelerator and is steered by magnets into one of three target rooms. The beam exits the vacuum system of the accelerator at the wall of each room through 0.030-in. KAPTON® foils. It penetrates the air of the room at about the diameter of a pencil, with little divergence. About 80% of the experimental beam time of the laboratory is used in medical treatments, especially the treating of carcinomas. The remaining beam time is used for the analysis of the soft fails of electronic devices. The experimental facility for the study of soft fails was described previously (Figure 1). The beam first encounters a "beam spreader" which defocuses the beam. The spreader consists of 3 mm of Pb foil, which spreads the beam into a cone about 4° in diameter. In passing through the spreader, the beam loses 12 MeV of energy, reaching the target area at 148 MeV. If a lower-energy proton

beam is desired, "degraders" are placed in the beam immediately after the beam spreader. The degraders are blocks of lucite (Plexiglas) which absorb known amounts of beam energy. Blocks are available for energies from 20 MeV up to 130 MeV. When the beam energy is lowered below 60 MeV, a significant beam energy straggle occurs. At 50 MeV the beam has a half-width of about 5 MeV. At 30 MeV, the straggle is much larger, almost 30 MeV. That is, at 30 MeV there are still some protons with energies above 45 MeV and some below 15 MeV.

### Los Alamos Meson Production Facility

The Los Alamos Meson Production Facility (LAMPF) originally was used for scientific research on muons and pions, which have creation threshold energies of 150 and 350 MeV. It is now also used as a source of synchrotron radiation and for military research projects. The 800-MeV beam occurs in 750- $\mu$ s pulses, with a pulse repetition rate of 12 Hz. Each of these pulses is broken down into 2000 "micro-pulses" which can be individually controlled and sent down different beam lines. Each micro-pulse is 750 ns wide, with 5 ns between pulses, with a flux of about  $10^6$  p<sup>+</sup>/cm<sup>2</sup>. Thus, the primary beam has a flux of  $2.4 \times 10^{10}$  p<sup>+</sup>/s at 800 MeV, or about 20 W.

A significant problem occurred in lowering the beam current to levels which could be used for soft-fail testing. Since the use of the normal micro-pulse would cause multiple fails in most chips, such testing would violate the principle that the chip should have time to recover between successive upsets. IBM built a special beamscattering system which allowed the further reduction of each micro-pulse by a factor of 100. This system had no effect on other simultaneous users, and allowed reliable testing of the SER of chips. Another major problem in using LAMPF concerns dosimetry, since Faraday cups are not available. In most materials, 800-MeV protons have a range of about a meter, with the further problem of rapidly activating all metals beyond human tolerance. This problem was solved by developing the use of thermoluminescent crystals for dosimetry (described elsewhere in this paper as TLD dosimetry). These LiF crystals were placed over the chips, and after exposure were sent to a special analytical laboratory for dosimetry.

### University of California at Davis Cyclotron

The Crocker Nuclear Lab at U.C.—Davis contains the second large cyclotron built in the United States (the first being the Harvard cyclotron). This machine sacrificed some energy (maximum energy is 67 MeV instead of 160 MeV as at Harvard) for a greatly increased beam current. The Davis cyclotron produces protons at currents to  $10^{14}$  p<sup>+</sup>/s in contrast to  $10^{10}$  p<sup>+</sup>/s at Harvard. With this extra current, it is feasible to convert the proton beam to a flood of neutrons, and to conduct neutron SER

measurements. This process is inefficient, taking about  $10^9$  protons to produce a neutron going in the same direction, but neutron fluxes of  $10^5$  n/cm<sup>2</sup>-s have been achieved for large-area targets. The principal advantage of the Davis laboratory is that it is unique in the United States in producing usable neutron beams in the energy range of 20-60 MeV.

### Brookhaven National Laboratory

There are five useful facilities at Brookhaven. Two reactors can provide 14-MeV neutrons with good energy resolution and modest dosimetry. One accelerator laboratory can produce 17.5-MeV neutrons with good dosimetry, accurate to about 50%. There is a tandem accelerator facility which can produce proton beams up to 36 MeV with excellent dosimetry, and finally there is a DOD facility dedicated to SER work which has proton beams up to 200 MeV.

The High Beam Flux Reactor (HBFR) is the largest nuclear reactor in the U.S. dedicated to neutron research. The maximum neutron energy is 14 MeV, with a fluence near the containment vessel of 10<sup>11</sup> n/cm<sup>2</sup>-s. To obtain energy resolution, a pair of rotating disks is located inside the beam line. Each disk has a radial slit window in it. Only neutrons of specific velocity can go through the slit in one disk and then make it through the second slit located a few meters downstream. The width of the slits determines the neutron velocity resolution, and the disk speed determines the neutron velocity selected for transmission. After velocity filtering, the beam leaves a beam pipe and travels in air into the experimental area. Dosimetry with neutrons is always difficult, but since the neutrons have precise energies, various nuclear cross sections may be used for dosimetry. For high doses, the activation of Mg<sup>24</sup> or Al<sup>27</sup> pellets is used. This technique has been described in detail [26]. It is useful for neutron doses above 10<sup>10</sup> n/cm<sup>2</sup>. For lower doses, the usual technique for dosimetry is to dump the beam into a deep chamber of paraffin with a BF, counter in the center. This technique relies on an original high-dose metal activation to calibrate it, since it is only a relative measurement.

University of Indiana Cyclotron Facility The proton beam available is at 210 MeV, degradable down to 50 MeV. The flux density ranges from  $10^7$  to  $10^{10}$  p<sup>+</sup>/cm<sup>2</sup>-s.

Plexiglas is a registered trademark of Rolm & Haas Company.

Lucite is a registered trademark of ICI Acrylics Inc.

KAPTON is a registered trademark of E. I. du Pont de Nemours and Company.

### References and notes

- P. C. Murley and G. R. Srinivasan, "Soft-Error Monte Carlo Modeling Program, SEMM," IBM J. Res. Develop.
   40, 109 (1996, this issue). See also G. R. Srinivasan, P. C. Murley, and H. K. Tang, IEEE J. Rad. Eff. (1994); G. R. Srinivasan, "Modeling the Cosmic-Ray- Induced Soft-Error Rate in Integrated Circuits: An Overview," IBM J. Res. Develop. 40, 77 (1996, this issue) and references therein.
- 2. J. F. Ziegler, "Terrestrial Cosmic Rays," *IBM J. Res. Develop.* **40**, 19 (1996, this issue).
- C. S. Guenzer, E. A. Wolicki, and R. G. Allas, "Single Event Upset of Dynamic RAMs by Neutrons and Protons," *IEEE Trans. Nucl. Sci.* NS-26, 5048 (1979). This is the first experimental paper on determining the SER of electronic components from protons.
- 4. J. H. Adams, Jr. and A. Gelman, "The Effects of Solar Flares on Single Event Upset Rates," *IEEE Trans. Nucl. Sci.* NS-31 (1984).
- D. K. Nichols, W. E. Price, and J. L. Andrews, "The Dependence of Single Event Upset on Proton Energy (15-590 MeV)," *IEEE Trans. Nucl. Sci.* NS-29 (1982).
- W. L. Bendel and E. L. Petersen, "Proton Upsets in Orbit," IEEE Trans. Nucl. Sci. NS-30, 4481 (1983).
- G. E. Farrell and P. J. McNulty, "Proton-Induced Nuclear Reactions in Silicon," *IEEE Trans. Nucl. Sci.* NS-28, 4007 (1981).
- 8. G. E. Varrell and P. J. McNulty, "Microdosimetric Aspects of Proton-Induced Nuclear Reactions in Thin Layers of Silicon," *IEEE Trans. Nucl. Sci.* NS-29, 2012 (1982)
- Y. Patin, C. Humeau, and G. Vidiella, "Etude Experimentale de la Collection de Charges dans des Diodes PN Irradiées par des Ions Lourds," Ann. de Phys. 14, 225 (1989).
- 10. E. L. Petersen, "The Relationship of Proton and Heavy Ion Upset Thresholds," *IEEE Trans. Nucl. Sci.* 39, 1600 (1992)
- J. G. Rollins, "Estimation of Proton Upset Rates from Heavy Ion Test Data," *IEEE Trans. Nucl. Sci.* 37, 1961 (1990)
- R. Koga, W. A. Kolasinski, J. V. Osborn, J. H. Elder, and R. Chitty, "SEU Test Techniques for 256Kb Static RAMs and Comparisons of Upset Induced by Heavy Ions and Protons," *IEEE Trans. Nucl. Sci.* 35, 1638 (1988).
- and Protons," *IEEE Trans. Nucl. Sci.* 35, 1638 (1988).

  13. T. Bion and J. Bourrieau, "A Model for Proton-Induced SEU," *IEEE Trans. Nucl. Sci.* 36, 2281 (1989).
- J. C. Pickel, B. Lawton, A. L. Friedman, and P. J. McNulty, "Proton-Induced SEU in CMOS/SOS," *IEEE Trans. Nucl. Sci.* 7, 67 (1989).
- A. B. Campbell, W. J. Stapor, R. Koga, and W. A. Kolasinski, "Correlated Proton and Heavy Ion Upset Measurements on IDT Static RAMs," *IEEE Trans. Nucl. Sci.* NS-32, 4150 (1985).
- B. Doucin, Y. Patin, J. P. Lochard, J. Beaucour, T. Carriere, D. Isabelle, J. Buisson, T. Corbiere, and T. Bion, "Characterization of Proton Interactions in Electronic Components," *IEEE Trans. Nucl. Sci.* 41, 593 (1994).
- E. L. Petersen, J. C. Pickel, J. H. Adams, Jr., and E. C. Smith, "Rate Prediction for Single Event Effects— A Critique," *IEEE Trans. Nucl. Sci.* 39, 1577 (1992).
- W. L. Bendel and E. L. Petersen, "Predicting Single Event Upsets in the Earth's Proton Belts," *IEEE Trans.* Nucl. Sci. NS-31, 1201 (1984).
- H. H. K. Tang, G. R. Srinivasan, and N. Azziz, "Cascade Statistical Model for Nucleon-Induced Reactions on Light Nuclei in the Energy Range 50 MeV-1 GeV," *Phys. Rev.* C 42, 1598 (1990).

- J. F. Ziegler, P. A. Saunders, and T. H. Zabel, "Portable Faraday Cup for Nonvacuum Proton Beams," *IBM J. Res. Develop.* 40, 73 (1996, this issue).
- 21. J. F. Ziegler, J. P. Biersack, and U. Littmark, The Stopping and Range of Ions in Solids, Pergamon Press, New York, 1985. All stopping powers and ranges of ions in matter quoted in this paper are from this reference.
- L. B. Freeman, "Critical Charge Calculations for a Bipolar SRAM Array," *IBM J. Res. Develop.* 40, 119 (1996, this issue).
- 23. W. T. Weeks, A. J. Jimenez, G. W. Mahoney, D. Mehta, H. Qassemzadeh, and T. R. Scott, "Algorithms for ASTAP—A Network-Analysis Program," *IEEE Trans. Circuit Theory* CT-20, 628-634 (1973).
- W. L. Bendel and E. L. Petersen, "Proton Upsets in Orbit," IEEE Trans. Nucl. Sci. NS-30, 4481 (1983);
   W. L. Bendel, "Proton-Induced Single Event Upsets in 71 Earth-Satellite Environments" Report No. 5364, U.S. Naval Research Laboratory, Washington, DC, 1984;
   W. L. Bendel and E. L. Petersen, "Predicting Single Event Upsets in the Earth's Proton Belts," IEEE Trans. Nucl. Sci. NS-31, 1201 (1984);
   W. J. Stapor, J. P. Meyers, J. B. Langworthy, and E. L. Petersen, "Two Parameter Bendel Model Calculations for Predicting Proton Induced Upset," IEEE Trans. Nucl. Sci. 37, 1966 (1990).
- T. J. O'Gorman, "The Effect of Cosmic Rays on the Soft Error Rate of a DRAM at Ground Level," *IEEE Trans.* Electron Devices 41, 553-557 (1994).
- 26. Applications of Nuclear Radiation, Vol. 13, American Society for Testing and Materials, 1976, Ch. 4.

Received September 30, 1994; accepted for publication March 6, 1995

James F. Ziegler IBM Research Division, Thomas J. Watson Research Center, P.O. Box 218, Yorktown Heights, New York 10598 (ZIEGLER at YKTVMV, ziegler@watson.ibm.com). After receiving B.S., M.S., and Ph.D. degrees from Yale, Dr. Ziegler joined IBM in 1967 at the Thomas J. Watson Research Center, where he now manages the Material Analysis and Radiation Effects group. Most of his research concerns the interaction of radiation with matter. Dr. Ziegler is the author of more than 130 publications and 14 books; he holds 11 U.S. patents. He received IBM Corporate Awards in 1981 and 1990. Dr. Ziegler is a Fellow of the American Physical Society and of the IEEE. He has been awarded the von Humboldt Senior Scientist Prize by the German government.

Hans P. Muhifeld IBM Microelectronics Division, East Fishkill facility, Route 52, Hopewell Junction, New York 12533 (MUHLFELD at FSHVMFK1). Mr. Muhlfeld is an advisory engineer in Reliability Services at the IBM East Fishkill facility. He joined the Military Products Division of IBM in 1957 at Kingston, New York. After two years at a SAGE installation at McCord AFB, Tacoma, Washington, he joined the memory development area of SMD in Poughkeepsie, New York, where he was involved in memory testing and memory tester design. In 1986 he joined Reliability Services in East Fishkill, designing test equipment and testing for soft fails in memory chips.

Charles J. Montrose IBM Microelectronics Division, East Fishkill facility, Route 52, Hopewell Junction, New York 12533 (MONTROSE at FSHVMFKI). Mr. Montrose is an advisory engineer in the Reliability Services Department at the East Fishkill facility. His responsibilities include test system design, system control software, and data acquisition. He joined IBM in 1982, after receiving a B.S. degree in electrical engineering from the New Jersey Institute of Technology. He was initially involved in the design of a custom high-speed driver/receiver chip for a high-performance test system.

Huntington W. Curtis IBM Research Division, Thomas J. Watson Research Center, P.O. Box 218, Yorktown Heights, New York 10598 (CURTIS at YKTVMV, curtis@watson. ibm.com). Dr. Curtis received a B.S. in chemistry and physics from the College of William and Mary in 1942, an M.S. in physics and electrical engineering from the University of New Hampshire in 1948, and a Ph.D. in electrical engineering from the State University of Iowa in 1950. Prior to joining IBM, he was a professor of electrical engineering at Dartmouth College. Dr. Curtis joined IBM in 1959, becoming a senior engineer in 1960. After serving as manager of technical requirements at FSD headquarters, he was promoted to technical advisor to the IBM Vice President for Research and Engineering, followed by assignments on the IBM Corporate engineering staff as director of government technical liaison and as director of scientific and technical information. He held subsequent positions as engineering consultant for IBM Biomedical Systems and engineering consultant for Manufacturing Research. Dr. Curtis retired from IBM in 1993 and is now an emeritus scientist at the IBM Thomas J. Watson Research Laboratory. He is a member of Phi Beta Kappa, Tau Beta Pi, and Sigma Xi, a senior member of the Institute of Electrical and Electronics Engineers, and a trustee of the Mount Washington Observatory.

Timothy J. O'Gorman IBM Microelectronics Division, Burlington facility, Essex Junction, Vermont 05452 (OGORMAN at BTVLABVM, ogorman@vnet.ibm.com).

Mr. O'Gorman received the B.S. degree in physics from Manhattan College, Bronx, New York, in 1976, and the M.S. degree in physics from Pennsylvania State University, State College, Pennsylvania, in 1978. He joined the IBM General Technology Division in Burlington, Vermont, in 1981. Since then he has worked in semiconductor reliability engineering. Mr. O'Gorman's main interests have been in radiation-induced soft errors in memory chips. He is currently working on reliability modeling of CMOS circuits.

John M. Ross IBM Microelectronics Division, East Fishkill facility, Route 52, Hopewell Junction, New York 12533 (JMROSS at FSHVMFK1, jmross@vnet.ibm.com). Mr. Ross is an electrical engineer in the Storage Subsystems and Interface Products Department of the IBM Microelectronics Division. He received a B.S. degree in electrical engineering from Virginia Polytechnic Institute and State University in 1989, and an M.S. degree in electrical engineering from Columbia University in 1993. Mr. Ross joined IBM in 1989 at the East Fishkill facility, where he became involved in the study of soft errors in computer memories. He is currently working on the design of high-performance multibyte interface circuits.