# by C. H. Stapper

# Large-area fault clusters and fault tolerance in VLSI circuits: A review

Fault-tolerance techniques and redundant circuits have been used extensively to increase the manufacturing yield and productivity of integrated-circuit chips. Presented here is a review of relevant statistical models which have been used to account for the effects on manufacturing yield of the large-area defect and fault clusters commonly encountered during chip fabrication. A statistical criterion is described for determining whether such large-area clusters are present.

### Introduction

The designation fault-tolerant is often used in connection with integrated circuits that have some degree of tolerance to flaws caused by their manufacture. Although such circuits are capable of functioning correctly if they contain certain types of manufacturing faults, their fault tolerance does not pertain to all types of such faults. As a result, their fabrication yield is usually not 100%.

<sup>®</sup>Copyright 1989 by International Business Machines Corporation. Copying in printed form for private use is permitted without payment of royalty provided that (1) each reproduction is done without alteration and (2) the *Journal* reference and IBM copyright notice are included on the first page. The title and abstract, but no other portions, of this paper may be copied or distributed royalty free without further permission by computer-based and other information-service systems. Permission to *republish* any other portion of this paper must be obtained from the Editor.

The prediction of the yields of such circuits is generally difficult, as is illustrated in at least three doctoral dissertations dealing with this subject, namely those of Mangir [1], Hedlund [2], and Harden [3]. Mangir subsequently improved her formulations in a later paper [4]. It was shown by Harden [5] that Mangir's approach can lead to yet different results when they incorporate a "lumped-sum approximation" that was originally described in Hedlund's thesis. In his doctoral dissertation [3], however, Harden used an extension of a method for yield modeling of integrated-circuit memory chips with redundant word and bit lines described by Stapper et al. [6]. The latter method has generally been accepted in the literature and forms the groundwork of this paper.

The difficulty in modeling the yield of fault-tolerant integrated-circuit chips is caused by the clustering of manufacturing defects during chip fabrication. The clusters can be categorized into three classes. The first pertains to clusters that are much larger than the chip size. Yield models which take such clustering into account have been adopted by most authors in this field [7–15]. These models also apply to wafer-to-wafer variations of defect densities that, according to [16], can be expected to dominate over other forms of defect and fault clustering.

Another class of clusters deals with fault clusters that are smaller than the chip area. It is sometimes believed that the faults in such small clusters should distribute themselves according to a Poisson distribution. This statistical distribution, however, is too constrained, because it has a variance that is equal to the mean. By their very nature, clusters contain a large number of defects. They therefore tend to increase the variability in the number of defects and faults per chip. As a result, clustering leads to distributions with variances that are larger than the mean. Some of the statistics applicable to this type of clustering are described elsewhere by this author [17]. It has been shown previously that, under the proper assumptions, a negative binomial fault distribution is applicable when clusters smaller than the chip area are encountered [18].

The third class of fault clusters deals with fault clusters that vary in dimension. This area has been investigated by Warner [19, 20], Hu [21], Stapper [22], and in an approximate point-defect model for wafer-scale integration by Ketchen [23]. A simulation technique for its modeling has, furthermore, been described by Foard Flack [24]. These efforts, however, have not been definitive.

Negative binomial distributions in general have provided good yield models for integrated-circuit defects and faults [13–16, 25–28]. They are used here for modeling fault distributions resulting from fault clusters that are larger than the chip area. However, a number of other statistical distributions, some of which have been described previously [16], can be used for this type of modeling if supported by relevant data.

Although cluster statistics have been used successfully for estimating the yields of chips with redundancy, they have not always been used in the literature. A number of analyses of the yield of wafer-scale integrated circuits with redundancy have been carried out with the random-defect yield formula [29–31]

$$Y = e^{-\lambda},\tag{1}$$

where Y is the chip yield and  $\lambda$  is the average number of faults expected per chip or circuit. The average number of faults per chip is often expressed as  $\lambda = AD$ , the chip area A times a defect density D. In some cases it is expressed in terms of a critical area, susceptible area, or defect-sensitive area, and a relevant defect density. Any of these designations is, however, a simplification. The relationship between the average number of faults per chip and the chip area is more complicated; it depends on the circuit complexity, the density of photolithographic patterns, the number of photolithographic masks used in the process, etc.

The above formula does not take clustering into account and usually leads to predicted chip yields that are too low when extrapolated from the yield of smaller chips or single circuits. Equation (1) results from the use of a Poisson distribution for modeling the distribution of the number of faults per chip. When such a distribution is used to estimate the yield of fault-tolerant chips, the results tend to be too optimistic. The gain in manufacturing yield for such chips

on actual manufactured wafers is usually less. These yields are lower than predicted because of defect clusters. This can best be demonstrated with a simple example.

Suppose that there is sufficient redundancy on a chip for it to be fault-tolerant to a maximum of four faults. Consider a wafer with such chips, each chip containing an average of five faults. Assume that the frequency distribution of the number of faults per chip is given by a Poisson distribution. It can then be determined from tables of cumulative Poisson distributions that 44% of those chips could be expected to have four or fewer faults [32]. Because of such fault tolerance, all these chips should be usable. As a result, the expected chip yield is 44%. In estimating this yield, we have assumed applicability of a Poisson distribution to the entire surface of the wafer.

Next let us examine a wafer for which one half is completely defect-free, so that the chip yield on that half is 100%. Let the other wafer half be very defective, with an average of ten faults per chip. The average number of faults per chip for the wafer is therefore equal to five, the same as in the prior example. If the chips on the defective half are adjacent to one another, the faults in those chips can be considered to form a contiguous defect cluster. Let the faults within this cluster also be randomly distributed in agreement with Poisson's distribution. This distribution therefore has a parameter  $\lambda = 10$ . The yield for this half is therefore 0%, and the combined yield for the two halves is equal to 50%. This is more than the fault-tolerant yield in the prior example.

Next, let us determine what fault tolerance does for this wafer. According to the tables of cumulative Poisson distributions, only 2.9% of chips with an average of ten faults per chip can be expected to have four or fewer faults. The predicted fault-tolerant yield for the chips in this half is therefore 2.9%. Combining this result with the 100% yield for the fault-free half produces an estimated combined chip yield of 51.5% for this wafer. This is only slightly more than the 50% yield which would have resulted if no fault-tolerant circuits had been used. The fault tolerance is therefore only of limited benefit. Contiguous defect clusters of this type could therefore severely impact yield of chips with fault-tolerance schemes, and benefit the ones without such schemes.

It has been known since the beginning of integrated-circuit manufacture that Equation (1) had to be modified to account for defect and fault clustering. Such modifications have been the subject of many papers in this field. The most commonly used method of modification is described in the next section. A method for determining the parameters of the resulting model is discussed in subsequent sections of this paper. The model is then used to calculate the yield of chips that are partially good. A comparison of actual and calculated results is given. The resulting formulas are also

extended to the calculation of the yield of chips containing redundant circuits.

### The effect of large-area fault clustering on yield

Determination of the size of integrated-circuit fault clusters is a subject that has found only cursory treatment in the vield-modeling literature, as for example in [33]. In most papers on integrated-circuit chip yield, this subject is simply ignored, even though in many cases it is unknowingly assumed that the clusters are larger than the chip size. The success of many yield models can be attributed to the fact that this is not a bad assumption. According to [16], most of the clustering is expected to be caused by wafer-to-wafer variations of defect densities. In that case, the cluster area is equal to the wafer size, which is indeed larger than the area of individual chips. Another source of clustering is the radial variation in the average number of faults per chip. This effect was originally described by Yanagawa [34, 35], confirmed by others [27, 36] and studied more recently by Ferris-Prabhu et al. [37], Walker [38, 39], and Gandemer [40]. It leads to a lower chip yield along the periphery of integrated-circuit wafers. This peripheral region can therefore in effect be considered a large fault cluster.

The radial variation of chip yield has led to the use of concentric wafer zones for yield analysis [16, 27, 36, 41]. In such analyses, it is usually assumed that the faults per chip within each zone are distributed according to a Poisson distribution. Each zone has its own average number of faults per chip  $\lambda$ . The yield inside a zone can therefore be estimated by using Equation (1). The yield of chips in all zones from many wafers can be combined, resulting in a compound or mixed Poisson yield model.

It is not necessary to constrain the fault clusters to zones. In a more general approach to fault clustering, use of a Poisson distribution is assumed to be valid for characterizing the frequency of occurrence of faults per chip within each cluster. Such clusters can be located anywhere. For an infinite number of them, according to [16, 22, 41], the yield formula becomes\*

$$Y = \int_0^\infty e^{-\lambda} dF(\lambda),\tag{2}$$

where  $F(\lambda)$  is a cumulative distribution function of the average number of faults per chip in each cluster. A more detailed description of this procedure can be found in the aforementioned references. In this paper the fault clusters for which this procedure is valid are referred to as *large-area* fault clusters. A test for this type of clustering is described in a subsequent section.

Associated with the cumulative distribution function  $F(\lambda)$  is a probability distribution function given by

$$P(\lambda) = \frac{dF(\lambda)}{d\lambda}.$$
 (3)

This represents a distribution of averages, where each value of  $\lambda$  pertains to the average number of faults per chip in a cluster. Combining Equations (2) and (3) results in the yield expression first used by Murphy [7]:

$$Y = \int_0^\infty e^{-\lambda} P(\lambda) \ d\lambda. \tag{4}$$

The function  $P(\lambda)$  in this expression is known as a compounder or mixing function. This function can often be approximated by a gamma distribution [12, 13, 15, 16, 25–28]. This therefore makes it possible to evaluate the integral in Equation (4) and results in a well-known integrated-circuit yield formula,

$$Y = (1 + \overline{\lambda}/\alpha)^{-\alpha},\tag{5}$$

where  $\alpha$  is a cluster parameter and  $\overline{\lambda}$  is the average number of faults per chip. It can be shown that  $\overline{\lambda}$  is in effect the average of the probability distribution function  $P(\lambda)$ . This average is therefore the grand average (average of averages) of the number of faults per chip. More sophisticated methods for deriving Equation (5) are described in [41].

The cluster parameter  $\alpha$  also has physical significance. In the limit when  $\alpha \to \infty$ , the yield in Equation (5) becomes equal to that of Equation (1). This represents the case of random defects and complete absence of clustering. Smaller values of  $\alpha$  usually indicate increased clustering. When  $\alpha=0$ , the defects are clustered in infinitely small regions and none are found elsewhere. This is maximum or perfect clustering. Actual values for  $\alpha$  typically range between 0.3 and 5. Methods for determining this parameter are described in the next section.

# An example of the effects of large-area clustering

The effects of large-area defect clustering are well known [7–16]. They can be illustrated by examining chips containing varying numbers of identical circuits. Let us start with a single circuit that has a hypothetical yield of 0.999 and an average of 0.001 faults per circuit. If we use Equation (1), the yield of a chip with 600 of these circuits is equal to  $e^{-600\times0.001}$ , which is approximately equal to 55%. For a chip with 40 000 logic circuits, we expect a yield of  $e^{-40.000\times0.001} = 4.248 \times 10^{-8}$ , or, for all practical purposes, 0%.

If large-area clustering is taken into account, the yield formula for a chip with n identical circuits is given by

$$Y_{nC} = (1 + n \times \overline{\lambda}_{1C}/\alpha)^{-\alpha}, \tag{6}$$

where the average number of faults in a single circuit is denoted by  $\overline{\lambda}_{1c}$ . Assuming again that this number is equal to 0.001, it is possible to estimate the yield for chips with any number of circuits. Calculated yields for chips with single circuits, chips with 600 circuits, and chips with 40 000

C. H. STAPPER

This type of integral is sometimes referred to as a Stieltjes-Lebesgue integral. In the example here, it is the result of a limiting process in which the number of clusters approaches infinity.

circuits are tabulated in **Table 1** for values of  $\alpha = 0.5$ , 1, 2, and  $\infty$ . These results show that even if  $\overline{\lambda}_{1c}$  is high, the presence of a high degree of clustering leads to surprisingly high yields. This effect has been observed in many manufacturing lines.

Usually a gross yield factor  $Y_0$  must be included in the yield model. Gross yield losses are usually the result of systematic processing problems that affect whole wafers or parts of wafers. Such losses may, for example, be caused by misalignment, over- or under-etching, or out-of-spec semiconductor parameters such as beta, transconductance, or threshold voltage. Paz and Lawson have shown that defect clusters with very high fault densities can also be modeled by  $Y_0$  [27].

Introduction of the gross yield into the yield formula leads to

$$Y = Y_0 (1 + \overline{\lambda}/\alpha)^{-\alpha}. \tag{7}$$

This three-parameter model has been used successfully for yield modeling since 1975. Its parameters have physical significance and can be determined by a straightforward technique described in the next section. It must be pointed out, however, that the simplicity of this model can be deceptive. Some of the hidden complexities are discussed in subsequent sections.

### **Determination of parameters**

The values of  $Y_0$ ,  $\overline{\lambda}$ , and  $\alpha$  in Equation (7) can be determined by the "window" method. This method was first described by Seeds [8, 9] and subsequently by Okabe et al. [11], Warner [19, 20], Paz and Lawson [27], and Hemmert [15]. The objective is to determine the yield as a function of chip multiples. This is done with wafer maps that show the location of functioning and failing chips at final test. The maps are analyzed using overlays with grids, or windows. These windows contain blocks of chips. Each block usually contains two, four, six, or nine chips. For each chip multiple, the number of windows containing only fault-free chips can be counted. Dividing this number by the total number of windows in the sample gives us the yield for that multiple.

The results of the window analysis must next be matched to a yield formula. For the negative binomial model this has the form

$$Y_N = Y_0 (1 + N\overline{\lambda}/\alpha)^{-\alpha}, \tag{8}$$

where N is the chip multiple. Values for the parameters  $Y_0$ ,  $\overline{\lambda}$ , and  $\alpha$  are usually determined by means of a nonlinear regression analysis.

Note that high values of  $\alpha$  obtained by this method do not necessarily mean that there is less clustering. This phenomenon only implies that there is less large-area clustering. Small-area clusters can still exist, but this method is impervious to them. The smaller clusters are essentially counted as single faults. These observations were described



### Figure

Wafer map showing the locations of fault-free (light) and defective (dark) chips. Test site locations are marked with crosses.

**Table 1** Yield as a function of the number of circuits per chip and the cluster parameter  $\alpha$ , assuming that  $\overline{\lambda}_{1c} = 0.001$ .

| Cluster<br>parameter<br>α | Number of circuits per chip |         |               |  |
|---------------------------|-----------------------------|---------|---------------|--|
|                           | n = 1                       | n = 600 | $n = 40\ 000$ |  |
| 0.5                       | 99.9                        | 67.4    | 11.1          |  |
| 1                         | 99.9                        | 62.5    | 2.4           |  |
| 2                         | 99.9                        | 59.2    | 0.2           |  |
| ∞                         | 99.9                        | 55.0    | 0             |  |

**Table 2** Illustrative use of the window method to determine model parameters. For this example  $Y_0 = 1$ ,  $\overline{\lambda} = 1.2934$ , and  $\alpha = 3.8274$ .

| Chip<br>multiples | Sample<br>size | Number<br>perfect | Data<br>yield<br>(%) | Model<br>yield<br>(%) |  |
|-------------------|----------------|-------------------|----------------------|-----------------------|--|
| 1                 | 2136           | 701               | 32.82                |                       |  |
| 2                 | 1008           | 140               | 13.89                | 13.86                 |  |
| 4                 | 480            | 18                | 3.75                 | 3.79                  |  |

in [42], but continue to be misunderstood, suggesting that future elaboration in the literature is warranted.

It is not difficult to use the window method. An example of a window-method analysis is tabulated in **Table 2**. The data in that table came from 24 wafers, each one containing 89 memory chips. For each wafer a map was obtained to show the location of fault-free and faulty chips. One of these maps is shown in **Figure 1**. Also shown on the map are locations taken up by test sites used to measure processing parameters.

The first step in evaluating the wafer-map data was the determination of the chip yield. In this case 701 out of a total of 2136 chips were fault-free. The yield was therefore 32.8%. Next, a transparent overlay was made with a grid containing pairs of chips. It was found that only 42 pairs could be placed on each wafer map. This resulted in a sample of 1008 pairs. Only 140 of these were found to be free of faulty chips. The yield for these windows with blocks of two chips was therefore 13.9%.

The third step consisted of making an overlay grid that contained four chips in a  $2 \times 2$  arrangement. Seventeen such windows could be fitted unambiguously on a wafer. To increase the sample size, and to include as much of the circumferential area as possible, three additional odd-shaped windows containing four chips were formed along the wafer edge. The total sample therefore contained 480 windows. For 18 of these windows it was found that all four chips were free of faults, thus resulting in a yield of 3.75%.

It is possible to obtain an additional data point by analyzing blocks of three chips. Such blocks, however, have odd-shaped windows, which makes them awkward to use. The three data points in Table 2 supply sufficient data for determining the parameters of the yield model. The values for  $\overline{\lambda}$ ,  $\alpha$ , and  $Y_0$  were obtained by fitting Equation (8) to these data points with a computer program that minimized the sum of the squares of the differences between model and data. With three data points and three parameters in Equation (8), this was equivalent to solving three nonlinear equations with three unknowns. For these data, furthermore, it was possible to set  $Y_0 = 1$ . This led to the values  $\overline{\lambda} =$ 1.2934 and  $\alpha = 3.8274$  for the other two parameters. Putting these values into Equation (8) led to the numbers shown in the column labeled Model yield in Table 2. The experimental yields are also tabulated and are in good agreement. Because of the nonlinearity, even with three data points, such agreement is not always guaranteed for this three-parameter model. The author has seen single-wafer data for which this was indeed the case. Results obtained from single-wafer analysis, as in [13, 19-22, 43], must therefore be regarded as fortuitous. The use of larger samples, as is done here and, originally, by Hemmert [15], is thus more appropriate.

The window-method analysis is used regularly in the industry. A variation of such an analysis was described by R. S. Hemmert [15]. His data were obtained from wafer maps of logic chips and read-only memories (ROMs). He used a least-square fitting technique to determine  $\overline{\lambda}$  and  $\alpha$  in Equation (8) while keeping  $Y_0$  at 100% yield. His results on seven manufacturing lots of wafers had an average cluster parameter of 2.2 with a standard deviation of 0.22. The values of  $\alpha$  were therefore tightly grouped, indicating that they were stable during the fabrication of those lots.

By combining all the data from his lots, Hemmert obtained a value of 2.1 for  $\alpha$  and surmised that an integer

number of 2 was acceptable for a yield model in his factory. He also showed that, if data were grouped by ranges of yield, the value of  $\alpha$  was observed to vary from 1.6 to 3.65. The lowest value, and therefore maximum clustering, was observed in the group with the highest yield.

An alternative use of Equation (8) has been described in [28] and [44]. The yield of different read-only memory chips was analyzed as a function of the number of bits in those chips. This number was represented by N in Equation (8). The values of  $Y_0$ ,  $\overline{\lambda}$ , and  $\alpha$  in that case were also determined with a nonlinear least-square minimization technique. This analysis was performed on data from three different manufacturing lines and resulted in values for  $\alpha$  of 1.27, 0.86, and 0.75. As in Hemmert's results, the lowest value, and therefore the highest degree of clustering, occurred on wafers fabricated in the manufacturing line with the highest chip yields. The highest value of  $\alpha$ , suggesting less clustering, resulted from the wafers fabricated in the line with the lowest chip yields.

The yield analysis of these read-only memory chips also showed that the gross yield  $Y_0$  varied between 70.8 and 90.4%. Although these numbers include the yield of the support circuits on these chips, this range of gross yields is typical for most integrated circuits. The lowest value of  $Y_0$  occurred in the low-yield line and the highest value of  $Y_0$  in the high-yield line.

It must be noted here that the values of  $\overline{\lambda}$  obtained by this method tend to be lower than the actual average number of faults observed on chips. This difference can be caused by the effect of clusters that are smaller than the chip. As mentioned before, such clusters are counted as single faults by this technique. This counting also affects the cluster parameter  $\alpha$ , which tends to be higher than the actual fault distributions might suggest. Nevertheless, the window method produces usable results, as is shown in the next sections.

### Partially good chips

In many integrated-circuit chips, identical blocks of circuits are often replicated. This is especially the case in chips used for digital computers. Sometimes these basic circuit blocks are referred to as *processing elements*, or *PEs*. In other digital computer applications they are referred to as *macros*. In memory chips, blocks of memory cells are known as *subarrays*. The terminology depends not only on the type of circuitry that is used, but also on the individual using it. The designation *circuit blocks* is used in this paper. It is meant to be general and to include all these designations.

Chips containing a number of identical circuit blocks can often be used even if some of the blocks do not function correctly. Consider, for example, chips consisting of four identical circuit blocks. These chips are known as *perfect* if all four blocks are fault-free. The fraction of chips falling in this category represents the perfect chip yield. The chips with

three operating circuit blocks and one defective one are referred to as being *three-quarter-good*. The yield of these chips is known as the three-quarter-good yield. Similar designations apply to the yields of chips that are *half-good* and *quarter-good*. In general fractionally usable chips are known as *partially good* chips [6].

Next, let us examine the case of a chip having N identical circuit blocks. Suppose that M of these blocks function properly and that (N-M) circuit blocks are defective. Furthermore, let the probability of finding a fault-free circuit block be denoted by the yield  $Y_{\rm CB}$ . In the case of random defects that do not cluster, the probability of finding M faultless circuit blocks on a chip can then be expressed as  $Y_{\rm CB}^M$ . It also follows that the probability of finding a faulty circuit block is given by  $1-Y_{\rm CB}$ . The probability of finding (N-M) flawed circuit blocks is therefore equal to  $(1-Y_{\rm CB})^{N-M}$ .

The number of different ways in which (N - M) faulty circuits can occur on a chip with N circuits is given by the binomial coefficient

$$C(N, M) = \frac{N!}{M!(N-M)!}.$$
 (9)

The probability  $Y_{MN}$  of finding precisely M flawless circuit blocks on a chip with a total of N circuit blocks is therefore given by the binomial distribution

$$Y_{MN} = \frac{N!}{M!(N-M)!} Y_{CB}^{M} (1 - Y_{CB})^{N-M}.$$
 (10)

As it stands, this formula can be used only when the faults do not cluster, or when the fault clusters are smaller than the individual circuit blocks. For fault clusters larger than the chip, Equation (10) must be modified. A means for doing so was originally mentioned in [6] and subsequently in [41]. A detailed discussion of this modification follows.

The key to the modification of the binomial distribution in Equation (10) is the quantity  $(1 - Y_{CB})^{N-M}$ . This can be expanded in a binomial series of the form

$$(1 - Y_{\rm CB})^{N-M} = \sum_{j=0}^{N-M} (-1)^j \frac{(N-M)!}{j!(N-M-j)!} Y_{\rm CB}^j.$$
 (11)

It is possible to define another running index n in such a way that n = j + M. Introducing this into Equation (11) and substituting the result in Equation (10) results in

$$Y_{MN} = \frac{N!}{M!(N-M)!} \sum_{n=M}^{N} (-1)^{N-M} \cdot \frac{(N-M)!}{(n-M)!(N-n)!} Y_{CB}^{n}.$$
 (12)

The yield of partially good chips therefore depends completely on a sum of powers of  $Y_{CR}$ .

As early as 1975, Dreckmann and Stapper replaced the yields  $Y_{CB}^n$  in Equation (12) *ipso facto* by yields calculated

with the negative binomial yield model. McLaren subsequently showed mathematically that their heuristic approach was indeed a correct procedure. His approach has been described in [6]. It appears, however, to be poorly understood. A more precise description of this method is therefore presented here.

When large-area fault clustering is present, Equation (12) is valid only within a cluster. The partially good chip yield  $Y_{MN}$  therefore varies from cluster to cluster. Let the frequency distribution of the number of faults per circuit block within each cluster be characterized by a Poisson distribution. It then follows that in such an area

$$Y_{\rm CB}^n = e^{-n\lambda_{\rm CB}}. (13)$$

What remains to be done is to average these yields for the different clusters. This can be done by applying the Poisson compounding procedure directly to Equation (12) for the yield of the partially good chips. This compounding is independent of the summation in that expression. The integral can therefore be brought inside the summation sign, thus leading to the expression

$$Y_{MN} = \frac{N!}{M!(N-M)!} \sum_{n=M}^{N} (-1)^{n-M}$$

$$\cdot \frac{(N-M)!}{(n-M)!(N-n)!} \int_0^\infty e^{-n\lambda_{\rm CB}} P(\lambda_{\rm CB}) d\lambda_{\rm CB}. \tag{14}$$

Denoting the integral in this expression by  $Y_{nCB}$  makes it possible to write the partially good yield as

$$Y_{MN} = \frac{N!}{M!(N-M)!} \sum_{n=M}^{N} (-1)^{n-M} \cdot \frac{(N-M)!}{(n-M)!(N-n)!} Y_{nCB}.$$
 (15)

This is the most important formula in this paper. It is crucial to the development of yield models for partially good chips, as well as for chips with redundancy. This equation depends completely on the yields  $Y_{nCB}$  associated with having n fault-free circuit blocks. If the compounder  $P(\lambda)$  in Equation (14) is equal to a gamma distribution, we again obtain a negative binomial yield formula

$$Y_{nCB} = (1 + n\overline{\lambda}_{CB}/\alpha)^{-\alpha}, \tag{16}$$

where  $\overline{\lambda}_{CB}$  is the average number of faults per circuit block and  $\alpha$  the cluster parameter. A number of other yield models, which can also be used for this application, are described in [16] and [41].

In many practical yield calculations it is often necessary to put more detail into the formula for  $Y_{nCB}$ . This can be done without any loss of generality in the approach described here. Examples of this are discussed in a subsequent section of this paper.

**Table 3** Illustrative calculations of the yields of partially good chips.

| Partials | Sample<br>size | Number<br>of chips | Data<br>yield<br>(%) | Model<br>yield<br>(%) |
|----------|----------------|--------------------|----------------------|-----------------------|
| All-good | 480            | 18                 | 3.8                  | 3.8                   |
| 3/4      | 480            | 63                 | 13.1                 | 12.3                  |
| 2/4      | 480            | 120                | 25.0                 | 23.6                  |
| 1/4      | 480            | 140                | 29.2                 | 32.1                  |

### **Experimental verification**

It is not difficult to verify the results from the preceding section. This can be done with the same window method that was described earlier. To do so, we perform a more detailed analysis of the overlay grid that contained the window arrangements for four chips. In the earlier example, twenty of such blocks or windows were fitted on a wafer, resulting in a total sample of 480 windows. It was found that in 18 of these windows all four chips were functioning correctly; this therefore produced a yield of 3.75%. It was also possible to count the windows containing three functioning chips and one faulty one. There were 63 of these, or 13.1% of the sample. Furthermore, 120 windows contained two good and two failing chips, which accounted for 25% of the windows. Another 140 windows, or 29.2%, contained only one functioning chip. In the remaining windows all four chips were defective.

These are all the data necessary to check the applicability of the theory described in the preceding sections. This is done by treating the individual chips as circuit blocks. Thus, windows with four good chips are considered to be perfect, those with three good chips as being three-quarter-good, those with two good chips as half-good, and those with only a single nonfailing chip as quarter-good. Their yields are tabulated in **Table 3**.

It is also possible to calculate these yields theoretically. Use of Equation (15) results in

$$Y_{\mathbf{P}} = Y_{4CB}, \tag{17a}$$

$$Y_{34} = 4(Y_{3CB} - Y_{4CB}), (17b)$$

$$Y_{24} = 6(Y_{2CB} - 2Y_{3CB} + Y_{4CB}), (17c)$$

$$Y_{14} = 4(Y_{1CB} - 3Y_{2CB} + 3Y_{3CB} - Y_{4CB}), (17d)$$

where the perfect yield is denoted by  $Y_{\rm P}$  rather than  $Y_{\rm 44}$ . Furthermore,  $Y_{\rm 1CB}$  is equal to the yield of the single chips,  $Y_{\rm 2CB}$  to the yield of blocks with two chips,  $Y_{\rm 3CB}$  to the yield of blocks with three chips, and  $Y_{\rm 4CB}$  to the yield of blocks with four chips. These yields can be calculated with either Equation (8) or Equation (16), using the values of  $\overline{\lambda}$ ,  $\alpha$ , and  $Y_{\rm 0}$  that were previously determined with the window method. The results of the calculations are given in Table 3 along with the observed yields. The agreement between the two sets is completely acceptable for practical purposes.

It is useful at this point to demonstrate the inadequacy of yield calculations made without taking clustering into account. This is done by using Equation (10) without the modifications for clustering. With the yield  $Y_{\rm 1CB}$  of a single circuit block equal to 32.8%, the yields obtained with this formula are  $Y_{\rm p}=1.2\%,\,Y_{\rm 34}=9.5\%,\,Y_{\rm 24}=29.2\%$ , and  $Y_{\rm 14}=39.8\%$ . These yields differ significantly from the data shown in the fourth column of Table 3.

The use of partially good chips can be very efficient. By using perfect, three-quarter-good, half-good, and quarter-good chips, all the functional circuits on a wafer are utilized. This can be demonstrated by determining the so-called equivalent yield. This is done by weighting the yield for each type of partially good chip by the fraction of good circuit blocks. These modified yields are then added to give the equivalent yield. For the preceding example, this results in

$$Y_{\rm FO} = Y_{\rm P} + 3/4Y_{34} + 1/2Y_{24} + 1/4Y_{14}. \tag{18}$$

When the yield formulas (17a-d) are substituted into this expression, it reduces to  $Y_{\rm EQ} = Y_{\rm 1CB}$ . The equivalent yield is therefore equal to the yield of the individual circuit blocks. This implies that the use of partially good chips results in utilization of all the fault-free circuit blocks; none have been wasted

Equation (18) can be evaluated by using the yield of fourchip multiples in Table 2, and the actual yields from the yield columm in Table 3. This produces an equivalent yield of 33.4%, which is higher than the original single-chip yield of 32.8% in Table 2. This difference is caused by the difference in sample size. There were 2136 chips used in determining the yield of the single chips in Table 2. For windows with four chips, however, only 1920 chips were used. Some of the single chips simply did not fit into exact blocks of four chips, and therefore could not be used.

### Some practical modifications

The example in the preceding section is an idealization, because actual chips rarely consist entirely of identical circuit blocks. In all chips there are support circuits in addition to such blocks. These support circuits are shared by the replicated circuit blocks. The chips, however, become unusable if such support circuits are damaged beyond use. In principle, this effect can be included in Equation (16) by multiplication with the yield of the support circuits. Doing so, however, would assume that the clustering of the support-circuit faults is completely independent of the clustering of the circuit block faults  $\overline{\lambda}_{CB}$ . In most practical cases there is a correlation between the average number of faults in different circuits. This effect can be taken into account by including in Equation (16) the average number of faults that cause these support circuits to be defective. This results in

$$Y_{nCB} = \left[1 + (\overline{\lambda}_{CK} + n\overline{\lambda}_{CB})/\alpha\right]^{-\alpha},\tag{19}$$

where  $\overline{\lambda}_{CK}$  is the average number of "fatal" or "chip-kill" faults in the support circuits. Chips with these faults cannot be used as partially good chips. Use of (19) in Equation (15) makes it possible to take these types of faults into account when calculating the yields of partially good chips with support circuitry.

Another effect that must be included in yield estimates is the gross yield. Unless the chips are very large, this yield is independent of chip area. It is used as a yield multiplier, denoted by  $Y_0$  in the preceding sections. Introducing it into the yield formula (19) results in

$$Y_{nCB} = Y_0 [1 + (\overline{\lambda}_{CK} + n\overline{\lambda}_{CB})/\alpha]^{-\alpha}. \tag{20}$$

Introduction of this expression into Equation (15) results in a formula that can be used to estimate yields of partially good chips with support circuits and gross yield losses.

This author has had the fortune to work in an integrated-circuit-chip manufacturing plant where a great deal of information about fault-producing defects is available. Such defects include missing and extra pattern defects for all the photolithographic masking steps, pinhole voids that cause short circuits in the interlevel insulators, and crystalline defects that affect the semiconductor device operation. It is possible to use an individual yield model for each of these defect types. The method for doing so has been described in [14, 41, 45] and is reviewed here.

It is possible to apply the Poisson compounding technique to the faults caused by each type of defect. To do so, let each of m different types be indicated by an integer value  $i=1,2,3,\cdots,m$ . The average number of faults per chip associated with each type within a fault cluster can then be designated by  $\lambda_i$ . Assume that the defects are randomly distributed within a cluster and that the number of faults per chip can be characterized by a Poisson distribution. The yield associated with each type within a cluster is then given by

$$Y_i = e^{-\lambda_i}. (21)$$

Assume further that the average number of faults per chip  $\lambda_i$  varies from cluster to cluster. It is then possible to apply the compounding technique to each defect type individually. Data have suggested that the compounders in this case can often also be approximated by gamma distributions, albeit with a different distribution for each defect type. As a result, the yield formulas take on the form

$$Y_i = (1 + \overline{\lambda}_i/\alpha_i)^{-\alpha_i}, \tag{22}$$

where the average number of faults  $\overline{\lambda}_i$  and the cluster parameter  $\alpha_i$  are different for each type of defect.

Yields associated with different types of defects are known as *limited yields*. They can be combined by multiplication, so that the random-defect yield for a chip is given by

$$Y = \prod_{i=1}^{m} \left(1 - \overline{\lambda}_i/\alpha_i\right)^{-\alpha_i}.$$
 (23)

This formula, according to [41], is valid even if there is interdependence or correlation between different types of defects.

Although Equation (23) is more complex than Equations (5), (7), and (8), it is possible to use it in yield calculations for partially good chips. When the gross yield and the chip-kill faults are included, we obtain the formula

$$Y_{nCB} = Y_0 \prod_{i=1}^{m} \left[ 1 + (\overline{\lambda}_{CKi} + n\overline{\lambda}_{CBi})/\alpha_i \right]^{-\alpha_i}$$
 (24)

for the yield of n circuit blocks. Here  $\overline{\lambda}_{CK,i}$  represents the average number of chip-kill faults per chip resulting from the different defect types. Similarly,  $\overline{\lambda}_{CB,i}$  denotes the average number of faults per circuit block caused by defects of type i. These different types of defects are designated by the values of the running index i. Partially good chip yields, in this case, can also be calculated by introducing Equation (24) into Equation (15).

Values for the parameters in Equations (22) and (23) are usually determined by applying the window method to wafers with test sites. Such sites contain defect monitors that are sensitive to the different types of defects. Yield predictions have been routinely made in this way at the IBM facilities in Essex Junction, Vermont, and Manassas, Virginia.

At the IBM facility in Hopewell Junction, New York, a somewhat more comprehensive approach is often followed. Using a technique originally described by Paz and Lawson [27], a gross yield factor  $Y_{0i}$  is introduced into Equation (22). However, data analysis has shown that small variations in the values of  $Y_{0i}$  can result in large fluctuations of the values of  $\alpha_{i}$ . It is possible to use an alternative technique for determining the parameters of this model, as was done by this author [36]. Nevertheless, use of Equation (22) as it stands, without a gross yield factor  $Y_{0i}$ , has proven to be adequate in a number of integrated-circuit manufacturing lines.

The expression resulting from the use of Equation (24) in Equation (15) was described earlier, in [45]. This formula and variants of it have been used successfully at the IBM facility in Essex Junction since 1981 for estimating and planning the yields of partially good memory chips. Most of the chips contained word- and bit-line redundancy in addition to the schemes for partially good chips. In that case, the added redundancy increased the yields  $Y_{nCB}$  of the individual circuit blocks. This increase in yield was estimated with a yield model for memory chips with redundant word and bit lines. A version of this model has been described in [6]. The yields obtained in this way were used directly in Equation (15) to calculate the yield of the partially good chips.

### Redundancy

The object of redundancy is the replacement of defective circuit blocks with good ones. For instance, consider chips

**Table 4** Yield enhancement with different amounts of redundancy R for varying degrees of large-area fault clustering.

| Cluster        | Redundancy |       |       |       |       |      |
|----------------|------------|-------|-------|-------|-------|------|
| parameter<br>α | R = 0      | R = 1 | R = 2 | R = 3 | R = 4 | R=5  |
| 0.5            | 10         | 14.6  | 17.8  | 20.3  | 22.4  | 24.1 |
| 1              | 10         | 18.3  | 25.2  | 31.1  | 36.1  | 40.6 |
| 2              | 10         | 22.4  | 34.3  | 44.8  | 53.7  | 61.1 |
| <b>∞</b>       | 10         | 30.6  | 53.8  | 73.0  | 85.8  | 93.1 |

on which M identical circuit blocks have to function properly if the chips are to be usable. Let these chips be manufactured with N of those circuit blocks, where N > M. The number of redundant circuits R is then given by

$$R = N - M. (26)$$

The likelihood of finding a number of good circuit blocks on such chips equal to M, M+1, M+2, etc. is a probabilistic event. The events associated with these numbers are mutually exclusive, because only a single number of good circuit blocks can exist on any given chip. The probabilities associated with the occurrences of M, M+1, M+2, etc. correctly functioning circuit blocks on a chip must therefore be added to one another to obtain the probability of finding M or more good circuit blocks on a chip. This results in

$$Y_{\text{RED}} = Y_{MN} + Y_{M+1,N} + Y_{M+2,N} + \dots + Y_{M+R,N},$$
 (27)

where  $Y_{M+i,N}$  denotes the probability of finding M+i good blocks on a chip having N circuits. Equation (27) can therefore be expressed as

$$Y_{\text{RED}} = \sum_{i=0}^{R} Y_{M+i,N},$$
 (28)

or, because of (26), as

$$Y_{\text{RED}} = \sum_{i=0}^{R} Y_{N-i,N}.$$
 (29)

Equations (28) and (29) are general expressions for calculating the expected yield of chips containing R redundant circuit blocks.

The probabilities  $Y_{M+i,N}$  and  $Y_{N-i,N}$  in Equations (28) and (29) are the same as those for the partially good chip yields that were discussed in the preceding sections of this paper. They can therefore be calculated with Equation (15), using the appropriate yield expressions for  $Y_{nCB}$ . This results in a complex mathematical expression that contains two series summations, a multiple product, and two sets of binomial coefficients. Fortunately, there is no need to formulate this explicitly, because all of the formulas can be treated simply as nested subroutines in computer programs used to make such yield estimates.

Fault clustering has a pronounced effect on redundancy yield. This can be illustrated with a contrived example that

deals with a chip on which ten identical circuit blocks must be functioning correctly if the chips are to be used. Let the yield of the ten circuit blocks be equal to 10%. We can then investigate how the chip yield is affected if it contains one to five redundant circuits. This is done in **Table 4**, where yields that correspond to different values of the cluster parameter  $\alpha$  are shown.

The pure random-defect model corresponds to  $\alpha=\infty$ . In this case, according to Table 4, the use of five redundant circuits increases the yield from 10% to 93.1%. If, however, the cluster parameter  $\alpha=0.5$ , the yield is expected to improve from 10% to 24.1%. This indicates that the yield prediction for purely random defects is  $4\times$  higher than the prediction for clustered defects. Miscalculations by a factor of four in the productivity of semiconductor manufacturing plants can be very costly. The inclusion of clustering in redundancy yield calculation is therefore of considerable importance.

### A criterion for large-area clustering

Use of the negative binomial yield model has found wide acceptance. It has been used for modeling fault-tolerant VLSI multiprocessors by Koren et al. [46–48], for memory chips containing redundancy by Stewart [49], and for wafer-scale cellular tree architectures by Harden [3]. The negative binomial distribution has also led to formulations of yield variations by Foard Flack [43], and interval estimates of yield by Winter and Cook [50]. It is therefore also used here in developing a criterion for ascertaining whether large-area clustering is present. The approach that follows, however, applies equally well to the Neymann Type A distribution discussed in [16].

The negative binomial distribution that deals with largearea clustering results from compounding of a Poisson distribution with a gamma distribution. This process has been described in detail in a large number of papers and need not be elaborated on here (see for example [11, 16, 41]). The result of the compounding procedure produces the discrete probability distribution function represented by

$$P(X=k) = \frac{\Gamma(\alpha+k)}{k!\Gamma(\alpha)} \frac{(\overline{\lambda}/\alpha)^k}{(1+\overline{\lambda}/\alpha)^{\alpha+k}},$$
 (30)

where X represents a random variable denoting the number of faults per chip, and k is an integer equal to 0, 1, 2, etc. As previously,  $\overline{\lambda}$  denotes the average number of faults per chip. It is also equal to the mean of the compounding gamma distribution. Similarly,  $\alpha$  denotes the cluster parameter. It is equal to  $(\overline{\lambda}/\sigma_{\lambda})^2$ , where  $\sigma_{\lambda}$  is the standard deviation of the gamma distribution [13].

We next investigate how Equation (30) is affected by the window method. Consider a general arrangement of windows, where each window contains a multiple of n chips. Let the clusters be larger than these windows, and let the number of faults per window within a cluster be a random

variable that satisfies a Poisson distribution with an average number of faults per window equal to  $n\overline{\lambda}$ . When this distribution is compounded with the gamma distribution, it also results in a negative binomial distribution. That result differs from Equation (30) by having  $\overline{\lambda}$  replaced by  $n\overline{\lambda}$ . The quantity  $\alpha$  remains the same because the compounder has not changed. It is this property that provides a convenient test for large-area clusters.

It is possible, albeit sometimes with great difficulty, to determine the actual frequency distributions of the number of faults occurring on chips. Similarly, it should be possible to obtain the frequency distributions of the number of faults occurring in windows containing different chip multiples. When these distributions are in agreement with negative binomial distributions, the results can be used to test for the validity of the large-area clustering assumption. This assumption is valid when all the values of  $\alpha$  are the same. This is the criterion for large-area clustering.

Obtaining actual frequency distributions for the number of faults per chip is difficult. To solve this problem, particle distributions on actual wafers have occasionally been used to study the effect of increased area. This was done, for example, in [42], where wafer surfaces were subdivided into squares called *quadrats*. Negative binomial distributions were found to be in good agreement with the frequency distributions of the number of particles in each quadrat for a wide range of quadrat sizes. The values of the cluster parameter  $\alpha$ , however, differed for quadrats with different areas.

The data obtained with quadrat analysis described in [42] can be analyzed by using a maximum-likelihood estimation technique described by Foard Flack [43]. This approach makes it possible to determine the variability in the estimated values of  $\alpha$ . The results of such an analysis are shown in Figure 2. The bars around the data points indicate the range of  $\pm \sigma_{\alpha}$ , where  $\sigma_{\alpha}$  is the standard deviation of each estimate. Note that the horizontal scale is logarithmic and represents a range of two orders of magnitude in area.

Of interest in Figure 2 are the results for the three smallest quadrat areas. The ranges of standard deviations overlap, thus suggesting that these points represent a nearly constant value of  $\alpha$ , and hence the condition for large-area clustering. The increase in values of  $\alpha$  for the other points on the curve indicates that the associated quadrat areas exceed the range for which the large-area clustering approach is valid.

### Cluster parameter dependencies

It was reported in [6, 42, 51] that negative binomial distributions provided good models for the frequency distributions of the number of faults per chip observed in a number of integrated-circuit manufacturing facilities. Studies by this author have indicated that the values for  $\alpha$  in such distributions varied between 0.30 and 2.38. These results were obtained during different years of manufacture. The



## Figure 2

Dependence of the cluster parameter  $\alpha$  on relative quadrat area. These data resulted from an analysis of wafer maps showing the location of particles. Note the logarithmic scale on the horizontal axis.

higher value of  $\alpha$  resulted from low-yield chips made during earlier years of fabrication. The lower value of  $\alpha$  was observed later, when the yields were higher.

The observed decrease in the clustering parameter  $\alpha$  with increasing yield was first reported in [6, 51]. The fault clusters occurred in both high- and low-yield processes. In the low-yield process, however, the clustering effect appeared to be masked by the high average fault levels present in low-yield chips. During a period of high-yield manufacturing, some of these same clusters remained, leading to an increased variability of the number of faults per chip. This effect could have been negated if the sources of the clusters had been found and subsequently eliminated.

These effects and the dependence of the cluster parameter on area can be incorporated in an approximate yield model of the form

$$Y_{nCB} = \left[1 + n\lambda_{CB}(D_i)/\alpha(n, D_i)\right]^{-\alpha(n, D_i)}.$$
(31)

Both  $\lambda_{CB}$  and  $\alpha$  depend on a set of defect densities represented by  $D_i$ . In addition,  $\alpha$  is also a function of the number of circuit blocks. Such dependencies have been used successfully by this author since 1981 for estimating the yields of chips with redundancy and partially good chips. However, further refinements of this model are needed to more accurately take into account the effects of varying cluster areas. A preliminary account of an effort to do so may be found in [17], which, however, is not broadly available; a subsequent effort, in which consideration is given

to the effects of clusters which are smaller than circuit areas, is described elsewhere in this issue [52].

### **Concluding remarks**

In this paper the methods used for estimating and predicting yields of integrated-circuit chips that have some degree of fault tolerance have been reviewed. Some of the yield models described have been used for more than a decade in productivity optimization of dynamic random access memory (DRAM) chips containing redundancy. They have also been used to project learning plans for manufacturing yields of such chips as the IBM 64K, 256K, 288K, and 1Mb DRAMs. Because of their usefulness, the models have found acceptance elsewhere, e.g., by Stewart [49], Koren et al. [46–48], Harden [3], and Wey [53].

With the continuing trend toward placing more transistors on chips, two effects can be expected. First, because of the quantities involved, the number of faults occurring on a chip can be expected to increase. This will require the use of more effective fault-tolerance schemes. Second, the variability of the number of faults per chip should also increase, thus causing the effects of fault clustering to become increasingly important. Further refinements in the models will be needed to take this effectively into account.

### References

- T. E. Mangir, "Use of On-Chip Redundancy for Fault-Tolerant Very Large Scale Integrated Circuit Design," Ph.D. Dissertation, University of California, Los Angeles, 1981, pp. 50-64; available through University Microfilms International, Ann Arbor, MI.
- K. S. Hedlund, "Wafer Scale Integration of Configurable, Highly Parallel Processors," Ph.D. Dissertation, Purdue University, Lafayette, IN, 1982, pp. 235–237; available through University Microfilms International, Ann Arbor, MI.
- J. C. Harden, Jr., "A Wafer Scale Cellular Tree Architecture," Ph.D. Dissertation, Texas A & M University, College Station, TX, 1985, pp. 44-170.
- T. E. Mangir, "Sources of Failures and Yield Improvement for VSLI and Restructurable Interconnects for RVLSI and WSI: Part I—Sources of Failures and Yield Improvement for VLSI," Proc. IEEE 72, 690-708 (June 1984).
- J. C. Harden, "Comments on 'Sources of Failures and Yield Improvement for VLSI and Restructurable Interconnects for RVLSI and WSI: Part I—Sources of Failures and Yield Improvement for VLSI," Proc. IEEE 74, 515-516 (March 1986).
- C. H. Stapper, A. N. McLaren, and M. Dreckmann, "Yield Model for Productivity Optimization of VLSI Memory Chips with Redundancy and Partially Good Product," *IBM J. Res.* Develop. 24, 398-409 (1980).
- B. T. Murphy, "Cost-Size Optima of Monolithic Integrated Circuits," Proc. IEEE 52, 1537-1545 (December 1964).
- R. B. Seeds, "Yield, Economic, and Logistic Models for Complex Digital Arrays," 1967 IEEE International Convention Record, Pt. 6, pp. 61-66.
- R. B. Seeds, "Yield and Cost Analysis of Bipolar LSI," presented at the 1967 International Electron Device Meeting Keynote Session, October 1967 (Abstract p. 12 of the meeting record).
- A. J. Dingwall, "High Yield-Processed Bipolar LSI Arrays," 1968 International Electron Device Meeting Technical Digest, p. 82.
- T. Okabe, M. Nagata, and S. Shimada, "Analysis on Yield of Integrated Circuits and a New Expression for the Yield," *Elec. Eng. Jpn.* 92, 135-141 (December 1972).

- J. Sredni, "Use of Power Transformations to Model the Yield of ICs as a Function of Active Circuit Area," 1975 IEEE International Electron Device Meeting, Paper No. 6.4, pp. 123-125.
- C. H. Stapper, "On a Composite Model of the IC Yield Problem," *IEEE J. Solid-State Circuits* SC-10, 537-539 (December 1975).
- H. Murrmann and D. Kranzer, "Yield Modeling of Integrated Circuits," Siemens Forschungs & Entwicklungs Berichte 9, 38-40 (February 1980).
- R. S. Hemmert, "Poisson Process and Integrated Circuit Yield Prediction," Solid-State Electron. 24, 511-515 (June 1981).
- C. H. Stapper, "The Effects of Wafer to Wafer Density Variations on Integrated Circuit Defect and Fault Distributions," IBM J. Res. Develop. 29, 87-97 (January 1985).
- C. H. Stapper, "Block Alignment: A Method for Increasing the Yield of Memory Chips That Are Partially Good," *International Workshop on Defect and Fault Tolerance in VLSI Systems*, Springfield, MA, October 1988, pp. 6.3-1-6.3-11.
- C. H. Stapper, "Yield Model for Fault Clusters Within Integrated Circuits," *IBM J. Res. Develop.* 28, 636–640 (September 1984).
- R. M. Warner, Jr., "Applying a Composite Model to the IC Yield Problem," Solid-State Circuits SC-9, 86-95 (June 1974).
- 20. R. M. Warner, "A Note on IC Yield Statistics," Solid-State Electron. 24, 1045-1047 (December 1981).
- S. M. Hu, "Some Considerations in the Formulation of IC Yield Statistics," Solid-State Electron. 22, 205–211 (February 1979).
- C. H. Stapper, "Comments on 'Some Considerations in the Formulation of IC Yield Statistics," Solid-State Electron. 24, 127-132 (February 1981).
- M. B. Ketchen, "Point Defect Yield Model for Wafer Scale Integration," *IEEE Circuits & Devices Mag.* 1, No. 4, 24–34 (July 1985).
- V. Foard Flack, "Introducing Dependency into IC Yield Models," Solid-State Electron. 28, No. 6, 555-559 (June 1985).
- C. H. Stapper, "Defect Density Distribution for LSI Yield Calculations," *IEEE Trans. Electron Devices* ED-20, 655-657 (July 1973).
- A. P. Turley and D. S. Herman, "LSI Yield Projections Based Upon Test Patterns Results: An Application to Multilevel Metal Structures," *IEEE Trans. Parts, Hybrids, Packag.* PHP-10, 230-234 (December 1974).
- O. Paz and T. R. Lawson, Jr., "Modification of Poisson Statistics: Modeling Defects Induced by Diffusion," *IEEE J. Solid-State Circuits* SC-12, 540-546 (October 1977).
- C. H. Stapper and R. J. Rosner, "A Simple Method for Modeling VLSI Yields," Solid-State Electron. 25, 487–489 (June 1982).
- D. L. Peltzer, "Wafer-Scale Integration: The Limits of VLSI?" VLSI Design 4, No. 5, 43-47 (September 1983).
- G. E. Dixon, J. F. Dickson, J. Fox, A. K. J. Steward, and G. Sumerling, "Test Methodology for a Parametric Cell Design Approach to Fault-Tolerant VLSI and Wafer Scale Integration," Wafer Scale Integration, C. R. Jesshope and W. R. Moore, Eds., Adam Hilger, Bristol, U.K., 1986, pp. 237-245.
- J. S. T. Huang and J. M. Daughton, "Yield Optimization in Wafer Scale Circuits with Hierarchical Redundancies," Integration 4, 43-51 (March 1986).
- W. H. Beyer, Handbook of Tables for Probability and Statistics, Chemical Rubber Co., Cleveland, 1968, pp. 213–218.
- C. H. Stapper, "Correlation Analysis of Particle Clusters on Integrated Circuit Wafers," *IBM J. Res. Develop.* 31, 641-650 (November 1987).
- T. Yanagawa, "Influence of Epitaxial Mounds on the Yield of Integrated Circuits," *Proc. IEEE* 57, 1621–1696 (September 1969).
- T. Yanagawa, "Yield Degradation of Integrated Circuits Due to Spot Defects," *IEEE Trans. Electron Devices* ED-19, 190-197 (February 1972).
- C. H. Stapper, "LSI Yield Modeling and Process Monitoring," IBM J. Res. Develop. 20, 228-234 (May 1976).

- A. V. Ferris-Prabhu, L. D. Smith, H. A. Bonges, and J. K. Paulsen, "Radial Yield Variations in Semiconductor Wafers," *IEEE Circuits & Devices Mag.* 3, 42-47 (March 1987).
- D. M. H. Walker, "Yield Simulation for Integrated Circuits," Ph.D. Dissertation, Carnegie Mellon University, Pittsburgh, PA, 1986; Ch. 4, pp. 44–47, Ch. 8, pp. 138–140.
- D. M. H. Walker, Yield Simulation for Integrated Circuits, Kluwer Academic Publishers, Boston, 1987, Ch. 4, pp. 45-49, Ch. 8, pp. 158-160.
- 40. S. Gandemer, "Modélisation de l'Impact des Défauts de Fabrication sur le Rendement des Microcircuits Intégrés Fabriqués en Technologie Silicium," Doctoral Dissertation, École Nationale Supérieure des Télécommunications, Sup. Télécom. Report ENST-87E018, 1987, Ch. 2, pp. 1-24.
- C. H. Stapper, F. M. Armstrong, and K. Saji, "Integrated Circuit Yield Statistics." Proc. IEEE 71, 453-470 (April 1983).
- Yield Statistics," *Proc. IEEE* 71, 453–470 (April 1983).
  42. C. H. Stapper, "On Yield, Fault Distributions, and Clustering of Particles," *IBM J. Res. Develop.* 30, 326–338 (May 1986).
- V. Foard Flack, "Estimating Variations in IC Yield Estimates," IEEE J. Solid-State Circuits SC-21, 362–365 (April 1986).
- C. H. Stapper, "The Defect-Sensitivity Effect of Memory Chips," *IEEE J. Solid-State Circuits* SC-21, No. 1, 193–198 (February 1986).
- C. H. Stapper, "Yield Model for 256K RAMs and Beyond," IEEE International Solid-State Circuits Conference Digest of Technical Papers, 1982, pp. 12–13.
- I. Koren and M. A. Breuer, "On Area and Yield Considerations for Fault-Tolerant VLSI Processor Arrays," *IEEE Trans.* Computers C-33, 21-27 (January 1986).
- I. Koren and D. J. Pradhan, "Introducing Redundancy into VLSI Designs for Yield and Performance Enhancement," Proceedings of the Fifteenth Annual International Symposium on Fault-Tolerant Computing, FTCS-15, 1985, pp. 330-334.
- I. Koren and D. J. Pradhan, "Yield and Performance Enhancement Through Redundancy in VLSI and WSI Multiprocessor Systems," Proc. IEEE 74, 699-711 (May 1986).
- D. M. Stewart, "Laser Fix Dynamic RAMs," Electron. Week 58, 45-49 (February 4, 1985).
- C. L. Winter and W. L. Cook, "Interval Estimates for Yield Models," *IEEE J. Solid-State Circuits* SC-21, 590-591 (August 1986).
- C. H. Stapper, "Modeling Redundancy in 64K to 16Mb DRAMs," *IEEE International Solid-State Circuits Conference Digest of Technical Papers*, 1983, pp. 86–87.
- C. H. Stapper, "Small-Area Fault Clusters and Fault Tolerance in VLSI Circuits," *IBM J. Res. Develop.* 33, 174–177 (1989, this issue).
- C. L. Wey, "On Yield Considerations for the Design of Redundant Programmable Logic Arrays," *IEEE Trans. Computer-Aided Design* 7, 528-535 (April 1988).

Received February 8, 1988; accepted for publication October 27, 1988

Charles H. Stapper IBM General Technology Division, Burlington facility, Essex Junction, Vermont 05452. Dr. Stapper received his B.S. and M.S. in electrical engineering from the Massachusetts Institute of Technology in 1959 and 1960. He subsequently joined IBM at the Poughkeepsie development laboratory, where he worked on magnetic recording and the application of tunnel diodes, magnetic thin films, electron beams, and lasers to digital memories. From 1965 to 1967, he studied at the University of Minnesota on an IBM fellowship. Upon receiving his Ph.D. in 1967, he joined the IBM development laboratory in Essex Junction. His initial work there was in the areas of magnetic thinfilm array development, testing and theory of magnetic bubble devices, and bipolar and field-effect transistor theory. During the early 1970s he developed a yield model for the analysis of defect monitor data. This model has been used since for line control and yield management. It has also been used extensively for productivity optimization of SRAMs and DRAMs with redundancy, as well as for planning the production of gate arrays, logic chips, and microprocessor chips. Dr. Stapper is a member of the Institute of Electrical and Electronics Engineers and Sigma Xi.