# The development by C. W. Rodrigu. D. E. Hoffman of ultra-highfrequency VLSI device test systems

by C. W. Rodriguez

The development of test systems for highperformance semiconductor logic and memory devices is discussed. The capabilities of shared-resource and tester-per-pin system architectures are reviewed. Test-system hardware design to provide high-speed pin electronics and generation of LSSD, weighted random, and algorithmic patterns is described. The reasons for the selection of the tester-perpin system architecture are given in terms of the way in which overall system accuracy and testsystem user flexibility are maximized for differing test methodologies.

## Introduction

Performance, reliability, availability, and price are among the most important parameters that determine the value of a product to its users. The semiconductor device test systems supporting very large scale integration (VLSI) logic and memory circuit devices magnify the careful consideration required in the design of each of these parameters.

<sup>©</sup>Copyright 1990 by International Business Machines Corporation. Copying in printed form for private use is permitted without payment of royalty provided that (1) each reproduction is done without alteration and (2) the Journal reference and IBM copyright notice are included on the first page. The title and abstract, but no other portions, of this paper may be copied or distributed royalty free without further permission by computer-based and other information-service systems. Permission to republish any other portion of this paper must be obtained from the Editor.

The availability of these VLSI device testers is critical throughout the development-to-manufacturing cycle of semiconductor products. The precision of the measurements is critical in the establishment of operating specifications. Quality Assurance organizations demand similar capabilities, but with an increased volume of parts for statistical studies on device performance and reliability. Manufacturing organizations also require test system flexibility; however, they place more emphasis on tester reliability, availability, and system cost because of the need to replicate the tester for device volume production.

In the past, Automated Test Equipment (ATE) producers have been able to meet the requirements of advanced logic and memory device technology development. However, testing the function and performance capability of such devices is becoming more difficult. Rapid advances and increased complexity in semiconductor technology have made the task of producing ATE systems correspondingly complex and expensive.

Test systems such as the IBM memory products test system of the early 1970s were designed with discrete transistors, small and medium-scale integrated logic and memory devices, hybrid module devices, and low-density packaging. The devices to be tested had few input/output (I/O) ports, low circuit densities, low operating frequencies, and moderately complex device circuitry. For its time, this test system's [1] maximum test rate of 100 MHz was well above the capabilities of any comparable tester.

Advances in semiconductor product technology quickly grew in scope. New devices were introduced at an accelerated pace, with stringent test requirements. The advent of VLSI devices such as microprocessors, high-density logic chips, and rapidly increasing data storage cell counts in static (SRAM) and dynamic (DRAM) random access memory devices, as well as the mixture of technology bases in test areas, such as transistor—transistor logic (TTL), emitter-coupled logic (ECL), complementary metal oxide semiconductor (CMOS), gallium-arsenide (GaAs), and devices combining bipolar and CMOS circuits on the same die (BICMOS), placed new demands on the test equipment used to verify their performance.

Similarly, testing requirements for IBM's semiconductor products made obvious the need for a new generation of device characterization, qualification, and final test systems. A new project named Advanced Test Systems (ATS) was established in the IBM East Fishkill test equipment organization. Its charter was, and remains, to determine requirements for semiconductor product testing in the foreseeable future, and to produce equipment necessary for device characterization and final test. As in any good product design, a balance of function, performance, and test system cost was the primary factor to be considered.

#### **Product trends**

The explosive growth of technical demands in memory and logic semiconductor device testing started in the late 1970s and has not yet abated. Memory device storage cell counts are growing beyond 16 Kb in bipolar SRAMs and will exceed 4 Mb for dynamic RAMs in the near future. Logic device gate counts have climbed above 10 000 in bipolar technology and above 100 000 for the CMOS devices.

Device I/O count and power have also risen as part of the technical evolution, with cycle times and gate delays now approaching gigabit-per-second circuit speeds. Design for test has increasingly received more attention from the memory and logic developers because of the inherent complexity of the devices under test. These factors have a direct effect on the capabilities, replication cost, and unit throughput of the test system.

## Advanced test system

The initial goal of the ATS system developers was to design a VLSI ATE system that would advance the state of the art in equipment design by developing a system architecture that could handle the total test needs of future semiconductor device technology. The ATS project charter included several key tester specifications, including the following:

- Development of a test system with a maximum tester operating rate of 250-500 MHz.
- Achieving a system overall accuracy in the 200–300-ps range.
- Implementation of a system circuit technology and packaging design that would be cost-competitive for tester pin counts (system I/O ports) of 256, expandable to beyond 1000 for future applications.
- Testing of both memory and logic devices through complex addressing and data patterns, utilizing algorithmic generation techniques at high data rates.
   System capability would be extendible to support deterministic logic test and level-sensitive scan design (LSSD) [2] scan ring support, weighted random-pattern generation for logic with data signature analysis, logic with embedded memory cells, and device boundaryscan test algorithms.

Advanced Test System I (ATS I) [3], the first system produced by the ATS organization to utilize the tester-per-pin system architecture, was completed in 1986. Designed for the Product Development Laboratory, ATS I realized many of the project's goals. Concepts learned were extended in the second-generation ATS II test system [4], destined for memory final test in manufacturing. Succeeding generations of ATS test systems, which are in development, will incorporate the remaining project goals and provide the basis for future development of advanced test systems.

# ATS test-system architecture

Two primary families of test-system architectures are available to the system designer—shared-resource and tester-per-pin. A shared-resource system utilizes a limited set of hardware, e.g., for pattern generation or timing, which is then distributed over the entire tester through a network of electronic multiplexers to its input/output ports. The assignment of the tester's resource to a specific test pin depends on options for its interconnection to the device-under-test (DUT) I/Os through cable patch panels and to the device electrical socket part-number (P/N) board.

Distributed-resource system architecture (Figure 1) has been the mainstay of the ATE industry for many years because of its technically satisfactory capability at a competitive system hardware cost. However, a penalty is paid in a production facility where different device types exist, with multiple part-number sets to be tested. Extensive physical reconfiguration of the tester is required for each change, resulting in a degradation of product throughput and system reliability. P/N programming of tests can also be complex in some cases, resulting in increased programming costs, because of limitations on tester resources available to the DUT.



#### Figure 1

Shared-resource test-system architecture.

The intricate path a tester signal must follow through the test system's multiplexer to reach the DUT did not degrade its fidelity sufficiently to affect the level of performance required for past devices. However, increased test-performance demands imposed by the technology emphasized the need for a new system architecture. An extensive review of future test needs was made, and the tester-per-pin architecture was selected for the ATS system development.

# Tester-per-pin test-system architecture

In a tester-per-pin system architecture, each of the tester resources is duplicated at each test pin. The advantages of such a test system include the following:

- Test-system throughput is enhanced, because the changeover of unique part-number interface boards is eliminated as the device type to be tested changes. This is especially advantageous when multiple part-numbers exist on the same semiconductor wafer. Multiple test passes on a wafer are eliminated, maximizing the productivity of the test area.
- The requirements of each of the pins of the DUT can be met with unique per-pin timing, voltage, and pattern generation, instead of maintaining identical parameter values for a set of tester pins. Each test pin can be individually programmed.
- The test-system I/O count can easily be reduced or expanded to meet the DUT I/O needs because of the test system's modularly expandable architecture.
   System cost can be optimized to cover the required number of DUT I/Os for each application. This is also possible with a shared-resource system, but the

- possibilities are limited by the preassignment of tester functions to specific test pins.
- P/N programming is simplified by the increased amount of available system resources, reducing DUT program development time.
- Per-pin DUT parametric measuring units (PMU) can be readily implemented, instead of multiplexing a limited set to all pins. This improves de measuring accuracy and significantly improves device throughput in the test system.
- Various device-test methodologies such as algorithmic pattern generation (APG), LSSD, scan testing, and weighted random pattern can be incorporated easily into the base tester architecture, which can be used to optimize test-system flexibility.

The principal disadvantage is the high hardware cost of the system relative to that of the shared resources due to the replication of all resources at each tester pin.

#### ATS test-system architecture

A block diagram for the ATS II system is shown in Figure 2. The test programs resident at the central or host computer are transmitted to the tester controller. This unit establishes the configuration of the system in several ways, in response to the needs of the DUT. Typically, a selected pin-pattern generator (PPG) buffer receives the DUT digital data pattern and algorithmic pattern generator parameters from the tester controller. In its operating mode, "drive" data to the DUT or "compare" data to be matched from those received from the DUT are established. This is repeated with the system's high-speed communication path for each of the PPGs assigned to the DUT test.

262



Figure 2

Tester-per-pin system architecture.

Resident in the path from the PPG to the DUT is the high-speed pin electronics (PE) subsystem. Its function for each test pin is to format the digital data [return-to-zero (RZ), return-to-one (R1), etc.], set the signal timing edges, and establish the voltage levels for the DUT, all under P/N program control. The output responses of the DUT are transmitted back to the comparators, whose voltage and timing windows have been set. Data that fail in comparison with an expected standard are logged in the pin-pattern generator fail buffers for later analysis.

The closely spaced DUT I/O electrical interconnection pads must be connected to the relatively large spacing of the test-system I/Os. The space transformer makes this physical transformation from the tester connections to the small spacing of the DUT pads, while maintaining a controlled 50- $\Omega$  impedance for signal fidelity.

The core of the test-system architecture is its patterngeneration and pin-electronics facilities. Other significant tester services are provided, such as DUT operating power supplies, DUT automated/manual handling for semiconductor wafer or module testing, software programs for tester operation, and system diagnostics [5]. However, these topics are beyond the scope of this paper.

## **Test-pattern generation**

The first ATS system concerned itself with memory-device testing. Its challenge was the implementation of high-speed pattern generation and high-speed pin electronics. DUT memory devices typically exhibit failure mechanisms such as adjacent memory cell charge disturbs, failure of address decoder and charge sense amplifier circuitry, and performance deficiencies in address access time. The test system must produce digital patterns for the device cell address, its control signals, data to be written to the DUT, and data to be compared to DUT output.

Figure 3 shows some of the possible memory-addressing patterns.

The data patterns in a memory device can be simple, such as writing solid "ones" or "zeros," "checkerboards," while being addressed in a pattern from first cell location

263





#### E a ma

Examples of memory device addressing patterns.

to last, then reading back the written data in the same sequence. "Walking" or "galloping ones" and "disturb" tests are much more complex, requiring per-pin data strings that are very long. The collection of data at all test pins in a single test cycle is said to be a vector. A series of vectors required for a complex and long test pattern is classified as  $N^{3/2}$  or  $N^2$  data, where N is the number of memory locations within the DUT.

Figure 4 is a simple example of a 4-bit memory undergoing a galloping ones test. Notice that this test requires a vector string proportional to  $N^2$ . If this example were applied to larger memories, such as 64-Kb cell locations, test vectors of at least 8 billion bits would be required.

Possible methods of handling such a large volume of PPG buffer data would be to store an image of all of the test vectors in a large high-speed buffer. This would be prohibitively expensive. The PPG buffer size could be reduced by stopping test to reload the PPG buffers from a less expensive bulk memory buffer. However, a severe product throughput degradation would result from this solution.

The most practical solution, though it is difficult to implement, is to recognize the nature of the vector strings with their repetitions and algorithmically generate the test patterns. Looping on a limited amount of data in the PPG buffers, analogous to nested DO loops in FORTRAN programming, will produce the desired data. No restrictions have been determined to date on the test-pattern needs of the DUT. This algorithmic pattern generator (APG) technique is realizable at a competitive test-system hardware cost. Two methods are available

| AB | 0          | 1 | 2              | 3 |                                                                    |                  |                                                                                                     |                                                 |                                  |                                                                                                     |                                       |
|----|------------|---|----------------|---|--------------------------------------------------------------------|------------------|-----------------------------------------------------------------------------------------------------|-------------------------------------------------|----------------------------------|-----------------------------------------------------------------------------------------------------|---------------------------------------|
| 0  | 0          | 0 | 0              | 0 | Reference memory cell at 2,2                                       | Time 1           | Location $\frac{(A-B)}{2-2}$                                                                        | Operation WRITE "1"                             | $\frac{\text{Time}}{10}$         | $\frac{\text{Location}}{2-3}$                                                                       | Operation<br>READ "0"                 |
| 1  | 0          | 0 | 0              | 0 | Write to reference<br>cell each time, after<br>reading each of the | 2<br>3<br>4      | $     \begin{array}{r}       1 - 1 \\       2 - 2 \\       1 - 2 \\       2 - 3 \\    \end{array} $ | READ "0"<br>WRITE "1"<br>READ "0"               | 11<br>12<br>13                   | $     \begin{array}{r}       2 - 2 \\       3 - 1 \\       2 - 2 \\       3 - 2 \\    \end{array} $ | WRITE "1"<br>READ "0"<br>WRITE "1"    |
| 2  | 0          | 0 | 1<br>Ref. cell | 0 | memory address<br>locations. Check<br>for disturbances at          | 5<br>6<br>7<br>8 | $     \begin{array}{r}       2 - 2 \\       1 - 3 \\       2 - 2 \\       2 - 1     \end{array} $   | WRITE "1" READ "0" WRITE "1" READ "0" WRITE "1" | 14<br>15<br>16<br>17<br><i>N</i> | $   \begin{array}{r}     3-2 \\     2-2 \\     3-3 \\     2-2   \end{array} $ and                   | READ "0" WRITE "1" READ "0" WRITE "1" |
| 3  | $\bigcirc$ | 0 | (0)            | 0 | "away" locations.                                                  | 9                | 2-1                                                                                                 |                                                 |                                  |                                                                                                     | d so on                               |

#### Giornal

Example of a "galloping ones" test.

today for algorithmic pattern generation. They are implemented either in a "shared-resource" or a "testerper-pin" pattern generator architecture.

### Shared-resource algorithmic pattern generators

A typical shared-resource test system for memory-device test utilizes an architecture where the chip addresses to be produced are described in terms of X and Y test vectors to denote the memory-cell location. Quite often a Z vector is used to describe the third dimension of a three-dimensional chip address. The availability of these X, Y, and Z addresses is fixed to specific tester pins, to maintain the optimum test-frequency characteristics of the system. However, a memory to be tested will extend beyond the addressing capability of a given tester if the number of memory-cell locations increases beyond the dimensions of the X, Y, and Z vector lengths.

Some memory products have the capability of multiport addressing, in which different means would be required for the memory read and write portions to be performed in parallel. A shared-resource test system limits the number of ports the product can have tested simultaneously because of its limited extent and its rigid assignment of X, Y, and Z vectors to a given set of test pins. It also becomes impossible to multiplex the necessary APG outputs to all tester pins as the product's pin count increases.

Tester-per-pin architecture allows maximum flexibility of system operation because, due to the universal makeup of the pins, there is no predefined pin function in the hardware. Each pin can assume the required configuration for the DUT under P/N program control (i.e., cell addressing, data-in/data-out compare, controls).

The system hardware implementation of a per-pin APG represents a substantial departure from that used in the shared-resource system architecture. The latter typically utilizes a set of digital counters to produce the required test vectors, as in an X/Y-chip-addressing scheme. For an  $N^2$  test, one counter keeps track of the "home" address, while the other is incremented for the "away" address to produce the chip addressing as previously described. In the case of a checkerboard or column/row bar cell data pattern, one counter is used for the chip cell row addresses, while the other is used for the cell column addresses. Device address lines are connected to the appropriate counter outputs through a multiplexer network. Limitations exist because of the availability of only two or three digital counters in a shared-resource system architecture.

## Tester-per-pin algorithmic pattern generation

The ATS project recognized the need early to develop an efficient and cost-effective method of producing digital patterns at high speed. Months of analysis and simulation were performed on every known vector string that was utilized in the test of memory devices. The design team's conclusion was that it was possible to implement such a pin-pattern generator architecture [6] by using a high-speed buffer of realizable dimensions for pattern store and by repeating the output sequence from the buffer under the control of its associated logic circuitry. A nested DO-loop structure, implemented in hardware for data control and reconfigurable buffer memory, would meet the needs.

For example, Figure 5 demonstrates the nested loop structure that has been implemented in the test-system hardware. Consider the FORTRAN program, noting the

#### 2 2 2 2 2 2

Algorithmic pattern loop equations.

complexity of the statements required to produce the desired "OUT" data address.

1000 OUT = 1 + 2

Figure 6 shows the block diagram of the APG loop control data structure. The engineering study suggested a buffer of depth K, with each word of the buffer consisting of N bits of either data out, or data compare mask, or the data to be expected and compared from the DUT. The N bits of the buffer word can also be of the form N/X chip control words, where X is the length of a control word. Pattern buffer data are not used directly as drive or expected data in the latter case. Instead, X bits of data per tester cycle are sent to the pin electronics channel to

address a memory containing  $2^x$  preloaded control words at the pin electronics channel.

A section containing M control bits is used for the logic circuitry that determines the nested data looping operation. The sum of the N bits plus the M bits of the buffer word completes the buffer word width.

"Pipelining" and multiplexing of data are used to overcome limitations in APG control logic and buffer memory performance. However, once these techniques have been fully utilized, the architectural limitations of the APG structure are reached.

For example, if N is equal to 1, different words of the buffer are addressed at a maximum rate of one word every Y ns, where Y is the test-cycle period. As Y decreases and the number of buffer words K to be used for complex patterns increases, the gate delays of the logic circuitry and associated pin-buffer memory devices cannot respond fast enough to produce the APG output data string. Random branching to any data word prevents the use of interleaved memories for the pattern buffer.

Multiple levels of pattern-buffer data-word looping are required. However, as the number of nested loops increases, demands on loop control logic performance increase at a given test operating rate. Therefore, the performance level of the control logic constitutes the upper limit of the APG's operating frequency.

Whereas multiple nested loops provide maximum operating flexibility, implementation of the loop control logic is complex. **Figure 7** is a simplified diagram of the loop control logic structure.

Programmable multi-bit counters must be provided to track the number of passes each loop has completed. Control logic for incrementing each loop counter is also required. The outer loop counter is incremented only if the present buffer word is the last word of the outer loop and all inner loops have reached their maximum counts.

The logic circuit performance is not the only limitation of the loop count increment function. A possible worst-case condition of all loops completing at the same buffer word must be considered. In this case, each loop's "loop-complete" condition must be propagated to the next loop in the nest. That loop in turn increments and propagates its loop-complete condition to the next level of looping. This worst-case loop count update time must fit within the update time permitted for the operation. Note that as the number of data bits per pattern buffer word decreases, the rate at which loop counts must be updated increases. Therefore, selection of *N* affects the maximum number of nested loops possible within the APG pattern buffer.

Additionally, each loop requires storage of its starting address. Control logic selects a loop's starting address as



#### Figure

Algorithmic pattern generator loop control block diagram. Sx = Flag for start of loop x; Ex = Flag for end of loop x; Ex = Flag for end of loop Ex count. Pattern buffer words beginning at Ex and ending at Ex are repeated Ex are repeated Ex times. Loop Ex is the innermost loop.

the next APG pattern buffer address when the loop is to be repeated.

Memory and logic device selection for the ATS I APG was dictated by speed and logic device density

requirements. Emitter-coupled-logic 100K family devices were used for control logic, and bipolar SRAMs were used for the pattern buffer; 32 bits of data per pattern buffer word were chosen to allow the control logic to

|              | Loop A |    |     |   | Loop B |    |     |  | L  | .oop | C   |     | Loop Z |    |     |
|--------------|--------|----|-----|---|--------|----|-----|--|----|------|-----|-----|--------|----|-----|
| Word 1       | SA     | EA | ALC | 5 | SB     | EB | BLC |  | SC | EC   | CLC | T   | SZ     | ΕZ | ZLC |
| Word 2       |        |    |     | T |        |    |     |  |    |      |     | 7 - |        |    |     |
|              |        |    |     | Ī |        |    |     |  |    |      |     |     |        |    |     |
| Word $(K-1)$ |        |    |     |   |        |    |     |  |    |      |     |     |        |    |     |
| Word K       |        |    |     |   |        |    |     |  |    |      |     |     |        |    |     |

#### a all a la la care

Loop control data structure. Sx = Flag for start of loop x; Ex = Flag for end of loop x; xLC = Loop x count. Pattern buffer words beginning at Sx and ending at Ex are repeated xLC times.

support test system output beyond 250 MHz. The same constraints limited the number of nested loops to three levels, with each loop having a maximum count of 2<sup>16</sup>.

Analysis of complex  $N^2$  patterns indicated that more than four levels of nested loops are required. However, this need is typical of associated DUT test pins with common APG functional requirements, e.g., the group of test pins producing the memory's cell address.

This fact suggested that a shared-resource function could be tied to test pins with a common test function. Four identical programmable controllers that would be associated with the four types of memory-device test functions were installed (i.e., address, control, data in to DUT, compare data to DUT output). These devices would be dedicated under P/N program control to their assigned APGs to extend the level of looping beyond three, as required by the test pattern. During a test, each programmable controller provides this extra looping capability by presenting branch addresses to locations in the pattern buffers for each test pin's APG. Four loops are available on each of these controllers. Each PPG is supported by its own buffer, control logic, and arithmetic logic unit (ALU).

Each of the APG buffer memories can be reconfigured under program control for further extension of its capabilities in more difficult data patterns. In one mode, 32 bits of loop control data are traded for 32 mask bits. This procedure is utilized by those test pins designated as receivers of data from DUT pins, where each of the 32 data bits is used as expected data and the 32 mask bits have a one-to-one association with the data bits.

In another mode, the same 32-bit loop control data can be substituted for 32 additional data bits. This

procedure is used primarily when the test data are truly random in nature, where much less repetition occurs. In this mode, K(32 + 32) pattern data bits are available.

A 64-MB bulk buffer is available to the test system's APG per-pin buffer after all other possibilities for producing the data required for test have been exhausted. A high-speed reload of large volumes of P/N program data can be performed by stopping the DUT test process and loading the per-pin buffers with new test patterns, repeated as required.

# Advances in pin-pattern generator architecture

A natural extension of the memory-device pattern generator architecture is to expand its role to include logic testing. This has been accomplished by combining the required APG logic circuitry with that necessary for generating logic test patterns to produce a VLSI gate array of higher density.

Current implementations of the tester's pin-pattern generator have been optimized by eliminating the four programmable controllers. This was made possible by advances in the density and performance of VLSI gate arrays. All PPG logic for algorithmic pattern generation and logic test methodology support is now included in one 70 000-gate CMOS gate array device, with the exception of the weighted random-pattern function, enabling a single-chip implementation of expanded PPG function.

Two factors led to this modification of the APG architecture. The first was the need for large amounts of "random" data to drive an LSSD string input to the DUT. The second was that only pins associated with expected data in a memory device test use the programmable controllers.

If large DRAM buffers were used at each pin, the need for expected data looping was decreased. Therefore, the buffer memories were increased in size, the looping capability per pin was increased from 3 to 4, and the innermost loop was given the capability to increment or decrement the loop count based on the status of other loops, eliminating the need for programmable controllers. This allowed the storage of data in the pin buffer in a compressed format to be used either for algorithmic pattern generation in array testing or for "flushing out" each PPG buffer for logic device testing. Data stored in a compressed format within the buffer are expanded before being transmitted from the tester to the DUT for logic device testing.

For the test of memory devices designed with LSSD rules and with embedded logic, the required algorithmic patterns are provided to the memory serially through the LSSD chain. Since the device to be tested has no restrictions on its pad location or number of LSSD chain inputs, a per-pin LSSD APG was added for this purpose.

Each of these APGs resembles a shared-resource X-Y pattern generator with the ability to serialize the output. Furthermore, the bit sequence can be programmed to reflect the way in which the I/Os embedded in the DUT array are connected into the LSSD chain.

The requirement for a large per-pin buffer encouraged the use of DRAM per-pin buffers as opposed to expensive SRAMs. The logic necessary to refresh the DRAM, as well as error correction for one-bit errors, is built into the same CMOS gate array used for the various types of pattern-generation methods described. To allow uninterrupted pattern flow to the DUT during refresh, a first-in/first-out (FIFO) file, realized by a multi-port cache memory, is used to buffer the data from the DRAM. In this manner, the pattern generator uses data from the FIFO, and the DRAM control logic ensures that the FIFO is never emptied. During program initialization, the FIFO is filled before pattern generation is started.

For each tester cycle, the information sent to each pin of the pin electronics controls the data, pulse format, and timing, as well as input/output control. This permits complete and independent control of each tester pin [7].

The pin electronics returns the error data to the pattern generator from the two comparators on each pin for every cycle. These data are then compressed at full operating speed in a high-speed buffer residing in the CMOS gate array. Whenever this local buffer is filled, the compressed error log results are transferred to the DRAM storage.

In summary, this single custom logic device contains the circuitry necessary for algorithmic pattern generation, compressed deterministic flush patterns, and LSSD test of the DUT. Testing of logic devices with embedded memories or support for boundary-scan test techniques is provided via an LSSD APG circuitry. The associated circuits for DRAM timing and refresh and for the control of the high-speed cache buffer used during data collection are also provided from this single logic chip.

Many memory devices under test today contain a considerable amount of logic circuitry. An enhanced pattern generator architecture (Figure 8) was designed to support the testing of logic devices as well as combinations of memory and logic devices.

The last mode of pattern generation support is to enable a test methodology developed by IBM called weighted random test [8, 9]. The advantage of this mode of testing is that smaller volumes of both test data and failure storage are required during test of a particular P/N program. The key element provided on a per-pin basis is a linear feedback shift register (LFSR).

By providing a particular starting point and configuring the LFSR to produce a digital data stream that may be described as an irreducible polynomial, the circuit generates a unique pattern before repeating itself after





 $2^N - 1$  cycles (where N is the maximum length of the shift register). These predictable and repeatable test data streams are sent to the device being tested. The particular LFSRs chosen for this design comprise 32 bits and have programmable tap selection. They can be configured as random-pattern generators for stimulating the product or as signature analyzers for compressing the response of the product output.

Because random-pattern generation generally takes much longer to fully exercise a product, a method called weighted random-pattern testing (WRPT) is used. Here, the LFSR output is applied to a combinatorial logic circuit that can change the pseudorandom output from a 50/50 probability of getting a logical 1 or 0 to a biased one. The weighted LFSR output provides the necessary test coverage on the DUT with a smaller pattern set than the unweighted methods. The data sequences are analyzed to see if they test the device fully, with the WRPT sequence re-analyzed and optimized by adjusting seed values and weights, for the highest test coverage at a minimum DUT test time.

Circuits with LSSD chains require more stringent test system criteria, since they may need the weights delivered to the LFSR pseudorandom output to be changed on every data shift out of the LFSR. The test system, at the full data rate, must be able to update the LFSR weighting



Figure 9

Pin electronics architecture.

circuit. Large buffers are needed behind any LSSD input pin to store the weight values for a particular test sequence, while the looping capabilities of the APG are used to compress these data.

## **Test-system pin electronics**

The pin electronics circuitry provides the high-speed analog signal interface between the tester's digital pattern generator and the device under test. The required patterns initiated by the pattern generator are received by the pin electronics, where proper timing edge placement and pulse format and voltage conditions are applied as DUT stimulus. The DUT responses are received by the system's pin electronics, where the signal obtained is strobed at a particular voltage reference level and at a specific point in time. This information is converted to binary data and is compared to the expected data as produced by the pattern generator. Failures in obtaining a match are considered device errors and are returned to the pattern generator for storage in buffers for later analysis.

The major components of overall system accuracy are contained in this section of the test system (Figure 9). Included are the timing generation network, data formatting, and driver/comparator circuitry. The DUT's interface to the test system, which consists of an electromechanical structure to map the tester pins to those of the DUT on a much smaller physical spacing of contacts, and the device probing network for semiconductor wafer testing, is part of the signal path.

# **Test-system timing generation**

Memory-device data access is achieving subnanosecond performance times, while the logic device's gate delays are decreasing below 100 ps. This places unique

requirements on the tester to produce accurate pulses of high fidelity, repeatable at all of the DUT's I/O ports. Any differentiation from what is required in timing sequences at the DUT, known as "skew" associated with the timing system, generates an unknown in the tester measurement. The result is tester "guard-banding" to avoid shipment of devices for which test results do not match a statistical quality level, also known as shipped product quality level (SPQL). However, excessive guard-banding could also result in the tester erroneously identifying as failing devices that actually meet specifications. As the measurement tolerances become smaller, the burden on the tester's timing generation unit and autocalibration network becomes more critical [10].

Several unique features were added in the ATS test system to enhance the overall signal timing distribution [11]. First, a master clock operating at lower frequencies is distributed throughout the system to minimize the problems associated with high-frequency clock distribution over a long distance. Phase-locked-loop (PLL) modules multiply its master clock input to the desired operating frequency.

The second feature of the timing system is that the timing generators on the electronic cards are centimeters away from the tester's DUT driver and comparator circuit modules. This departs from the concept used in most test systems, in which timing edges are transmitted to a test head from a much greater distance. The close proximity of the timing generators also minimizes the "skin" effect of high-speed clock distribution through a coaxial cable over long distances, resulting in an accurate placement of the timing pulses.

The timing circuitry must also be designed to permit very fine steps of clock edge adjustment. This is necessary to define the timing edge placement of a particular test

270



Integrated digital timing delay block.

sequence with a very fine resolution in relation to other edges in the system and to allow for fine movements in calibrating all of the tester pins within a certain tolerance. In earlier systems, an analog technique was used to generate the fine delay. The PLL not only multiplies the master clock, but also serves as the tester fine-delay unit.

Control of the fine-delay unit is obtained by injecting a voltage into the summing node of the PLL's low-pass filter. The error voltage changes the average dc value of the integrated phase detector output. When applied to the voltage-controlled-oscillator (VCO) section, the loop will momentarily unlock and cause a phase shift in the PLL output equivalent to an edge movement until the circuit restabilizes.

This implementation of the fine-delay unit provides a delay-controlled output relative to the test system's master clock, over the operating frequency range of the PLL. A high-gain amplifier in the unit's loop makes the phase-delay error very small. Phase jitter was a difficult problem to overcome, but circuit techniques were developed to reduce jitter to less than ±20 ps.

The advantages of this form of electronic delay are derived from its ability to operate up to higher frequencies, greater than those found in today's typical test systems. It will perform over 100% of the input duty cycle without the linearity or drift problems associated with timing delays implemented with the usual analog-circuit, ramp-type delay units. Timing delays can be programmed with resolutions as low as 4 ps. The maximum frequency of operation is 500 MHz, with a timing edge placement accuracy of less than ±50 ps.

The PLL fine-delay approach has limitations because it is not conducive to altering the timing of pulses provided to the DUT without stopping its test. This desired test-system feature is commonly known as timing-on-the-fly (TOF) changes. Also, because of the large size of the PLL module, this technique is not compatible with systems using large pin counts. Therefore, a digital technique is being developed to alleviate both of these problems [12].

The digital fine-delay unit uses semiconductor logic gate delays and circuit-loading effects to generate the required small increments of delay.

The circuit shown in Figure 10 is optimized for linearity in delay characteristics over the required range of settings, with a calibration algorithm to compensate for device fabrication process variations and external environmental effects such as changes in ambient temperature and applied voltage.

To minimize the range that the fine delay must cover and to maintain a higher degree of accuracy over a smaller linear range, a high-speed digital counter is used for larger delay movements outside the fine-delay range. The counter can provide a delay that is a multiple of the input clock. Since this technique is digital, the combination of the fine- and coarse-delay circuits can be placed on one integrated circuit device, enabling placement of the pin electronics high-accuracy components in close proximity to one another. The newest logic devices contain a sufficient number of gates to permit the placement of several timing generators on one device. This is crucial for the performance and cost-competitiveness of test systems with large pin counts.

## Pin electronics data path structures

The data path section of the pin electronics acts as the bridge between the system's pattern generator and the driver/comparator unit for the DUT. The data path device controls the three modes of operation for the pin electronics: driving data to the DUT, receiving data from the DUT, or situations in which the DUT's port can operate as an input or an output. The last case, usually referred to as common I/O operation, places additional burdens on the test system.

The communication path from the pattern generator to the pin electronics consists of four data bits. The first implementation of the ATS test system's pattern generator provided the data at a rate of 1/4 of the actual



Figure 11
Pin electronics driver waveforms subset.

tester cycle time. This was necessary because of packaging constraints on sending high-speed signals over a relatively long distance. The function of the data path circuit was to provide a high-speed multiplexer for capture of the data and to increase the data rate by a factor of four, as the output of the pin electronics to the DUT. Therefore, only one bit of data was provided on a per-test system cycle.

System packaging and circuit improvements are being made in the succeeding generations of the ATS system. The parallel data bits are now transmitted at the full system cycle rate. These data bits provide an address to a high-speed cache within the data path circuitry which contains a set of test cases for a particular test vector. In the drive mode, the test case controls the particular data format, timing edge placement, and driver data. In the receive mode, comparator strobe edge placement and data expect/mask information from the DUT are provided. The common I/O mode utilizes a combination of the drive and receive controls.

As a result of this enhanced mode of communication, the adjustment of both data timing and formatting can now be made "on the fly" without stopping the test system. The result is a wide menu of product waveform sets that can be selected to test or characterize the product under varying conditions to ensure its proper operation. Figure 11 illustrates a small subset of typical

driver waveforms that can be generated from the lookup table of the data path module.

# Test-system-to-device-under-test interface

The driver and receiver/comparator circuitry form the ac interface of the test system to the DUT; dc parametric testing of the product is achieved through a separate perpin parametric measuring unit (PMU), electrically connected to the product by a relay network. The network also provides the test system's interconnection to its autocalibration units.

The various product technologies and unique system-to-DUT interface conditions place several constraints on the driver and comparator amplifier designs. The driver must be able to generate large voltage swings, with short transition times and minimal signal overshoots or undershoots [13]. Fast edge speeds typically create waveform overshoots, and this problem is further compounded by the need to present to the DUT variable amplitudes under P/N programming control.

The overshoot is caused by the excess current demand necessary to charge the parasitic driver device capacitance during the short transition time. The solution used is to design a circuit in which the driver input amplitude is varied with respect to the output amplitude. This forces the driver to operate just outside the linear range of operation of the differential switching pair, finding the right balance between edge speeds and overshoot control.

Figure 12(a) shows computer-simulated results of a driver's output waveform; Figures 12(b) and 12(c) show the output waveforms from the actual circuit. This circuit is the key element that enables implementation of a high-speed, high-voltage single-test-head design for the various DUT devices, as compared to other test systems with multiple test heads for each application.

The receiver circuit is a three-stage amplifier designed for high gain and high transition speeds, to detect both small and large signal swings. Tester comparator designs differ from those of typical amplifiers, because they must be able to detect small overdrive levels (when a voltage reference is very close to the up or down level of the product's output) and large signals. Typical large-signal applications have the reference level set in the center of the voltage range, or at a level in which a minimum voltage condition has been established to ensure triggering of the comparator circuit. The circuit's propagation delay as compared to the value of signal overdrive is critical to the design, because the placement of a voltage reference will affect the ac timing measurement when the comparator is strobed.

This problem was minimized with a first-stage amplifier input designed as a cascode stage with Schottky diode clamps and the subsequent two stages providing





#### Emmas 2

Driver amplifier simulator measured results: (a). Typical driver output waveforms: (b) 250 MHz -H=2 ns/div., V=1 V/div., and scope attenuation = 20:1 into 1000- $\Omega$  load. (c) 500 MHz -H=2 ns/div., V=200 mV/div., and scope attenuation = 10:1 into 50- $\Omega$  load. © 1988 IEEE. Algirdas J. Gruodis and Dale E. Hoffman, "250 MHz Advanced Test Systems," *IEEE Design & Test of Computers* 5, 31 (April 1988).

further voltage gain. Figure 13 shows measured results for various overdrive conditions.

The ATS system's pin electronics contains two comparators. This is useful in defining both timing and voltage "windowing" for enhanced device characterization. It also facilitates performing measurements on device product outputs with common I/O driver circuits.

# Space transformers and device probes

Minimizing electrical discontinuities in the path from the pin electronics to the DUT's connection pads is critical to overall test-system accuracy. The entire path must be designed so that a controlled electrical impedance of 50  $\Omega$  (in the ATS system) is maintained from the tester to the product. Since the DUT's impedance may or may not match the transmission-path



#### Figure 13

Pin electronics comparator measured results. © 1988 IEEE. Algirdas J. Gruodis and Dale E. Hoffman, "250 MHz Advanced Test Systems," *IEEE Design & Test of Computers* 5, 31 (April 1988).

impedance, the tester end of the interconnection is backterminated at the proper transmission-line impedance to prevent "ringing" of the signal.

Although it is important to maintain a controlled impedance, it is also critical to maintain an accurately placed high-speed signal edge, and one that is not affected by the proximity of other signals in transition. To maintain the signal quality, the distance to the product must be minimized, and the choice of signal conductors must not add high-frequency loss caused by the skineffect degradation.

## Closing remarks

High-speed test systems utilizing a shared-resource architecture have served IBM's semiconductor device test needs well. However, a new generation of logic and memory devices made evident the need to continue advancing the state of the art in test systems.

The ATS project in IBM was established to address the limitations of both internal and ATE vendor test systems. A system architecture was chosen to facilitate a modular expansion of tester capabilities and to easily integrate newer system implementation technologies into the design as they became available.

The ATS project has been structured to execute in three test-system development phases, in a segmented approach to solving the complex test issues:

 The ATS I test system was designed to address the need for characterization testing of memory products in the Product Development Laboratory.

- The ATS II test system is a repackaging of ATS I into reproducible form for cost-effective system replication. Applications to logic device test were also explored during this phase of the ATS project.
- ATS III is further developing logic and memory device testing within one test system. Its architecture is readily adaptable to the various device test methodologies for each type, as well as mixed testing on one device.

The advantages of a tester-per-pin architecture for high-performance test systems have been discussed. The modularity of the system allows the tester to be configured as a low-pin-count test system consistent with boundary-scan test methodologies, or a high-pin-count system capable of exercising each of the product's pins at the maximum test rate. Preserved in the design are the tester-per-pin support of algorithmic pattern generation, LSSD testing support for both logic devices and logic embedded memories, and pattern generation for weighted random-pattern testing.

The demands on advanced tester development continue in step with rapidly expanding device capabilities and complexities. The need for test systems operating at data rates in the GHz range and overall system accuracies below 100 ps is on the horizon. GaAs logic and memory devices, as well as fiber-optic links for data transmission within the tester, could be used to solve some of the future tester design challenges. Elaborate testpattern algorithms will continue to evolve to meet the needs of advanced products. Test methodologies which decrease product test time must be developed for those whose logic circuit densities are climbing over 100 000. Furthermore, the fact that the growth of the number of tester pins is nearly proportional to that of the number of pins in the product will require novel system designs to control system replication costs in manufacturing applications.

The logic and memory products of the 1990s are coming into existence, with high-performance test-system development striving to stay ahead.

## **Acknowledgments**

There are many individuals to thank for their contributions to the achievements of the Advanced Test Systems project. Many have patiently communicated to us their test needs and have provided suggestions and critiques regarding our system design, including logic and memory device designers, developers of test methodologies, users of the test system for their lab characterization of devices, final test system users in manufacturing, and other ATE developers in IBM East Fishkill, New York and IBM Burlington, Vermont. We wish especially to thank the technical leader of the ATS

project, Dr. Algirdas Gruodis. Finally, much appreciation is due Erik Kusko of the ATS project for contributing technical material used in this paper, and Dan Skooglund and John Dickol for their critical reading of the paper prior to submission.

#### References

- Y. E. Chang, H. P. Muhlfeld, and R. M. Morton, "High Speed Memory Testing," *Proceedings of the IEEE Semiconductor Test Symposium*, Cherry Hill, NJ, October 1977, pp. 152–157.
- E. B. Eichelberger and T. W. Williams, "A Logic Design Structure for LSI Testability," *Proceedings of the 14th Design Automation Conference*, New Orleans, 1977, pp. 462–468.
- 3. Y. E. Chang, D. E. Hoffman, A. J. Gruodis, and J. E. Dickol, "A 250 MHz Advanced Test System," *Proceedings of the 1987 IEEE International Test Conference*, September 1987, pp. 68–75
- A. J. Gruodis and D. E. Hoffman, "250 MHz Advanced Test Systems," *IEEE Design & Test of Computers* 5, 24–35 (April 1988).
- J. M. McArdle, "A 250 MHz Advanced Test System Software," Proceedings of the 1987 IEEE International Test Conference, September 1987, pp. 85–93.
- Y. E. Chang, A. J. Gruodis, H. P. Muhlfeld, Jr., C. W. Rodriguez, and M. L. Schulman, "Distributed Pattern Generator," U.S. Patent 4,639,919, January 27, 1987.
- M. L. Combs, A. J. Gruodis, D. E. Hoffman, and C. A. Puntar, "A Method for Per Pin Test System Signal Specification," U.S. Patent pending, application filed January 1990.
- J. A. Waicukauski, E. Lindbloom, E. B. Eichelberger, and O. P. Forlenza, "A Method for Generating Weighted Random Test Patterns," *IBM J. Res. Develop.* 33, 149–161 (March 1989).
- F. Motika and J. Waicukauski, "Weighted Random Pattern Testing Apparatus and Method," U.S. Patent 4,688,233, August 18, 1987.
- L. Grasso, C. E. Morgan, M. A. Peloquin, and F. Rajan, "A 250 MHz Test System Timing and Auto Calibration," *Proceedings of the 1987 IEEE International Test Conference*, September 1987, pp. 76–84.
- Y. E. Chang, L. Grasso, A. J. Gruodis, and C. E. Morgan, "High Speed Programmable Timing Generator," U.S. Patent 4,608,706, August 1986.
- J. Fischer, D. Hoffman, D. Skooglund, and D. Young, "Implementation and Calibration of an Integrated Digital Fine Delay," U.S. Patent pending, application filed April 20, 1989.
- A. J. Gruodis, D. E. Hoffman, C. A. Puntar, and D. E. Skooglund, "A Method for Reducing and Maintaining Constant Overshoot in a High Speed Driver," U.S. Patent 4,779,270, October 1988.

Received May 19, 1989; accepted for publication October 15, 1989

Charles W. Rodriguez IBM General Technology Division, East Fishkill facility, Route 52, Hopewell Junction, New York 12533. Mr. Rodriguez is a Senior Engineering Manager. He received his B.S. in electrical engineering from New York University and his M.S. in electrical engineering from the Polytechnic Institute of New York. He is the Advanced Test Systems project manager in Test Equipment Engineering at the IBM East Fishkill facility, responsible for the development of high-performance memory and logic device test systems. Mr. Rodriguez joined IBM in 1977; he has since held assignments in the semiconductor product development laboratory and currently in test equipment development. He has been a member of the Advanced Test System project since its inception, serving on the team which defined the basic ATS system architecture. Mr. Rodriguez is the recipient of an issued U.S. patent.

Dale E. Hoffman IBM General Technology Division, East Fishkill facility, Route 52, Hopewell Junction, New York 12533. Mr. Hoffman is a Developmental Engineer. He received his B.S. in electrical engineering from Pennsylvania State University in 1981, joining IBM that same year; he received an M.S. in electrical engineering from Syracuse University in 1984. He is the manager of Test Systems Engineering in Test Equipment Engineering at the IBM East Fishkill facility, and has worked on the Advanced Test System since its inception. Mr. Hoffman's responsibilities have included the design of high-speed pin electronics circuits and subsystems. He now manages the development of advanced VLSI memory and logic test system architectures, as well as advanced pattern generator design. Mr. Hoffman has reached his first IBM Invention Plateau, and holds three U.S. patents, with others pending.