# Design for testability and diagnosis in a VLSI CMOS

System/370

processor

by Cordt W. Starke

This paper describes the design for testability and diagnosis in an IBM System/370 processor based on VLSI CMOS technology. The design incorporates built-in pseudorandom-pattern self-test and the boundary-scan technique. This technique permits the migration of tests generated for component-level to higher-level packages such as printed circuit boards and the system. Consequently, the expense for testing of higher-level packages can be reduced, and the test equipment for the processor can be simplified. The design also offers economical diagnostic capability.

## 1. Introduction

This paper focuses on design for testability and diagnosis in a System/370 processor based on VLSI CMOS technology. In the processor, CMOS ASICs (application-

<sup>®</sup>Copyright 1990 by International Business Machines Corporation. Copying in printed form for private use is permitted without payment of royalty provided that (1) each reproduction is done without alteration and (2) the *Journal* reference and IBM copyright notice are included on the first page. The title and abstract, but no other portions, of this paper may be copied or distributed royalty free without further permission by computer-based and other information-service systems. Permission to *republish* any other portion of this paper must be obtained from the Editor.

specific integrated circuits) populated with up to one million transistors and multi-chip modules with up to 2.7 million transistors are used [1]. With the growing complexity of VLSI chips and the high logic-to-pin ratios, the cost and effort of testing increase dramatically. This situation does not improve in the testing of higher-level packages, e.g., printed circuit boards (PCBs) with many interacting VLSI chips. Traditional test methodologies, such as in-circuit testing in conjunction with functional testing for boards, are no longer economical because of unacceptable tester costs and manual effort. At the system level, the effort spent for extensive diagnostic packages based on exercising the normal functions of the system in an attempt to isolate hardware failures occurring, e.g., in the field, must be diminished.

A promising solution to this problem can be conceived in built-in self-test in conjunction with the boundary-scan technique as it is implemented in our System/370 system. Detailed chip and system design rules guarantee the application of self-test at module, card, and system levels. Our self-test scheme utilizes on-chip test-pattern generation and test-answer evaluation (signature analysis). A logic design with a pseudorandom pattern generator and a signature register unique to each chip provides economical diagnostic capability down to the chip level, because the failing chip can be identified



# Figure 1 LSSD double-latch design.





directly by the signature. Therefore, the diagnostic effort to identify components to be replaced is minimal. In addition, the benefits of self-test in terms of reduced test data volume, reduced test time, and reduced costs for tester hardware are demonstrated. Self-test may run without any tester and is therefore applicable in a system environment.

The boundary-scan (BSC) technique allows the reuse of most existing test data generated for the component level at the board or the system level. A test applied to a PCB can be constructed from the tests generated for single components and an additional test for the PCB wiring; a system test (or at least a part of it) can be constructed from the tests of different PCBs.

The body of this paper is concerned with the design for test and diagnosis in a System/370 CMOS computer system. Test-relevant design features are introduced which render a testing and diagnostic strategy consistent from chip to system level. Although testing and diagnosis at chip level is a difficult task, the test problem in higherlevel packages is even more difficult. Therefore, in this paper much emphasis is placed on testing and diagnosis of packaged chips such as multi-chip modules and printed circuit boards. The first part of the paper describes the design for testability at the chip level; the second part focuses on testing and diagnosis of higherlevel packages such as PCBs. Test-pattern generation and diagnostic application aspects are then considered. Finally, some results with respect to the test and diagnostic quality and costs for the methodology incorporated in the System/370 processor are discussed.

# 2. Chip-level design for testing

Each chip of the System/370 processor chip set incorporates built-in self-test capability. This is provided by the implementation of a test-pattern generator and a test-answer evaluator on each chip in addition to the system logic. A linear feedback shift register (LFSR) configured as a pseudorandom-pattern generator (PRPG) stimulates the chip internal system logic with flat random patterns. A second LFSR configured as a multiple-input signature register (MISR) performs the on-chip testanswer evaluation. In this MISR, the test answers are compressed into a signature. At test completion time, the signature is compared to a "known-good" one derived from simulation. The comparison may be done by unloading the signature register for comparison outside the CUT (chip under test) or by comparing the signature to one being hard-wired within the CUT itself.

All of the processor chips are designed strictly according to LSSD (level-sensitive scan design) rules [2]. Since this methodology is well known, its principles are discussed only briefly here. Further details can be found in [2].

Figure 1 depicts a typical LSSD double-latch design, in which all latches are designed as shift-register latches (SRLs). Besides the normal system data-in path, each latch L1 has an additional shift-in data path. All SRLs are configured to a shift-register scan path, which can be

loaded from primary inputs or unloaded to primary outputs. To apply a test pattern, the scan path is loaded by applying a definite number of shift clocks ACL and BCL. This step is followed by pulsing the system clock SCL. In this time frame, the circuit behaves in the normal mode of system operation. After the system clock has been applied, the test answers which reside in the SRLs are shifted out for test-answer evaluation. While the test answers are being shifted out, a new test pattern is loaded. The LSSD technique simplifies testing of general sequential logic so that testing can be performed in two independent processes: testing of the combinational logic and testing of the shift-register latches.

During self-test, pseudorandom patterns generated by the PRPG within the CUT are shifted down the scan path. Once the scan path is loaded, a system clock is applied to each SRL, and the scan path is unloaded. Instead of being shifted to primary outputs, the test answers are compressed into a signature by the on-chip MISR. Simultaneously with the unloading of the scan path, fresh random data are loaded into the SRLs.

The basic structure of a VLSI chip acting in the self-test mode is shown in Figure 2. The CUT (the figure shows only its scan path consisting of different SRL chains marked by CH or BSC) is stimulated by its internal PRPG and monitored by its internal MISR. Neither the PRPG nor the MISR is part of the system logic. This basic structure is known as the STUMPS approach [3]. In contrast to the original scheme, self-test is implemented at the chip level in our design instead at the multi-chip module level. This allows a more precise and easy diagnosis of failing chips, because each of these chips is directly (without any additional calculations) identifiable by its individual failing signature.

Attention must be drawn to the testing of randomaccess memories (RAMs) embedded in combinational logic because, for reasons of economy, these memories provide no shift-register capability. To facilitate testing of such RAMs, a one-to-one correspondence between array inputs and chip primary inputs and/or SRLs and between array outputs and chip primary outputs and/or SRLs can be established. With this one-to-one correspondence, test patterns can be applied to the arrays by placing them on the corresponding inputs, and test results of the array can be monitored at the corresponding outputs [4]. Faults, which may exist in the system logic driving the array or in the logic driven by the array, and which are not detectable during the array test via the one-to-one correspondence, can be covered during another part of the test, when the correspondence is not activated. The effectiveness of pseudorandom test patterns for testing of RAMs is shown in [5].

In Figure 2 the test mode is selected by the signal TM (Test Mode) = 1. TIR represents a test-instruction









register supplying instructions for controlling the CUT during testing. BSC refers to the boundary scan implemented at chip level to provide testing of higher-level packages. Design details are discussed in the following sections.

# Macro approach

The PRPG, MISR, and TIR are designed as generic macros, available for use in each chip. The interface between the system logic and the test logic must be well defined and common to all chips. To keep the design effort for these test aids at a minimum, logical macros for general use are absolutely required. Another advantage is that macros can be controlled and released by one department, thus guaranteeing that all logic designers apply the same implementation.

Figures 3 and 4 show linear feedback shift registers (LFSRs) configured as a PRPG and a MISR with 31 bits



# Figure 5 I/O macro with a boundary-scan latch.

Table 1 Boundary-scan control signals and test modes.

| BSC | SELCIO | Test mode               |
|-----|--------|-------------------------|
| 1   | 0      | Chip internal test mode |
| 1   | 1      | Chip external test mode |
| 0   | 1      | Sample mode             |

and 25 bits, respectively. For both LFSRs one of two primitive polynomials (specification of the feedback function) may be selected. In addition, the feedback signal is invertible. Considerable literature exists on the properties of LFSRs [5, 6]. The use of primitive and different polynomials for the PRPG depends on the quality of the pseudorandom patterns. For the same reason, the PRPG incorporates an XOR network at its output to the scan chains  $P_1 \cdot \cdot \cdot P_w$  (called scan channels in self-test). The network outputs provide shifted versions of the pattern sequence generated by the LFSR. The use of this network avoids a possible test degradation caused by the structural dependency upon correlated bits in the array of SRLs formed by the scan channels, which can occur when the outputs of the PRPG stages feed the channels directly [5]. In signature testing, primitive polynomials also show better results than nonprimitive ones with respect to aliasing (that is, the signature is correct and the CUT is faulty) [7]. The selected mode is controlled by the signals IPG, SPG for the PRPG and IPM, SPM for the MISR provided by the TIR. The signal ST-CONFIG, which is also generated by the TIR, forces both circuits to act in the self-test mode. According to their usage, 1 to 15 parallel scan channels

can be connected. While the CUT is not in self-test mode, all shift-register latches of the LFSRs are part of the chip scan path.

The third macro represents the test instruction register (TIR) needed for test control.

#### Boundary-scan design

Boundary-scan latches are implemented to simplify testing and diagnosis of VLSI chips in higher-level packages. They allow the partitioning of complex logic structures such as printed circuit boards into smaller testable islands. Each chip includes boundary-scan latches logically adjacent to all signal I/Os, except for clocks and test-control signals. The latched I/Os are also designed as macros. Figure 5 depicts an I/O macro with a tri-state driver and receiver capability (CIO) and a boundary-scan SRL. A driver-only or receiver-only application can easily be derived from this one. Two signals, called BSC (boundary scan) and SELCIO (select CIO) control the logic of the BSC macro. The DI (driver-inhibit) signal forces the driver to the high-impedance state.

To keep the impact on system performance at a minimum, the boundary-scan SRL has been implemented outside the system path, and is switched into the system path only during testing. The multiplexors themselves are path-gate circuits which impact the signal propagation time less than 0.5 ns.

**Table 1** shows the test modes supported for different values of the two boundary-scan control signals.

- In the chip internal test mode, the BSC-SRLs are used to stimulate the chip internal system logic with test data and to capture test responses.
- In the chip external test mode, the BSC-SRL supports testing of circuitry external to the chip under test.
   Typically, the chip interconnections will be tested in higher-level packages (multi-chip modules, boards, system).
- In the sample mode, the boundary-scan design allows sampling of signals received by or sent from the CUT during the normal mode of system operations. This mode represents an excellent feature for an improved system-level maintenance and diagnostic system.

In self-test, the chip internal test mode is selected. All boundary SRLs are part of the scan channels. MUX 3 is switched to stimulate the system logic under test by BSC-SRLs. MUX 1 connects the data path from the system logic to BSC-SRLs. The driver is in the high-impedance state. With this I/O control, the fault-free signature derived from simulation is independent of logical values at the CIO pads. Therefore, any signature calculated for the chip level is also valid for higher-level packages. This allows the migration of tests to system level.

## • Test-mode control

Various test modes such as chip internal testing and chip external testing are selected by the test-instruction register (TIR). The TIR is given access to the chip control by the signal TM = 1. In this mode the content of the register is frozen. The TIR

- Forces all SRL scan chains to connect to the PRPG and the MISR.
- Controls the selection of different PRPG and MISR polynomials.
- Switches all boundary-scan SRLs into the logic data nath.
- Controls off-chip driver nets.

In the normal mode of system operation, the TIR instructions are inactive. The TIR is loaded serially before self-test begins. This is accomplished while the chip is acting in the normal mode of operation, as described in the following section.

Because VLSI chips are pin-limited, test access must be permitted through a very low number of signal pins. In our design, the following signals must be controlled for the application of all chip internal and external chip-to-chip interconnect tests: scan-in, scan-out, test-mode signal TM, and LSSD-clocks.

# • Scan-chain configuration during normal mode of operation

During the normal mode of system operation (TM = 0), neither the self-test circuitry nor the TIR instructions affect system operations. All SRLs, including the PRPG, MISR, and TIR SRLs, are configured as one LSSD shiftregister chain, as shown in Figure 6. The PRPG, MISR, and TIR are initialized by loading this SRL chain with predefined logic values. Embedded RAMs are initialized by applying several random patterns generated by the PRPG while the chip is acting in the self-test mode. The initialization can be verified by logic simulation. After the initialization of all memory cells is completed, the MISR (and with it all other SRLs) must be reinitialized because of unpredictable values which have been shifted into the MISR during the initialization phase. After the test, the signature may be read out for comparison in this mode.

# 3. Design for testing and diagnosis in higher-level packages

#### Test-access bus

Access to VLSI components in higher-level packages for testing is provided by a test-access control bus to which each chip is connected. **Figure 7** illustrates the test-access scheme. All chips are connected in a star configuration.









LSSD clock signals and one scan-out signal are unique to each chip. Individual scan-clock signals for each chip allow the initialization of the TIR, the PRPG, and the MISR with different seeds as well as the signature checking separately for each chip. On the other hand, the clocks of different chips may also be activated simultaneously as required to run self-test for all chips in parallel.

A test-access control scheme has recently been defined by the Joint Test Access Group (JTAG) with participation by several companies [8]. Currently, activity is underway to establish this as an IEEE standard. Since



# Figure 8 The clock chip as central self-test controller.

the present design was completed before the JTAG proposal was developed, the test-access method described here differs in some details. In contrast to JTAG's test access port (TAP), which also allows scan access to one chip at a time (in a star configuration), test access is controlled via one test-control signal TM, n scan-out and n clock signals, instead of n test-control signals (TMS), one scan-out (TDO), and one clock signal. The present design, with individual clock lines for each chip, shows benefits with respect to a small clock skew [9]. Of course, additional features such as accessing of only one portion of the SRLs as supported by TAP are not available in this approach. However, only small design changes would be needed to follow the JTAG proposal for future designs.

# • Central self-test controller

Self-test at system level or where no test equipment is available or can be attached (e.g., field testing), may be driven with no external test control. In this case, one chip of the system controls the self-test (see Figure 8). The clock chip as a central chip is generally well suited for this mission. It generates all clocks and may provide all control signals during self-test. The test runs at system clock speed.

Two binary counters are needed to control the clock generation during self-test. The clock sources and sequences of clocks are assumed to be the same for each chip. The first counter CLC controls the scan cycles to load the longest scan channel. A second counter TCC counts the number of test cycles to be applied during self-

test. Before self-test starts, each counter is initialized with the highest number of cycles to be applied. Whenever counter CLC reaches the zero state, counter TCC decrements by one. Then a system clock cycle is applied and the CLC counter is reloaded with its initial value. If "all zero" is reached in counter TCC, self-test is completed.

All initialization data for self-test are generated by the clock chip itself or are loaded via a test and maintenance interface (TMI) by means of a service processor (SP). The clock chip as a central self-test controller is set up with all the clock-sequence and cycle-number information necessary for self-test. When all storage elements in the system are at a known value, the test signal TM will be set to "1"; self-test for all chips will then be executed. At test completion time, the signature is compared to the "known-good" one also received via the TMI or hardwired within the CUT itself. Test data for the chipinterconnect test are transferred between the chips under test and the SP, also via the TMI. In case of an incorrect response for any test, the SP switches into a faultrecovery mode; otherwise the main system control program receives control of the computer system.

The clock chip itself also has self-test capability, with the limitation that it must be tested functionally if no test equipment is attached (only in the system environment).

# 4. Testing and diagnosis

# • Migration of chip-level tests

The design for test and diagnosis of the System/370 processor described in the previous part of the paper allows testing of the chip internal logic separately from testing of the chip interconnect logic. The chip internal logic is tested by self-test running for each individual chip of the processor. Since the chip internal logic is isolated from other chips during testing, any tests generated at the chip level can be applied in higher-level packages such as multi-chip modules, printed circuit boards, or even the system itself. The signature which is providing the information that the CUT passed or failed self-test need be calculated only once, and can be migrated from the chip level to higher-level packages.

The chip internal self-test is followed by a chip external interconnect test. This test must be generated separately for each package. It can be generated relatively easily with deterministic test patterns, since the chip interconnect network is not very complex. Randompattern testing is also feasible, but some special problems with intermediate states (not allowed in self-test) of bidirectional drivers/receivers and buses must be considered more carefully. In any case, to avoid orthogonal states on external tri-state buses, only one chip at a time should be allowed to drive (not tri-stated); all others should be in the receiving mode. After each

chip has acted as a driver for some tests, all inter-chip wiring is tested. In the case of deterministic testing, the test patterns are applied using the boundary-scan chain, which can be accessed via the test bus. Test answers are also transferred via the boundary-scan chain.

### Diagnosis

Self-test as implemented in the System/370 processor system points directly to the failing chip via the chip individual signature. This provides excellent diagnostic capability down to the replaceable unit in the processor system. As stated before, the signature needs only to be compared to a "known-good" signature obtained from simulation and stored within the CUT or on a storage medium. The diagnostic technique can even be extended for remote failure analysis in the field. Extensive diagnostic packages based on exercising the normal function of the system, which attempt to isolate hardware failures occurring, e.g., in the field, are no longer required.

Diagnosis at the net level, which is sometimes required in wafer testing during the chip bring-up or early manufacturing phase, requires intermediate signatures to be recorded in order to limit simulation time for diagnostics. A method for this is given in [5, 10]. In addition, testing and fault diagnosis at the wafer level based on deterministic test patterns are also manageable and available. This kind of diagnosis is also applied for chip-interconnect fault testing with deterministic test patterns.

# 5. Results and discussion

The test methodology based on built-in self-test and the boundary-scan test shows the following results for testing and diagnosis of the System/370 CMOS VLSI processor system:

- Component tests can be migrated to the board and system levels. Consequently, the total test generation and simulation time for the System/370 computer system is reduced. The fault coverage for all components can be guaranteed by simulation for all packaging levels.
- 2. With built-in self-test, the volume of test data to be applied to each individual component is very low. Therefore, the test equipment can be less expensive due to the reduction of test memory needed compared to that required by conventional testing with test patterns supplied by a tester.
- The quantity of test equipment for testing and diagnosis at different packaging levels is reduced. For example, testing of assembled printed circuit boards in manufacturing can be accomplished in a system environment without any tester support.

- 4. On-chip built-in self-test provides excellent diagnostic capability down to the replaceable unit. This feature saves a lot of the manpower and computer power which would be needed to establish a separate diagnosis tool. As another benefit, the system run time for fault isolation is reduced drastically. With self-test, the accuracy of the failure isolation down to the chip level is many times better than that obtainable by an isolation tool based on functional system tests.
- 5. This System/370 design permits all tests to run at system clock speed, so that logic-delay faults are covered as well as dc stuck-at faults. The total test time is small (e.g., less than one minute for a complex printed circuit board) because self-test is running for all chips in parallel.
- 6. The hardware overhead (in circuits) needed to obtain a self-testable chip design is less than 1.5% in addition to that for normal LSSD design. The overhead for the boundary scan is about 1%. This overhead is acceptable because most of the chips are I/O and not area-limited. The performance loss in the processor due to the additional test features is negligibly small, as most of the test logic is not part of the functional system path.
- 7. Although most of the additional circuitry for testing is provided as macros to the logic designers, an extra design effort of two to three weeks per chip must be spent to implement all design for testability and diagnosis features.
- 8. In some chips we found random-pattern-resistant stuck-at faults (faults which cannot be detected within a reasonable set of test patterns). Most of these testability problems have already been solved by improving the controllability or the observability for that untested logic. As the chip designs are not yet completed, this work is still proceeding. In cases where a logic change demanded by testability will affect system performance in an unacceptable manner, the pseudorandom pattern test will be extended by a few deterministic test patterns.

The test methodology based on self-test and boundary scan reduces the expense of testing and diagnosis of complex digital systems with steadily increasing complexity, such as the System/370 VLSI CMOS processor system.

## 6. Concluding remarks

The design for testability and diagnosis in a System/370 processor computer system based on VLSI CMOS technology has been described. The design incorporates built-in self-test and the boundary-scan technique. As major benefits, the design allows the migration of tests generated for the component level to the board and

system level, and provides an excellent diagnostic capability down to the replaceable unit. The test equipment for the processor can be simplified.

Altogether, this results in a reduction of the total test and diagnosis effort for that system. This test and diagnosis strategy opens a new dimension in the testing of highly integrated VLSI circuits.

References

- H. Schettler, J. Hajdu, K. Getzlaff, W. D. Loehlein, and C. W. Starke, "A Mainframe Processor in CMOS Technology with 0.5 μm Channel Length," presented at the International Solid State Circuits Conference, San Francisco, February 14–16, 1990.
- 2. E. B. Eichelberger and T. W. Williams, "A Logic Design Structure for LSI," *Proceedings of the 14th Design Automation Conference*, June 1977, pp. 462–468.
- P. H. Bardell and W. H. McAnney, "Self-Testing of Multi-Chip Modules," *Proceedings of the 1982 IEEE International Test* Conference, 1982, pp. 200–204.
- 4. E. B. Eichelberger, T. W. Williams, E. J. Muehldorf, and R. G. Walther, "A Logic Structure for Testing Internal Arrays," *Proceedings of the USA-Japan Computer Conference*, October 1978, pp. 266–272.
- P. H. Bardell, W. H. McAnney, and J. Savir, Built-In Test for VLSI: Pseudorandom Techniques, John Wiley & Sons, Inc., New York, 1987.
- S. W. Golomb, Shift Register Sequences, Holden Day Publishing Co., San Francisco, 1967.
- T. W. Williams, W. Daehn, M. Gruetzner, and C. W. Starke, "Bounds and Analysis of Aliasing in Linear Feedback Shift Registers," *IEEE Trans. Computer-Aided Design* 7, 83 (January 1988).
- 8. "IEEE Standard Test Access Port and Boundary Scan Architecture," IEEE Standard 1149.1-1989/D4, IEEE Standards Board, 345 E. 47th St., New York, NY 10017, May 5,1989.
- K. D. Wagner, "Clock System Design," IEEE Design & Test of Computers 5, 9–27 (October 1988).
- E. B. Eichelberger and E. Lindbloom, "Random-Pattern Coverage Enhancement and Diagnosis for LSSD Logic Self-Test," *IBM J. Res. Develop.* 27, 265–272 (May 1983).

Received May 29, 1989; accepted for publication August 14, 1989

Cordt W. Starke 1BM Data Systems Division, Schoenaicher Strasse 220, D-7030 Boeblingen, Federal Republic of Germany. Dr. Starke received the Dipl.-Ing. and the Ph.D. degrees in electrical engineering from the University of Hannover, FRG, in 1980 and 1985, respectively. Since 1985 he has been with the IBM Data Systems Division Laboratory in Boeblingen, working on new test methodologies and test-pattern generation. Dr. Starke is involved in the design of VLSI CMOS chips with emphasis on design for testability, self-test, random-pattern testing, test generation, fault simulation, and fault diagnosis.