# DTMROC-S: Deep submicron version of the readout chip for the TRT detector in ATLAS

F. Anghinolfi, Ph. Farthouat, P. Lichard CERN, Geneva 23, Switzerland

V. Ryjov

JINR, Moscow, Russia and University of Lund, Lund, Sweden

R. Szczygiel

CERN, Geneva 23, Switzerland and INP, Cracow, Poland

N. Dressnandt, P.T. Keener, F.M. Newcomer, R. Van Berg, H.H. Williams

University of Pennsylvania, Philadelphia, USA

T. Akesson, P. Eerola

University of Lund, Lund, Sweden

### Abstract

A new version of the circuit for the readout of the ATLAS straw tube detector, TRT [1], has been developed in a deepsubmicron process. The DTMROC-S is fabricated in a commercial 0.25 $\mu$ m CMOS IBM technology, with a library hardened by layout techniques [2]. Compared to the previous version of the chip [3] done in a 0.8 $\mu$ m radiation-hard CMOS and despite of the features added for improving the robustness and testability of the circuit, the deep-submicron technology results in a much smaller chip size that increases the production yield and lowers the power consumption.

# I. INTRODUCTION

One DTMROC-S and two amplifier-shaper-discriminator front-end chips [4] (ASDBLR) service 16 straw tubes of the 425,000 total in the ATLAS TRT. The DTMROC-S processes ternary encoded signals from ASDBLR; provides threshold voltages to the ASDBLR discriminators, which identify the transition radiation and tracking signals; and communicates with the back-end electronics via 40 MHz LVDS compatible serial interfaces.

IEEE JTAG [5] concept was implemented throughout the chip to ensure full testability of the circuit. Some "hardening by design architecture" techniques were included to detect and decrease the consequences of Single Event Upset in the highly radioactive environment of the ATLAS TRT.

# II. CHIP ARCHITECTURE

The DTMROC-S chip (shown in Figure 1) receives 16 channels of low-level differential ternary (3-levels) current encoded signals from two 8-channel ASDBLR chips. The ternary signals are a composite of two binary discriminator output pulses: one unit of current indicating the presence of a straw-track (LOW), and a two-unit pulse indicating a transition radiation (HIGH) signal.

The transition pulses are sampled at 25ns intervals, while the tracking pulses are sampled at 3.125ns intervals. Timing is derived from the ATLAS 40 MHz system clock and an onboard Delay-Locked Loop (DLL) that generates eight additional clock phases spaced by 3.125ns.

For each 25 ns interval (time slice), 9 bits of information per straw are latched forming a 144-bit word for the 16 channels on the chip. This data is stored in a Pipeline for up to 256 clock cycles (the latency is programmable). If a Level 1 Accept (L1A) trigger command is received by the DTMROC-S, three adjacent time slices at the desired latency are copied into a Derandomizer and then read out of the chip as a serial 40 MHz data stream.

In addition to the sampled timing data, some event header information is generated and added in at various data path stages.

The DTMROC-S also provides two Test Pulses with programmable delay and amplitude for the ASDBLR chip, plus a "wire or" of the input channels to allow the possibility of a fast local trigger.

There are two 6-bit and six 8-bit linear DACs on the chip. The 6-bit DACs control two Test Pulse amplitudes. The 8-bit DACs are used for the remote programming of the ASDBLR LOW/HIGH threshold signals and some internal voltage and temperature sensing.

## A. Tri-level Input translator

The Ternary Receiver circuit is capable of detecting 4ns wide, tri-level differential current pulses with  $200\mu A$  steps. This permits high-density communication between the ASDBLR and the DTMROC-S chips without driving up pincounts and without causing self-oscillations via stray capacitive couplings back to amplifier inputs.

# B. DLL and Drift Time digitizer

The DLL implementation uses a classical structure. It employs 32 elements delay chain, phase detector and a charge pump. Eight equally spaced clock outputs (with jitter less then 500ps) are used to sample Low Threshold signals. Additionally, a 50% duty cycle clock is generated and can be



Figure 1: DTMROC-S block diagram

selected to run the chip core logic since some parts of the design use both clock edges and are particularly sensitive to the supplied clock quality. A "watchdog" circuit controls internally generated system clock, and switches to the external clock source in case of DLL failure.

Two status bits are provided by the DLL block: DLL lock flag and a "dynamic" flag which monitors the duty cycle of the external clock.

## C. Pipeline and Derandomizer

Several different configurations of the main data flow architecture were considered before the final design was chosen. The decision was based largely on the availability of a radiation tolerant RAM macro-cell [6], the required chip area and the simplicity of the overall controller mechanism.

The basic memory block used to make the Pipeline and the Derandomizer is shown in Figure 2. A configurable, synchronous dual-port 128×9-bit SRAM macro-cell was used for this implementation. The whole block is build of 17 columns (9-bit each) and 128 rows, and has total storage capacity of 2.35kB.



Figure 2: Basic memory block

The Pipeline is made of two parallel basic blocks with the whole storage capacity of 256×153-bit words. During data acquisition, this memory is operated as continuously running simple circular buffer.

When a L1A command is recognized, the relevant Pipeline read address is generated and the data are copied to the Derandomizer.

The effective storage time is therefore  $6.3\mu s$ , each data set contains 149 bits (1-bit DLL "dynamic" error flag, 4-bits of BC ID and 16 channels  $\times$  9-bits per channel).

The Derandomizer is an additional buffer acting as a FIFO. It is built of the same synchronous dual-port static RAM memory as the Pipeline, but has half the number of banks, that gives a  $128 \times 153$ -bit words storage capacity and it can store 42 events. In the case of a memory overflow, the control logic provides a "full" flag and skips complete events (that avoids the synchronisation troubles) until the memory frees up 3 locations to store a subsequent event.

A parallel write access to all memory banks might cause a large power consumption fluctuation with possible serious consequences for the stability of the analog part of the DTMROC-S. To avoid this problem, every odd RAM bank has been connected to a *True* address bus and every even bank is driven by an *Inverted* address bus.

Due to the relatively low trigger rate (100kHz), a clock gating technique is implemented to comply with power-efficient requirements.

Both memory components are equipped with Build-In-Self-Test (BIST) controlled via the Configuration and/or JTAG register.

#### D. Command Decoder

The Command Decoder is the main control block that decodes the command stream and issues all the necessary timing signals, internal registers read/write accesses and data processing.

The communication protocol lacks advanced data protection, but was chosen so that a single transmission error will not cause an erroneous command to be accepted.

The implemented decoding algorithm is very simple. It is built using a command shift-in register, a look-up table with valid command codes and coincidence logic. This architecture, in our case, is favourable over the resource consuming Finite State Machine (FSM) solution, and improves the SEU robustness because of the minimized number of vulnerable register cells.

# E. DAC

Each of the eight DACs (Figure 3) consists of 256 identical PMOS slave current mirrors. Reference for the slave mirrors is provided by a current mirror master consisting of 128 PMOS unit devices (L=8um, W=5um)). The current mirror master is sandwiched between two DACs. Dummy devices are located around the periphery of each DAC and the

topology utilizes a common centroid structure, to minimize the effects of process gradients on the matching of current units.



Figure 3: Threshold DAC block diagram. "R" and "2R" are matched resistors. "n" is between 0 and 255

Each threshold DAC creates an 8 bit reference voltage with a source impedance of R=5Kohms. An 8-bit switch array steers ratioed currents from the PMOS slave current mirror devices to provide an 8-bit current output into the 5Kohms resistor. The master mirror device current feeds the output branch of an Opamp driver, such that the output voltage of the Opamp across an internal resistor (10Kohms) matches the internal bandgap's 1.26V reference. The output voltage becomes the ratioed fraction of the bandgap reference voltage, independently of the Io current value and of the process variation.

# F. Read-out

The DTMROC-S chip has two LVDS-compatible, drivers that send data out over differential 40Mbit/s copper links. The Data Out driver sends event data to the readout system over a dedicated line. The Command Out driver connects to a bussed line shared by as many as 15 DTMROC-S chips and has two modes of operation. In data mode it operates as a tristated data driver for reading back the contents of internal registers when chip specific commands are directed to a particular DTMROC-S chip.

A special wire "OR" mode was added to the Command Output LVDS driver to provide a prompt trigger, fast-out option, useful for initial checkout of the detector mounted electronics. In this mode, selected ternary inputs are put in logical "OR" and contribute to a prompt chip-level trigger. To enable multiple DTMROC-S chips to contribute to the prompt trigger without baseline uncertainty, the LVDS drive is modified to generate differential output current only in the presence of a valid ternary input. In quiescent mode, no output current is generated. To support this feature, the backend electronics must be able to accommodate both output modes of this driver.

### III. ROBUSTNESS AND TESTABILITY

The DTMROC-S is intended to be used in a highly radioactive environment, hence, exposed to destructive effects. In order to improve the circuit's robustness against SEU, vital registers and sections of the DTMROC-S are implemented using one of the following special schemes. All internal registers are equipped with SEU detecting parity check logic. The most critical parts (Fast Command Decoder, Configuration and Threshold registers, event length counters, etc.) are built of the SEU resistant and selfrecovering elements based on triple logic with majority vote. The schematic implementation of one register unit is shown in Figure 4.



Figure 4: SEU redundant, self-recovering register unit

"Surveillance" counters were implemented to guarantee full Register Transfer Level (RTL) state coverage and to release any access or command execution lasting longer than the required time period.

A general-purpose 32-bit Status Register indicating the DTMROC-S operating conditions, DLL status, SEU error flags and statistics has been incorporated.

An error bit, representing the logical OR of the DTMROC-S error flags, is provided in the header of output data stream.

The JTAG concept was implemented to allow exhaustive production tests at the chip and board level. It includes all mandatory and some optional functionalities, such as memory self-tests and internal register scan path. Due to danger of SEU's drastically upsetting JTAG registers, the JTAG circuitry is completely frozen during normal chip operation.

### IV. DESIGN TOOLS ISSUES

The entire design was modelled in Verilog including most of the analog components. In addition to functional verification, this assured correct connectivity for all subblocks – both digital and analog.

The RTL stage modelling was constrained by the chip operating conditions, the technological parameters and the radiation-induced effects. This produced a preference for long combinatorial paths rather then fast sequential steps.

The constraint-driven logic synthesis was performed using the Synopsys tools [7]. High-performance implementations of the Synopsys DesignWare Library components significantly reduced the effort required to create and verify the design. This allowed transparent, high-level optimisation of performance during the synthesis process.

The physical design flow covered abstract generation of each block (autoAbgen [8]), hierarchical placement (Silicon Ensemble), clock tree generation (ctgen [8]), design flattening (Design Planner), routing (Silicon Ensemble [8]) and parasitic extraction (hyperExtract [8]). The flow was completely scripted using skill, perl, Design Planner and Silicon Ensemble scripts, so the last-minute changes were easily introduced. The timing analysis was done using Pearl.



Figure 5: Design iteration cycles to close timing

Some effort was made to predict post-route timing with appropriate margins during RTL synthesis (Figure 5). A number of synthesis-layout cycles were done to generate design specific custom wire load models based on extracted parasitics. The worst case achieved results for the longest paths are presented below.

| Path Endpoint            | Synopsys       | Layout Path | Variance |
|--------------------------|----------------|-------------|----------|
|                          | Path Delay, ns | Delay, ns   | ns       |
| ShiftRegister_reg_1_/D   | 11.42          | 12.55       | -1.13    |
| ShiftRegister_reg_7_/D   | 10.68          | 11.26       | -0.58    |
| ShiftRegister_reg_3_/D   | 10.08          | 10.68       | -0.6     |
| ShiftRegister_reg_5_/D   | 10.01          | 10.76       | -0.75    |
| ShiftRegister_reg_4_/D   | 9.88           | 10.58       | -0.7     |
| ShiftRegister_reg_6_/D   | 9.77           | 10.42       | -0.65    |
| DeraReadAddress_reg_6_/D | 9.73           | 9.92        | -0.19    |
| ShiftRegister_reg_8_/D   | 10.04          | 10.61       | -0.57    |
| ShiftRegister_reg_2_/D   | 9.37           | 10          | -0.63    |
| ShiftRegister_reg_0_/D   | 9.22           | 10.01       | -0.79    |

# V. STATUS

The DTMROC-S layout is shown in Figure 6. The die size is  $5.2 \times 5$ mm<sup>2</sup>. The chip has been fabricated in 0.25µm IBM CMOS technology.



Figure 6: DTMROC-S layout

The DTMROC-S was successfully tested on the mixed signal IMS Tester at CERN. The planned testability features simplified many of the test procedure with very good fault coverage. All implemented functionalities and expected performance were confirmed.

The manufacturer of the DTMROC-S chip has fabricated a special wafer with five process corners. The power consumption for all process corners easily met the design estimates - 130mA at 40 MHz and 2.5 V VDD.

The digital core logic was fully functional with the system clock above 100 MHz. This should allow for the expected degradation caused by the high level of ionising radiation. The SRAM performance has shown the predicted dependence on the process parameters (Figure 7).



Figure 7: DTMROC-S current consumption and SRAM performance

Figure 8 depicts the DTMROC-S time-measuring performance with the nominal, 2.5V, and 2.0V power supply. A 4.0ns wide tracking pulse was injected at 100ps intervals across three full clock periods, 75ns in total. The Figure shows the leading and falling edges, fit deviations and non-linearity sampled by one of the chip channels.



Figure 8: The linear fit deviation of the time measurements. The red and blue bars represent the leading and falling edges accordingly

The measured test results will be used to help select the final production process corner and the operating conditions. Because of the strong desire to reduce on-detector power, if the IMS measurements are confirmed by Test Beam results, the DTMROC-S nominal operating voltage may be reduced to 2.1-2.2V.

In total, 850 chips were packaged and tested with a demonstrated yield of 79%.

# VI. RADIATION TOLERANCE AND TEST BEAM ANALYSES

The total ionising dose tolerance has been studied at the CEA Saclay Pagure facility in July 2002. The tests have been done up to 7 Mrad total dose using a Co-60 source, which provides 1.33 MeV gamma radiation. The test results have shown  $\sim 10\%$  increase in the DAC's output voltage, after irradiation, without linearity degradation. No variations in the power consumption and the chip performance were detected.

The chip SEU sensitivity has been evaluated at the CERN-PS irradiation facility. The DTMROC-S was exposed to a 24GeV proton beam with an integrated fluence of about  $1.8 \times 10^{14} \text{p/cm}^2$ .

The test was done using the full TRT back-end readout electronics. The procedure consisted of repeatedly downloading all internal registers with pseudo-random patterns; continuously generating triggers; reading and monitoring the data stream; cross-checking actual contents of the configuration registers with the on chip Status Register SEU indications. The cross-section for a single D flip-flop was calculated from the total number of the SEU detected in different internal registers. The numbers we got vary from  $0.8 \times 10^{-14}$  to  $1.2 \times 10^{-14}$  cm<sup>2</sup>, that is slightly superior but still consistent with the results presented in [9]. The DTMROC-S incorporates a monitor, which counts the number of detected and corrected SEU's in the self-recovering elements of the chip. The impact of SEU's in these vital parts of the design was reduced by this design strategy. In total, two incidences with a flip in one of the redundant register units were detected. These events could be caused by the irregularity of the beam profile and may be irrelevant to the LHC radiation environment.

Measurements of track position-resolution and hitefficiency using the ASDBLR/DTMROC-S chip set were made at the CERN H8 test beam in August-September 2002. The straw track coordinate accuracy for a typical straw is presented in Figure 9 and is consistent with ATLAS requirements.



Figure 9: Drift time accuracy for a typical straw (left). The track radial position R versus drift-time dependence "V" curve is also plotted (right).

### VII. CONCLUSIONS

A new version of the DTMROC designed in a deep submicron process has been fabricated and demonstrated to function. Extensive lab tests give a high overall yield. The process corner impact on the chip performance was examined. The effectiveness of the radiation tolerant layout and design architecture techniques are confirmed.

Exhaustive internal test features were beneficial in simplifying and ensuring comprehensive design verification, high fault coverage and throughput.

Further beam test analyses are being pursued for the final verification of chip functionality and performance.

### VIII. REFERENCES

- [1] ATLAS Inner Detector TRD, CERN/LHCC/97-10.
- [2] F. Faccio et al., "Total dose and Single Event Effects (SEE) in a 0.25µm CMOS technology", LEB98, INFN Rome, September 1998, pp.105-113.
- [3] C. Alexander et al., "Progress in the development of the DTMROC time measurement chip", IEEE Trans. Nucl. Sci., vol.48 (2001), pp.514-519.
- [4] N. Dressandt et al., "Implementation of the ASDBLR straw tube readout ASIC in DMILL technology", IEEE Trans. Nucl. Sci., vol.48 (2001), pp.1239-1243.
- [5] IEEE Standard Test Access Port and Boundary-Scan Architecture, IEEE Std 1149.1-1990.
- [6] K. Kloukinas et al., "A Configurable Radiation Tolerant Dual-Ported Static RAM macro, designed in a 0.25μm CMOS technology for applications in the LHC environment", LEB02, Colmar, France, September 2002.
- [7] Synopsys Inc., 700 East Middlefield Rd., Mountain View, CA 94043.
- [8] Cadence Design System Inc., 555 N. Mitilda Ave., Sunnyville, CA 94086.
- [9] F. Faccio et al., "SEU effects in registers and in a Dual-Ported Static RAM designed in a 0.25μm CMOS technology for applications in the LHC", LEB99, Snowmass, USA, September 1999, pp.571-575.