VMEbus Interrupter Module Specification

Epicure Design Note 123.2

1 February 1995

Paul A. Kasley

I. Overview

The VMEbus Interrupter Module provides hardware interrupt support to the DAE front-end configurations that use a VAX4000/90 or DEC3000/400 AXP with the DEC DWTVX TURBOchannel to VMEbus bridge. In the QVIP front-ends, Qbus and VMEbus interrupters are provided on their respective plug-in boards. In contrast, the TURBOchannel bridge has no means whereby the VAX may generate an interrupt to the VME crate and, while the bridge handles the VMEbus IRQ lines, the Force 386 processors are incapable of generating an interrupt on the VMEbus.

The VME Bus Interrupter Module (VBIM) provides hardware that implements the QVIP interrupt functionality in a VME-resident module.

II. TVI Test Stand Results

After the initial VBIM design was completed, sections of the design were "wire-wrapped", debugged in a VME crate with a single board computer (MVME130) and then moved into the TURBOchannel front-end for integration with the Epicure TVI services (Epicure Design Note 126).

a. Interrupts

With DAS running on the VAX4000/90, the interrupt system locked up randomly. A complete power cycling and reboot was needed to clear the lock-up. With assistance from DEC, the cause of the lock-up was determined to be due to the use of edge-triggered interrupts in the Model 90. The solution requires that the TVI services manipulate a VAX-internal register to re-enable pending interrupts. This fix is documented in Appendix A. The specification for the 4000/90 Turbochannel adapter (the "CDAL" board) is made part of this document by incorporation into Appendix B.

b. DMA

DMA performance of the 4000/90 DMA facilities was evaluated by using a Force 386 to copy data from the VME common memory directly to the VAX. Using the 386 MOVSD string copy instruction, a read-common-memory/write-VAX-DMA cycle completes in 1240ns to 1440ns. Of this time, the VAX DMA write takes 660ns, the 386 CM read takes 280-440ns, and 360ns is consumed by dead time between bus cycles.

The timing breaks down as follows:

VAX WRITE

VAX DMA write delay (DS to DTACK) 560ns

386 AS to DS delay 20ns

386 DTACK to AS inactive delay 80ns

386 AS inactive duration 160ns

CM READ

RAM access (DS to DTACK), cache hit 180ns

RAM access (DS to DTACK), cache miss 340ns

Average access 260ns

386 AS to DS delay 20ns

386 DTACK to AS inactive delay 80ns

386 AS inactive duration 160ns

AVERAGE LONGWORD DMA TRANSFER TIME 1340ns

Cache hits on the common memory board alternate with cache misses for the string move to yield an average RAM access time of 260 ns.

The question arises as to what gain to expect with a separate DMA controller versus having the Force card write the return packet directly to the VAX.

A direct 386-to-VAX DMA operates by:

1. 386 assembles a return packet directly in common memory

2. 386 copies the return packet from common memory to the VAX

using the string move instruction

3. 386 interrupts VAX to signal packet on return Q

4. VAX does copy of packet from its free page pool to local memory and

a de-Q to sink the CM allocation

The VME activity is primarily the string move consisting of a CM read followed by a DMA write with some fixed amount of dead time.

A DMAC-mediated DMA operates by:

1. 386 assembles a return packet directly in common memory

2. 386 sets up the VBIM DMAC and issues a "go" to the controller

3. The DMAC reads the packet from CM into its private buffer

4. The DMAC writes from its buffer to the VAX

5. The DMAC interrupts the VAX

6. VAX de-Qs the packet and does its internal copy

The VME activity is primarily a block read from CM followed by a block write to the VAX with some amount of dead time. The Clearpoint specified RAM access for block moves is 265ns. Thus, there is no gain in the average CM read access time with the DMAC.

The VAX write time is limited by the Turbochannel interface and is the same for either case.

There are two timing parameters that a separate DMAC will impact: the AS inactive duration of 160ns between cycles and the 80ns DTACK to AS inactive delay. For direct DMA these delays amount to 480ns per longword transfer. The VME spec requires 80ns of AS-high time between cycles. The maximum possible reduction is 400ns per longword transferred. The theoretical minimum time per transfer on the 4000/90 is 940ns, a 29% improvement over that currently seen.

The only way to realize a gain from the DMAC is by decreasing the 400ns "dead-time" incurred with direct DMA. The percent improvement resulting from a fixed amount of dead-time reduction increases as the CM and VAX access times decrease. Note also that the DMAC has extra overhead that must be amortized over the actual transfers. This overhead includes set-up of the DMAC by the 386, coordination of events to ensure synchronization, DMAC manipulation of control blocks and interrupts, "nice" time in which the DMAC suspends its activity and re-arbitrates the bus periodically to allow other masters onto the bus, and "set-up" and "tear-down" of bus transfers.

As presently designed, the VBIM DMAC is clocked at 25Mhz. From the initial DMAC sequencer program:

VAX WRITE

VAX DMA write delay (DS to DTACK) 560ns

DMAC DTACK to DS inactive delay 40ns

DMAC DS inactive duration 80ns

CM READ

RAM access 280ns

DMAC DTACK to DS inactive delay 40ns

DMAC DS inactive duration 80ns

LONGWORD DMA TRANSFER TIME 1080ns

The DMAC may provide a 260ns, or 19%, raw transfer time improvement over direct DMA. For a 256 longword packet, the copy through the DMAC is 66.5 usec faster than direct DMA. However, the savings may be quickly eaten by the setup and management burden. For the VAX4000/90 front end, a separate DMAC offers little (or possibly no) benefit. Thus, the decision was made to drop the DMAC from the design and to proceed with an interrupter-only board to support the Turbochannel interface.

c. VAX/VME transaction VMEbus cycle times

Programmed I/O (PIO) VME read by the TCI 4.8 usec

PIO write to VME by the TCI 1.6 usec

DMA read of TCI by VME 4.8 usec

DMA write of TCI by VME 0.7 usec

Programmed I/O operations were also tested on a VAX4000/60. The performance of the model 60 was comparable to that of the model 90.

For production front-ends using PIO only, the VAX4000/60 provides essentially the same performance as the model 90 at a significant cost savings.

d. Other oddities

Other effects were noted during TVI testing:

1. When using the VAX BSSI instruction to perform a VME read-modify-write, it is possible for the TVI process to be preempted between the read and the write. This can leave VME masters locked out for up to several hundred usec while the VAX deals with its interrupt.

2. Back-to-back single DMA writes and DMA block transfer writes will completely tie up the internal buses of the 4000/90. This causes all processes, including the VMS clock and scheduler, to hang until a break in DMA activity occurs.

III. Interrupter

The interrupter consists of four separate FIFO interrupt queues. While all queues are general purpose, it is anticipated that one will be used by the TURBOchannel to signal to Timer, one will be used by Timer to signal to CAMAC, and a third will be used by CAMAC to signal the TURBOchannel, leaving one spare. Each queue is 2048 entries deep by 32 bits wide (23 bits used). Entries are loaded into a queue by a VMEbus write operation.

Each entry consists of three parts: a 3 bit interrupt level specifier, an 8 bit interrupt vector, and 12 trigger bits. The interrupt level specifier determines which VME IRQ line will be driven. The vector is applied to the data bus during the interrupt acknowledge cycle that occurs in response to the interrupt. When loaded as a "1", each trigger bit pulses a corresponding output signal low for a predetermined duration simultaneously with the interrupt being driven onto the IRQ line. The trigger outputs are validated by a separate low-going strobe signal. When the interrupt specifier is loaded as '000', no interrupt is generated and a strobe and triggers are presented for the entry. The strobe is 250ns minimum duration. Triggers are set up 100ns before and held 100ns past the strobe.

The status of the interrupt FIFOs is monitored and an interrupt is set when the 1025th entry is written into the FIFO. The vector and interrupt level for the two 'FULL' interrupts are loaded by the VMEbus write operations to local registers. The queue full interrupt bypasses the queuing logic and is posted immediately by the affected interrupter. No strobe or triggers are associated with these interrupts. A single write-only CSR is provided for each queue to flush the FIFO and reset the logic. The full interrupt may be disabled by setting the interrupt level in the QFULL_IVR register to zero.

Up to eight interrupts may be waiting simultaneously for servicing. The VME IACKIN*/IACKOUT* daisy chain is used to establish a hardwired priority ordering within the interrupter. The ordering is Full Queue 0,.., Full Queue 3, Queue 0,..,Queue 3.

The interrupt queues are completely flexible. Any entry can cause an interrupt on any IRQ and return an arbitrary byte-wide vector to the interrupt handler. Any VME master (TURBOchannel, Timer, CAMAC, Listener) can write to any queue. Interrupts occur in the sequence in which they are loaded into the queue. Interrupts will not be lost as long as the queues do not overflow and as long as a positive response occurs for every interrupt. Queue entries that activate trigger pins only do not elicit an interrupt acknowledge cycle from the VME master and therefore the possibility of losing the interrupt exists.

IV. Other Resources

a. Diagnostic register

A 32 bit diagnostic register provides 24 undedicated LED outputs and two error indications. Bit 31 of the diagnostic register latches occurrences of bus error. Bit 30 indicates that one of the queues has slipped out of sync. This occurs when hardware detects that the three FIFO chips that comprise a queue are not in the same state.

b. Board ID ROM

A 32 byte fuse programmable bipolar ROM is provided to hold a VME-readable board identification. The programming process programs ones in a field of zeros. The signature PROM organization given below allows obsolete revision data to be overwritten and new information added without having to scrap the device:

Address Description Value (LS byte)

BA+0080 ASCII "Q" ($51) $51

+0084 ASCII "I" $49

+0088 ASCII "Q" $51

+008C BOARD SN $(01-FE)

+0090 overwritten byte $FF

. . .

. . .

. . $FF

+n PCB Revision Letter ASCII "A" through "Z"

+(n+4) Assembly Major Revision Number $01-$FE

+(n+8) Assembly Minor Revision Number $01-$FE

+(n+12) unprogrammed byte $00

. . .

. . .

+00FC . $00

c. Scratch RAM

An undedicated longword scratch SRAM area is provided.

V. General Characteristics

a. Endians

The BIM is structured to conform to the big endian VME bus. This is done to avoid adding another level of confusion to the already confusing situation regarding what is swapped when and where. All VME accesses are aligned longword ONLY (Note that this is not in strict compliance with the VME standard. The VME standard recommends that longword devices support D16 and D08EO). This is to eliminate confusion about byte enables and addresses on write operations and to conserve board area. Non-longword or unaligned VME accesses will cause a bus error.

b. Electrical

VME DTB Slave:

Address modifiers 09 extd usr data, 0A extd usr pgm, 0D extd supv data, 0E extd supv pgm

A32/D32 aligned only. Bus error generated on non-longword and unaligned access

VME Interrupter:

D08(O)

Drives IRQ1-IRQ7

Release on acknowledge (ROAK)

c. Mechanical

Strobe and trigger outputs: Pinned out to P2 connector, rows A and C.

Front panel LED indicators:

(24) Diagnostic

(4) Queue Full

(4) Queue empty

(4) Queue read

(4) Queue write

(1) Board reset

(4) Device selects

(1) IACKOUT*

VI. Memory Map

The BIM occupies thirty two 4K pages in A32 space:

(BA+$00000 - BA+$0003F) Interrupt queue registers

(BA+$00040) Diagnostic register

(BA+$00080 - BA+$000FF) ID ROM

(BA+$00400 - BA+$1FFFF) Scratch RAM (32 wide)

Note: '$' prefix denotes hexadecimal value

Note: To conserve PCB real estate, the VME base address is hard-coded into the board address decoder.

VII. Revision History

123.0 Original release

123.1 Revised specification. Removed DMA controller, Arcnet port, VSB port, and syscon. Changed number of interrupt queues from two to four. Modified diagnostic port.

123.2 Added ID ROM data table, sec. IV.b.

Revised Diagnostic Register bit definitions, section IV.a. and figure.

 

Security, Privacy, Legal