A High Performance Data Acquisition System for Structural Biology

Mary Westbrook, Tom Coleman, Bob Daly, Joe Haumann, Steve Naday, John Weizeorick Argonne National Laboratory, ECT-EE

1. Introduction

Research in structural biology conducted at synchrotron sources using large area electronic detectors represents a significant challenge for beamline control and data acquisition systems. An intimidating quantity of data and rapid data rates must be successfully accomodated in order to achieve data acquisition in near-realtime. These control and data acquisition systems are also interfaced with data analysis applications, which further complicates the task.

An EPICS-based (Experimental Physics and Industrial Control) data acquisition system developed for the Structural Biology Center Collaborative Access Team (CAT), Advanced Photon Source (APS) Sector 19, will be presented. The SBC-CAT data acquisition system has been designed for speed, using both specialized hardware and the EPICS distributed architecture to conduct data acquisition tasks in parallel. Each 18 MB image can be acquired and written to disk in less than 3 seconds. This data acquisition system has also been interfaced to d*TREK, a single-crystal macromolecular x-ray diffraction analysis package, written by Jim Pflugrath, Molecular Structure Corporation. This represents an integrated approach to data acquisition and analysis. Currently, data analysis is the rate-limit for the SBC-CAT.

2. APS 1 Detector Data Acquisition System

2.1 Scientific Need for High Performance Data Acquisition

Our group was contracted to design a data acquisition system using the APS 1 Detector (pictured in Figure 1) at the Advanced Photon Source. The APS 1 Detector has a large front face area of 21x21 cm2. The APS 1 Detector is composed of a 3x3 matrix of 1024x1024 pixel CCD chips bonded to reducing fiber-optic tapers. Each APS 1 Detector image (full-resolution) contains a square 3072x3072 pixel array of 16-bit integers and is therefore 18 MB in size, see Figure 2. The APS 1 Detector readout is very fast, producing 18 MB in 1.8 s. Control system developers strive to meet this rate without significant additional overhead.

Another factor influencing the need for a high performance data acquisition system is the use of the APS 1 Detector at the APS. Synchrotron experimentation generally result in short exposure times. For example, at the APS beamline 19-ID, the total flux onto a sample at 100mA and 12KeV when the beam is fully-focused (38 microns vertical and 83 microns horizontal) is 10 Xph/s.

Finally, the cost of operating APS and each sector is high. DOE and other funding agencies have made big investments, experimentation should be productive as possible.

Figure 1 APS 1 Detector

Figure 2 APS 1 Detector Image

2.2 Actual Performance

The design presented in this paper has met the technical requirements cited below. A summary of the APS 1 Detector data acquisition system performance is shown below:

2.3 Key Features of Data Acquisition System Design

Each hardware and software component integrated into the APS 1 Detector data acquisition system is consistent with high performance:

Other issues that will be discussed are interfacing of the APS 1 Detector data acquisition system to d*TREK and the use of data compression.

2.3.1 APS 1 Detector VME Interface Architecture

The APS 1 Detector is currently interfaced to 2 VME chassis as shown in Figure 3. One IOC is responsible for the detector setup, monitoring and exposure control, while the other IOC receives the data from the detector during data acquisition.

ECT developers have also implemented custom EPICS device support for the APS 1 detector, see the EPICS MEDM (Motif Editor and Display Manager) Screens shown in Figure 4. Software modules have been developed to set up and monitor individual CCD gains, offsets, and temperatures via an RS-422 Interface. Software modules have also been provided to set up and monitor detector enclosure temperature and humidity and individual CCD Thermo-electric Cooler Controller currents and detector power supply voltage analog signals via a XYCOM Analog-to-Digital VME Module. Software modules used to monitor and control approximately 200 APS 1 detector parameters (such as readout mode and detector state via an RS-422 Interface) have been implemented. This software serves to aid detector engineers with routine detector setup, monitoring, and control tasks. This software has been implemented via EPICS device support and EPICS databases, which are downloaded to the APS 1 Detector Setup & Monitor IOC.

Figure 3 APS 1 Detector VME Interface

When the APS 1 detector is read out, the data are passed serially in a fixed order from 18 pixels at a time (each of the nine CCDs has two readout amplifiers) and transferred to a VME Multiplexer module in the Data Acquisition IOC. The 18 pixels are scattered across the detector face. The Multiplexer module rearranges the data into 16-bit parallel words, which represent the signal measured by each pixel where the signal is proportional to the number of x-rays detected, and passes the 16-bit words to a VME Descrambler module. The VME Descrambler reorders the pixels assembling a sequential image, sequential in x- and y-positions, in VME memory.

While the data is transferred from VME memory to SGI Challenge memory and written to disk, the next image is taken and transferred to a second descrambler and memory module. By "ping-pong"ing between memory modules, the data may be transferred to disk without additional overhead. We'll look at the parallel nature of the data acquisition system later in the talk.

Figure 4 APS 1 Detector MEDM Screens

2.3.2 SGI Challenge High-bandwidth System Architecture

The APS 1 Detector data acquisition system was designed with high-performance I/O which allow it to support concurrent beamline data manipulation tasks:

The data acquisition system design centers around a beamline file and compute server, the SGI Challenge. SGI Challenge has the following high-bandwidth features (see Figure 5):

Disk I/O is handled by striping the data across 3 SCSI-2 FWD (20 MB/s/controller ) connected RAID Arrays seen as 1 large logical volume using XLV Logical Volume Manager. With this I/O configuration, an 18 MB image can be written to disk in < 1.0 s. Other features to note, RAID arrays have hot swappable disks and power supplies.

The IRIX operating system comes with a set of realtime programming features which are also an integral part of the APS 1 Detector data acquisition system. The data acquisition system makes use of the following IRIX REACT Extensions, a realtime programming library, capabilities:

Figure 5 SGI Challenge High-bandwidth System Architecture

2.3.3 HIPPI Data Transfer

High Performance Parallel Interface, HIPPI, is an ANSI standard high speed interconnect which originated from a need to interconnect LANs of supercomputers. The HIPPI standard defines a simple and efficient memory-to-memory data transfer. As such, HIPPI is an excellent point-to-point, large file transfer mechanism. HIPPI uses 2 copper 50-pair twisted-pair cables over up to 25 meter distances. HIPPI is currently the interface of choice and de-facto high speed interconnect for high-end workstations and other devices requiring high speed data transfers.

In this data acquisition system, a HIPPI network is used as a dedicated data acquisition pathway. Image data is transferred from VME memory to SGI Challenge memory, via the HIPPI protocol. On the Challenge, the HIPPI module and driver are provided by SGI. On the data acquisition IOC (input-output controller), a commercial HIPPI-VME module is used. A custom HIPPI vxWorks driver was implemented by ECT control system developers to support this interface. A raw-character-mode protocol, rather than a TCP/IP protocol, was used to eliminate the software overhead needed to implement the TCP/IP protocol. A HIPPI Server process running on the SGI Challenge, accepts image data and header information and writes the image with header to disk asynchronously(see Section 2.3.5).

2.3.4 PMAC/VME Intelligence and Hardware Synchronization

>From a controls point of view, the following data acquisition tasks are critical in the single-crystal x-ray diffraction experiment:

To achieve this, the Delta Tau Systems PMAC Programmable Multi-Axis Controller, PMAC-VME, motor controller is used to control the position and speed of the crystal orientation axis, also called the goniometer Omega axis. The PMAC-VME programmable feature is also used to calculate and maintain a constant speed of rotation during image acquisition.

ECT developers use the PMAC-VME programmable feature to define the start and end positions for an image using real-time angular readback data from a shaft encoder attached to the Omega axis motor and to output control pulses at each position. These pulses are used to precisely control an x-ray shutter which defines the sample x-ray exposure interval time, the gating of the APS 1 Detector, and any detector used to measure the actual dose of x-rays to the sample (i.e., a beam intensity monitor).

To assure that the gating of all devices is done accurately and with minimal latencies, which rules out software control, a hardware solution was devised. A Data Acquisition Sync Module was designed with an on-board general-purpose VME interface. The DAQ Sync Module has the ability to delay the gating of any device by a predefined time; in this way the delay in opening of the x-ray shutter can be accommodated.

Testing of the fast shutter electronics and software subsytem was conducted in a laboratory setup using the shutter filter module, timing scaler, goniometer sync module, and a sample shutter attenuator unit. Additional tests also involved a motor controller and a goniostat rotary table. A laser diode was directed through the shutter's opening at a photodiode, and a scope was used to monitor the timing signals driving the shutter blades and the output of the photodiode. A 40 millisecond mechanical shutter blade delay opening and closing and a mechanical shutter jitter of approximately 1 millisecond.

The sequence of events during data acquisition for a single image is as follows:

2.3.5 Distribution of Data Acquisition Tasks

To achieve the 3+s image to disk performance we have made use of the distributed nature of EPICS, using a great deal of parallelism. Figure 6 is a flowchart that can be used to track the operations which are occurring in parallel. While it is somewhat difficult to explain the chart in detail, we will highlight two important occurrences of overlapping parallel paths.

First some background, data acquisition involves a variety of programming techniques (C++ and C programs, EPICS sequence programs, PMAC Motion Control Programs) running on multiple CPUs. The CPUs involved are:

We have discussed the SGI Challenge in Section 2.3.2, the HIPPI Server in Section 2.3.3 and the PMAC-VME in Section 2.3.4.

The Experiment IOC is the data acquisition "master" controlling the sequencing of data acquisition using a sequence program. EPICS provides a state notation compiler that is a C preprocessor which converts state notation language source code into C code. ECT developers wrote the program using state notation language as a sequence of related states (typically, hardware states) within a database(s). State notation language also provides easy access to EPICS database information.

Interprocess communication occurs as both hardware and software signals:

Figure 6 Image Acquisition Detailed Flow Diagram Overlapping Parallel Paths #1 and #2

The first example of data acquisition tasks being performed in parallel, starts with a disk write initiated for data from VME memory A, and the detector readout system has filled memory B (see Figure 6). Parallel Path #1 represents the HIPPI transfer for VME memory B, header formatting for VME memory B image, and VME memory B image verify. These tasks overlap with Parallel Path #2, the completion of disk write for VME memory A (see Table 1).

Table 1 Parallel Path #1 and #2 Timings

Parallel Path #1 Tasks Measured Time (s)
Header Format 1.2
Image Verify 0.02
Parallel Path #2 Task Measured Time (s)
Disk Write <1.0

In this overlapping of data acquisition tasks, the header formatting becomes the rate limit. We now use C++ classes from d*TREK which use a lot of dynamic string manipulation. We feel that we can significantly reduce this time in order to bring the time intervals closer. Overlapping Parallel Paths #3, #4, and #5

In the next example, the APS 1 Detector readout into VME memory after an exposure (Parallel Path #3), the Experiment IOC is gathering header data for the current image collected during exposure, followed by gathering header data for the next image collected prior to it's exposure (Parallel Path #4) and the PMAC-VME motion controller is decelerating the Omega motor, rewinding, correcting for backlash and repositioning for optimal acceleration for the next image (Parallel Path #5) all overlap.

Table 2 Parallel Path #3, #4, and #5 Timings

Parallel Path #3 Task Measured Time (s)
Detector Readout 1.8
Parallel Path #4 Tasks Measured Time (s)
Header Information Current 0.3
Header Information Next 1.2
Parallel Path #5 Task Measured Time (s)
Motor Manipulations >2.5

The motor speed is currently the rate limiting factor, requiring (>2.5 s) to perform these motions. A higher-speed torque motor is planned. This motor will perform to allow the motions to be completed in under 1.8 s.

For short exposure "still" images, where the motor is not the rate limit, the APS 1 Detector data acquisition system presently collects and writes to disk in 3+ seconds. If header manipulations can be tuned, this performance will be directly affected. For images with motor motion and short exposure times, the image to disk time is 5+ seconds. The current goniometer Omega motor cannot perform within the 3+ second time. Rewind, backlash correction, and ramp of omega motor to speed for next image, which involves 3 accelerations and 3 decelerations. Because of this hardware limitation, actual rotation image to disk time is 5+ s. Data Acquisition Diagnostics

Data acquisition diagnostic capabilities have been built into the control system. Image acquisition performance can be displayed using a diagnostic tool, the data acquisition monitoring MEDM screen shown in Figure 7. This diagnostic tool can be used to measure system performance and to help troubleshoot the data acquisition system.

Figure 7 Data Acquisition Diagnostic MEDM Screen

3. d*TREK dtcollect

3.1 d*TREK dtcollect GUI

Users interact with the d*TREK dtcollect GUI (shown in Figure 8) to collect data . They enter the exposure time and mode, rotation angle, number of images, image naming convention, desired wavelength, starting sample position, detector position in distance and angle. They use the "Expose" button to proceed.

Figure 8 d*TREK dtcollect GUI

While data acquisition proceeds, they may monitor data acquisition status and a variety of sensors. In this case, the PIN diode detector (ua) and APS ring current (ma). The dtcollect GUI allows the user to collect a single data scan, series of images collected in sequential angular position.

Multiple scans may be requested by the user with the dtcollect Scan screen (see Figure 9), brought up from the dtcollect Collect menu. Here the user has the same input with the addition of specifying a series of scans. Also, interesting to note, the scans may be read in from an image header file, reducing the user input required, and thereby automating the task. The user uses the "Scan" button to proceed and may "Abort" or "Pause"/"Resume" the scan at any time.

3.2 Image Header Information from 3 Sources

Figure 10 shows the flow of control during the "Expose"/"Scan" operation. Notice that we are merging header information from 3 sources: site-specific information stored as a disk file, user-input to the dtcollect GUI, and dynamic information gathered from the IOCs.

3.3 Significance of APS 1 Detector Image Headers

APS 1 Detector images are formatted in the MIFF format (machine-independent file format), first developed by Jon Cristy of Dupont and later modified by Jim Pflugrath and Marty Stanton for area detector images. The APS 1 Detector image file is stored as an ASCII header followed by binary data (3073x3072 pixel array of 16-bit unsigned integers).

The advantage of the MIFF format is that the data format can be easily determined and therefore, an image read routine used to process these images with other data analysis programs may be easily implemented. Also, the UNIX "more" command may be used to view image headers and avoid viewing binary data.

Once the crystallography community adopts a standard image format, like the Crystallographic Binary File (CBF) format for image data, APS 1 Detector images will be formatted in that standard format.

The image header format consists of an ASCII string that has a length that is a multiple of 512 characters. The string consists of human readable i text of the form KEYWORD=VALUE; where, KEYWORD is a case-sensitive string of up to 32 characters. VALUE is the value of the KEYWORD, which may be a number (integer or decimal), a string, or an array of numbers.

APS 1 Detector image header information serves the following functions:

Figure 9 d*TREK dtcollect Scan GUI

Figure 10 APS 1 Detector Images to Disk

3.4 d*TREK-APS 1 Detector Data Acquisition System Interface

ECT control system developers and Jim Pflugrath collaborated in specifying an application programming interface (API) which has been implemented for data acquisition with the APS 1 Detector and d*TREK dtcollect.

This API consists of 6 c++ classes (coded by ECT control system developers) which interface to the EPICS-based control system.

Together these device classes form the d*TREK DTDEV Device Library. The DTDEV Device Library contains the site-specific, hardware- dependent code that interfaces d*TREK dtcollect to the beamline and experiment, allowing d*TREK to remain device-independent.

Figure 11 d*TREK Application Programming Interface

Methods were provided to control and monitor beamline and end-station hardware used in an experiment, such as APS 1 detector, goniometer, ion chamber, PIN diode detector, double crystal monochromator, and fast shutter. These methods use the EPICS Common DEVice (cdev) C++ class library. Cdev provides a standard interface between the class methods and the EPICS-based Control System.

SBC-CAT acquisition of processible x-ray diffraction images demonstrates the successful integration of d*TREK and the APS 1 Detector data acquisition system. This API has also been successfully applied to a Windows NT version of the software by MSC for use with a commercial CCD detector system.

4. Use of Data Compression

4.1 Commercial Hardware Compression

The use of commercial hardware compression chips and their software equivalents was explored during the design phase of the project. A typical image to be compressed is shown in Figure 2. Figure 2 profiles show the image to be very detailed, each pixel can contain a value from 0 to 65535. Table 3 shows compression ratios achieved with the Advanced Hardware Associates dclz algorithm and APS 1 Detector images. We found that the resulting compression ratio was not sufficient to indicate a significant reduction in required disk space (see Table 3).

Table 3 APS 1 Detector Image Compression (proprietary hardware algorithm)

Disk I/O included

Image Data lys023s1a (raw) lys011s3a (raw)
dclz Algorithm Compression Ratio 1.38 1.32

4.2 Standard Unix Software Compression

Standard Unix software compression algorithms were also explored (see Table 4). The compression ratios are only slightly better, while the time to execute these algorithms was high relative to the data aquisition rate.

Table 4 APS 1 Detector Image Compression (standard UNIX)

Tests performed on SGI R4400 Challenge - Disk I/O Included

Image Data lys023s1a (raw) lys011s3a (raw)
Image size (MB) 18.88 18.88
Unix "compress" Time (s) 22 22
Unix "compress" Compression Ratio 1.51 1.42
GNU "gzip" Time (s) 83 72
GNU "gzip" Compression Ratio 1.63 1.58
Unix "pack" Time (s) 14 14
Unix "pack" Compression Ratio 1.51 1.43

The factors contributing to the decision not to use data compression in the initial implementation are:

4.3 Byte-Offset Software Compression

The byte-offset algorithm has recently come to our attention and proves to execute faster while at the same time, results in higher compression ratios. It is also of interest to us as it is one of the proposed compression schemes for Crystallographic Binary File Format, or CBF. We tested this algorithm with APS 1 Detector images and our preliminary results are shown in Table 5.

Table 5 APS 1 Detector Image Compression (Byte-Offset)

Tests performed on SGI R4400 Challenge - Disk I/O Included

Image Data lys011s3a (raw)
Image size (MB) 18.88
Byte-Offset Compress Time (s) 2.5
Byte-Offset Compression Ratio 1.94

For our system, the byte-offset compression algorithm done in software and conducted in parallel with other data acquisition tasks could be used without degrading image acquisition. We could conduct byte-offset software compression of the current image in parallel with the writing of the previous image and reading of the next image (see Table 6).

Table 6 Byte-Offset Compress, Write Image and Read Image Timings

Readout Next Image Tasks Measured Time (s)
Header Format 1.2
Image Verify 0.02
Byte-Offset Compress Curent Image Task Measured Time (s)
Data Compression <2.0 (without disk I/O)
Disk Write Previous Image Task Measured Time (s)
Disk Write <1.0

Futhermore, the data acquisition system design does not prohibit installation and use of data compression hardware. If the byte-offset compression algorithm could be implemented in hardware on a VME module, it could be used as an integral part of the VME data handling pipeline (see Figure 3).

5. Conclusions

The APS 1 Detector data acquisition system has been designed for speed, using both specialized hardware and the EPICS distributed architecture to conduct data acquisition tasks in parallel. Each integral hardware and software component is consistent with high performance. No rate-limit exists that prevents the data acquisition system from being further optimized to achieve 1.8 s image acquisition.

In the process of 20 months of validation, the first crystal structure has been solved and published during this past year. ECT was involved in the "Human fhit" experiment, which used the technique of MAD Phasing. fhit is important to our understanding of cancer in biological systems and in the development of biotechnology.

RFC822 header ----------------------------------- Return-Path: Received: from epics by epicsmac (SMI-8.6/SMI-SVR4) id LAA15680; Wed, 7 Jan 1998 11:03:26 -0600 Received: from ANEL19.EL.ANL.GOV by epics (SMI-8.6/SMI-SVR4) id LAA23558; Wed, 7 Jan 1998 11:03:03 -0600 Date: Wed, 7 Jan 1998 11:03:26 -0600 From: Mary Westbrook To: jmwulf@aps.anl.gov Message-Id: <980107110326.2022bd63@ANLEL.EL.ANL.GOV> Subject: NOBUGS paper in html format hardcopy in interoffice mail Content-Type: text X-UIDL: 091a807d3d086b8ca497c0d73514098f