> > From: firstname.lastname@example.org (S. Joshua Stein)
> > We've had some interesting problems with the NI-1014 GPIB board talking to
> > some Hewlett Packard instruments - specifically some of their 'scopes. I spent
> > a good week with a GPIB logic analyzer watching transactions between the
> > NI-1014 board and an HP54615B oscilliscope - it seemed that ocasionally the
> > GPIB board would not honor an assertion of one of the handshaking lines
> > from the 'scope (I believe it was NRFD) causing all sorts of bad things
> > (garbled messages, locking up the scope, etc).
> > The solution (after talking many times to many engineers at HP and NI) was
> > to use a different controller. I switched to a greenspring IP running on
> > a 162 (with Hideos - see how I tied that in?) and everything has worked great
> > since.
> We've had problems with the NI-1014 also - particularly with older instruments
> like ion-gage controllers.
> We've got a simple hardware hack which seems to do much for their reliabilty.
> Essentially, it consists of dividing down the clock to the GPIB controller
> chip. We cut one trace, and add one IC, dead-bug style. Takes about 20
> minutes for a good tech to do.
Ouch! These problems are sooOOOoo time consuming to debug. It was this
exact problem (that took Greg Nawrocki a month to determine with GPIB and
waveform analysers) that drove Greg, Jim and I to brainstorm what Jim would
ultimately (vastly) improve upon and release as HiDEOS. By the way, Greg and
I were having trouble with the Digitel-500's GPIB interface board. It used a
68488 to operate its GPIB bus that was operated by a 680x processor. It
turned out that the 68400 was not responding with NRFD fast enough to stop
the data flow at the right time. I wonder what type of GPIB chips are used
in these other devices?
Here is the long verison of that paragraph...
Once upon a time there was a guy named Greg that had to get pressure
readings from Digitel-500 ion pump controllers...
It was the early 1990s... 1991 I think... Greg's office was part of a
hallway that connected three lab areas. One of the labs was being used as a
conference room that had a table in it that was just a little too big. Noone
could enter or exit the room without forcing everyone else to to the same. It
was if every meeting was modeled after that early meeting scene in "Attack of
the Killer Tomatoes."
Greg also had a bench in one of the labs... I suppose that is why he had the
crummy office. I mean, he was one of the few (and lowest in the pecking order)
that had both an office AND a lab. Under his bench sat 1/2 dozen Digitel-500
units piled up in the corner. These things had a brown-colored face plate
about 6-inches high, a large digit LED panel and two back-lit buttons. A
yellow one and a red one. These butotns said something to the effect of "High
Voltage Enabled" and "High Voltage On." One of these DIG-500's (as we called
them) had some funky wires connected to a cylinder-shaped thing on the
floor. I later figured out that that cylinder thing was an ion-pump... later-
still I found out that an ion pump tries to attract floating particles against
the sides of of what would otherwise be a prefectly empty vacum chamber.
Somewhere, somehow, someone had decided to order a couple-100 of these DIG-500
units before asking anyone how they wanted to connect them to the control
system. The units were shipped configured for RS-232 and given to Greg to
figure out how to deal with the connecting. Greg knew that he was going to
have to connect all of these units and that the manufacturer advertised
support for both RS-232 and GPIB. He also knew that by using RS-232, one
had to have one connection from the host processor to each and every
DIG-500, and that by uising GPIB he could connect several units to one
host processor connection.
John had just started at the lab at this time and was asked to have a look
at some software that was used to operate the GPIB controller they were using.
It was the top-of-the-line National Instruments 1014. It boasted connectivity
thru BOTH the front and rear panels and had a sticker on the front of it that
read "approved by VMELabs".
John looked at the driver software and said "wow is this thing sophisticated!
At long last... something that I can really sink my teeth into!"
One day soon after, on a self un-guided tour of the lab area to see all the
tubes and wires that "represented real science" to John, he walked into Greg's
lab and saw this impressive looking stuff with the cylinders and wires and
asked of Greg "what doeth this scientific-looking stuff?"
Greg told John of the RS-232 and GPIB (John knew of RS-232 fropm his own
past, and been excited to be able to start using his new-found knowledge
of GPIB.) John was impresed with Greg's need to address the situation and
wanted to help.
For the next week or so, John read all the information available on the
DIG-500 and noticed that the RS-232 connection was designed to be pluged
into a terminal and a printer that someone could later inspect to observe
the trend of one or two specific readings on the unit. Something called
Torr and something else that to do with these things called setpoints.
There were other things that the DIG-500 could do, but you had to type
in specific commands to get that other data each time you wanted to see it.
John asked of Greg "what happens if you are typing when the DIG-500 decides
that it is time to spew out a Torr and setpoint reading?" They looked at
eachother blankly for a few seconds and started typing in commands while
waiting for a pressure reading.
The pressure reading is printed when ever it is ready, right in the middle
of what ever the user is typing or requesting a reading of.
"What a lovely feature", commented John using a certain jargon that is often
printed in comic books like this: "What a #@!%%&." John then added a side
note (like he always does) that "Brown is a good color for these things. I
dub thee 'brown pile'. From now on, you shall be known as the brown pile."
The ONLY redemption the company had in John's eyes was that the feature was
disableable. However, the default was for it to be 'on', and in order to
disable it, you had to enter a command, and since that command possibly be
interrupted with a reading of the current Torr value, there is no simple way
that one can 'initialize' the mode without including a 10-fold increase
in code size side to add the timeout and retry logic needed to be sure
it completes properly.
Greg and John then decided to see how the GPIB interface works since GPIB
implys a very different 'style' of communication. It is rarely used for
anything other than machine-machine communication. So they were certain
the designers would take that into account when designing it.
John and Greg hooked up a GPIB interface on one of the brown piles and plugged
it into a 1014 and fired it all up.
"Real science goin on here!" thought John as he watched the 90 LEDs on the
VME analyser flash on and off as it accessed the 1014, brown pile with the
funky wires and cylinder thingy off to the side. And then, all of a sudden,
all but one of the LEDs went out and we sat there looking at like it was
going to wake back up and finish... any second now... aaaaany second now...
A day or two later finds Greg, John, an NI GPIB analyser,and the flashing
VME analyser along with all of their fans roaring away in the lab still asking
"what the %$#%$# is wrong with this thing?" (The fans eventually gave John a
bit of a cold in the middle of that summer.)
Calling the manufacturer we found out little more than that "it works on our
test stand", and "the guy that worked on the GPIB interface no longer works
here." Somehow John and Greg did not feel that that was an acceptable response
to a customer that just purchased a couple-100 of these things at a cost in
excess of $100K.
Days and weeks went on as the two convinced Perkin Elmer (the company that
no longer employs the GPIB designer, and who John refers to as Pukin' Elmer)
to give them the source code to the software for the GPIB interface. After
finding it completely undocumented incomplete and unreadable, John and Greg
then realized WHY that guy no longer works for Pukin' Elmer.
John and Greg decided that if it works for them, it will work for us. There
must be something wrong with John's code, or the 1014. John decided (like
all good programmers) that his code was fine and that there was no need to
check it (although he did secretly when noone was looking and found that while
there were no problems that would cause what they were observing, there were
indeed enough other things wrong with it that he worked on it for over a year
after that day.) So the two hooked up a logic analyser and watched the
voltages go up and down on the GPIB bus to see if there was some kind of
What they saw was that the the NI-1014 data transfer to the DIG-500 was
shoving out the first two bytes before the device could respond with a NRFD
that it apparantly intended to assert after the FIRST one... which, by then,
was long gone. (Bill Brown's solution, that had a few years to go before
birth, slows down the NI-1014 clock and thus its top end data rate which
works fine. However, if you need several 1014's, it is sort-of annoying to
have to rebuild them after shelling out $1500 each for them.)
The 1014 (and its glorious driver) were just too fast for the little
Pukin' Elmer brown pile.
"Hah!" said John as he also calculated that the 1014 was not violating any
timing requirements in what it was doing.
John was younger then and took the mission, stated by those that paid him,
"COTS based hardware only" very seriously. He figures guess it was sort-a
foolish not to discount that a little when he noticed that 1/2 the floor space
was filled with guys soldering chips into PC boards and than 1/2 of the
remaining space was filled with guys designing those PC boards.
At this same time, NI came out with a new version of the 1014 (the 1014D) that
had two ports on it. Another guy in involved with the project, Nedo, was
excited about it because he had a number of brain-damaged GPIB devices that
worked, but were so slow that they had to be put on their own provate GPIB
busses if he wanted to ever finish transfering his data by the year 2000.
[Ironic that a bus that can support 15 devices is redered slower than 300-baud
RS-232 link by some engineer somewhere that figured that his was to be the
only device of interest by his own user community. And God forbid he try to
make it easier on his sales force to sell anybody more than one, or get any
repeat business! One wonders if said company hired this bright lad away from
the now rich Pukin' Elmer in hopes that the same sequence of events would
happen to them.]
By now, months had passed, John was coding away on the latest and greatest
GPIB drivers, Greg was calculating the cost of 300 RS-232 ports, and Ned was
ordering a 1014D to give it a test drive. Should be no big deal since NI
claims that it is software-compatible with the 1014... it is merly "two
1014's on the same board", as their marketing material claims.
It turns out that the 1014 uses a 7210 to operate its GPIB bus and is wired
in such a way that the addressing commands MUST be sent out in polled mode
and the data transfers can be sent out in DMA or polled mode. The NI-1014D
is a two-port version of the NI-1014 and is wired to use the same DMA
controller for both ports. The National Instruments vxWorks driver software
code for the 1014D (when John last looked at it) was designed in such a way
that one had to reboot it between port accesses. John called their support
line and got the usual card-reading droid that eventually transfered him to
an engineer that said:
the board is intended to be used in an environment where two
different experiments can be 'wired' at the same time but
only one will run at a time.
"What a lovely feature", replied John as he hung up the phone... and has since
NEVER talked to ANYONE at NI for any reason agian.
So after a couple hours of studying the schematic for the board John decided
that the board WAS software compatible as long as you only use the first port.
But that would render it functionally compatable and at 1.4 times the price!
John eventually figured out how to program the thing so that both ports could
run at the same time. This required making both the hardware and the control
system happy by making some serious alterations to the way the board is
initialized and reset after data transfers.
When that was all done, John looked at the 5000-line driver, the pile of bown
piles, and said "there has got to be a better way." So he looked at the price
of the NI-1014, and the NI-1014D and decided that, "for that price we should
be able to find a better piece of hardware. One whose schematic looks better
than the master's project of a C-student. One whose driver is smaller than
the operating system. One that can be supported someone that has no work
experience with either Pukin Elmer or hational instruments... some kind of
Meanwhile, back in the lab...
Greg ended up building a custom GPIB controller that ran in a bitbus BUG. He
had two versions, one based on the 7210 and another based on the TI9914. The
7210 worked because the bitbus BUG had a slow-ish CPU in it and it did not
overrun the lowly brown pile. But the day it was finished, his source on the
7210 dried up :-( BUT it WAS cost-compatible with the NI solution and since
we were working in a high-noise environment, he decided that it might be a
good idea to use way-short GPIB cables and fiber the bugs to the VME-crate
(which proved to be a very good idea. Looking back says that cabling them
back to the 1014 would have never worked due to EMI.)
To make his idea fly he decided to use the (apparantly) software compatible
9914. And after that failed miserably to function, he realized that the
technical reference docs labels the physical package pins using D0-D7 to
represent MSB-LSB order a'la IBM mainframe green-card standards, and the
software documentation (in the same document!) uses DO-D7 to represent
LSB-MSB like the rest of the world. He was then able to get the thing
off the ground enough to then realize that the two chips are not really
software compatible after all.
By this time, John and Ned had been working on the driver software for
Bitbus and Greg asked that John help him fix the software for the 9914.
John was glad to do anything that would prevent Greg from being forced to
purchase 1014 boards and dove right in. While working on it John noticed
that if events were JUUUUUST right, the bitbus messages could appear to
be out of order... some kind of off-by-one error in queueing someplace that
would eventually go away again after the system would sit idle for a second
or so. "Hmmm... must be a bug in the driver somewhere", thought John assuming
that he would find it before it became a problem.
Once this new solution to end all problems was finished, Greg found that
when sending several commands to the brown pile in succession, the brown pile
would completely reset itself and turn the high voltage off to the cylinder
thingy that would in turn release all the particles that it had trapped, which
that would in turn completely defeat the entire system!
John and Greg grew increasingly upset about all this as communications with
support at Pukin' Elmer worked about as good as their GPIB interface. Finally,
one day, Greg proclaimed that the brown piles would be operated using RS-232,
and that we would use the over-sized driver design and that the RS-232 ports
would be controlled by yet another verison of a BUG that would operate
the RS-232 ports. (By this time it was obvious that he had to use fiber
between the VME host and the cabinets where the brown piles were. And that
RS-232 over fiber was more expensive than the RS-232 BUG solution.)
Greg's problem was at long last acceptably solved (once the plethora of
RS-232 problems with the UART and brown piles were coded around.) And to
this day the brown piles at the APS are conversed with via RS-232 connected
to the VME hosts via bitbus.)
John noticed that this queueing error appeared to have gone away somewhere
between driver versions. He assumed it was cured by the addition of a great
deal of error checking and timeout logic that had been added since.
A week later, a suit from Arrow Electronics came and tossed a promo for a new
thing called an MVME162 on John's desk. John asked of its IP port things.
John asked of the availibity boards that could do RS-232 and GPIB. The
response was that they did exist and that the MV162 with two 9914-based
GPIB ports was less expensive that a 1014D!!!!
John and Greg talked excitedly of the idea of dedicating the entire MV162 to
the operation of the GPIB ports... after all it was no more expensive than
any other VME-based GPIB solution. (The BUG solution could have also been used
but for large data transfers it was too slow to be acceptable.)
John dreamed of the day that he would see NO 1014s in use. As his new mission
was to replace them by writing a driver that sat on the MV162 and could be
completely operated by a small driver that would reside in the host processor.
...to be continued
Tune in next time when; we hear Claude say "this bitbus looks like a good
idea for the power supply controllers", we hear Jim say "you should call that
mess 'hideous' since it is hideos", we hear Greg say "how come the bitbus
stuff is responding off-by one when I connect heterogenus BUGs on the same
link?", when we hear John tell Claude "it appears to be deadlocked, but that
can't happen... see the call to FASTLOCK right here? How could two different
processes get past that point at the same time?", we hear Jim say "I'd like
to consider the reimplementation of DCT as an X windows application."
- Navigate by Date:
flaky IOC problems at Jefferson Lab watson
- Navigate by Thread:
flaky IOC problems at Jefferson Lab watson