EPICS Controls Argonne National Laboratory

Experimental Physics and
Industrial Control System

1994  1995  1996  1997  1998  1999  2000  2001  2002  2003  2004  2005  <20062007  2008  2009  2010  2011  2012  2013  2014  2015  2016  2017  2018  2019  2020  2021  2022  2023  2024  Index 1994  1995  1996  1997  1998  1999  2000  2001  2002  2003  2004  2005  <20062007  2008  2009  2010  2011  2012  2013  2014  2015  2016  2017  2018  2019  2020  2021  2022  2023  2024 
<== Date ==> <== Thread ==>

Subject: Re: VME Bus Error handling on MVME3100 and MVME6100 boards
From: Till Straumann <[email protected]>
To: Andrew Johnson <[email protected]>
Cc: Kate Feng <[email protected]>, [email protected]
Date: Fri, 08 Sep 2006 09:34:55 -0700
Sorry if I repeat myself - but IMO flaming the tempe chip and
recommending not using boards employing it is too simple and
therefore unfair.
E.g., neither the MVME5500 (universe/non-tempe)  nor the
MVME6100 (tempe) allows you to detect bus errors (neither
PCI nor VME) by means of a machine-check exception
as you could e.g., on the mvme2300.

Here are (again) the details:

Not getting an exception as a result of a VME bus error is
not simply the problem of the tempe chip. It involves the VME
bridge, the PCI hostbridge the interrupt controller and, most
importantly, the board design.

The marvell hostbridge and interrupt controller present on both, the mvme5500
and the 6100 does not support a MCP (nonmaskable AKA machine-check exception)
wire and MCP is simply not hooked-up on *either* of these boards.
You won't get a machine-check for PCI bus errors either.
(But you can use ordinary interrupts to detect them)


In some way, the tempe chip is even better than the universe because
a careful board designer could route one of the 4 tempe-interrupts directly to
the CPU's MCP input. Contrary to the universe, the tempe does
generate interrupts for coupled pci<->vme transactions. With
the universe this would not be possible. Such a board design could
let VME bus errors raise a machine-check.


Finally, IMHO it is not *that* big of an issue. If we use an interrupt
to signal bus-errors, as you point out, it might be delivered asynchronously
[as it would anyways as soon as posted transactions are involved]
and therefore harder to debug but you still get a notification that something
is wrong. It is even possible [for coupled transactions] to suspend the
faulty thread and halt the system if it happens in an ISR.



-- T.



Andrew Johnson wrote:
Kate Feng wrote:
Andrew Johnson wrote about mvme6100:

The only sure-fire way around this problem is to check
the Tempe chip's VMEbus Exception Attributes Register after every
write operation and every read that returns an all-1s bit pattern

Just a clarification: Should'nt most applications be terminated upon bus error ?

Yes; this discussion is about how to actually do that in a way that catches the right task and provides a pointer to the code that failed so that engineers and technicians can track down problems quickly.


For those applications, it seems that the overhead for the
VME read/write is necessary to be considered only inside
the related ISRs or in the related non-ISR routines where
the interrupt has to be disabled, which is rare.

Actually I don't think it's that rare. I have counted something like 30 calls to the vxWorks intLock() routine in our R3.13.10 support module area, most of which are protecting code that manipulates at least one card register over the VMEbus while the lock is held. In addition I counted 79 calls to intConnect(), and most of those ISRs will be manipulating VME card registers.


Every one of those drivers must be examined and may have to be modified if we decide to use a Tempe-based CPU board here. That's a lot of work!

The PCIbus Retry and Disconnect cycle terminations that you discussed do not actually stop the data transfer cycle completely, they only permit it to be run again or to take longer to complete than a regular PCIbus single I/O cycle.

- Andrew


Replies:
Re: VME Bus Error handling on MVME3100 and MVME6100 boards Kate Feng
References:
Re: VME Bus Error handling on MVME3100 and MVME6100 boards Kate Feng
Re: VME Bus Error handling on MVME3100 and MVME6100 boards Andrew Johnson
Re: VME Bus Error handling on MVME3100 and MVME6100 boards Kate Feng
Re: VME Bus Error handling on MVME3100 and MVME6100 boards Kate Feng
Re: VME Bus Error handling on MVME3100 and MVME6100 boards Andrew Johnson

Navigate by Date:
Prev: Re: Build failure under QNX 6.1 David Dudley
Next: Re: EPICS source Documentation Kay-Uwe Kasemir
Index: 1994  1995  1996  1997  1998  1999  2000  2001  2002  2003  2004  2005  <20062007  2008  2009  2010  2011  2012  2013  2014  2015  2016  2017  2018  2019  2020  2021  2022  2023  2024 
Navigate by Thread:
Prev: Re: VME Bus Error handling on MVME3100 and MVME6100 boards Kate Feng
Next: Re: VME Bus Error handling on MVME3100 and MVME6100 boards Kate Feng
Index: 1994  1995  1996  1997  1998  1999  2000  2001  2002  2003  2004  2005  <20062007  2008  2009  2010  2011  2012  2013  2014  2015  2016  2017  2018  2019  2020  2021  2022  2023  2024 
ANJ, 02 Sep 2010 Valid HTML 4.01! · Home · News · About · Base · Modules · Extensions · Distributions · Download ·
· Search · EPICS V4 · IRMIS · Talk · Bugs · Documents · Links · Licensing ·