Argonne National Laboratory

Experimental Physics and
Industrial Control System

1994  1995  1996  1997  1998  1999  2000  2001  2002  2003  2004  2005  <20062007  2008  2009  2010  2011  2012  2013  2014  2015  2016  2017  Index 1994  1995  1996  1997  1998  1999  2000  2001  2002  2003  2004  2005  <20062007  2008  2009  2010  2011  2012  2013  2014  2015  2016  2017 
<== Date ==> <== Thread ==>

Subject: Re: R3.13.10 ca_event problem [sls]
From: Andrew Johnson <anj@aps.anl.gov>
To: Al Honey <ahoney@keck.hawaii.edu>
Cc: tech-talk@aps.anl.gov
Date: Wed, 25 Jan 2006 14:52:44 -0600
Al Honey wrote:

Seems to me that the overflow affects taskWd in such a way that the memory corruption occurs at a low level and not from within the application code (two different memory maps on two different IOCs with two wildly different EPICS versions generating the same invalid instruction address). If taskWd simply attempts to suspend the offending task then does that mean the corruption occurs when the ring buffer overflow message is generated (perhaps something as simple as a malformed log message)?

taskWd is a watchdog task that exists to monitor a set of tasks and report if/when they die. It was once used to restart some such tasks automatically, but that functionality is no longer used. Thus it isn't doing any task suspends at all, contrary to your assumption, although it does periodically look at the TCBs (Task Control Blocks) of all its tasks. I suspect the ring buffer overflow is related to the CA_event task dying; it's just a conicidence that taskWd is informing you of that occurrence at about the same time - that's what it's there for.


It's possible that something is corrupting the CA_event task's TCB. If this happens again, do a 'd' of the memory area around the tid to see if there's any ASCII text nearby, and have a look for where it might have started from if it's a long ASCII stream - if you're lucky that might give you some hints of the source.

- Andrew
--
* * Matt Santos / / For a Brighter America * *

References:
RE: R3.13.10 ca_event problem [sls] Al Honey

Navigate by Date:
Prev: RE: R3.13.10 ca_event problem [sls] Jeff Hill
Next: RE: Using AAI records under EPICS Bruins, Stefan
Index: 1994  1995  1996  1997  1998  1999  2000  2001  2002  2003  2004  2005  <20062007  2008  2009  2010  2011  2012  2013  2014  2015  2016  2017 
Navigate by Thread:
Prev: RE: R3.13.10 ca_event problem [sls] Jeff Hill
Next: realTimePerform Marty Kraimer
Index: 1994  1995  1996  1997  1998  1999  2000  2001  2002  2003  2004  2005  <20062007  2008  2009  2010  2011  2012  2013  2014  2015  2016  2017 
ANJ, 02 Sep 2010 Valid HTML 4.01! · Home · News · About · Base · Modules · Extensions · Distributions · Download ·
· EPICS V4 · IRMIS · Talk · Bugs · Documents · Links · Licensing ·