Argonne National Laboratory

Experimental Physics and
Industrial Control System

1994  1995  1996  1997  1998  1999  2000  2001  2002  2003  2004  2005  2006  2007  2008  2009  <20102011  2012  2013  2014  2015  2016  2017  Index 1994  1995  1996  1997  1998  1999  2000  2001  2002  2003  2004  2005  2006  2007  2008  2009  <20102011  2012  2013  2014  2015  2016  2017 
<== Date ==> <== Thread ==>

Subject: RE: epicsEvent::invalidSemaphore exception in timerQueue
From: "Jeff Hill" <johill@lanl.gov>
To: "'Dirk Zimoch'" <dirk.zimoch@psi.ch>, "'EPICS'" <tech-talk@aps.anl.gov>
Cc: "'ebner'" <simon.ebner@psi.ch>
Date: Thu, 6 May 2010 09:48:27 -0600
Dirk,

One typically attaches the tornado debugger and request, for a particular
thread, that it suspend execution at the point where the exception gets
thrown instead of where it gets caught (currently in the epicsThread's last
chance exception handler). This will provide a more useful stack trace.

Are you aware of a precipitating set of events that leads to the demise of
this timer queue. Is there any facility that is being shut down at about the
same time (just before this occurrence)? New device support? Does it happen
shortly after the EPICS system starts?

Typically erroneous use of an invalid semaphore id results from ...
A) memory corruption
B) shutdown order issues where an object is used after it was destroyed

It _would_ be useful to determine what component of the IOC the timer queue
belongs to. We should be able to make that determination if you send the
output from the vxWorks "i" command (hopefully also identifying the task id
of the culprit). The associated component probably can be inferred from the
relative execution priorities of the timer queue to the spawning component.

I am the author of the timer queue (a new, heavily used, feature in R3.14).
The timer queue thread typically spends most of its time waiting, with
timeout, on an event semaphore. The event semaphore gets posted only when
the timer queue needs to reschedule. That particular event semaphore would
only become invalid if the timer queue was being shut down, the timer queue
data structure was corrupted, or if the vxWorks kernel data structures were
corrupted.

The timer queue class has a show diagnostic member function which is
typically called by the diagnostic member function of its owner. So if you
can find out who the timer queue belongs to then you can invoke its show
function, at increased interest level, to find out if there is generalized
corruption in its data structures. The tornado debugger can also help with
quick surveys for corruption.

Also, see Mantis entries 336, 332, 320 which may be unrelated, but
nevertheless do involve fixes to the timer queue facility after your
version.

Jeff
______________________________________________________
Jeffrey O. Hill           Email        johill@lanl.gov
LANL MS H820              Voice        505 665 1831
Los Alamos NM 87545 USA   FAX          505 665 5107

Message content: TSPA


> -----Original Message-----
> From: tech-talk-bounces@aps.anl.gov [mailto:tech-talk-
> bounces@aps.anl.gov] On Behalf Of Dirk Zimoch
> Sent: Thursday, May 06, 2010 7:51 AM
> To: EPICS
> Cc: ebner
> Subject: epicsEvent::invalidSemaphore exception in timerQueue
> 
> Hi all,
> 
> We are having a strange problem where a TimerQueue task gets suspended
> because of a invalidSemaphore exception and we can't wind out where.
> 
> We are using EPICS 3.14.8 on vxWorks.
> 
> Here is the error message:
> 
> > 0x1fb87960 (timerQueue): Unhandled C++ exception resulted in call to
> terminate
> > epicsThread: Unexpected C++ exception "epicsEvent::invalidSemaphore()"
> with type "Q210epicsEvent16invalidSemaphore" in thread "timerQueue" at
> THU MAY 06 2010 10:19:21.310783060
> 
> The dead task:
> > timerQueue a94c08       1fb87960 148 SUSPEND      23d410 1fb87470
> 3d0002     0
> 
> And the stack trace:
> >> tt 0x1fb87960
> > 243c24 vxTaskEntry    +68 : a94c08 (&epicsThreadCallEntryPoint,
> 1f9d24fc)
> > a94c84 epicsThreadOnceOsd+174: a7b230 ()
> > a7b230 epicsThreadCallEntryPoint+5c8: __cp_exception_info ()
> > 15e2fc __cp_exception_info+0  : __default_unexpected(void) ()
> > 15e2b8 set_terminate(void (*)(void))+0  : terminate(void) ()
> > 15e2a8 __default_unexpected(void)+0  : cplusTerminate(void) ()
> > 15d068 cplusTerminate(void)+50 : taskSuspend ()
> 
> I found out that epicsEvent::invalidSemaphore is thrown by
> epicsEvent::wait() and its variants when semTake failes for other
> reasons than timeout (which can only be an invalid SEM_ID).
> Unfortunately the stack trace is not very helpful to find out where the
> actual error happened (thanks to the C++ exception mechanism).
> 
> The error happens when (shortly after) the attached SNL program finishes
> the entry block of state "active". We are using seq version 2.0.10.
> 
> Any idea how to find out where the problem really is? Who might own the
> timerQueue? What corrupted epicsEvent the timerQueue might wait for? How
> the epicsEvent semaphore might got corrupted?
> 
> Dirk



References:
epicsEvent::invalidSemaphore exception in timerQueue Dirk Zimoch

Navigate by Date:
Prev: Re: epicsEvent::invalidSemaphore exception in timerQueue Daron Chabot
Next: [Request of tech-support] Error in compiling EPICS 3.14.8.2 base with dbCommon.h Jincheol B. Kim
Index: 1994  1995  1996  1997  1998  1999  2000  2001  2002  2003  2004  2005  2006  2007  2008  2009  <20102011  2012  2013  2014  2015  2016  2017 
Navigate by Thread:
Prev: Re: epicsEvent::invalidSemaphore exception in timerQueue Daron Chabot
Next: [Request of tech-support] Error in compiling EPICS 3.14.8.2 base with dbCommon.h Jincheol B. Kim
Index: 1994  1995  1996  1997  1998  1999  2000  2001  2002  2003  2004  2005  2006  2007  2008  2009  <20102011  2012  2013  2014  2015  2016  2017 
ANJ, 02 Sep 2010 Valid HTML 4.01! · Home · News · About · Base · Modules · Extensions · Distributions · Download ·
· EPICS V4 · IRMIS · Talk · Bugs · Documents · Links · Licensing ·