EPICS Controls Argonne National Laboratory

Experimental Physics and
Industrial Control System

1994  1995  1996  1997  1998  1999  2000  2001  2002  2003  2004  2005  2006  2007  2008  2009  2010  2011  2012  2013  2014  2015  <20162017  2018  2019  2020  2021  2022  2023  2024  Index 1994  1995  1996  1997  1998  1999  2000  2001  2002  2003  2004  2005  2006  2007  2008  2009  2010  2011  2012  2013  2014  2015  <20162017  2018  2019  2020  2021  2022  2023  2024 
<== Date ==> <== Thread ==>

Subject: Re: IOC segmentation fault related to CA security
From: "Kasemir, Kay" <[email protected]>
To: Michael Davidsaver <[email protected]>, "[email protected]" <[email protected]>
Date: Mon, 18 Apr 2016 12:34:47 +0000
Hi Michael:

There's no CA gateway between these IOCs.
The two IOCs are in fact on the same Linux host, and each depends on PVs from the other one.

IOC that provides a PV used as input in a channel access security rule shuts down.
Another IOC that uses that rule detects the disconnect. It recomputes the access security state, then tries to notify all CA clients about the new read/write permissions.
One of those CA clients was actually the IOC that just shut down, the client is no longer valid; crash.

Thanks,
Kay

________________________________________
From: [email protected] <[email protected]> on behalf of Michael Davidsaver <[email protected]>
Sent: Saturday, April 16, 2016 1:43 PM
To: [email protected]
Subject: Re: IOC segmentation fault related to CA security

Hi Matt,

I've created https://bugs.launchpad.net/epics-base/+bug/1571224 for this
bug.

I can recall seeing "dbCa:exceptionCallback" associated with ACF
disconnects before, but don't recall any crashes.  This might just be
chance.  The 'channel "unknown"' certainly stands out.

Would it be easy for you to take a packet capture of the traffic between
these two IOCs when this crash occurs?  This might give some clues about
the specific sequence of events which triggers the crash.

Also, is the communication between these two IOCs direct?  You don't
mention any ca gateway.

Michael


On 04/15/2016 05:27 PM, Pearson, Matthew R. wrote:
> Hi,
>
> I’ve had a few instances of one of my soft IOCs crashing with a segmentation fault when I shutdown another IOC that hosts a PV used as part of the CA security logic on the other crashed IOC.
>
> There is sometimes (but not always) this message printed out before the crash:
>
> dbCa:exceptionCallback stat "Virtual circuit disconnect" channel "unknown" context "cg1d-dassrv1.ornl.gov:5064"
> nativeType DBR_invalid requestType DBR_invalid nativeCount 0 requestCount 0 noReadAccess noWriteAccess
>
> Printing the stack trace:
>
> Core was generated by `../../bin/linux-x86_64/cg1d-parker1 ./st.cmd'.
> Program terminated with signal 11, Segmentation fault.
> #0  0x00007f5321218599 in ellDelete (pList=0x7f52bc000920, pNode=0x7f52ac008ec0) at ../../../src/libCom/ellLib/ellLib.c:87
> 87              pNode->previous->next = pNode->next;
> Missing separate debuginfos, use: debuginfo-install glibc-2.12-1.149.el6_6.9.x86_64 libgcc-4.4.7-11.el6.x86_64 libstdc++-4.4.7-11.el6.x86_64 ncurses-libs-5.7-3.20090208.el6.x86_64 readline-6.0-4.el6.x86_64
> (gdb) bt
> #0  0x00007f5321218599 in ellDelete (pList=0x7f52bc000920, pNode=0x7f52ac008ec0) at ../../../src/libCom/ellLib/ellLib.c:87
> #1  0x00007f532217c289 in casAccessRightsCB (ascpvt=0x7f52b8000db8, type=asClientCOAR) at ../camessage.c:1111
> #2  0x00007f5321d64122 in asComputePvt (asClientPvt=0x7f52b8000db8) at ../asLibRoutines.c:1014
> #3  0x00007f5321d63ea0 in asComputeAsgPvt (pasg=0x1f91ee0) at ../asLibRoutines.c:940
> #4  0x00007f5321d62419 in asComputeAsg (pasg=0x1f91ee0) at ../asLibRoutines.c:455
> #5  0x00007f5321d60482 in connectCallback (arg=...) at ../asCa.c:99
> #6  0x00007f53214c1301 in oldChannelNotify::disconnectNotify (this=0x7f52d0000d10, guard=...) at ../oldChannelNotify.cpp:112
> #7  0x00007f53214abe30 in nciu::unresponsiveCircuitNotify (this=0x7f5323714010, cbGuard=..., guard=...) at ../nciu.cpp:171
> #8  0x00007f53214b7c38 in tcpiiu::disconnectAllChannels (this=0x7f52d40008c0, cbGuard=..., guard=..., discIIU=...) at ../tcpiiu.cpp:1834
> #9  0x00007f532149a042 in cac::destroyIIU (this=0x7f52d000ed20, iiu=...) at ../cac.cpp:1227
> #10 0x00007f53214b27e3 in tcpSendThread::run (this=0x7f52d4000a00) at ../tcpiiu.cpp:229
> #11 0x00007f532122bc51 in epicsThreadCallEntryPoint (pPvt=0x7f52d4000a08) at ../../../src/libCom/osi/epicsThread.cpp:85
> #12 0x00007f53212333ce in start_routine (arg=0x7f52d400a250) at ../../../src/libCom/osi/os/posix/osdThread.c:385
> #13 0x00007f53204a29d1 in start_thread () from /lib64/libpthread.so.0
> #14 0x00007f53207a08fd in clone () from /lib64/libc.so.6
>
>
> The crashed IOC has a CA access security rule that looks like:
>
> ASG(DEFAULT)
> {
>     INPA("$(P):Scan:Active")
>     RULE(1, READ)
>     RULE(1, WRITE)
>     {
>         CALC("A=0")
>     }
>     RULE(1, WRITE)
>     {
>         UAG(epics, beamline, detector)
>         HAG(beamline)
>         CALC("A=1")
>     }
> }
>
> where $(P):Scan:Active is hosted by the IOC that I’m shutting down.
>
> In addition there is a channel access link between the two IOCs involving a CP link.
>
> I can’t reliably reproduce it, but it’s happened a few times today as I was testing it (stopping and starting the IOC hosting $(P):Scan:Active perhaps 20 times).
>
> Anybody have ideas about this?
>
> Our base version is 3.14.12.4 running on RHEL6.
>
> Cheers,
> Matt
>
>
> Data Acquisition and Control Engineer
> Spallation Neutron Source
> Oak Ridge National Lab
>
>
>
>
>
>



References:
IOC segmentation fault related to CA security Pearson, Matthew R.
Re: IOC segmentation fault related to CA security Michael Davidsaver

Navigate by Date:
Prev: RE: seqence update, then make motor error Mark Rivers
Next: Event Generator and Receiver. Amit Chauhan
Index: 1994  1995  1996  1997  1998  1999  2000  2001  2002  2003  2004  2005  2006  2007  2008  2009  2010  2011  2012  2013  2014  2015  <20162017  2018  2019  2020  2021  2022  2023  2024 
Navigate by Thread:
Prev: Re: IOC segmentation fault related to CA security Michael Davidsaver
Next: EPICS at the last continent - Antarctic Shen, Guobao
Index: 1994  1995  1996  1997  1998  1999  2000  2001  2002  2003  2004  2005  2006  2007  2008  2009  2010  2011  2012  2013  2014  2015  <20162017  2018  2019  2020  2021  2022  2023  2024 
ANJ, 15 Jul 2016 Valid HTML 4.01! · Home · News · About · Base · Modules · Extensions · Distributions · Download ·
· Search · EPICS V4 · IRMIS · Talk · Bugs · Documents · Links · Licensing ·