Experimental Physics and Industrial Control System
|
Hi,
ITER is experiencing issues with JCA and CAJ, leading to a CSS
application freezing/crashing or running happily, depending on a
bugfix level change in EPICS base on the server side.
Nadine Utzel and I were running a few tests, showing the
questionable behaviour.
Setup 1:
Server: Gateway 2.0.4.0 using Base
3.14.12.2
Client: CSS using JCA 2.3.6 with JNI to
Base 3.14.12.3
When opening panels, or switching tabs in BOY tabbed containers,
there are situations where the gateway prints:
CAS:
Sep 03 10:17:35 !!! Errlog message received (message is
above)
CAS Request: utzeln on 4501WS-CC-0006.codac.iter.org:
cmd=2 cid=135 typ=34 cnt=1 psz=0 avail=1b7
bad resource id in
"../../../../src/cas/generic/casStrmClient.cc" at line 2203
Sep 03 10:17:35 !!! Errlog message received (message is
above)
filename="../../../../src/cas/generic/st/casStreamOS.cc"
line number=479
Bad resource identifier - unexpected problem with
client's input - forcing disconnect
Sep 03 10:17:35 !!! Errlog message received (message is
above)
while the CSS console shows:
2013-09-03 08:17:35.185 WARNING [Thread 40]
org.csstudio.utility.pv.epics.ContextErrorHandler
(contextException) - Channel Access Exception from
gov.aps.jca.jni.ThreadSafeContext@179bf1b3: Status: Bad event
subscription (monitor) identifier
Info: host=ca-gateway-util.codac.iter.org:5064 ctx=Bad
Resource ID=439 detected at
../../../../src/cas/generic/casStrmClient.cc.2203
file: null at line 0
2013-09-03 08:17:35.186 WARNING [Thread 58]
org.csstudio.utility.pv.epics.ContextErrorHandler
(contextException) - Channel Access Exception from
gov.aps.jca.jni.ThreadSafeContext@179bf1b3: Status: Virtual
circuit disconnect
Info: ca-gateway-util.codac.iter.org:5064
file: ../cac.cpp at line 1214
2013-09-03 08:17:35.442 WARNING [Thread 39]
org.csstudio.utility.pv.epics.JCACommandThread (run) -
JCACommandThread exception
java.lang.IllegalStateException: Invalid channel
at
gov.aps.jca.jni.JNIChannel.assertState(JNIChannel.java:71)
at
gov.aps.jca.jni.JNIChannel.getConnectionState(JNIChannel.java:221)
at
org.csstudio.utility.pv.epics.EPICS_V3_PV$3.run(Unknown Source)
at
org.csstudio.utility.pv.epics.JCACommandThread.run(Unknown
Source)
[...]
The last exception gets repeated many times, probably once per PV.
Many channels reconnect, some stay disconnected and will never
connect.
Sometimes that last repeated exception on the client side does not
occur.
The bad id followed by server disconnect also causes exceptions to
be printed to STDERR of CSS:
2013-09-03
08:24:22.335 WARNING [Thread 102]
org.csstudio.utility.pv.epics.ContextErrorHandler
(contextException) - Channel Access Exception from
gov.aps.jca.jni.ThreadSafeContext@179bf1b3:
Status: Bad event subscription (monitor) identifier
Info: host=ca-gateway-util.codac.iter.org:5064 ctx=Bad
Resource ID=6334 detected at
../../../../src/cas/generic/casStrmClient.cc.2203
file: null at line 0
2013-09-03 08:24:22.336 WARNING [Thread 111]
org.csstudio.utility.pv.epics.ContextErrorHandler
(contextException) - Channel Access Exception from
gov.aps.jca.jni.ThreadSafeContext@179bf1b3:
Status: Virtual circuit disconnect
Info: ca-gateway-util.codac.iter.org:5064
file: ../cac.cpp at line 1214
Setup 2:
Server: Gateway 2.0.4.0 using Base
3.14.12.2
Client: CSS using JCA 2.3.6 with CAJ
1.1.10
The gateway shows similar error messages, but always in pairs of
two:
CAS:
Sep 03 10:44:50 !!! Errlog message received (message is
above)
bad resource id in
"../../../../src/cas/generic/casStrmClient.cc" at line 2203
Sep 03 10:44:50 !!! Errlog message received (message is
above)
CAS Request: utzeln on 4501WS-CC-0006.codac.iter.org:
cmd=2 cid=135 typ=34 cnt=1 psz=0 avail=1ca
filename="../../../../src/cas/generic/st/casStreamOS.cc"
line number=479
Bad resource identifier - unexpected problem with
client's input - forcing disconnect
Sep 03 10:44:50 !!! Errlog message received (message is
above)
CAS:
Sep 03 10:44:51 !!! Errlog message received (message is
above)
CAS Request: utzeln on 4501WS-CC-0006.codac.iter.org:
cmd=2 cid=204 typ=17 cnt=1 psz=0 avail=31d
bad resource id in
"../../../../src/cas/generic/casStrmClient.cc" at line 2203
Sep 03 10:44:51 !!! Errlog message received (message is
above)
filename="../../../../src/cas/generic/st/casStreamOS.cc"
line number=479
Bad resource identifier - unexpected problem with
client's input - forcing disconnect
Sep 03 10:44:51 !!! Errlog message received (message is
above)
CSS freezes immediately and has to be killed manually (no access to
CSS console). STDERR shows many times (once per PV?):
2013-09-03 08:47:01.439 SEVERE [Thread 41]
com.cosylab.epics.caj.impl.CATransport (processRead) -
java.lang.UnsupportedOperationException
at
java.nio.ByteBuffer.array(ByteBuffer.java:959)
at
com.cosylab.epics.caj.impl.handlers.ExceptionResponse.internalHandleResponse(ExceptionResponse.java:130)
at
com.cosylab.epics.caj.impl.handlers.AbstractCAResponseHandler.handleResponse(AbstractCAResponseHandler.java:110)
at
com.cosylab.epics.caj.impl.CAResponseHandler.handleResponse(CAResponseHandler.java:139)
at
com.cosylab.epics.caj.impl.CATransport.processRead(CATransport.java:530)
at
com.cosylab.epics.caj.impl.CATransport.processRead(CATransport.java:412)
at
com.cosylab.epics.caj.impl.CATransport.handleEvent(CATransport.java:350)
at
com.cosylab.epics.caj.impl.reactor.lf.LeaderFollowersHandler.handleEvent(LeaderFollowersHandler.java:77)
at
com.cosylab.epics.caj.impl.reactor.Reactor.processInternal(Reactor.java:400)
at
com.cosylab.epics.caj.impl.reactor.Reactor.process(Reactor.java:284)
at
com.cosylab.epics.caj.impl.reactor.lf.LeaderFollowersHandler.run(LeaderFollowersHandler.java:91)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603)
at java.lang.Thread.run(Thread.java:722)
[...]
followed by the really bad guy (when CSS freezes):
2013-09-03 08:49:15.736 WARNING [Thread
41] org.csstudio.utility.pv.epics.ContextErrorHandler
(contextException) - Channel Access Exception from
com.cosylab.epics.caj.CAJContext@15546ea6: Virtual circuit
disconnect
2013-09-03 08:49:15.757 SEVERE [Thread 1]
org.csstudio.utility.pv.epics.EPICS_V3_PV (handleConnected) -
UTIL-S15-AG91:MUT3-JT1 connection handling error
java.lang.IllegalStateException: transport closed
at
com.cosylab.epics.caj.impl.CATransport.submit(CATransport.java:827)
at
com.cosylab.epics.caj.impl.requests.AbstractCARequest.submit(AbstractCARequest.java:88)
at
com.cosylab.epics.caj.impl.requests.ReadNotifyRequest.submit(ReadNotifyRequest.java:171)
at
com.cosylab.epics.caj.CAJChannel.get(CAJChannel.java:952)
at
org.csstudio.utility.pv.epics.EPICS_V3_PV.handleConnected(Unknown
Source)
at
org.csstudio.utility.pv.epics.EPICS_V3_PV.connect(Unknown
Source)
at
org.csstudio.utility.pv.epics.EPICS_V3_PV.start(Unknown Source)
at
org.csstudio.opibuilder.editparts.PVWidgetEditpartDelegate.startPVs(Unknown
Source)
at
org.csstudio.opibuilder.editparts.AbstractPVWidgetEditPart.activate(Unknown
Source)
at
org.csstudio.opibuilder.widgets.editparts.TextUpdateEditPart.activate(Unknown
Source)
at
org.eclipse.gef.editparts.AbstractEditPart.activate(AbstractEditPart.java:160)
at
org.eclipse.gef.editparts.AbstractGraphicalEditPart.activate(AbstractGraphicalEditPart.java:195)
at
org.csstudio.opibuilder.editparts.AbstractBaseEditPart.activate(Unknown
Source)
at
org.csstudio.opibuilder.widgets.editparts.GroupingContainerEditPart.activate(Unknown
Source)
at
org.eclipse.gef.editparts.AbstractEditPart.activate(AbstractEditPart.java:160)
at
org.eclipse.gef.editparts.AbstractGraphicalEditPart.activate(AbstractGraphicalEditPart.java:195)
at
org.csstudio.opibuilder.editparts.AbstractBaseEditPart.activate(Unknown
Source)
at
org.csstudio.opibuilder.widgets.editparts.GroupingContainerEditPart.activate(Unknown
Source)
at
org.eclipse.gef.editparts.AbstractEditPart.activate(AbstractEditPart.java:160)
at
org.eclipse.gef.editparts.AbstractGraphicalEditPart.activate(AbstractGraphicalEditPart.java:195)
at
org.csstudio.opibuilder.editparts.AbstractBaseEditPart.activate(Unknown
Source)
at
org.csstudio.opibuilder.widgets.editparts.TabEditPart.activate(Unknown
Source)
at
org.eclipse.gef.editparts.AbstractEditPart.activate(AbstractEditPart.java:160)
at
org.eclipse.gef.editparts.AbstractGraphicalEditPart.activate(AbstractGraphicalEditPart.java:195)
at
org.csstudio.opibuilder.editparts.AbstractBaseEditPart.activate(Unknown
Source)
at
org.csstudio.opibuilder.editparts.DisplayEditpart.activate(Unknown
Source)
at
org.eclipse.gef.editparts.AbstractEditPart.addChild(AbstractEditPart.java:215)
at
org.eclipse.gef.editparts.SimpleRootEditPart.setContents(SimpleRootEditPart.java:105)
at
org.eclipse.gef.ui.parts.AbstractEditPartViewer.setContents(AbstractEditPartViewer.java:617)
at
org.eclipse.gef.ui.parts.AbstractEditPartViewer.setContents(AbstractEditPartViewer.java:626)
at
org.csstudio.opibuilder.runmode.OPIRuntimeDelegate.init(Unknown
Source)
at
org.csstudio.opibuilder.runmode.OPIRunner.init(Unknown Source)
at
org.csstudio.opibuilder.runmode.OPIRunner.setOPIInput(Unknown
Source)
at
org.csstudio.opibuilder.runmode.RunModeService.replaceOPIRuntimeContent(Unknown
Source)
at
org.csstudio.opibuilder.widgetActions.OpenDisplayAction.openOPI(Unknown
Source)
at
org.csstudio.opibuilder.widgetActions.AbstractOpenOPIAction.run(Unknown
Source)
at
org.csstudio.opibuilder.widgets.editparts.Draw2DButtonEditPartDelegate$1.actionPerformed(Unknown
Source)
at
org.csstudio.swt.widgets.figures.ActionButtonFigure.fireActionPerformed(Unknown
Source)
at
org.csstudio.swt.widgets.figures.ActionButtonFigure$ButtonEventHandler.mouseReleased(Unknown
Source)
at
org.eclipse.draw2d.Figure.handleMouseReleased(Figure.java:944)
at
org.eclipse.draw2d.SWTEventDispatcher.dispatchMouseReleased(SWTEventDispatcher.java:267)
at
org.eclipse.gef.ui.parts.DomainEventDispatcher.dispatchMouseReleased(DomainEventDispatcher.java:374)
at
org.eclipse.draw2d.LightweightSystem$EventHandler.mouseUp(LightweightSystem.java:548)
at
org.eclipse.swt.widgets.TypedListener.handleEvent(TypedListener.java:219)
at
org.eclipse.swt.widgets.EventTable.sendEvent(EventTable.java:84)
at
org.eclipse.swt.widgets.Widget.sendEvent(Widget.java:1258)
at
org.eclipse.swt.widgets.Display.runDeferredEvents(Display.java:3588)
at
org.eclipse.swt.widgets.Display.readAndDispatch(Display.java:3209)
at
org.eclipse.ui.internal.Workbench.runEventLoop(Workbench.java:2701)
at
org.eclipse.ui.internal.Workbench.runUI(Workbench.java:2665)
at
org.eclipse.ui.internal.Workbench.access$4(Workbench.java:2499)
at
org.eclipse.ui.internal.Workbench$7.run(Workbench.java:679)
at
org.eclipse.core.databinding.observable.Realm.runWithDefault(Realm.java:332)
at
org.eclipse.ui.internal.Workbench.createAndRunWorkbench(Workbench.java:668)
at
org.eclipse.ui.PlatformUI.createAndRunWorkbench(PlatformUI.java:149)
at
org.csstudio.utility.product.Workbench.runWorkbench(Unknown
Source)
at
org.csstudio.startup.application.Application.startApplication(Unknown
Source)
at
org.csstudio.startup.application.Application.start(Unknown
Source)
at
org.eclipse.equinox.internal.app.EclipseAppHandle.run(EclipseAppHandle.java:196)
[...]
Now for the fun part:
Setup 3:
Server: Gateway 2.0.4.0 using Base
3.14.12.3
Client: CSS using JCA 2.3.6 with CAJ
1.1.10
No errors whatsoever.
Bottom line:
Switching the CAS CA server from base 3.14.12.2 to 3.14.12.3
determines if the pure Java CA client will die a horrible death or
just work fine.
Considering that Channel Access is the main separation layer that
enables clients and servers of control systems to be updated
independently, and that C/C++ Channel Access works reliably across
virtually any combination of Base between 3.13 and 3.15, I would say
this is very bad behaviour and should be considered a serious bug.
I know that the client is not using the latest version of CAJ,
though. Has this issue been addressed?
Has anyone else seen this? I assume this is connected to LP issue
730720 [1]?
The only relevant change in CAS was adding support for the
DBE_PROPERTY flag.
Thanks a lot,
~Ralph
[1]
https://bugs.launchpad.net/epics-base/+bug/730720
|
- Replies:
- RE: Serious issue with JCA / CAJ Mark Rivers
- Re: Serious issue with JCA / CAJ Michael Davidsaver
- Navigate by Date:
- Prev:
Vacancy at Diamond Light Source for Accelerator Control Systems Leader mark.heron
- Next:
RE: Serious issue with JCA / CAJ Mark Rivers
- Index:
1994
1995
1996
1997
1998
1999
2000
2001
2002
2003
2004
2005
2006
2007
2008
2009
2010
2011
2012
<2013>
2014
2015
2016
2017
2018
2019
2020
2021
2022
2023
2024
- Navigate by Thread:
- Prev:
Vacancy at Diamond Light Source for Accelerator Control Systems Leader mark.heron
- Next:
RE: Serious issue with JCA / CAJ Mark Rivers
- Index:
1994
1995
1996
1997
1998
1999
2000
2001
2002
2003
2004
2005
2006
2007
2008
2009
2010
2011
2012
<2013>
2014
2015
2016
2017
2018
2019
2020
2021
2022
2023
2024
|
ANJ, 20 Apr 2015 |
·
Home
·
News
·
About
·
Base
·
Modules
·
Extensions
·
Distributions
·
Download
·
·
Search
·
EPICS V4
·
IRMIS
·
Talk
·
Bugs
·
Documents
·
Links
·
Licensing
·
|