Argonne National Laboratory

Experimental Physics and
Industrial Control System

2002  2003  2004  2005  2006  2007  2008  2009  2010  2011  2012  2013  2014  2015  2016  <2017 Index 2002  2003  2004  2005  2006  2007  2008  2009  2010  2011  2012  2013  2014  2015  2016  <2017
<== Date ==> <== Thread ==>

Subject: Stalled CA connection (IOC to CS-Studio archiver)
From: Ralph Lange <ralph.lange@gmx.de>
To: EPICS Core Talk <core-talk@aps.anl.gov>
Date: Thu, 15 Jun 2017 11:37:05 +0200
Hi all,

We have an ongoing issue in a test setup that includes a Linux "Fast Controller" (IP...37) running IOCs (40k records each) on one end and a CS-Studio BEAUTY archiver on a VM (IP...41) on the other end. IOCs are running Base 3.15.5, BEAUTY uses a current JCA/CAJ client.

The CA TCP connection is up, but blocked in both directions:

On the fast controller (...37) , netstat shows

tcp        0      0 IP...37:5064   0.0.0.0:*      LISTEN      29499/MAG-CYSI
tcp    86888 178656 IP...37:5064   IP...41:40147  ESTABLISHED 29499/MAG-CYSI

On the archiver VM (...41), we see

tcp   495144  70184 IP...41:40147  IP...37:5064   ESTABLISHED 9164/java
tcp        0      0 IP...41:40691  IP...49:5064   ESTABLISHED 9164/java

tcpdump shows no traffic on that connection.

The archive engine logs things like:

2017-06-12 22:17:53.047 WARNING [Thread 30] com.cosylab.epics.caj.impl.CATransport (noSyncSend) - Failed to send message to /IP...37:5064 - buffer full, will retry.

and has not written data to the archive from this IOC for a long time. It is happily archiving data from other connections (e.g. the one shown in line 2 of the netstat output above).

Obviously the TCP connection is blocked and backed up to the other host in both directions.

The IOC is alive and casr shows all channels as connected.

Why are both sides not taking data out of their receive-Qs?

In this test setup, this is not happening to us for the first time. Has anyone seen such situations before? Any ideas for how to proceed trying to find out what's happening?

Thanks a lot
~Ralph


Replies:
Re: Stalled CA connection (IOC to CS-Studio archiver) Kasemir, Kay

Navigate by Date:
Prev: Jenkins build is back to normal : epics-base-3.15-mac-test #131 APS Jenkins
Next: Re: Stalled CA connection (IOC to CS-Studio archiver) Kasemir, Kay
Index: 2002  2003  2004  2005  2006  2007  2008  2009  2010  2011  2012  2013  2014  2015  2016  <2017
Navigate by Thread:
Prev: Re: startPVAServer takes 15 seconds Michael Davidsaver
Next: Re: Stalled CA connection (IOC to CS-Studio archiver) Kasemir, Kay
Index: 2002  2003  2004  2005  2006  2007  2008  2009  2010  2011  2012  2013  2014  2015  2016  <2017
ANJ, 15 Jun 2017 Valid HTML 4.01! · Home · News · About · Base · Modules · Extensions · Distributions · Download ·
· EPICS V4 · IRMIS · Talk · Bugs · Documents · Links · Licensing ·