2002 2003 2004 2005 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015 2016 <2017> 2018 2019 2020 2021 2022 2023 2024 | Index | 2002 2003 2004 2005 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015 2016 <2017> 2018 2019 2020 2021 2022 2023 2024 |
<== Date ==> | <== Thread ==> |
---|
Subject: | Stalled CA connection (IOC to CS-Studio archiver) |
From: | Ralph Lange <[email protected]> |
To: | EPICS Core Talk <[email protected]> |
Date: | Thu, 15 Jun 2017 11:37:05 +0200 |
tcp
0 0 IP...37:5064 0.0.0.0:* LISTEN 29499/MAG-CYSI
tcp 86888 178656
IP...37:5064 IP...41:40147 ESTABLISHED 29499/MAG-CYSI
On the archiver VM (...41), we see
tcp 495144 70184 IP...41:40147 IP...37:5064 ESTABLISHED 9164/java
tcp 0 0 IP...41:40691 IP...49:5064 ESTABLISHED 9164/java
tcpdump shows no traffic on that connection.
The archive engine logs things like:
2017-06-12 22:17:53.047 WARNING [Thread 30] com.cosylab.epics.caj.impl.CATransport (noSyncSend) - Failed to send message to /IP...37:5064 - buffer full, will retry.
and has not written data to the archive from this IOC for a long time. It is happily archiving data from other connections (e.g. the one shown in line 2 of the netstat output above).
Obviously the TCP connection is blocked and backed up to the other host in both directions.
The IOC is alive and casr shows all channels as connected.
Why are both sides not taking data out of their receive-Qs?
In this test setup, this is not happening to us for the first time. Has anyone seen such situations before? Any ideas for how to proceed trying to find out what's happening?
Thanks a lot
~Ralph