g+
g+ Communities
Argonne National Laboratory

Experimental Physics and
Industrial Control System

1994  1995  1996  1997  1998  1999  2000  2001  2002  2003  2004  2005  2006  2007  2008  2009  2010  2011  <20122013  2014  Index 1994  1995  1996  1997  1998  1999  2000  2001  2002  2003  2004  2005  2006  2007  2008  2009  2010  2011  <20122013  2014 
<== Date ==> <== Thread ==>

Subject: Re: CAC problem between RTEMS and vxWorks
From: Dirk Zimoch <dirk.zimoch@psi.ch>
To: "Hill, Jeff" <johill@lanl.gov>
Cc: EPICS tech-talk <tech-talk@aps.anl.gov>
Date: Fri, 21 Sep 2012 12:01:07 +0200
Hi Wesley

Hill, Jeff wrote:
Hi Wesley,

TCP 129.57.214.101:1024(): User="rtems", V4.11, 4 Channels, Priority=0
TCP 129.57.214.101:1025(): User="rtems", V4.11, 4 Channels, Priority=0
TCP 129.57.214.101:1026(): User="rtems", V4.11, 4 Channels, Priority=0
TCP 129.57.214.101:1027(): User="rtems", V4.11, 4 Channels, Priority=0
TCP 129.57.214.101:1028(): User="rtems", V4.11, 4 Channels, Priority=0
TCP 129.57.214.101:1029(): User="rtems", V4.11, 4 Channels, Priority=0
TCP 129.57.214.101:1030(): User="rtems", V4.11, 4 Channels, Priority=0
TCP 129.57.214.101:1031(): User="rtems", V4.11, 4 Channels, Priority=0

When abruptly rebooting the RTEMS system the TCP shutdown interactions may not (probably don’t) occur, and when the new instance of the RTEMS system starts up it begins a new CA circuit over TCP. What ephemeral port is assigned to the new circuit (i.e. 1024 through 1031 above) is is an implementation detail of the IP kernel. On some implementations the same port gets reused, and typically TCP detects an attempt to start a new circuit on the same port as a preexisting circuit, and the server side hangs up. Otherwise, the server side waits the duration of a long timeout, in case the client is just temporarily loosing connectivity, before it hangs up. The order of the ephemeral port assignment depends typically on what sockets have been created since rebooting, and so you may get different port assignment orderings if there are many circuits being created shortly after the IOC reboots.

CAC: Unable to connect because "Connection timed out"

Make certain that full/half-duplex configuration match between the switches and the IOCs that are involved. If the switch and the IOC don’t match communication can proceed but it can be very slow and unreliable.
Sometimes you can see this by watching the beacons with casw. On vxWorks
you can sometimes see the Ethernet auto-negotiation parameters by typing
ifShow. In my experience, some of the vxWorks Ethernet drivers are neglecting to turn on the continuous auto-negotiation option in the PHY and so if the vxWorks system gets powered up before the switch it decides to default the auto-negotiation parameters, and never tries to auto-negotiate again. This can be a problem because switches are typically set to continuously auto-negotiate.

Another possibility is too many collisions which can be sometimes seen
on vxWorks systems with ifShow, but this is an infrequently experienced problem today, on modern switched Ethernet networks.

Jeff

Keep in mind that the number of file descriptors in vxWorks is limited and each "dead" socket connection uses one. The default number is 50. Once they are used up, no new network connection can be made (and no file can be opened). Modify NUM_FILES in the vxWorks BSP configuration.

But even worse, vxWorks may run out of network buffers. If the RTEMS system has monitors on its 4 channels, then EPICS may try to send the monitor events filling the TCP send buffers and using up all available network buffers.

So you may check the following:
* Run inetstatShow and check the Send-Q entries of your "dead" sockets
* Run mbufShow and see if you are short of free buffers
* Run iosFdShow and see how many file descriptors are in use.

Jeff, maybe CA should "clean up" the stale sockets instead of waiting for TCP to do that (if possible).

Dirk


-----Original Message-----
From: tech-talk-bounces@aps.anl.gov [mailto:tech-talk-bounces@aps.anl.gov]
On Behalf Of Wesley Moore
Sent: Wednesday, September 19, 2012 10:19 AM
To: EPICS tech-talk
Subject: CAC problem between RTEMS and vxWorks

All,

I'm having issues with a RTEMS client (3.14.11) accessing PVs from a
vxWorks IOC (3.14.8.2).  When the RTEMS IOC is rebooted, sometimes it
doesn't reconnect to the other IOC.  Even after connecting, I'm often
getting timeouts on RTEMS and can't seem to maintain a solid connection
between the two.

# timeouts on RTEMS client IOC
CAC: Unable to connect because "Connection timed out"
CA.Client.Exception...............................................
    Warning: "Virtual circuit disconnect"
    Context: "iocfel8.acc.jlab.org:5064"
    Source File: ../cac.cpp line 1145
    Current Time: Wed Sep 19 2012 11:52:27.183784942
..................................................................

After reboots, It stacks up new socket connections which isn't helping
matters.

# casr on vxWorks IOC
TCP 129.57.214.101:1024(): User="rtems", V4.11, 4 Channels, Priority=0
TCP 129.57.214.101:1025(): User="rtems", V4.11, 4 Channels, Priority=0
TCP 129.57.214.101:1026(): User="rtems", V4.11, 4 Channels, Priority=0
TCP 129.57.214.101:1027(): User="rtems", V4.11, 4 Channels, Priority=0
TCP 129.57.214.101:1028(): User="rtems", V4.11, 4 Channels, Priority=0
TCP 129.57.214.101:1029(): User="rtems", V4.11, 4 Channels, Priority=0
TCP 129.57.214.101:1030(): User="rtems", V4.11, 4 Channels, Priority=0
TCP 129.57.214.101:1031(): User="rtems", V4.11, 4 Channels, Priority=0


Any help is greatly appreciated.

Wesley




Replies:
Re: CAC problem between RTEMS and vxWorks Wesley Moore
RE: CAC problem between RTEMS and vxWorks Hill, Jeff
References:
CAC problem between RTEMS and vxWorks Wesley Moore
RE: CAC problem between RTEMS and vxWorks Hill, Jeff

Navigate by Date:
Prev: RE: CAC problem between RTEMS and vxWorks Hill, Jeff
Next: Re: CAC problem between RTEMS and vxWorks Wesley Moore
Index: 1994  1995  1996  1997  1998  1999  2000  2001  2002  2003  2004  2005  2006  2007  2008  2009  2010  2011  <20122013  2014 
Navigate by Thread:
Prev: RE: CAC problem between RTEMS and vxWorks Hill, Jeff
Next: Re: CAC problem between RTEMS and vxWorks Wesley Moore
Index: 1994  1995  1996  1997  1998  1999  2000  2001  2002  2003  2004  2005  2006  2007  2008  2009  2010  2011  <20122013  2014 
ANJ, 18 Nov 2013 Valid HTML 4.01! · Home · News · About · Base · Modules · Extensions · Distributions · Download ·
· EPICSv4 · IRMIS · Talk · Bugs · Documents · Links · Licensing ·