g+
g+ Communities
Argonne National Laboratory

Experimental Physics and
Industrial Control System

1994  1995  1996  1997  1998  1999  2000  2001  2002  2003  2004  2005  2006  2007  2008  2009  2010  2011  <20122013  2014  Index 1994  1995  1996  1997  1998  1999  2000  2001  2002  2003  2004  2005  2006  2007  2008  2009  2010  2011  <20122013  2014 
<== Date ==> <== Thread ==>

Subject: RE: Channel Access monitoring tools
From: "Hill, Jeff" <johill@lanl.gov>
To: "lavender@agni.phys.iit.edu" <lavender@agni.phys.iit.edu>, "tech-talk@aps.anl.gov" <tech-talk@aps.anl.gov>
Date: Mon, 14 May 2012 15:29:41 +0000
Hi,

> 2.  Call ca_pend_io() to make sure the request is sent on its way.

The ca_pend_io does have also an implicit flush, but it's not typically 
used only for that purpose, if a flush is all that needed. It also turns out 
that the udp based search messages are sent periodically, and ca internally 
takes care of flushing them out (when you periodically call ca_poll).

> At present, the only client platform that I have that has these timeouts
> is Debian 6.0 Linux (Squeeze).  The same hardware running Debian 5.0
> did not have these problems.  The clients are using EPICS Base 3.14.10
> and the IOC is using 3.14.12.1.

Are both installations the same word size (i.e. both 32 or both 64). Is gcc
at a significantly different version on the two os? There might also be 
something changing in the nic driver btw debian versions.

> I am assuming that what I need to do here is to monitor the network
> traffic between the Debian 6.0 machine and the IOC and compare it
> to the traffic between a Debian 5.0 machine and the same IOC.  

Do you see these same issues when changing EPICS_CA_AUTO_ADDR_LIST to
NO, and then manually configure the server's address(s) in 
EPICS_CA_ADDR_LIST (this would fault isolate an issue with broadcasting).
It might be interesting to monitor ICMP (IP error diagnostic) traffic 
on your network with wireshark or some other Ethernet sniffer. You could 
could look at all ICMP messages and also only the ICMP messages that have 
the CA client's host for a destination address. I have seen problems here 
getting a client to connect reliably when there were too many hosts
on the LAN with miss-configured subnet masks. There were so many ICMP
error responses to each search broadcast that the finite length UDP 
input queue was saturating. 

I assume that other types of network traffic work (i.e. ping, ssh, telnet,
http) without issues on this network?

Jeff

> -----Original Message-----
> From: tech-talk-bounces@aps.anl.gov [mailto:tech-talk-bounces@aps.anl.gov]
> On Behalf Of Bill Lavender
> Sent: Friday, May 11, 2012 2:21 PM
> To: tech-talk@aps.anl.gov
> Subject: Channel Access monitoring tools
> 
> One of the installations I am responsible for is having trouble with
> unexplained PV connection failures.  What I have is a program that needs
> to do things in a serialized fashion.  In other words, for each of
> several PVs, it needs to start a connection to a PV, wait for the
> connection to complete, and then immediately use that PV.
> 
> At present, I am doing something like this:
> 
> 1.  Invoke ca_create_channel() with a connection state change handler.
>     The connection state change handler sets a flag to indicate that
>     the connection has complete.
> 
> 2.  Call ca_pend_io() to make sure the request is sent on its way.
> 
> 3.  I then wait inside a loop periodically calling ca_poll() until
>     my connection flag has been sent.
> 
> 4.  If the loop has been looping for too long, I declare a timeout
>     and call ca_clear_channel() to get rid of the existing unconnected
>     PV andn then call ca_pend_io() to send that request on the way.
> 
>     My code did not originally have this ca_clear_channel() call,
>     but I added it to see if it would help and in the name of
>     preventing memory leaks.  It didn't help.
> 
> 5.  I then go back to step one to create a new channel.
> 
> 6.  If I execute the outer loop from step 1 to step 5 too many times
>     then I give up and tell the user that I have timed out.
> 
> Increasing the timeouts has not helped.  For debugging, ,I have tried
> timeouts as long as 10 seconds and have not seen a change in the frequency
> of connection timeouts.
> 
> At present, the only client platform that I have that has these timeouts
> is Debian 6.0 Linux (Squeeze).  The same hardware running Debian 5.0
> did not have these problems.  The clients are using EPICS Base 3.14.10
> and the IOC is using 3.14.12.1.
> 
> I am assuming that what I need to do here is to monitor the network
> traffic between the Debian 6.0 machine and the IOC and compare it
> to the traffic between a Debian 5.0 machine and the same IOC.  I see
> that there is a Channel Access plugin for Wireshark that I hope will
> be helpful.  Are there other things that I should be trying?
> 
> The Channel Access code is wrapped in some code of my own, so it will
> not look quite the same as raw Channel Access code, but if you want
> to look at it anyway, look at the function mx_epics_pv_connect()
> in this file
> 
>   http://svn.csrri.iit.edu/mx/trunk/modules/epics/mx_epics.c
> 
> Thanks.
> 
> Bill Lavender
> lavender@agni.phys.iit.edu



Replies:
Re: Channel Access monitoring tools Bill Lavender
References:
Channel Access monitoring tools Bill Lavender

Navigate by Date:
Prev: cPCI express Urša Rojec
Next: Re: cPCI express Michael Davidsaver
Index: 1994  1995  1996  1997  1998  1999  2000  2001  2002  2003  2004  2005  2006  2007  2008  2009  2010  2011  <20122013  2014 
Navigate by Thread:
Prev: Re: Channel Access monitoring tools Benjamin Franksen
Next: Re: Channel Access monitoring tools Bill Lavender
Index: 1994  1995  1996  1997  1998  1999  2000  2001  2002  2003  2004  2005  2006  2007  2008  2009  2010  2011  <20122013  2014 
ANJ, 18 Nov 2013 Valid HTML 4.01! · Home · News · About · Base · Modules · Extensions · Distributions · Download ·
· EPICSv4 · IRMIS · Talk · Bugs · Documents · Links · Licensing ·