EPICS Controls Argonne National Laboratory

Experimental Physics and
Industrial Control System

1994  1995  1996  1997  1998  1999  2000  2001  2002  2003  2004  2005  <20062007  2008  2009  2010  2011  2012  2013  2014  2015  2016  2017  2018  2019  2020  2021  2022  2023  2024  Index 1994  1995  1996  1997  1998  1999  2000  2001  2002  2003  2004  2005  <20062007  2008  2009  2010  2011  2012  2013  2014  2015  2016  2017  2018  2019  2020  2021  2022  2023  2024 
<== Date ==> <== Thread ==>

Subject: RE: CA disconnect mechanism
From: "Jeff Hill" <[email protected]>
To: "'Kay-Uwe Kasemir'" <[email protected]>
Cc: "EPICS-tech-talk" <[email protected]>
Date: Fri, 6 Oct 2006 16:33:58 -0600
Hi Kay,

> Can you confirm and clarify the following about the CA
> connection handling in case of network errors?

All of the following describes R3.14 behavior. Behavior will subtly vary
with R3.13 and also with earlier R3.14 releases.

Servers only close connections when:
A) There is a protocol violation.
B) The socket option for a TCP keep-alive disconnect to kicks in. Basically
for TCP keep-alive disconnect the circuit must first be detected to be idle,
and subsequently be detected to be unresponsive. TCP keep-alive disconnect
are typically configured in the OS globally for all TCP circuits.

The client library detects circuit disconnects via the socket library. This
might be because of a peer disconnect, TCP keep-alive timer disconnect, and
many other reasons. When a circuit disconnects any channels attached to it
go to the "server needs to be located state" (see below). 

The client library detects circuit unresponsiveness using these criteria. 
O First, there must be no beacon from the server for EPICS_CA_CONN_TMO
seconds
O Second, a response to an are-you-there query does not arrive within 5
seconds

The client library will keep an unresponsive (as judged by the above
criteria) circuit in a disconnected state until a response to the
"are-you-there" query comes back within 5 seconds. If the response to the
"are-you-there" query is late the library will immediately reissue a fresh
"are-you-there" request. 

The application's per-channel disconnect handler is called in both
situations (in response to whichever of them is first)
1) the channel's circuit is deemed to be unresponsive
2) the channel's circuit disconnects

The application's per-channel connect handler is called in both situations
(in response to whichever of them is first)
1) the channel's circuit is deemed to be responsive
2) the channel's circuit connects

Attempting to locate a server
-----------------------------
O In the client library there are N search buckets each with an independent
search request period. The bucket's search period is determined by two to
the power of the bucket's index times a constant. Bucket indexes start at
zero and are contiguous. When a search response arrives the channel is
removed from its search bucket and attached to a circuit for the specified
server. If there is no response to the search request after the buckets
period expires the channel is removed from its search bucket and moved to
the search bucket at index plus one. When a channel's search request times
out in the bucket with the slowest period it is not removed from this bucket
so that it continues to be searched for at a slow rate. The timeout of the
slowest period bucket is configurable.
O New channels are immediately moved to the search bucket with the shortest
timeout (at index zero).
O Disconnecting channels enter a cooling off state for a brief interval
prior to being moved to the search bucket with the shortest timeout (at
index zero).
O When a bucket sends a search request datagram it packs together as many
search requests for individual channels as can be made to fit in a UDP
frame. 
O The number of search datagram frames sent when a search bucket's timer
expires is roughly based on the TCP slow start algorithm, and of course on
how many channels are waiting in the bucket.
O If a beacon anomaly is detected all channels with a search bucket period
greater than a medium period are moved to the search bucket with a medium
period search interval.

Jeff


> -----Original Message-----
> From: Kay-Uwe Kasemir [mailto:[email protected]]
> Sent: Friday, October 06, 2006 1:59 PM
> To: Jeff Hill
> Subject: CA disconnect mechanism
> 
> Hi Jeff:
> 
> We'll have EPICS training here at the SNS in two weeks,
> and I'm brushing some slides up.
> 
> Can you confirm and clarify the following about the CA
> connection handling in case of network errors?
> 
> Thanks,
> -Kay
> 
> ----
> 
> a) TCP connection closed by server?
> - Notify client code about problem
> - EDM screens turn "white".
> - Client sends new search requests,
>    initially fast, then with exponential back-off,
>    for about 8 minutes.
>    Then nothing, unless a beacon anomaly wakes
>    the client up to send new search requests.
> 
> 
> b) No response from server for 30 sec. (configurable)?
> - Client sends "Are you there?" query.
> - If no response for 5 sec, also notifies the client code,
>    so EDM screens turn white,
>    but TCP connection is kept open to avoid network storms.
> 
> Now what?
> b.1) Server again sends data
> -> we're fully reconnected, EDM displays new data.
> b.2) Server never sends new data
> -> we'll stay in this state until the TCP connection dies?
>     Or do we wake in response to beacon anomalies?
> 
> 



Navigate by Date:
Prev: RE: sequencer installation Mark Rivers
Next: Failed to install sequencer 2.0.11 on WIN32 Zhang, Zhan
Index: 1994  1995  1996  1997  1998  1999  2000  2001  2002  2003  2004  2005  <20062007  2008  2009  2010  2011  2012  2013  2014  2015  2016  2017  2018  2019  2020  2021  2022  2023  2024 
Navigate by Thread:
Prev: Re: sequencer installation Janet Anderson
Next: Failed to install sequencer 2.0.11 on WIN32 Zhang, Zhan
Index: 1994  1995  1996  1997  1998  1999  2000  2001  2002  2003  2004  2005  <20062007  2008  2009  2010  2011  2012  2013  2014  2015  2016  2017  2018  2019  2020  2021  2022  2023  2024 
ANJ, 02 Sep 2010 Valid HTML 4.01! · Home · News · About · Base · Modules · Extensions · Distributions · Download ·
· Search · EPICS V4 · IRMIS · Talk · Bugs · Documents · Links · Licensing ·