I re-ran my tests: You are right: all diconnect delays are in the
expected range of 28 - 35 seconds.
Note that also in case of the suspended IOC (or interrupted network)
coming back testcaput takes longer (30 sec.) than testcaget
(immediately) to reconnect the channels. But all inside the 35 second limit.
Ralph
Jeff Hill wrote:
From Ralph:
The not-so-good news: I still see that testcaget and testcaput
behave *very* differently. The testcaput takes a *lot* longer
to react to CA server availability.
Hopefully you observed that the disconnect timeout was the same as always -
EPICS_CA_CONN_TMO + 5.0 seconds for the circuit verify to complete.
Starting with R3.14.6 or so, circuit restart after disconnecting a cable for
a short interval will be dependent on TCP/IP's internal timing. If TCP can't
get through it will keep trying, but with dimensioning period following an
exponential back-off. CA will not mark the circuit responsive again until it
get an are-you-there response within 5 seconds and that could be delayed a
bit depending on where we are in TCP's exponential back-off based retry
interval.
If we leave the cable disconnected long enough TCP's internal keep-alive
timer will fire. CA will see a TCP disconnect indication in which case you
should also see that reconnect times are a bit slower in recent versions.
This is by design. The client library now (starting with R3.14.6) uses a
different (slower) search interval boost for beacon anomalies compared to
when a new channel is created. Starting with R3.14.6 the maximum search
period is also different. In prior releases CA eventually stops searching
(after 100 tries) and only resuming again when it sees a beacon anomaly. In
R3.14.6 CA plateaus the search interval to a configurable maximum - and
never stops searching. In R3.14.6 there are multiple search timers - each a
power of two times the round trip estimate. Therefore, creation of a new
channel has no impact whatsoever on the search intervals for disconnected
channels created in the past.
I expect that sites with many IOCs will be pleased with the behavior
changes. The primary benefits being less broadcast traffic, and less
tendency for the system to flail when operated close to load saturation, or
should there be a centralized communication failure.
Jeff
- References:
- reacting to CA server availability Jeff Hill
- Navigate by Date:
- Prev:
RE: priorities Jeff Hill
- Next:
3.14.8@HP-UX: go! Ralph Lange
- Index:
2002
2003
2004
<2005>
2006
2007
2008
2009
2010
2011
2012
2013
2014
2015
2016
2017
2018
2019
2020
2021
2022
2023
2024
- Navigate by Thread:
- Prev:
reacting to CA server availability Jeff Hill
- Next:
3.14.8@HP-UX: go! Ralph Lange
- Index:
2002
2003
2004
<2005>
2006
2007
2008
2009
2010
2011
2012
2013
2014
2015
2016
2017
2018
2019
2020
2021
2022
2023
2024
|