EPICS Controls Argonne National Laboratory

Experimental Physics and
Industrial Control System

2002  2003  2004  2005  2006  2007  2008  2009  2010  2011  2012  2013  2014  2015  2016  2017  2018  2019  2020  2021  2022  2023  <2024 Index 2002  2003  2004  2005  2006  2007  2008  2009  2010  2011  2012  2013  2014  2015  2016  2017  2018  2019  2020  2021  2022  2023  <2024
<== Date ==> <== Thread ==>

Subject: Re: caRepeater question
From: "J. Lewis Muir via Core-talk" <core-talk at aps.anl.gov>
To: Zimoch Dirk <dirk.zimoch at psi.ch>
Cc: "core-talk at aps.anl.gov" <core-talk at aps.anl.gov>
Date: Thu, 1 Feb 2024 14:12:12 -0600
On 02/01, Zimoch Dirk wrote:
> Normally I an running it as a service. I gave that simpler scenario because it
> shows the critical points and is simpler to reproduce.
> Our actual problem was that [snip].

Ah, OK, thanks for that explanation; makes sense.

> I tested with casw:
> 0. caRepeater.service is running
> 1. start casw
> 2. start an ioc. casw shows the beacon anomaly
> 3. sudo systemctl restart caRepeater.service
> 4. start an ioc. casw does not show any beacon anomalies any more
> 5. restart casw. It works again.
> 
> Unfortunately, casw (or any ca client) cannot find out that the caRepeater it
> had registered to has died. Thus it never tries to reconnect.

Ouch.  That seems like a major problem to me.  It seems like that means
that to upgrade caRepeater, you have to restart all CA clients as well,
which would include IOCs that are CA clients.  If you don't do that, the
CA clients (including IOCs that are CA clients, for example, via a CA
link) will stop working correctly.  Is that right?  If so, that's rough.

I don't know hardly anything about the CA protocol, so what I'm about
to say may not be possible or may not even make sense, but I wonder
if caRepeater could be changed to send some kind of CA message to all
registered clients when it's about to exit?  That wouldn't work for
the case of caRepeater being sent a SIGKILL or SIGSTOP signal (or the
equivalent on Windows), nor the case of caRepeater crashing, but it
would work for the case of signals that can be caught.  Still, such a
solution doesn't seem particularly robust since it wouldn't work if the
CA message didn't get delivered to all clients for whatever reason.

I wonder if the CA protocol could be extended to support some kind of
mechanism to allow clients to detect when the caRepeater has died,
stopped working, or restarted?  For example, maybe CA clients could
periodically poll for a unique caRepeater ID that would change when a
new caRepeater process is started?

> Having used TCP instead of UDP to connect to the caRepeater would not have this
> problem, I think.

Interesting.

Lewis

Replies:
Re: caRepeater question Torsten Bögershausen via Core-talk
References:
caRepeater question Zimoch Dirk via Core-talk
Re: caRepeater question Torsten Bögershausen via Core-talk
Re: caRepeater question Zimoch Dirk via Core-talk
Re: Re: caRepeater question J. Lewis Muir via Core-talk
Re: Re: caRepeater question Zimoch Dirk via Core-talk

Navigate by Date:
Prev: Build failed: EPICS Base 7 base-7.0-1088 AppVeyor via Core-talk
Next: Build completed: epics-base base-socket_accept_type-54 AppVeyor via Core-talk
Index: 2002  2003  2004  2005  2006  2007  2008  2009  2010  2011  2012  2013  2014  2015  2016  2017  2018  2019  2020  2021  2022  2023  <2024
Navigate by Thread:
Prev: Re: Re: caRepeater question Zimoch Dirk via Core-talk
Next: Re: caRepeater question Torsten Bögershausen via Core-talk
Index: 2002  2003  2004  2005  2006  2007  2008  2009  2010  2011  2012  2013  2014  2015  2016  2017  2018  2019  2020  2021  2022  2023  <2024
ANJ, 02 Feb 2024 Valid HTML 4.01! · Home · News · About · Base · Modules · Extensions · Distributions · Download ·
· Search · EPICS V4 · IRMIS · Talk · Bugs · Documents · Links · Licensing ·