EPICS Controls Argonne National Laboratory

Experimental Physics and
Industrial Control System

2002  2003  2004  2005  2006  2007  2008  <20092010  2011  2012  2013  2014  2015  2016  2017  2018  2019  2020  2021  2022  2023  2024  Index 2002  2003  2004  2005  2006  2007  2008  <20092010  2011  2012  2013  2014  2015  2016  2017  2018  2019  2020  2021  2022  2023  2024 
<== Date ==> <== Thread ==>

Subject: Re: GW status
From: Dirk Zimoch <[email protected]>
To: Jeff Hill <[email protected]>
Cc: "'Amedeo Perazzo'" <[email protected]>, "'Core-Talk'" <[email protected]>, "'Ernest L. Williams Jr.'" <[email protected]>
Date: Wed, 05 Aug 2009 10:07:30 +0200
Hi all,

The -no_cache option has been added by Cosylab (Gasper Jansa, I have put him into the cc of this mail) on my request.

At SLS, people complained about the default behavior of the gateway. Without -no_cache, the gateway forwards ca_get to the IOC as long as no monitor subscription exist for this channel. As soon as any client creates a monitor, the behavior changes and the gateway sends back the last value it received through the monitor instead of asking the IOC. Especially the unexpected change of behavior (depending on some other client) was confusing.

It gave us problems where fields that are used in DBR_CTRL* and friends are modified during run-time (e.g. EGU or HOPR). If any client had a monitor on such a value, only restarting the gateway could update the cached attributes. Restarting only the client did not help. Also with high MDEL (or ADEL in case of the archiver), polling the record becomes futile.

Dirk


Jeff Hill wrote:
   1. PCAS was unable (before the patch) to handle subscriptions during
the time that ca_put_callback was being processed.

More specifically, if the service (in your situation the GW) returned status
from an IO request indicating "postpone this request because too many
asynchronous IO operations are already in progress" then PCAS would not
update subscriptions until at least one asynchronous IO operation completed.
Since the GW service returns this "postpone this request because too many
asynchronous IO operations are already in progress" status if a
ca_put_callback request to the IOC is pending then this explains some of the
behavior you saw there (frozen EDM subscription updates when the same client
initiates the motor move).

   2. The gateway may be translating a ca_put on its server side to a
ca_put_callback on its client side even when that's not requested by the
client.

In EPICS ca_put is very different from ca_put_callback.
First, ca_put means no response message from the IOC unless something goes
wrong. Second, in situations where there are many values being written one
after another by the same client to the same PV (an EDM slider is a good
example of this) then if the consumer (the PV in the database) is slower
than the producer (the EDM slider) then some of the intermediate values will
be discarded, but the system guarantees that the last value sent is always
written to the database, and if the field is process passive the record is
also processed with this last value sent.

Second, ca_put_callback means that the callback is not called until after
the record finishes processing, and any records or asynchronous devices that
are linked directly or indirectly to this record are done with their
processing.

So, yes, there was a weakness in the PCAS service interface which prevented
the service from knowing if the request was initiated by a put or a put
callback. Good designs are minimal, but I went too far on this one. I have
committed changes to PCAS so that ca_put_callback request invoke
casChannel::writeNotify and ca_put requests invoke casChannel::write. If the
service does not implement casChannel::writeNotify then
casChannel::writeNotify invokes casChannel::write thereby preserving
backward compatibility. I have also committed changes to the GW so that
casChannel::writeNotify invokes ca_put_callback, and casChannel::write
invokes ca_put. I am still testing these changes. They appear to work
correctly but I see another issue which may be unrelated to my recent
changes, and I am currently pursuing that.

The bottom line is that I hope that these changes should make the GW more
transparent, but this also opens up the possibility that certain programs
that issue a ca_put request followed by ca_get request will discover that
the value written was not returned in response to the ca_get request, and
that would be a behavior change for the GW, but this is probably what Jim
originally intended based on some discussions that I recall. Presumably
programs that really care about such things will be written to issue
ca_put_callback followed by ca_get. Also, turning on -no_cache would fix
this issue for the client but that would be a global change impacting all
clients, and also weakening the GW's role as a offloading agent for the IOC.

   3. The caput issue may be something different from the previous two
points and may be related to the way we configured caching on the gateway.

Is the above correct? Who is the best person to help with the gateway side
now that the PCAS patch is available? Where do I find some documentation
(if any) about caching on the gateway? The only thing I found on the
manual is the -no_cache option whose description doesn't sound like the
caput problem (I may be wrong).

The manual says this:

-------------snip-snip-----------------
-no_cache  	
Do not use cached (monitored) values when a client does ca_get. This results
in higher network traffic to the IOC but returns always the current value,
even if no monitor event had been send (e.g. because of a MDEL). This also
solves problems with record fields like HOPR or EGU if they are modified
during run-time.
-------------snip-snip-----------------

Servicing get requests out of a subscription updated cache appears to be the
default although perhaps the manual doesn’t say this (I read very quickly so
I could be wrong on this). When no_cache is enabled this makes clients that
do a put followed by a get to the same channel happy because their get
request is translated not into a cache read, but into a retransmitted get
request to the IOC. This comes at the expense of increased load on the IOC.

During testing, without the -no_cache option set I appear to see that gets
are postponed until a pending ca_put_callback completes. Recall however that
the new version (as of today) of the gateway issues ca_put in response to
EDM's ca_put, and the gateway issues ca_put_callback in response to the high
level applications ca_put_callback. I am starting to suspect after looking
at the source code and reading the doc that the -no_cache option has zero
impact whatsoever on that behavior although I haven’t run any experiments
with -no_cache set.

- Is the zipped base you sent me last week the same as the current CVS
head?

Yesterday I committed changes to the PCAS library in base.

Today I committed the above mentioned patches to the GW, but during testing
(a few minutes ago) I found another possibly unrelated issue that needs to
be understood. I could arrange for a copy if you would like to test in
parallel.

Jeff
______________________________________________________
Jeffrey O. Hill           Email        [email protected]
LANL MS H820              Voice        505 665 1831
Los Alamos NM 87545 USA   FAX          505 665 5107

Message content: TSPA


-----Original Message-----
From: Amedeo Perazzo [mailto:[email protected]]
Sent: Tuesday, August 04, 2009 2:49 PM
To: Jeff Hill
Cc: 'Ernest L. Williams Jr.'; 'Andrew Johnson'
Subject: Re: GW status

Jeff,

I have some questions which can help me understand the current situation
and what we can do next:

- I understand that we may have three different issues:

   1. PCAS was unable (before the patch) to handle subscriptions during
the
time that ca_put_callback was being processed.

   2. The gateway may be translating a ca_put on its server side to a
ca_put_callback on its client side even when that's not requested by the
client.

   3. The caput issue may be something different from the previous two
points and may be related to the way we configured caching on the gateway.

Is the above correct? Who is the best person to help with the gateway side
now that the PCAS patch is available? Where do I find some doumentation
(if any) about caching on the gateway? The only thing I found on the
manual is the -no_cache option whose description doesn't sound like the
caput problem (I may be wrong).

- Is the zipped base you sent me last week the same as the current CVS
head?

Thanks!
Amedeo


On Mon, 3 Aug 2009, Jeff Hill wrote:

Hi,



Today I committed changes to EPICS base so that the PCAS service can
distinguish between ca_put and ca_put_callback. The changes are
backwards
compatible for the service. In summary, there is a new "writeNotify"
interface that calls the "write" interface if it isn't implemented by
the
service.



The next step will be to modify the gateway so that it uses ca_put
instead
of ca_put_callback if that is what the client, in SLAC's case EDM, has
chosen to use. EDM chooses to use ca_put, and not ca_put_callback,
precisely
because it chooses _not_ to block for write requests to complete in the
IOC,
or in the future hopefully also the GW. So we anticipate that after
fixing
the GW we will have a scenario where EDM will call ca_put, PCAS will
call
writeNotify, and the GW will call ca_put - thereby avoiding use of
ca_put_callback where it is inappropriate.



Jeff
______________________________________________________
Jeffrey O. Hill           Email         <mailto:[email protected]>
[email protected]
LANL MS H820              Voice        505 665 1831
Los Alamos NM 87545 USA   FAX          505 665 5107



Message content: TSPA






--
Dr. Dirk Zimoch
Paul Scherrer Institut, WBGB/006
5232 Villigen PSI, Switzerland
Phone +41 56 310 5182

References:
RE: GW status Jeff Hill

Navigate by Date:
Prev: Re: GW status Stephen Lewis
Next: Re: GW status Ralph Lange
Index: 2002  2003  2004  2005  2006  2007  2008  <20092010  2011  2012  2013  2014  2015  2016  2017  2018  2019  2020  2021  2022  2023  2024 
Navigate by Thread:
Prev: Re: GW status Dirk Zimoch
Next: VWTime driver Kalantari Babak
Index: 2002  2003  2004  2005  2006  2007  2008  <20092010  2011  2012  2013  2014  2015  2016  2017  2018  2019  2020  2021  2022  2023  2024 
ANJ, 02 Feb 2012 Valid HTML 4.01! · Home · News · About · Base · Modules · Extensions · Distributions · Download ·
· Search · EPICS V4 · IRMIS · Talk · Bugs · Documents · Links · Licensing ·