EPICS Controls Argonne National Laboratory

Experimental Physics and
Industrial Control System

1994  1995  1996  1997  1998  1999  2000  2001  2002  2003  2004  2005  2006  2007  2008  2009  2010  <20112012  2013  2014  2015  2016  2017  2018  2019  2020  2021  2022  2023  2024  Index 1994  1995  1996  1997  1998  1999  2000  2001  2002  2003  2004  2005  2006  2007  2008  2009  2010  <20112012  2013  2014  2015  2016  2017  2018  2019  2020  2021  2022  2023  2024 
<== Date ==> <== Thread ==>

Subject: RE: reading large data arrays over slow networks
From: "Jeff Hill" <[email protected]>
To: "'Matt Newville'" <[email protected]>, "'EPICS Tech Talk'" <[email protected]>
Date: Thu, 3 Nov 2011 15:08:14 -0600
Hi Matt,

In a single threaded (non-preemptive callback) client no CA activity can
occur (i.e. processing network packets and or calling callbacks) unless your
single thread is executing in the CA library. The ca_pend_event call is
purely a mechanism for spending some time, the amount of time being
specified by its sole argument, in the library. The ca_pend_event call is
not used in any way for synchronizing ca_array_get (or channel
connectivity).

In contrast, ca_pend_io will synchronize completion of all outstanding
ca_array_get requests issued since the beginning of the ca context, or the
previous call to ca_pend_io, whichever is later. In practice, what this
implies is that ca_pend_io is a one-time synchronization. If it returns
ECA_TIMEOUT this implies that any incomplete CA get requests are effectively
canceled. The library will _not_ allow your variable, specified to ca get,
to be modified unexpectedly far in the future after ca_pend_io returns if
the network is slow. That could be result in a disaster, for example, if you
are using a stack automatic variable.

> That is, how do I know how long to wait in ca_pend_event() and
> ca_pend_io()?    

It's actually quite difficult to write a program that is generic in a wide
range of networking environments if there are any short timeouts which are
hardcoded. In practice, network generic programs with complex state
transitions typically use CA callbacks.

> leaves the data at pvalue incorrect (all zeros).  Trying to access
> this data from python, I can get python to segfault for large enough
> data (say, 4.2M ints).  I'm not sure I fully understand why python is
> crashing -- the segfault happens well after ca_array_get() returns,
> but can happen in ca_pend_event() a short time later.  If I save but
> don't try to access the data in pvalue, or wait "long enough", it
> seems I never have a crash.

I suspect that python, or the ca python interface library, isn't happy if
its data is being modified by an outside thread (i.e. a ca client library
auxiliary thread) precisely at the same time that its being read. It's
probably a good idea not to touch the data until ca_pend_io synchronized, or
alternatively use a callback.

> Is there a way to tell if pend_event() or pend_io() have timed out

They return ECA_TIMEOUT

> or if there are events that are pending?

The ca_pend_io call returns ECA_TIMEOUT if it timed out before all of the
get requests completed, and otherwise ECA_NORMAL if they did complete. There
is also a mechanism for determining at any given instant if the ca_client
library has outstanding labor (network activity) which needs to be dealt
with. I can provide more details on that if you are interested.

If you need to know precisely which get requests failed out of many then
it's probably best to use a callback mechanism.

Jeff
______________________________________________________
Jeffrey O. Hill           Email        [email protected]
LANL MS H820              Voice        505 665 1831
Los Alamos NM 87545 USA   FAX          505 665 5107

Message content: TSPA

With sufficient thrust, pigs fly just fine. However, this is
not necessarily a good idea. It is hard to be sure where they
are going to land, and it could be dangerous sitting under them
as they fly overhead. -- RFC 1925

> -----Original Message-----
> From: [email protected] [mailto:[email protected]]
> On Behalf Of Matt Newville
> Sent: Thursday, November 03, 2011 2:09 PM
> To: EPICS Tech Talk
> Subject: reading large data arrays over slow networks
> 
> Hi,
> 
> I'm seeing an issue with Channel Access getting "large data arrays"
> and hope someone can provide some insight.  First, this is not an
> issue with EPICS_CA_MAX_ARRAY_BYTES, which is set large enough.
> Rather the issue I'm seeing seems to raise the naive question:
> 
>     Once I do a ca_array_get(type, count, chid, pvalue), when can I
> successfully read the data in pvalue?
> 
> That is, how do I know how long to wait in ca_pend_event() and
> ca_pend_io()?    In most cases of a fast network and scalar values or
> small arrays, values on the order of
>    ca_pend_event(1.e-3);
>    ca_pend_io(1.0);
> 
> seem to work well.   But, if the array is "large" or the network
> "slow", I have a hard time predicting these values, and accessing the
> data may give incorrect values unless I've waited "long enough".
> Using either a slow network or pend_event/io times that is clearly
> "too short",  say,
>    ca_pend_event(1.e-5);
>    ca_pend_io(0.01);
> 
> leaves the data at pvalue incorrect (all zeros).  Trying to access
> this data from python, I can get python to segfault for large enough
> data (say, 4.2M ints).  I'm not sure I fully understand why python is
> crashing -- the segfault happens well after ca_array_get() returns,
> but can happen in ca_pend_event() a short time later.  If I save but
> don't try to access the data in pvalue, or wait "long enough", it
> seems I never have a crash.
> 
> Related questions are:  What is supposed to happen if ca_pend_event()
> and/or ca_pend_io() do time out?  and: Is there a way to tell if
> pend_event() or pend_io() have timed out or if there are events that
> are pending?
> 
> I was hoping to be able to do something like
>    ret_ev = ca_pend_event(1.e-3);
>    ret_io = ca_pend_io(1.);
> 
>    while (ret_io == ECA_TIMEOUT) {
>       ret_ev = ca_pend_event(1.e-3);
>       ret_io = ca_pend_io(1.);
>    }
> 
> Unfortunately, it seems that checking the return values are not so
> helpful, as pend_event() always(?) returns ECA_TIMEOUT and pend_io()
> seems to return ECA_TIMEOUT no more than one time before returning
> ECA_NORMAL.
> 
> For what it's worth, I'm using base 3.14-12.1.   Thanks in advance for
> any insight.
> 
> --Matt Newville <newville at cars.uchicago.edu> 630-252-0431



Replies:
Re: reading large data arrays over slow networks Matt Newville
References:
reading large data arrays over slow networks Matt Newville

Navigate by Date:
Prev: Re: reading large data arrays over slow networks Tim Mooney
Next: Re: epicsQt plugin build errors Martin Konrad
Index: 1994  1995  1996  1997  1998  1999  2000  2001  2002  2003  2004  2005  2006  2007  2008  2009  2010  <20112012  2013  2014  2015  2016  2017  2018  2019  2020  2021  2022  2023  2024 
Navigate by Thread:
Prev: Re: reading large data arrays over slow networks Tim Mooney
Next: Re: reading large data arrays over slow networks Matt Newville
Index: 1994  1995  1996  1997  1998  1999  2000  2001  2002  2003  2004  2005  2006  2007  2008  2009  2010  <20112012  2013  2014  2015  2016  2017  2018  2019  2020  2021  2022  2023  2024 
ANJ, 18 Nov 2013 Valid HTML 4.01! · Home · News · About · Base · Modules · Extensions · Distributions · Download ·
· Search · EPICS V4 · IRMIS · Talk · Bugs · Documents · Links · Licensing ·