Experimental Physics and
Industrial Control System

Michael Abbott <[email protected]> · Mon, 5 Oct 2009 09:12:06 +0100 (BST)

On Sun, 4 Oct 2009, Matt Newville wrote:
> I completely agree that one wants to get CA array data into numpy arrays 
> quickly and efficiently.
> 
> Perhaps I was unclear earlier: I think a Python interface to CA should 
> fetch the native CA type (or perhaps the CTRL or TIME variant on special 
> request) and then convert that into Python objects in the interface 
> layer.
> 
> For an array of doubles (DBR_FLOAT,DBR_DOUBLE) or ints 
> (DBR_SHORT,DBR_LONG), automatically converting that to a numpy array of 
> the appropriate "dtype"  makes perfect sense.

This can in fact be done for all DBR types, even DBR_STRING, though a 
little care is needed in this last case -- these all fit cleanly into 
numpy types and memory layout.  So cothread.catools does exactly this for 
all caget and camonitor types.

> If one does a ca_put() with a numpy array on a PV that is natively 
> DBR_FLOAT, it would be nice if the interface layer made the conversion.

I'm in two minds about this one.  At the moment cothread.catools makes no 
conversion, it simply puts the data as it is, selecting DBR_CHAR / 
DBR_SHORT / DBR_LONG to match the Python data, but I'm tempted to add a 
conversion parameter.

> Similarly, for waveform records of characters, turning that into a 
> string makes the most sense to me. I can see that other conversions of 
> character waveforms might be desirable, but not supporting any 
> conversions for the sake of "being compete and general" seems like a bad 
> choice to me.

This funny business of whether a string is a DBR_STRING or a DBR_CHAR 
array is an interesting bit of overloading; think any API needs special 
direction for this.

> The point I was trying to make was that a "complete C API" is a bit
> more general than needed.  

I would completely agree with this.  There's no point in exposing CA 
functionality that's not actually used.  Personally I favour an API which 
somewhat abstracts the quite low level functionality provided by libca 
(but I think that libca provides an excellent building block)

> For example, request_type != native_type is not needed (again, CTRL and 
> TIME would variants are needed).  I'm not opposed to allowing a Python 
> programmer to say "I know this PV is a DBR_FLOAT, but I want to get it 
> as a DBR_SHORT", I just see it as pointless.  DBR_SHORT is a detail that 
> is important for C (and a Python interface has to deal with this), but 
> it is not important for Python.  Again, you *might* want a numpy array, 
> and this should be of the correct type: the interface should known that 
> an Epics array of DBR_SHORT corresponds to a numpy array with 
> dtype=uint16.  If a "complete API" means that the Python programmer 
> *has* to deal with a DBR_*** type, then it's bad Python.

What I've done in cothread.catools is allow the following options:

1. caget(pv_name) returns the native type converted to Python
2. caget(pv_name, datatype=str) makes a request for DBR_STRING, and 
returns it as a str.
3. caget(pv_name, datatype=DBR_STRING) also makes a request for DBR_STRING
4. caget(pv_name, datatype=str, format=FORMAT_TIME) makes a request for 
DBR_TIME_STRING and augments the returned string with extra fields 
.status, .severity, .raw_stamp and .timestamp.

And so on; this is a more abstract API over the concrete CA API.

> >> In addition, mixing threads well between C and Python is well known 
> >> to be hard and error-prone.    A "complete" API would probably need 
> >> to allow "ca_create_context(ca_enable_preemptive_context)". I'm not 
> >> sure that even make sense with a language with its own VM.  How is 
> >> this *supposed* work in such a case?
> > so, I wrote and maintain my own extension (missing from your survey :) 
> > that actually does that. Here are the use cases that motivated me:
> I guess it's just so easy to roll owns one interface that there is very 
> little incentive to use someone else's code even if it is available and 
> documented.  Coming to a common solution would be nice.

Aye.  I don't really know how to proceed except by discussing specifics.

I think the ctypes and dbr handling of cothread.catools, specifically the 
files dbr.py and cadef.py, can be used elsewhere with little argument.  
The question of whether cothreads are valuable is quite a separate 
conversation!

> > - python used as a shell
> > I found ca_enable_preemptive_context + python GIL work well.
> > early implementation (R3.12? when ca was not thread safe) used hooks to
> > libreadline and handle the ca background polling.
> 
> As I understand it, "preemptive callback" was introduced in 3.14 and 
> means that one does not need to poll; there is a C thread effectively 
> polling for CA events for you in the background. The issue (for Python) 
> is whether Python callback functions can be sensibly run from the 
> background CA polling thread without coordinating with the Python main 
> thread.  Python supports threads, but only allows one thread at a time 
> to have access Python objects in the global context of the process.  If 
> a background CA polling thread cannot acquire and release the Python 
> GIL, I don't understand how it could do anything useful (to Python).  
> Perhaps I am misunderstanding.

This isn't really a problem, as the Global Interpreter Lock is 
periodically freed by the intepreter (every so many Python opcodes, can't 
remember how many), so all that's really needed is that a function calling 
into the Python interpreter takes the GIL.

In other wrods, a background CA polling thread *can* acquire and release 
the GIL, that's what it's there for!

> For interactive shell work, I rarely use callbacks at all.  Do you use 
> callbacks here? I do like your idea of adding a polling hook into the 
> readline library so that one did not need to explicitly poll from the 
> shell, but that's the same as enabling preemptive callbacks.  Again, 
> maybe I am misunderstanding something here. Please correct me if I am 
> wrong.

We need to distinguish with care between preemtive and non-preemptive 
callbacks.  In particular I would not describe callbacks from the readline 
hook as "preemptive", as they only occur at a well defined and predictable 
point (when the intepreter is waiting for keyboard input from the user).

By the way, if you're looking at the cothread readline hook, the current 
implementation is hugely overcomplicated.  It can be replaced by the 
following:

	def _readline_hook():
	    coselect.poll([(0, coselect.POLLIN)])

However, ctrl-C handling is not nice, and will probably need some C 
support to get right.

> > - python UI with your python-toolkit
> > mixing X and threads seems extremely difficult, the simplicity one 
> > might be seeking with python script is lost. Here instead, I have a 
> > (gtk) wrapper that starts ca_disable_preemptive_context and handles ca 
> > polling with a glib.timeout_add(...)
> 
> My experience (and I believe the experience of everyone else using
> Python and CA) is that one needs needs to be very careful of using
> native CA threads and GUI-level threads, as mixing CA and GUI threads
> through a Python main thread can easily crash.
> 
> Even with preemptive_callback disabled, and polling to run Python
> callback functions (for PVs which have one defined), one has to be
> careful as the Python callback is run entirely within that
> ca_pend_event().  That means no other CA calls can be made, and that
> calls into other threads (say, to update a widget) are disaster-prone.
> One can access Python objects with confidence, but ca_pend_event()
> often needs to be run quickly (so don't fetch data from a URL or
> process an image!) The strategy I take with wxPython is to have a
> python callback for each PV that simply notes that the PV has changed
> (and caches the value), and then set up a Timer loop() for the GUI
> that effectively does
>     ca_pend_event()
>     react_to_changes()

This is not so dissimilar to what cothread.catools does.  If you set up a 
monitor with
	camonitor(pv_name, callback)
then callback is *not* processed in the context of the CA callback, but is 
instead processed on a dedicated cothread.  Instead CA callbacks simply 
place updates on a queue which is then automatically processed.

I am wondering whether to let these callback happen on a separate thread: 
I guess I'll need to do some timing tests, because that'd be the only 
compelling reason to use asynchronous callbacks instead of the current 
polling mechanism.

> What I *really* do is sub-class widgets (PVStaticText, PVTextCtrl,
> etc) that includes has the simple callback, knoww which GUI field to
> update, and sets up (or reuses) a Timer loop as above.  That way, PV
> values in widgets get updated automatically and only on real changes.
> I believe that others using CA and widget toolkits do approximately
> the same thing, though I'd be happy to hear of a better way to do it.
> 
> >> For what it's worth, My own extension and cothreads do not enable
> >> preemptive callbacks, and neither has "native" threads -- my own
> >> simply does not have them at all.
> >
> > Does cothreads let you handle those 3 use cases somewhat transparently? One
> > big argument for scripting is that it makes things more straightforward:
> > import the module, do the work.
> 
> I believe that all Python interfaces will work with these three use
> cases. I won't speak for cothreads (I'm still looking at this code: I
> very much like its use of ctypes) only for my own interface: EpicsCA.
> 
> For interactive shell: Yes, this works.  One needs to occasionally
> poll().  I like yor idea of hiding a poll() in a readline hook.  From
> an interactive shell, I rarely use callbacks and mostly use
> higher-level functions caget() and caput() functions which include
> polling anyway.

Actually, the poll() is hidden deeper still, in the bowels of the cothread 
scheduler, it's just that while I was trying to wire things together ages 
ago I ended up pulling too much mechanism back out into the readline hook.  
Try the almost trivial readline_hook() I've suggested above instead, and 
everything should still work as before!

> For GUI code: Yes, this works.

With GUI code and cothreads I have a bit of a problem: the GUI has to 
provide some cooperation mechanism.  Perhaps I'll try and prepare an essay 
on this.

> For scripts: Yes, this works.  I run many long running (months)
> scripts that run a  "poll-and react to events" loop.

Same here.  I do have occasional hints though that I'm missing the 
occasional update.  For example, I might have something like the following 
code:

	state = numpy.zeros(168, dtype=bool)
	def update_state(value, index):
	    state[index] = value
	camonitor(list_of_pvs, update_state)

and then state[i] for some arbitrary i might once fail to update.  I 
haven't got to the bottom of this, and I don't see how to reproduce it, as 
it's exceptionally rare.

One final thought.  I have wondered whether cothread.catools could use a 
plug-in scheduler which could be replaced by classic threads ... but I 
make essential use of cothreads in ways which would be painfully 
inefficient with threads, in particular, here is how caget() looks:

	def caget_array(pvs, **kargs):
1.	    return WaitForAll([
2.	        Spawn(caget_one, pv, **kargs) for pv in pvs])
	def caget_one(pv, **kargs):
	    channel = channel_cache[pv]
3.	    channel.Wait()
	    done = Event()
	    ca_array_get_callback(..., _caget_event_handler, done)
4.	    return done.Wait()
	def _caget_event_handler(data, done):
5.	    done.Signal(data)

I think the particularly expensive bit would be (2) where if caget() is 
passed a long list of PVs then it spawns a separate parallel task for each 
PV.  With cothreads this is very cheap.  The other numbered lines are just 
synchronisation points: WaitForAll() is no more than

	def WaitForAll(events):
	    return [event.Wait() for event in events]

The point of doing all this, of course, is that caget(a_long_list_of_pvs) 
is painfully slow if the requests are done sequentially.

(Of course, all the above is lies, as we are beset with special cases, 
error handling and parameters which all the above pretends does not 
exist, but the code tells the truth.)

Hum.  Think this has got a trifle overlong!

Experimental Physics and Industrial Control System

Experimental Physics and
Industrial Control System