EPICS Controls Argonne National Laboratory

Experimental Physics and
Industrial Control System

1994  1995  1996  1997  1998  1999  2000  2001  2002  2003  2004  2005  2006  2007  2008  2009  2010  2011  <20122013  2014  2015  2016  2017  2018  2019  2020  2021  2022  2023  2024  Index 1994  1995  1996  1997  1998  1999  2000  2001  2002  2003  2004  2005  2006  2007  2008  2009  2010  2011  <20122013  2014  2015  2016  2017  2018  2019  2020  2021  2022  2023  2024 
<== Date ==> <== Thread ==>

Subject: Re: Strategies for working with fast-changing large arrays
From: Matt Newville <[email protected]>
To: [email protected], [email protected]
Date: Mon, 16 Jan 2012 16:58:43 -0600
Hi Michael,

Thanks for the suggestions!

On Mon, Jan 16, 2012 at 2:49 AM,  <[email protected]> wrote:
> From: [email protected] [mailto:[email protected]] On
>> Thanks for the report.   FWIW, pyepics uses preemptive callbacks by
>> default, but I still see dropped frames, independent of the context
>> model.   After some further testing, I believe this is due to the
>> conversion of data from C to Python, which is not too surprising.  The
>> culprit is, as earlier in this conversation, with the '_unpack'
>> function which does the C-Python data conversion.
>>
>> For CHAR waveforms, such as images from a Prosilica camera, the
>> current code has a test for converting to a CHAR waveform is (ca.py,
>> line 815)
>>         # waveform data:
>>         if ntype == dbr.CHAR:
>>             if use_numpy:
>>                 data = numpy.array(data)
>>             return copy(data[:])
>>
>> where 'data' is the still-ctypes data (an unsigned byte Array:
>> c_ubyte_Array).  Ignoring the 'use_numpy' option, the 'return
>> copy(data[:])' does two things:
>>    a) makes a copy to avoid memory overwrites -- I've seen this on some
>> systems.
>>    b) takes a slice of the whole array 'data[:]' which converts the
>> ctypes ubyte Array to a list.  This appears to be the slow part.
>>
>> Replacing that with
>>         # waveform data:
>>         if ntype == dbr.CHAR:
>>             if use_numpy:
>>                 data = numpy.array(data)
>>             return copy(data)
>>
>> allows color images from a 1.4Mb camera (so 4177920 data) to be
>> received at full speed.  The downside is that the data is still a
>> ctypes array and has to be converted *somewhere* to something useful
>> to python, though this is simply 'data=data[:]'.   Doing this inside
>> the user-level callback still causes frames to be dropped, as that is
>> still run "inside" the event handler.   It might be possible that
>> running the user callback in a separate thread or expecting the user
>> to store the data in the callback and convert in a separate
>> thread/process would work, but I haven't tried that yet.
>
> This data conversion path seems a bit painful to me, the trick is to ensure that the
> underlying data is copied exactly once (this is necessary of course as the data buffer
> handed over by the subscription callback is temporary) and that all metadata handling
> is done before Python actually gets its hands on the underlying bytes.  As soon as you
> let Python convert your data to a Python list you're paying a huge cost!

The original version was just stupidly converting CHAR waveforms to
lists, which is incredibly inefficient.

> The corresponding code in cothread (cothread.dbr.dbr_to_value) is (somewhat trimmed):
>
>        # Array size is count, corresponding numpy datatype is dtype,
>        # raw_dbr is the corresponding event_handler args field of type c_void_p,
>        # ca_array is a convenience subclass of numpy.ndarray
>        result = ca_array(shape = (count,), dtype = dtype)
>        ctypes.memmove(result.ctypes.data, raw_dbr.raw_value, result.nbytes)
>
> Note here that ndarray(shape = ..., dtype = ...) allocates the necessary storage without
> initialising it, but creates all the necessary metadata, and then we can just memcpy()
> the dbr data directly into place.  Of course this only works when count and dtype are a
> precise match for the dbr block we've received, hence all the dancing you'll find in
> cothread.dbr.

Thanks for the suggestion. I had been effectively using
   result = numpy.ctypeslib.as_array(raw_dbr.raw_value)

for converting to numpy arrays, which is simpler than using
ctypes.memcopy, but appears to be about 2 to 3 times slower.   Of
course, the conversion to a list was decades slower than that!

With pyepics, numpy is an option.  A few people have actually
requested this, and I'm OK with keeping it that way.  If numpy isn't
available, a simple copy() does work to prevent data corruption,  and
is reasonably fast, though not as fast as converting to numpy array.
Leaving the data as a ctypes Array that the non-numpy user then has to
deal with.  I had been reluctant to ever expose ctypes Arrays to the
end user, but I think I'm willing to say that if you're using
waveforms and NOT using numpy, that you're going to have to deal with
the ctypes Arrays.   They turn out to be not too bad to deal with, and
the alternative of converting waveforms to lists is just way too slow.

A git branch at github (pvget_improvements) has this and related
changes.  It appears to be passing all tests for Python2.7 and 3.1 on
Windows32 and linux32 for me.  I'll need to do more testing, and then
expect to make these changes into version 3.2 soon.

As part of the testing for this, I also created a simple AreaDetector
Display app (in wxPython), modeled on Mark's AD_Display examples in
IDL and ImageJ.   In tests with a 1360x1024 Prosilica color camera,
this does a reasonable job of keeping up with the ImageJ plugin.  I
wouldn't call it complete, but it does a decent job of keeping up with
detectors, and might be a useful start of the sort of application
Anton had in mind.   This is at
https://github.com/pyepics/epicsapps/tree/master/AreaDetector_Display

>> I'm not sure how cothread handles such a scenario, but I guess it
>> runs callbacks in a separate thread.
>
> In effect each callback is processed twice, once immediately as an
> event_handler in the Channel Access context, and then a second time
> in the user's own context.

OK.  In pyepics, the user-level callbacks are called from the
internally defined C monitor callback function, so that the user-level
callback is still "inside" that ca event thread.

>> I'm also not sure whether the above change is the best idea for
>> default use, but having a 'DONT UNPACK' option for each PV/ChannelID
>> might be a useful option.  In this case, for example, it might allow a
>> faster image processing and display program, at least in the sense
>> that fewer frames would be dropped.
>>
>> Any suggestions on whether that would be a useful addition?
>
> Personally I'd recommend making numpy a mandatory part of your library api.

I don't really disagree, but a few users want to be able to not rely
on numpy.  I am willing to say that if you're aiming to deal with
waveform data that numpy is so strongly encouraged that you'd have to
deal with ctypes arrays is numpy is not installed.

Anyway, thanks!

--Matt


References:
Re: pyepics: Strategies for working with fast-changing large arrays Anton Derbenev
Re: pyepics: Strategies for working with fast-changing large arrays Andrew Johnson
RE: Strategies for working with fast-changing large arrays michael.abbott
Re: Strategies for working with fast-changing large arrays Matt Newville
RE: Strategies for working with fast-changing large arrays michael.abbott

Navigate by Date:
Prev: Re: Reading scope waveforms with StreamDevice + asyn Rod Nussbaumer
Next: labca-3.4 released Till Straumann
Index: 1994  1995  1996  1997  1998  1999  2000  2001  2002  2003  2004  2005  2006  2007  2008  2009  2010  2011  <20122013  2014  2015  2016  2017  2018  2019  2020  2021  2022  2023  2024 
Navigate by Thread:
Prev: RE: Strategies for working with fast-changing large arrays michael.abbott
Next: Re: pyepics: Strategies for working with fast-changing large arrays Anton Derbenev
Index: 1994  1995  1996  1997  1998  1999  2000  2001  2002  2003  2004  2005  2006  2007  2008  2009  2010  2011  <20122013  2014  2015  2016  2017  2018  2019  2020  2021  2022  2023  2024 
ANJ, 18 Nov 2013 Valid HTML 4.01! · Home · News · About · Base · Modules · Extensions · Distributions · Download ·
· Search · EPICS V4 · IRMIS · Talk · Bugs · Documents · Links · Licensing ·