g+
g+ Communities
Argonne National Laboratory

Experimental Physics and
Industrial Control System

1994  1995  1996  1997  1998  1999  2000  2001  2002  2003  2004  2005  2006  2007  2008  2009  2010  2011  <20122013  2014  Index 1994  1995  1996  1997  1998  1999  2000  2001  2002  2003  2004  2005  2006  2007  2008  2009  2010  2011  <20122013  2014 
<== Date ==> <== Thread ==>

Subject: RE: Strategies for working with fast-changing large arrays
From: <michael.abbott@diamond.ac.uk>
To: <newville@cars.uchicago.edu>
Cc: tech-talk@aps.anl.gov
Date: Mon, 16 Jan 2012 08:49:00 +0000
From: matt.newville@gmail.com [mailto:matt.newville@gmail.com] On
> Thanks for the report.   FWIW, pyepics uses preemptive callbacks by
> default, but I still see dropped frames, independent of the context
> model.   After some further testing, I believe this is due to the
> conversion of data from C to Python, which is not too surprising.  The
> culprit is, as earlier in this conversation, with the '_unpack'
> function which does the C-Python data conversion.
> 
> For CHAR waveforms, such as images from a Prosilica camera, the
> current code has a test for converting to a CHAR waveform is (ca.py,
> line 815)
>         # waveform data:
>         if ntype == dbr.CHAR:
>             if use_numpy:
>                 data = numpy.array(data)
>             return copy(data[:])
> 
> where 'data' is the still-ctypes data (an unsigned byte Array:
> c_ubyte_Array).  Ignoring the 'use_numpy' option, the 'return
> copy(data[:])' does two things:
>    a) makes a copy to avoid memory overwrites -- I've seen this on some
> systems.
>    b) takes a slice of the whole array 'data[:]' which converts the
> ctypes ubyte Array to a list.  This appears to be the slow part.
> 
> Replacing that with
>         # waveform data:
>         if ntype == dbr.CHAR:
>             if use_numpy:
>                 data = numpy.array(data)
>             return copy(data)
> 
> allows color images from a 1.4Mb camera (so 4177920 data) to be
> received at full speed.  The downside is that the data is still a
> ctypes array and has to be converted *somewhere* to something useful
> to python, though this is simply 'data=data[:]'.   Doing this inside
> the user-level callback still causes frames to be dropped, as that is
> still run "inside" the event handler.   It might be possible that
> running the user callback in a separate thread or expecting the user
> to store the data in the callback and convert in a separate
> thread/process would work, but I haven't tried that yet.

This data conversion path seems a bit painful to me, the trick is to ensure that the underlying data is copied exactly once (this is necessary of course as the data buffer handed over by the subscription callback is temporary) and that all metadata handling is done before Python actually gets its hands on the underlying bytes.  As soon as you let Python convert your data to a Python list you're paying a huge cost!

The corresponding code in cothread (cothread.dbr.dbr_to_value) is (somewhat trimmed):

	# Array size is count, corresponding numpy datatype is dtype,
	# raw_dbr is the corresponding event_handler args field of type c_void_p,
	# ca_array is a convenience subclass of numpy.ndarray
	result = ca_array(shape = (count,), dtype = dtype)
	ctypes.memmove(result.ctypes.data, raw_dbr.raw_value, result.nbytes)

Note here that ndarray(shape = ..., dtype = ...) allocates the necessary storage without initialising it, but creates all the necessary metadata, and then we can just memcpy() the dbr data directly into place.  Of course this only works when count and dtype are a precise match for the dbr block we've received, hence all the dancing you'll find in cothread.dbr.

> I'm not sure how cothread handles such a scenario, but I guess it
> runs callbacks in a separate thread.

In effect each callback is processed twice, once immediately as an event_handler in the Channel Access context, and then a second time in the user's own context.

> I'm also not sure whether the above change is the best idea for
> default use, but having a 'DONT UNPACK' option for each PV/ChannelID
> might be a useful option.  In this case, for example, it might allow a
> faster image processing and display program, at least in the sense
> that fewer frames would be dropped.
> 
> Any suggestions on whether that would be a useful addition?

Personally I'd recommend making numpy a mandatory part of your library api.

-- 
This e-mail and any attachments may contain confidential, copyright and or privileged material, and are for the use of the intended addressee only. If you are not the intended addressee or an authorised recipient of the addressee please notify us of receipt by returning the e-mail and do not use, copy, retain, distribute or disclose the information in or attached to the e-mail.
Any opinions expressed within this e-mail are those of the individual and not necessarily of Diamond Light Source Ltd. 
Diamond Light Source Ltd. cannot guarantee that this e-mail or any attachments are free from viruses and we cannot accept liability for any damage which you may sustain as a result of software viruses which may be transmitted in or with the message.
Diamond Light Source Limited (company no. 4375679). Registered in England and Wales with its registered office at Diamond House, Harwell Science and Innovation Campus, Didcot, Oxfordshire, OX11 0DE, United Kingdom
 





Replies:
Re: Strategies for working with fast-changing large arrays Matt Newville
References:
Re: pyepics: Strategies for working with fast-changing large arrays Anton Derbenev
Re: pyepics: Strategies for working with fast-changing large arrays Andrew Johnson
RE: Strategies for working with fast-changing large arrays michael.abbott
Re: Strategies for working with fast-changing large arrays Matt Newville

Navigate by Date:
Prev: RE: problem with 64 bit build of EDM v1-12-68 John Dobbins
Next: Re: Reading scope waveforms with StreamDevice + asyn Dirk Zimoch
Index: 1994  1995  1996  1997  1998  1999  2000  2001  2002  2003  2004  2005  2006  2007  2008  2009  2010  2011  <20122013  2014 
Navigate by Thread:
Prev: Re: Strategies for working with fast-changing large arrays Matt Newville
Next: Re: Strategies for working with fast-changing large arrays Matt Newville
Index: 1994  1995  1996  1997  1998  1999  2000  2001  2002  2003  2004  2005  2006  2007  2008  2009  2010  2011  <20122013  2014 
ANJ, 18 Nov 2013 Valid HTML 4.01! · Home · News · About · Base · Modules · Extensions · Distributions · Download ·
· EPICSv4 · IRMIS · Talk · Bugs · Documents · Links · Licensing ·