g+
g+ Communities
Argonne National Laboratory

Experimental Physics and
Industrial Control System

1994  1995  1996  1997  1998  1999  2000  2001  2002  2003  2004  2005  2006  2007  2008  2009  2010  2011  <20122013  2014  Index 1994  1995  1996  1997  1998  1999  2000  2001  2002  2003  2004  2005  2006  2007  2008  2009  2010  2011  <20122013  2014 
<== Date ==> <== Thread ==>

Subject: RE: asynPortDriver
From: Mark Rivers <rivers@cars.uchicago.edu>
To: "'Szalata, Zenon M.'" <zms@slac.stanford.edu>
Cc: "tech-talk@aps.anl.gov" <tech-talk@aps.anl.gov>
Date: Fri, 20 Jul 2012 00:12:26 +0000
Hi Zen,

Do I understand correctly that you have a waveform record with SCAN=I/O Intr using the standard asyn device support?

If so, then this is what happens:

- Your driver calls doCallbacksInt32Array.
- This results in a call to interruptCallbackInput in asyn/devEpics/devAsynXXXArray.h
- interruptCallbackInput copies the data to the waveform record VAL field and then calls scanIoRequest
- scanIoRequest processes the record

I suspect the problem is that sometimes scanIoRequest has not processed the record before your driver gets a second interrupt, and repeats the above sequence. When the record finally does process the first set of data has been overwritten by the second set.

For all records except waveform records standard asyn device support guards against this by using an EPICS ring buffer to store the callback values.  Then if the callbacks momentarily happen faster than scanIoRequest can handle no data is lost.  Of course if it happens too many times in a row you get a ring buffer overflow.

For the waveform records we don't create a ring buffer, so data will be overwritten if scanIoRequest cannot keep up.

Here is a way to test if this hypothesis is correct.  In interruptCallbackInput the pPvt->gotValue flag is set to 1, and it is reset to 0 when scanIoRequest processes the record.  Thus, normally pPvt->gotValue should be 0 when interruptCallbackInput is entered.  If it is 1 that means a second callback has happened before the record was processed from the previous callback.  You can change interruptCallbackInput to print an error message if pPvt->gotValue is non-zero on entry, and see if this corresponds with the increment by 2 problem you are seeing.

Mark


-----Original Message-----
From: Szalata, Zenon M. [mailto:zms@slac.stanford.edu] 
Sent: Thursday, July 19, 2012 5:23 PM
To: Mark Rivers
Cc: tech-talk@aps.anl.gov; Williams Jr., Ernest L.
Subject: RE: asynPortDriver

Hi Mark,
I modified my version 2 asynPortDriver based device driver so that now it:
1. in ISR (at interrupt level) it read data from all 16 channels and sends the data by message queue.
2. In a high priority thread the data is received from the message queue, unpacked, and posted as an array.
    The unpacked data are put into an array where the first element is a counter, which is incremented in the
    data unpacking thread, and elements 1 through 16 are the ADC values.
In the db layer I still see missing data.  Once in a while the counter value, which is the first element in the data array, is larger by 2 than the previous counter value.  The test is done at 100Hz.

The version 3 of the device driver, which is not using asyn, does not exhibit this behavior.  In fact I can run the system with a trigger rate of up to 400Hz before it starts falling apart.  Since I have a device driver that works (version 3) I could just use it and move on.  But at this point I am still reluctant to just abandon the asynPortDriver version.  Can you think of some tests that could help understand what is causing this problem?
Thanks,
Zen

> -----Original Message-----
> From: Mark Rivers [mailto:rivers@cars.uchicago.edu]
> Sent: Sunday, July 15, 2012 6:59 PM
> To: Szalata, Zenon M.
> Cc: tech-talk@aps.anl.gov; Williams Jr., Ernest L.
> Subject: RE: asynPortDriver
> 
> Hi Zen,
> 
> > In the first version of the device driver, I saw the effect that Kevin
> > Tsubota described in his reply to my email. This is what he said:
> 
> > "At Keck we ran into a similar problem where we weren't seeing all the
> > events we expected with a different driver and it turned out to be that ASYN
> only posts on change."
> 
> OK, there are 3 parts to my reply about that:
> 
> 1) This is not an issue with ASYN in general, it is specific to asynPortDriver.
> 
> 2) A future version of asynPortDriver should have a variant of the setXXXParam
> that takes a "force" flag, i.e.
> setIntegerParam(addr, myParam, value, forceCallback) which will force a
> callback the next time callParamCallbacks(addr) is called, even if the value has
> not changed.
> 
> 3) Meanwhile it is trivial to force such behavior in the current version by doing
> setIntegerParam(addr, myParam, value+1); setIntegerParam(addr, myParam,
> value);
> 
> This will fool the parameter library into thinking that the value has changed so it
> will do the callback on "value".  The overhead of doing the additional call to
> setIntegerParam is insignificant.
> 
> > In all three version of the device driver The data is read out at
> > interrupt level, which is needed to clear the interrupt.  Then I
> > schedule a callback to continue processing the data.  This consists of
> unpacking the data  words into various components.
> 
> I would have to see your code, but what you are doing sounds dangerous.
> What if a second interrupt occurs before the callback occurs to process the
> first?  If your data is not put in a queue, then you risk missing data.  In my Ip330
> and ipUnidig drivers I send the data from the interrupt service routine to the
> thread that will do callbacks via an epicsMessageQueue.  Then even if the
> thread that does callbacks momentarily falls behind I will not lose data unless
> the message queue overflows, and I have a flag to signal if that occurs.
> 
> Mark
> 
> ________________________________________
> From: Szalata, Zenon M. [zms@slac.stanford.edu]
> Sent: Sunday, July 15, 2012 8:20 PM
> To: Mark Rivers
> Cc: tech-talk@aps.anl.gov; Williams Jr., Ernest L.
> Subject: RE: asynPortDriver
> 
> Hi Mark,
> In the first version of the device driver, I saw the effect that Kevin Tsubota
> described in his reply to my email.  This is what he said:
> 
> "At Keck we ran into a similar problem where we weren't seeing all the events
> we expected with a different driver and it turned out to be that ASYN only
> posts on change."
> 
> In all three version of the device driver The data is read out at interrupt level,
> which is needed to clear the interrupt.  Then I schedule a callback to continue
> processing the data.  This consists of unpacking the data words into various
> components.
> In the case of the first version of the device driver, the callback requests to
> process the records are done in a for loop.  Perhaps I have abandoned the first
> version too quickly.
> Anyway it was clear to me that it would be better to bundle the trigger number
> and data from all ADC channels and pass that to the db layer.
> 
> I will modify the C subroutine of the subroutine record to also print a message
> when the current gate number is less than the previous value.
> 
> Thanks,
> Zen
> 
> > -----Original Message-----
> > From: Mark Rivers [mailto:rivers@cars.uchicago.edu]
> > Sent: Sunday, July 15, 2012 3:19 PM
> > To: Szalata, Zenon M.
> > Cc: tech-talk@aps.anl.gov; Williams Jr., Ernest L.
> > Subject: RE: asynPortDriver
> >
> > How are you passing the data from your interrupt service routine to
> > the thread that is calling callParamCallbacks()?  Are you using a
> > mechanism with a queue, like an epicsMessageQueue?
> >
> > I'd like to understand why "the order of record processing is somewhat
> > random" in method 1.  I would think that it would be deterministic
> > because there is only 1 callback thread that is processing all the
> > records in the order in which the callback requests were issued, which
> > I assume your driver does in a loop like:
> >
> > for (i=0; i<16; i++) {
> >   callParamCallbacks(i);
> > }
> >
> > Mark
> >
> > ________________________________
> > From: Szalata, Zenon M. [zms@slac.stanford.edu]
> > Sent: Sunday, July 15, 2012 12:15 PM
> > To: Mark Rivers
> > Cc: tech-talk@aps.anl.gov; Williams Jr., Ernest L.
> > Subject: asynPortDriver
> >
> > Hi Mark,
> > I am trying to understand why my device driver for a CAEN VME gated
> > ADC module, which is based on asynPortDriver class works incorrectly.
> > The module has 16 ADC channels and is setup to generate an interrupt
> > for each gate signal.  The interrupt triggers data readout and record
> > processing.  For testing, the trigger/gate rate is 100 Hz.  The device
> > driver maintains a gate counter, which gets incremented each time the data
> is processed.
> > So far I have three versions of the device driver.
> >
> > 1. In this version I have 16 I/O Intr scanned ai records.  The gate
> > counter is pushed to a longin record.
> >     This version has the problem that the order of record processing
> > is somewhat random and the IOC
> >     using this VME module requires the processing to be deterministic.
> > To overcome this difficulty,
> >    I wrote the second version.
> >
> > 2. In this version I pack the gate counter and all 16 ADC channel
> > values into an array and this array gets
> >     pushed to a waveform record.  Then, the data gets distributed to
> > ai records using  sub array records.
> >    This part works fine.  I have subroutine record which is processed
> > each time the data waveform record
> >    is processed.  It checks for missing triggers, by expecting that
> > the gate counter value for this data array
> >    is larger by 1 than the previous one.  This is where the device
> > driver code seems to fail.
> >    I let the IOC run for a bit over 35 hours.  More precisely the time
> > corresponds to 12700898 triggers.
> >    During this time the subroutine record reported 42 missing
> > triggers.  They occurred spread out
> >    throughout the run nearly but not equally spaced in time. I have a
> > simple print statement
> >    in the C routine which prints a few numbers.  These numbers tell me
> > that the trigger incremented by 2.
> >    I just looked at the logic in the C routine and I see that I am not
> > printing messages when the new gate
> >    counter is less than the previous one.  So it is possible that from
> > time to time the waveform record gets
> >    the data out of order.  It is also possible, but I think unlikely,
> > that from time to time the data is lost.
> >
> > 3. This version of the device driver is non-asyn.  It consists of two
> > parts, device support and record support.
> >     I wrote it, by taking the second version of the device driver and
> > replaced the asynPortDriver part with
> >    device support.  Also, I created a new version of the IOC, by
> > taking the IOC which used the second version of
> >    the device driver and modifying the db files and the st.cmd file as
> > needed.  I ran this IOC version
> >   overnight, so far it has processed 4 and half million triggers and
> > not a single missed trigger.
> >
> > I wonder if the behavior observed using the second version of the
> > device driver is to be expected?
> > Does it mean, that for some applications the asynPortDriver approach
> > to device driver implementation might be not appropriate?
> > Zen



References:
asynPortDriver Szalata, Zenon M.
RE: asynPortDriver Mark Rivers
RE: asynPortDriver Szalata, Zenon M.
RE: asynPortDriver Mark Rivers
RE: asynPortDriver Szalata, Zenon M.

Navigate by Date:
Prev: RE: asynPortDriver Szalata, Zenon M.
Next: RE: asynPortDriver Mark Rivers
Index: 1994  1995  1996  1997  1998  1999  2000  2001  2002  2003  2004  2005  2006  2007  2008  2009  2010  2011  <20122013  2014 
Navigate by Thread:
Prev: RE: asynPortDriver Szalata, Zenon M.
Next: RE: asynPortDriver Mark Rivers
Index: 1994  1995  1996  1997  1998  1999  2000  2001  2002  2003  2004  2005  2006  2007  2008  2009  2010  2011  <20122013  2014 
ANJ, 18 Nov 2013 Valid HTML 4.01! · Home · News · About · Base · Modules · Extensions · Distributions · Download ·
· EPICSv4 · IRMIS · Talk · Bugs · Documents · Links · Licensing ·