EPICS Controls Argonne National Laboratory

Experimental Physics and
Industrial Control System

1994  1995  1996  1997  1998  1999  2000  2001  2002  2003  2004  2005  <20062007  2008  2009  2010  2011  2012  2013  2014  2015  2016  2017  2018  2019  2020  2021  2022  2023  2024  Index 1994  1995  1996  1997  1998  1999  2000  2001  2002  2003  2004  2005  <20062007  2008  2009  2010  2011  2012  2013  2014  2015  2016  2017  2018  2019  2020  2021  2022  2023  2024 
<== Date ==> <== Thread ==>

Subject: Get new, improved Matlab CA support, and crash your IOC with put callbacks
From: Kay-Uwe Kasemir <[email protected]>
To: tech talk <[email protected]>
Date: Wed, 01 Nov 2006 10:34:39 -0500
Hi:

For those of you who use Matlab with the 'MCA' extension
for ChannelAccess connectivity, there's a new version.
Most important are improvements to the timeout handling
in 'mcaput', which is a put-callback in most cases,
and fewer (actually none so far) crashes on exiting Matlab.
More in the release notes copied below.

The Extensions/Clients link from the APS web page will direct you to
http://ics-web1.sns.ornl.gov/mca.html
which hasn't been updated, so you'll have to go directly to
http://ics-web1.sns.ornl.gov/~kasemir/mca/index.html
to get the latest.

I'm sorry to have missed Halloween by a day,
but here it is the frightening news:
When using ca-put-callback, then waiting for the callback,
[Matlab:  mcaput(pv_handle, value);   ]
and then closing the channel,
[Matlab:  mcaclose(pv_handle);   ]
I managed to crash IOCs.
You should be able to do something similar with other clients
that support put-callback.

What can happen:
1) On the IOC, the 'put' executes.
2) write_notify_reply() starts, informs the client
   that the put-callback completed.
3) The client closes the connection, which invokes destroy_client()
   on the IOC, which deletes the client's "blockSem".
3) write_notify_reply crashes in its final
    epicsEventSignal ( pClient->blockSem );

You have to get the timing 'just right', and it helps to have an IOC
which uses task priorities (vxWorks, Mac OSX, but by default not Linux),
since otherwise the thread which runs 'destroy_client' isn't as likely
to jump in just before the write_notify_reply() task runs epicsEventSignal().


A possible fix, although I'm can't be 100% sure that this causes new problems,
is to change the end of write_notify_reply in base/src/rsrv/ camessage.c like this:


    /* wait with this unlock...
    SEND_UNLOCK ( pClient );
    */

    /*
     * wakeup the TCP thread if it is waiting for a cb to complete
     */
    epicsEventSignal ( pClient->blockSem );

    /* ... until we're really done: */
    SEND_UNLOCK ( pClient );
}

I saw this with R3.14.8.2, but I think at least back to R3.13.7 all
versions of rsrv/camessage.c have a write_notify_reply() routine
which releases the pClient lock, and then uses pClient->blockSem.

-Kay



From the MCA RELEASE_NOTES file:

- Found one reason for the occasional crash-on-exit:
  In mca_cleanup, the 'iterator' used the node that
  was just deleted to get the 'next' node.
- MCAPUT changes:
  It mostly uses a CA put-callback, which is good, because in Matlab
  scripts you often want to be assured that the 'put' finished before
  you move on.
  Issue 1:
  When used with arrays of PVs and scalar values to put like this:
     mcaput([pv1 pv2 pv3 ...], [scalar1 scalar2 scalar3 ...])
  ... it would silently _not_ use put callback.
  Now the online help tells you so.

  Issue 2:
  MCAPUT always simply waited for the 'put' timeout.
  With the default 'put' timeout of 0.01 secs, it would often
  return 0 = "put didn't succeed" because it didn't wait long enough.
  Worse: It would actually return the status of the last callback,
  which might be the 'OK' from a previous caput, not the yet-to-arrive
  'not OK' from the ongoing caput.

  With a 'put' timeout of 10 secs, it would always wait 10 seconds,
  even if the put callback already arrived 'OK' after e.g. 0.02 secs.

Now the 'put' timeout is the max. wait time, but mcaput() will
otherwise return as soon as the 'put' callback arrives.
Also changed the default 'put' timeout to 10 seconds, in fact
all timeouts now default to 10 seconds.
- MCAPUT didn't give any error when writing strings to non-string PVs
and vice versa.
- Made the "No channel exists for this handle." error messages more
specific. The main reason for getting this error is timing.
Due to irreproducible circumstances, MCAOPEN will sometimes
fail to connect within the default timeout.
When MCAOPEN is called interactively, it will print a warning
message and return a 0 handle.
When invoked from inside a script, however, Matlab does not show
this warning message, and subsequent use of the 0 handle will then
result in this type of error.
- The default search timeout was increased from 1 to 10 seconds to
make this error less likely.
(You can still change it via MCATIMEOUT).
- Scripts that re-use the same PV, but don't want to bother
with elaborate bookkeeping of open connections, might
consider omitting the MCACLOSE calls, and use MCACHECKOPEN
in place of every MCAOPEN call.
This keeps all connections open and thus avoids re-connect timeouts.
Final cleanup can be done via MCAEXIT.
- Changed PV hash to template.
- Most messages are now more specific, displaying the invoked method
and the affected PV name.
- Use PVNAME_SZ from base instead of yet another #define.
- No more compiler warnings with g++ 4.0 and '-Wall'.
- monitor for 'STRING' PVs didn't update the alarm status/severity.
- added a mutex for the monitor cache, since both the monitor callback
and matlab access it concurrently.
- Arranged files in subdirectories.
- Added unit tests.
- Replaced ChannelAccess.cpp by inlines, no more IsInitialized() checks needed
because that's automatic.
- Reviewed all the online help, fixed some inconsistencies.
Example: MCAINFO and MCASTATE were wrong.
Since closed channels are simply gone, they don't show up
in the mcastate or mcainfo lists at all,
so they also don't show up as 'closed'.
- MCACHECK is now the same as MCASTATE.
- MCATIME used to return a Matlab datenum for the UTC time zone,
and since Matlab lacks timezone support, it was very hard to convert
that to 'local' time. Now MCATIME returns 'local' time.

Replies:
RE: Get new, improved Matlab CA support, and crash your IOC with put callbacks Jeff Hill

Navigate by Date:
Prev: Re: EPICS Device Driver question Andrew Johnson
Next: RE: Get new, improved Matlab CA support, and crash your IOC with put callbacks Jeff Hill
Index: 1994  1995  1996  1997  1998  1999  2000  2001  2002  2003  2004  2005  <20062007  2008  2009  2010  2011  2012  2013  2014  2015  2016  2017  2018  2019  2020  2021  2022  2023  2024 
Navigate by Thread:
Prev: Re: EPICS Device Driver question Andrew Johnson
Next: RE: Get new, improved Matlab CA support, and crash your IOC with put callbacks Jeff Hill
Index: 1994  1995  1996  1997  1998  1999  2000  2001  2002  2003  2004  2005  <20062007  2008  2009  2010  2011  2012  2013  2014  2015  2016  2017  2018  2019  2020  2021  2022  2023  2024 
ANJ, 02 Sep 2010 Valid HTML 4.01! · Home · News · About · Base · Modules · Extensions · Distributions · Download ·
· Search · EPICS V4 · IRMIS · Talk · Bugs · Documents · Links · Licensing ·