EPICS Controls Argonne National Laboratory

Experimental Physics and
Industrial Control System

1994  1995  1996  1997  1998  <19992000  2001  2002  2003  2004  2005  2006  2007  2008  2009  2010  2011  2012  2013  2014  2015  2016  2017  2018  2019  2020  2021  2022  2023  2024  Index 1994  1995  1996  1997  1998  <19992000  2001  2002  2003  2004  2005  2006  2007  2008  2009  2010  2011  2012  2013  2014  2015  2016  2017  2018  2019  2020  2021  2022  2023  2024 
<== Date ==> <== Thread ==>

Subject: database race condition?
From: Till Straumann <[email protected]>
To: [email protected]
Date: Thu, 18 Mar 1999 15:07:16 +0100
One of our programmers consulted me because his device support module
deadlocked the database (i.e. one or two of the scanning tasks). This happened
when his module read a field using dbGetField().
dbGetField() tried to read the value of "that one record" that triggered "this record's"
processing:

                      FLNK
       Record 1   ---------> Record 2
         ^                      |
         |                      |
         ------------------------
               dbGetField()

Of course, the deadlock occurred because Record1 processing had
not finished yet at the time the FLNK was processed, leaving Record1
locked when dbGetField() also tried to lock Record1, hence the deadlock.
(Both records are not necessarily scanned by the same task)

Nevertheless, I remembered a very similar constellation to work well

                   FLNK
       Record1  -------->  Record2
          ^                   |
          |                   |
          ---------------------
                INP NPP

Here, Record2 obtains the value by dbGetLink() and no deadlock occurs.

Studying the source, I was surprised to learn that dbGetLink() (calling dbGet)
not only does no record locking, but EPICS not seeming to implement
something like a `field read/write access mutex'. A task reading a database
link (using dbGetLink()) may therefore read a tampered value from a record
according to the following race condition:

  • record A processing (i.e. in the context of a low prio scanning task) starts
    writing a field.
  • record B processing starts (in the context of a high prio task) preempting
    the processing of record A. record B processing calls
    dbGetLink() trying to read just the field A is currently writing.
    Hence B will get only the partially written value!
  • record A completes writing the field and terminates processing.
It is indeed very simple to observe this race condition. I created two stringin
records, A and B. B is scanned `.1 second', has its INP field set to
"A NPP" and is using the devSiSoft device. A is scanned less frequently and
has a device support module which (artificially slowly) modifies its value field.
Observing B shows that the described race condition occasionally is met.

Did I miss something? Wouldn't some finer grained locking than locking a whole
scanLock set make sense to prevent this kind of race condition?

Best regards.

Till Straumann (PTB/Bessy II, Berlin)
 


Replies:
Re: database race condition? Marty Kraimer
Re: database race condition? Marty Kraimer

Navigate by Date:
Prev: Re: TCP s_errno_ENOBUFS error in CAS Frank Lenkszus
Next: Re: database race condition? Marty Kraimer
Index: 1994  1995  1996  1997  1998  <19992000  2001  2002  2003  2004  2005  2006  2007  2008  2009  2010  2011  2012  2013  2014  2015  2016  2017  2018  2019  2020  2021  2022  2023  2024 
Navigate by Thread:
Prev: RE: TCP s_errno_ENOBUFS error in CAS Jeff Hill
Next: Re: database race condition? Marty Kraimer
Index: 1994  1995  1996  1997  1998  <19992000  2001  2002  2003  2004  2005  2006  2007  2008  2009  2010  2011  2012  2013  2014  2015  2016  2017  2018  2019  2020  2021  2022  2023  2024 
ANJ, 10 Aug 2010 Valid HTML 4.01! · Home · News · About · Base · Modules · Extensions · Distributions · Download ·
· Search · EPICS V4 · IRMIS · Talk · Bugs · Documents · Links · Licensing ·