Hi all -
While we're sitting at our test stand again trying to reproduce the IOC's
crashes, we think we have thought of a scenario where the onceQ is getting
filled with crap without some faulty driver or device support doing weird
things.
[First of all: During the last week I have learned *a lot* about CA and the
database by reading the sources and found a couple of things and thoughts
in the concept and design really amazing.]
We found that the crap in the onceQ looked like as if a rngBufPut() call
was interrupted by itself.
This is an excerpt from the onceQ buffer:
00e3e010: 00f7941c 00f7941c 00f7941c 00f7941c *................*
00e3e020: 00f7941c 00f7941c 00f7941c 00f7941c *................*
00e3e030: f7941c00 f7941c00 f7941c00 f7941c00 *................*
00e3e040: 941c00f7 941c00f7 941c00f7 941c00f7 *................*
00e3e050: 941c00f7 1c00f794 1c00f794 1c00f794 *................*
00e3e060: 1c00f794 1c00f794 1c00f794 00f7941c *................*
00f7941c is a valid record pointer. Do you see how bytes get swallowed from
time to time?
This is what we think is happening: [I'm giving more details than necessary
to give you the chance to find exactly the point where we are wrong ...;-]
scanOnce() is the only function that writes into the onceQ.
The onceQ is not guarded by a mutex semaphore, so interruption of
scanOnce() must be impossible by means of design.
scanOnce() may be called from two different function contexts:
- recGblFwdLink() calls scanOnce() if the record is to be reprocessed. This
is part of record processing, so possible task contexts are all cb and
scan tasks (once and periodic) running at priorities between 51 and 65.
- the event_task() function calls scanOnce() indirectly by executing
event_read() on each event it takes out of its event queue. event_read()
calls the event->user_sub() which depends on the event type. The
user_sub() may be (e.g.) eventCallback(), accessRightsCallback() or
connectionCallback() from dbCa.c which call scanOnce() if the link's type
is CP or CPP. The possible task contexts for this are CA event tasks (at
priority 181) and the EV dbCaLink task (at priority 99) which handles the
event queue for the "user" database.
Imagine the following:
A CA event task (prio 181) is peacefully processing its queue and calls
scanOnce() for a CPP link. While scanOnce() uses rngBufPut() to put the
record pointer into the onceQ, a periodic scan task (prio ~56) gets ready
and interrupts. During record processing the monitor for a CPP link is
fired: db_post_events() puts an event into the database's event queue and
wakes up the owner, i.e. the EV dbCaLink task (prio 99) gets ready. After
the periodic scan task is finished, the EV dbCaLink task resumes execution,
because its priority being lower than the CA event task's. The EV dbCaLink
task processes its event queue, calls the event->user_sub() of the CPP
monitor event and ... scanOnce() is called again thereby interrupting a
call to rngBufPut() by another rngBufPut() which ruins the onceQ.
Solution: Guarding the onceQ with a binary semaphore.
What are we missing?
Ralph
--
__ Ralph Lange Email: [email protected]
/\ \ WWW: http://www.bessy.de/~lange
/ \ \ BESSY II
/ /\ \ \ Berliner Elektronenspeicherring- Snail: BESSY II
/ / /\ \ \ Gesellschaft fuer Synchrotron- Rudower Chaussee 5
/ / /__\_\ \ strahlung m.b.H. D-12489 Berlin, Germany
/ / /________\ Phone: +49 30 6392-4862
\/___________/ Control System Group Fax: ... -4859
- Replies:
- Re: IOC hangs (still) Marty Kraimer
- References:
- Re: IOC hangs (again) Ralph Lange
- Navigate by Date:
- Prev:
Statically built MEDM/GDCT, xfdApp distribution Bakul Banerjee
- Next:
Boot files for MVME167 Marian ZUREK
- Index:
1994
1995
1996
<1997>
1998
1999
2000
2001
2002
2003
2004
2005
2006
2007
2008
2009
2010
2011
2012
2013
2014
2015
2016
2017
2018
2019
2020
2021
2022
2023
2024
- Navigate by Thread:
- Prev:
Re: IOC hangs (still) Ralph Lange
- Next:
Re: IOC hangs (still) Marty Kraimer
- Index:
1994
1995
1996
<1997>
1998
1999
2000
2001
2002
2003
2004
2005
2006
2007
2008
2009
2010
2011
2012
2013
2014
2015
2016
2017
2018
2019
2020
2021
2022
2023
2024
|