EPICS Controls Argonne National Laboratory

Experimental Physics and
Industrial Control System

1994  1995  1996  <19971998  1999  2000  2001  2002  2003  2004  2005  2006  2007  2008  2009  2010  2011  2012  2013  2014  2015  2016  2017  2018  2019  2020  2021  2022  2023  2024  Index 1994  1995  1996  <19971998  1999  2000  2001  2002  2003  2004  2005  2006  2007  2008  2009  2010  2011  2012  2013  2014  2015  2016  2017  2018  2019  2020  2021  2022  2023  2024 
<== Date ==> <== Thread ==>

Subject: Re: IOC hangs (still)
From: Marty Kraimer <[email protected]>
To: [email protected]
Date: Mon, 30 Jun 1997 08:33:42 -0500
Ralph Lange wrote:
> 
> Hi all -
> 
> While we're sitting at our test stand again trying to reproduce the IOC's
> crashes, we think we have thought of a scenario where the onceQ is getting
> filled with crap without some faulty driver or device support doing weird
> things.
> 
> [First of all: During the last week I have learned *a lot* about CA and the
> database by reading the sources and found a couple of things and thoughts
> in the concept and design really amazing.]
> 
> We found that the crap in the onceQ looked like as if a rngBufPut() call
> was interrupted by itself.
> 
> This is an excerpt from the onceQ buffer:
> 
> 00e3e010:  00f7941c 00f7941c 00f7941c 00f7941c   *................*
> 00e3e020:  00f7941c 00f7941c 00f7941c 00f7941c   *................*
> 00e3e030:  f7941c00 f7941c00 f7941c00 f7941c00   *................*
> 00e3e040:  941c00f7 941c00f7 941c00f7 941c00f7   *................*
> 00e3e050:  941c00f7 1c00f794 1c00f794 1c00f794   *................*
> 00e3e060:  1c00f794 1c00f794 1c00f794 00f7941c   *................*
> 
> 00f7941c is a valid record pointer. Do you see how bytes get swallowed from
> time to time?
> 
> This is what we think is happening: [I'm giving more details than necessary
> to give you the chance to find exactly the point where we are wrong ...;-]
> 
> scanOnce() is the only function that writes into the onceQ.
> 
> The onceQ is not guarded by a mutex semaphore, so interruption of
> scanOnce() must be impossible by means of design.


It looks like you have found a bug. I will hang my head in shame!!

> Solution: Guarding the onceQ with a binary semaphore.


I would rather keep the option of calling scanOnce from interrupt level.
I think it is better to use intLock/intUnlock.
This is what callback does so that callbackRequest
can be called from interupt routines.

Thus scanOnce should be:

 void scanOnce(void *precord)
{
    static int newOverflow=TRUE;
    int lockKey;
    int nput;
 
    lockKey = intLock();
    nput = rngBufPut(onceQ,(void *)&precord,sizeof(precord));
    intUnlock(lockKey);
    if(nput!=sizeof(precord)) {
        if(newOverflow)errMessage(0,"rngBufPut overflow in scanOnce");
        newOverflow = FALSE;
    }else {
        newOverflow = TRUE;
    }
    semGive(onceSem);
}


This will be in next beta release.


> What are we missing?

Nothing!!!

THANKS.

By the way let us know if your problem goes away.

Also keep looking at code. We really really appreciate your
finding problems.

Martyu Kraimer

Replies:
Re: IOC hangs (no more) Ralph Lange
References:
Re: IOC hangs (still) Ralph Lange

Navigate by Date:
Prev: Re: Boot files for MVME167 Andrew Johnson
Next: European EPICS meeting Bob Dalesio
Index: 1994  1995  1996  <19971998  1999  2000  2001  2002  2003  2004  2005  2006  2007  2008  2009  2010  2011  2012  2013  2014  2015  2016  2017  2018  2019  2020  2021  2022  2023  2024 
Navigate by Thread:
Prev: Re: IOC hangs (still) Ralph Lange
Next: Re: IOC hangs (no more) Ralph Lange
Index: 1994  1995  1996  <19971998  1999  2000  2001  2002  2003  2004  2005  2006  2007  2008  2009  2010  2011  2012  2013  2014  2015  2016  2017  2018  2019  2020  2021  2022  2023  2024 
ANJ, 10 Aug 2010 Valid HTML 4.01! · Home · News · About · Base · Modules · Extensions · Distributions · Download ·
· Search · EPICS V4 · IRMIS · Talk · Bugs · Documents · Links · Licensing ·