EPICS Home

Experimental Physics and Industrial Control System


 
1994  1995  1996  1997  1998  1999  2000  2001  2002  2003  2004  <20052006  2007  2008  2009  2010  2011  2012  2013  2014  2015  2016  2017  2018  2019  2020  2021  2022  2023  2024  Index 1994  1995  1996  1997  1998  1999  2000  2001  2002  2003  2004  <20052006  2007  2008  2009  2010  2011  2012  2013  2014  2015  2016  2017  2018  2019  2020  2021  2022  2023  2024 
<== Date ==> <== Thread ==>

Subject: Solution to a different version of the S_errno_ENOBUFS problem
From: "Redman, Russell O." <[email protected]>
To: "'[email protected]'" <[email protected]>
Date: Fri, 14 Jan 2005 08:17:14 -0500
Title: Solution to a different version of the S_errno_ENOBUFS problem

I encountered a somewhat different version of the ENOBUFS problem, but seem to have cured it.  I am by no means an expert in these issues, but enough people have encountered ENOBUFS that there may be some interest in my cure.

I am running EPICS R3.13.8 on an MVME2402-3.  I build vxWorks using Tornado 2.0.2.

The IOC would refuse to start the Channel Access repeater during iocInit.  I am writing this message on a different machine and did not save screen dumps of the boot sequence, so I cannot reproduce the exact error messages (which were different in each test that I made anyways) but the upshot was always that the system could not allocate a socket to start the repeater.  There were also occasional complaints about a bind error.  Always, the sequence ended with S_errno_ENOBUFS.  I was able to isolate the system from the rest of the network, so I am quite sure that network traffic was not an issue.  Only iocInit was running (and any subtasks tht it spawned), so higer-priority tasks should not have been in the way.  This looked very much like resource starvation as was discussed for an MVME162 by  "Zoltan Kakucs" on 20 Oct 2003 (see Re: CA block sem corrupted error and S_errno_ENOBUFS" in the tech-talk archives), but his detailed solution does not apply to my BSP that uses a very different set of #define's.  Jeff Hill also contributed a useful discussion of a related problem on 29 Jan 2003. 

casr verified that there were no channels connected.  Similarly, inetstatShow verified that there were no active network connections, and netStackDataPoolshow revealed no problems.  However, netStackSysPoolShow claimed that the system pool had been drained 3 times, and the number of sockets was 16, the maximum number of MUX bindings allowed,   I do open a lot of serial ports using tnetdev to access a pair of networked terminal servers.  In Tornado, I therefore went to

- network components
-- basic network initialization
---network buffer initialization
and arbitrarily doubled the number of system buffers from 64 to 128:
NUM_SYS_128 = 128
NUM_SYS_256 = 128
NUM_SYS_512 = 128
NUM_SYS_64  = 128

Because of the complaints about bindings, and the suspicious equality between the number of sockets used and the MUX_MAX_BINDS, I also doubled the number of BINDS from 16 to 32.

-network components
-- basic network initialization
---network buffer initialization
MUX_MAX_BINDS = 32

Rebuilding vxWorks and rebooting the IOC, I now find that the repeater starts properly, and that netStackSysPoolShow reports the pool was never drained.  I also have 19 sockets open - no wonder that MUX_MAX_BINDS=16 gave trouble.

Hope this is helpful for someone else.
Russell O. Redman


Navigate by Date:
Prev: Re: CA links and EPICS Records user routines Ralph Lange
Next: Oxford/Danfysik power supply Gournay Jean-Francois
Index: 1994  1995  1996  1997  1998  1999  2000  2001  2002  2003  2004  <20052006  2007  2008  2009  2010  2011  2012  2013  2014  2015  2016  2017  2018  2019  2020  2021  2022  2023  2024 
Navigate by Thread:
Prev: MSVCPRT.LIB bharoto
Next: Oxford/Danfysik power supply Gournay Jean-Francois
Index: 1994  1995  1996  1997  1998  1999  2000  2001  2002  2003  2004  <20052006  2007  2008  2009  2010  2011  2012  2013  2014  2015  2016  2017  2018  2019  2020  2021  2022  2023  2024