EPICS Controls Argonne National Laboratory

Experimental Physics and
Industrial Control System

1994  1995  1996  1997  1998  1999  2000  2001  2002  2003  2004  2005  2006  <20072008  2009  2010  2011  2012  2013  2014  2015  2016  2017  2018  2019  2020  2021  2022  2023  2024  Index 1994  1995  1996  1997  1998  1999  2000  2001  2002  2003  2004  2005  2006  <20072008  2009  2010  2011  2012  2013  2014  2015  2016  2017  2018  2019  2020  2021  2022  2023  2024 
<== Date ==> <== Thread ==>

Subject: Re: vxWorks network problem on MVME2700
From: "Martin L. Smith" <[email protected]>
To: Mark Rivers <[email protected]>
Cc: TechTalk EPICS <[email protected]>
Date: Mon, 15 Oct 2007 06:15:24 -0500
Hi Mark,

You might use the inetstatShow command and see if there are any send-q's
that have a large number. If you do then you would convert the given
IP address into the machine name, log into that machine and see what
clients are running on it that might be causing problems.

You could also try using casr to see how many CA client connections you
have to this IOC. I have seen where too many connections to an IOC can
cause problems of this sort and even cause CA clients to be unable to
connect to a PV.

If there are quite a few connections to this IOC you might also try
using the iosFdShow to show the number of open file descriptors, I think
that with asd9_nodns the max number is about 255.

Hope this helps out,
Marty

Mark Rivers wrote:
Folks,

We have been getting network lockups on our MVME2700 boards.  This is
running vxWorks 5.4.2, EPICS 3.14.8.2, and Andrew Johnson's latest BSP
asd9-nodns.

When it happens it appears that the network is still receiving packets,
but not sending any, as seen by 2 successive ifShow commands:

ioc13ida> ifShow("dc")
dc (unit number 0):
Flags: (0x8063) UP BROADCAST MULTICAST ARP RUNNING Type: ETHERNET_CSMACD
Internet address: 164.54.160.75
Broadcast address: 164.54.160.255
Netmask 0xffff0000 Subnetmask 0xffffff00
Ethernet address is 08:00:3e:2f:39:46
Metric is 0
Maximum Transfer Unit size is 1500
0 octets received
0 octets sent
83065850 packets received
122169709 packets sent
83065850 unicast packets received
122118974 unicast packets sent
0 non-unicast packets received
50735 non-unicast packets sent
0 input discards
0 input unknown protocols
1407 input errors
2834 output errors
0 collisions; 0 dropped
ioc13ida> ioc13ida> ifShow("dc")
dc (unit number 0):
Flags: (0x8063) UP BROADCAST MULTICAST ARP RUNNING Type: ETHERNET_CSMACD
Internet address: 164.54.160.75
Broadcast address: 164.54.160.255
Netmask 0xffff0000 Subnetmask 0xffffff00
Ethernet address is 08:00:3e:2f:39:46
Metric is 0
Maximum Transfer Unit size is 1500
0 octets received
0 octets sent
83065862 packets received
122169709 packets sent
83065862 unicast packets received
122118974 unicast packets sent
0 non-unicast packets received
50735 non-unicast packets sent
0 input discards
0 input unknown protocols
1419 input errors
2858 output errors
0 collisions; 0 dropped


The above shows that the number of packets received is increasing, but
the number of packets sent is not.  It also shows that the number of
input and output errors is increasing.

mbufShow shows no free buffers of size 128 and above.

ioc13ida> mbufShow
type number
--------- ------
FREE : 150
DATA : 581
HEADER : 69
SOCKET : 0
PCB : 0
RTABLE : 0
HTABLE : 0
ATABLE : 0
SONAME : 0
ZOMBIE : 0
SOOPTS : 0
FTABLE : 0
RIGHTS : 0
IFADDR : 0
CONTROL : 0
OOBDATA : 0
IPMOPTS : 0
IPMADDR : 0
IFMADDR : 0
MRTABLE : 0
TOTAL : 800
number of mbufs: 800
number of times failed to find space: 22579
number of times waited for space: 0
number of times drained protocols for space: 22526
__________________
CLUSTER POOL TABLE
________________________________________________________________________
_______
size clusters free usage
------------------------------------------------------------------------
-------
64 125 50 68411923 128 400 0 115759801 256 50 0 45096817 512 25 0 11828199 1024 25 0 8333 2048 25 0 1558940 ------------------------------------------------------------------------
-------


netStackSysPoolShow shows no problems:

ioc13ida> netStackSysPoolShow
type number
--------- ------
FREE : 732
DATA : 0
HEADER : 0
SOCKET : 95
PCB : 116
RTABLE : 75
HTABLE : 0
ATABLE : 0
SONAME : 0
ZOMBIE : 0
SOOPTS : 0
FTABLE : 0
RIGHTS : 0
IFADDR : 4
CONTROL : 0
OOBDATA : 0
IPMOPTS : 0
IPMADDR : 2
IFMADDR : 0
MRTABLE : 0
TOTAL : 1024
number of mbufs: 1024
number of times failed to find space: 0
number of times waited for space: 0
number of times drained protocols for space: 0
__________________
CLUSTER POOL TABLE
________________________________________________________________________
_______
size clusters free usage
------------------------------------------------------------------------
-------
64 256 206 65 128 256 154 63893 256 256 211 4103 512 256 161 63886 ------------------------------------------------------------------------
-------
value = 80 = 0x50 = 'P'



Finally a command Andrew added to show the Ethernet driver pool shows no free clusters:


ioc13ida> endPoolShow
Device name needed, e.g. "ei" or "dc"
value = -1 = 0xffffffff = ipAddrToAsciiEnginePrivate type_info node +
0xfe2f46af
ioc13ida> endPoolShow("dc")
type number
--------- ------
FREE : 432
DATA : 80
HEADER : 0
SOCKET : 0
PCB : 0
RTABLE : 0
HTABLE : 0
ATABLE : 0
SONAME : 0
ZOMBIE : 0
SOOPTS : 0
FTABLE : 0
RIGHTS : 0
IFADDR : 0
CONTROL : 0
OOBDATA : 0
IPMOPTS : 0
IPMADDR : 0
IFMADDR : 0
MRTABLE : 0
TOTAL : 512
number of mbufs: 512
number of times failed to find space: 0
number of times waited for space: 0
number of times drained protocols for space: 0
__________________
CLUSTER POOL TABLE
________________________________________________________________________
_______
size clusters free usage
------------------------------------------------------------------------
-------
1520 208 0 205183548 ------------------------------------------------------------------------
-------
ioc13ida>


Another command, netQueueShow shows that there have been 249 dropped
transmissions:

ioc13ida> netQueueShow
IP Rx queue: len = 0, max = 50, drops = 0
ARP Rx queue: len = 0, max = 50, drops = 0
dc0 Tx queue: len = 50, max = 50, drops = 249
value = 0 = 0x0
ioc13ida>


Has anyone else been seeing such problems?  This is happening on at
least 3 IOCs with a frequency of about once per week per IOC.  Too rare
to do easy debugging, but too frequent to live with.  There does not
seem to be any correlation with something unusual happening on the IOC.

Mark



References:
vxWorks network problem on MVME2700 Mark Rivers

Navigate by Date:
Prev: Re: softioc crashes CAS (gateway) when using port number in EPICS_CA_ADDR_LIST Dirk Zimoch
Next: Re: fedora core 7 medm (was >> Re: MEDM compile - update :) Ernest L. Williams Jr.
Index: 1994  1995  1996  1997  1998  1999  2000  2001  2002  2003  2004  2005  2006  <20072008  2009  2010  2011  2012  2013  2014  2015  2016  2017  2018  2019  2020  2021  2022  2023  2024 
Navigate by Thread:
Prev: RE: vxWorks network problem on MVME2700 Thompson, David H.
Next: ICALEPCS07 conference Kate Feng
Index: 1994  1995  1996  1997  1998  1999  2000  2001  2002  2003  2004  2005  2006  <20072008  2009  2010  2011  2012  2013  2014  2015  2016  2017  2018  2019  2020  2021  2022  2023  2024 
ANJ, 10 Nov 2011 Valid HTML 4.01! · Home · News · About · Base · Modules · Extensions · Distributions · Download ·
· Search · EPICS V4 · IRMIS · Talk · Bugs · Documents · Links · Licensing ·