EPICS Controls Argonne National Laboratory

Experimental Physics and
Industrial Control System

1994  1995  1996  1997  1998  1999  2000  2001  2002  2003  2004  2005  2006  <20072008  2009  2010  2011  2012  2013  2014  2015  2016  2017  2018  2019  2020  2021  2022  2023  2024  Index 1994  1995  1996  1997  1998  1999  2000  2001  2002  2003  2004  2005  2006  <20072008  2009  2010  2011  2012  2013  2014  2015  2016  2017  2018  2019  2020  2021  2022  2023  2024 
<== Date ==> <== Thread ==>

Subject: RE: vxWorks network problem on MVME2700
From: "Thompson, David H." <[email protected]>
To: Maren Purves <[email protected]>, Mark Rivers <[email protected]>
Cc: TechTalk EPICS <[email protected]>
Date: Tue, 16 Oct 2007 05:45:44 -0400
The dec driver has a couple of different problems.  One, the driver allocates loan buffers out of the TX pool so if the TX pool isn't large a starved ca client or ca repeater on the IOC can cause E_NOBUFS.  The other is that when the Ethernet mac interface is not able to get rid of packets because the physical layer is down UDP packets back up in the buffer pool and finally break the driver.
 
WRS has a fix for the latter problem that I have not tested yet.
 

________________________________

From: [email protected] on behalf of Maren Purves
Sent: Fri 10/12/2007 5:10 PM
To: Mark Rivers
Cc: TechTalk EPICS
Subject: Re: vxWorks network problem on MVME2700



Mark,

this reminds me of a problem Ian Smith had at the ATC (Edinburgh) last
year. The solution isn't on the exploder (at least not if I search by
"Smith"), but I found something in my email:

-------------- quote -----------
I think I've eliminated almost everything and still have the problem.
I've set up another board, used the vxWorks that you sent me and the
problem persists.

I got one reply from the epics group:

"The folks at the SNS had to patch there vxWorks IP kernel against
defects related to mbuf starvation IP deadlocks (both in the kernel
itself and also in the NIC driver)."
------------- unquote ----------------

the "you" refers to is Craig Walther.

Hope this helps any at all,
Maren
(Ian has since taken redundancy and may be hard to get hold of)


Mark Rivers wrote:
> Folks,
>
> We have been getting network lockups on our MVME2700 boards.  This is
> running vxWorks 5.4.2, EPICS 3.14.8.2, and Andrew Johnson's latest BSP
> asd9-nodns.
>
> When it happens it appears that the network is still receiving packets,
> but not sending any, as seen by 2 successive ifShow commands:
>
> ioc13ida> ifShow("dc")
> dc (unit number 0):
>      Flags: (0x8063) UP BROADCAST MULTICAST ARP RUNNING
>      Type: ETHERNET_CSMACD
>      Internet address: 164.54.160.75
>      Broadcast address: 164.54.160.255
>      Netmask 0xffff0000 Subnetmask 0xffffff00
>      Ethernet address is 08:00:3e:2f:39:46
>      Metric is 0
>      Maximum Transfer Unit size is 1500
>      0 octets received
>      0 octets sent
>      83065850 packets received
>      122169709 packets sent
>      83065850 unicast packets received
>      122118974 unicast packets sent
>      0 non-unicast packets received
>      50735 non-unicast packets sent
>      0 input discards
>      0 input unknown protocols
>      1407 input errors
>      2834 output errors
>      0 collisions; 0 dropped
> ioc13ida>
> ioc13ida> ifShow("dc")
> dc (unit number 0):
>      Flags: (0x8063) UP BROADCAST MULTICAST ARP RUNNING
>      Type: ETHERNET_CSMACD
>      Internet address: 164.54.160.75
>      Broadcast address: 164.54.160.255
>      Netmask 0xffff0000 Subnetmask 0xffffff00
>      Ethernet address is 08:00:3e:2f:39:46
>      Metric is 0
>      Maximum Transfer Unit size is 1500
>      0 octets received
>      0 octets sent
>      83065862 packets received
>      122169709 packets sent
>      83065862 unicast packets received
>      122118974 unicast packets sent
>      0 non-unicast packets received
>      50735 non-unicast packets sent
>      0 input discards
>      0 input unknown protocols
>      1419 input errors
>      2858 output errors
>      0 collisions; 0 dropped
>
> The above shows that the number of packets received is increasing, but
> the number of packets sent is not.  It also shows that the number of
> input and output errors is increasing.
>
> mbufShow shows no free buffers of size 128 and above.
>
> ioc13ida> mbufShow
> type        number
> ---------   ------
> FREE    :    150
> DATA    :    581
> HEADER  :     69
> SOCKET  :      0
> PCB     :      0
> RTABLE  :      0
> HTABLE  :      0
> ATABLE  :      0
> SONAME  :      0
> ZOMBIE  :      0
> SOOPTS  :      0
> FTABLE  :      0
> RIGHTS  :      0
> IFADDR  :      0
> CONTROL :      0
> OOBDATA :      0
> IPMOPTS :      0
> IPMADDR :      0
> IFMADDR :      0
> MRTABLE :      0
> TOTAL   :    800
> number of mbufs: 800
> number of times failed to find space: 22579
> number of times waited for space: 0
> number of times drained protocols for space: 22526
> __________________
> CLUSTER POOL TABLE
> ________________________________________________________________________
> _______
> size     clusters  free      usage
> ------------------------------------------------------------------------
> -------
> 64       125       50        68411923     
> 128      400       0         115759801    
> 256      50        0         45096817     
> 512      25        0         11828199     
> 1024     25        0         8333         
> 2048     25        0         1558940      
> ------------------------------------------------------------------------
> -------
>
> netStackSysPoolShow shows no problems:
>
> ioc13ida> netStackSysPoolShow
> type        number
> ---------   ------
> FREE    :    732
> DATA    :      0
> HEADER  :      0
> SOCKET  :     95
> PCB     :    116
> RTABLE  :     75
> HTABLE  :      0
> ATABLE  :      0
> SONAME  :      0
> ZOMBIE  :      0
> SOOPTS  :      0
> FTABLE  :      0
> RIGHTS  :      0
> IFADDR  :      4
> CONTROL :      0
> OOBDATA :      0
> IPMOPTS :      0
> IPMADDR :      2
> IFMADDR :      0
> MRTABLE :      0
> TOTAL   :    1024
> number of mbufs: 1024
> number of times failed to find space: 0
> number of times waited for space: 0
> number of times drained protocols for space: 0
> __________________
> CLUSTER POOL TABLE
> ________________________________________________________________________
> _______
> size     clusters  free      usage
> ------------------------------------------------------------------------
> -------
> 64       256       206       65           
> 128      256       154       63893        
> 256      256       211       4103         
> 512      256       161       63886        
> ------------------------------------------------------------------------
> -------
> value = 80 = 0x50 = 'P'
>
>
> Finally a command Andrew added to show the Ethernet driver pool shows no
> free clusters:
>
>
> ioc13ida> endPoolShow
> Device name needed, e.g. "ei" or "dc"
> value = -1 = 0xffffffff = ipAddrToAsciiEnginePrivate type_info node +
> 0xfe2f46af
> ioc13ida> endPoolShow("dc")
> type        number
> ---------   ------
> FREE    :    432
> DATA    :     80
> HEADER  :      0
> SOCKET  :      0
> PCB     :      0
> RTABLE  :      0
> HTABLE  :      0
> ATABLE  :      0
> SONAME  :      0
> ZOMBIE  :      0
> SOOPTS  :      0
> FTABLE  :      0
> RIGHTS  :      0
> IFADDR  :      0
> CONTROL :      0
> OOBDATA :      0
> IPMOPTS :      0
> IPMADDR :      0
> IFMADDR :      0
> MRTABLE :      0
> TOTAL   :    512
> number of mbufs: 512
> number of times failed to find space: 0
> number of times waited for space: 0
> number of times drained protocols for space: 0
> __________________
> CLUSTER POOL TABLE
> ________________________________________________________________________
> _______
> size     clusters  free      usage
> ------------------------------------------------------------------------
> -------
> 1520     208       0         205183548    
> ------------------------------------------------------------------------
> -------
> ioc13ida>
>
> Another command, netQueueShow shows that there have been 249 dropped
> transmissions:
>
> ioc13ida> netQueueShow
>     IP Rx queue: len =    0, max =   50, drops =    0
>    ARP Rx queue: len =    0, max =   50, drops =    0
>    dc0 Tx queue: len =   50, max =   50, drops =  249
> value = 0 = 0x0
> ioc13ida>
>
> Has anyone else been seeing such problems?  This is happening on at
> least 3 IOCs with a frequency of about once per week per IOC.  Too rare
> to do easy debugging, but too frequent to live with.  There does not
> seem to be any correlation with something unusual happening on the IOC.
>
> Mark
>





References:
vxWorks network problem on MVME2700 Mark Rivers
Re: vxWorks network problem on MVME2700 Maren Purves

Navigate by Date:
Prev: Installing EDM on OS X 10.4.10 Bertrand H.J. Biritz
Next: Re: Installing EDM on OS X 10.4.10 Eric Norum
Index: 1994  1995  1996  1997  1998  1999  2000  2001  2002  2003  2004  2005  2006  <20072008  2009  2010  2011  2012  2013  2014  2015  2016  2017  2018  2019  2020  2021  2022  2023  2024 
Navigate by Thread:
Prev: Re: vxWorks network problem on MVME2700 Maren Purves
Next: Re: vxWorks network problem on MVME2700 Martin L. Smith
Index: 1994  1995  1996  1997  1998  1999  2000  2001  2002  2003  2004  2005  2006  <20072008  2009  2010  2011  2012  2013  2014  2015  2016  2017  2018  2019  2020  2021  2022  2023  2024 
ANJ, 10 Nov 2011 Valid HTML 4.01! · Home · News · About · Base · Modules · Extensions · Distributions · Download ·
· Search · EPICS V4 · IRMIS · Talk · Bugs · Documents · Links · Licensing ·