EPICS Controls Argonne National Laboratory

Experimental Physics and
Industrial Control System

1994  1995  1996  1997  1998  1999  2000  2001  2002  2003  2004  <20052006  2007  2008  2009  2010  2011  2012  2013  2014  2015  2016  2017  2018  2019  2020  2021  2022  2023  2024  Index 1994  1995  1996  1997  1998  1999  2000  2001  2002  2003  2004  <20052006  2007  2008  2009  2010  2011  2012  2013  2014  2015  2016  2017  2018  2019  2020  2021  2022  2023  2024 
<== Date ==> <== Thread ==>

Subject: RE: UDP to CA_UDP hangs network?
From: "Jeff Hill" <[email protected]>
To: "'Steven Hartman'" <[email protected]>, "'EPICS tech-talk'" <[email protected]>
Date: Mon, 17 Oct 2005 11:07:11 -0600
Steve,

The UDP receive thread in the CA server opens a socket and, by default binds
it to port 5064. It listens for, and consumes, any UDP messages arriving at
this socket.

> I don't have a full understanding of what is happening, or whether the
> problem is from VxWorks or EPICS, but here is what I have. CA-UDP opens a
> UDP port on the IOC using the next available port (typically 1027-1029 on
> my IOCs) for sending UDP beacons to listeners on UDP 5065 every 15
> seconds. 

The CA server's beacon generating thread also opens a socket and allows it
to be bound to an ephemeral (dynamically assigned) port. This socket is used
only for sending beacon messages. That code does not listen for, or consume,
any incoming messages destined for its socket.

> When a UDP packet is directed at CA_UDP's server port, however,
> something goes wrong. inetstatShow() shows a positive value for Recv-Q
> which never goes down, but increases as additional UDP traffic is directed
> at it. The size of this buffer is set in netLib.h (UDP_RCV_SIZE_DFLT) to
> 41600, but it does not seem to be the buffer filling, but the frequency of
> the traffic which locks up the network interface.

One would certainly expect that if messages were sent to this UDP port that
the high water mark would be found very quickly and that there would be no
negative impact on the IP kernel other than the network stack's data pool
being reduced in size by UDP_RCV_SIZE_DFLT bytes.

With sockets we can shutdown one side of their full duplex capabilities
should we not need them. I'm not fully certain what internal impact that
might have on the IP kernel, but this does appear to be a sensible thing to
do in this situation. However, since you are reporting that the problem
appears to be related to the frequency of the rogue traffic then it may be
such a change will not have a functional impact on robustness. 

> Interestingly, on the effected mv167 VxWorks 5.4.2 IOCs, nmap reports this
> port as in an "open" state, but on the non-effected MVME5110 VxWorks
> 5.5.1, the CA_UDP port is reported as "closed". I don't have any other
> targets available to try this out on.

That's interesting, but it's hard to comment further w/o knowing more about
what criteria nmap uses to decide between a "closed" and "open" status
report.

> Any insight or suggestions or other tests to run?

Sounds like an IP kernel issue, but if we were to shutdown the receive side
of that UDP socket it might be more robust.

I created a Mantis issue against R3.15. I selected that release, and
assigned a low priority to the fix, because this sounds primarily like a
problem with a particular vxWorks IP kernel and its not clear that a fix
will produce any visible benefit.

PS: Was the rogue traffic in fact caused by programs such as nmap running
scans on all ports (and sending potentially invalid protocol)?

Jeff

> -----Original Message-----
> From: Steven Hartman [mailto:[email protected]]
> Sent: Friday, October 14, 2005 12:09 PM
> To: EPICS tech-talk
> Subject: UDP to CA_UDP hangs network?
> 
> Over the last few days I have had the network of a few VxWorks IOCs hang.
> PVs are white-boxed, the IOC does not respond to ping or any other network
> traffic. The IOC cannot send or receive any network traffic.  Except for
> the network, the IOC appears to be running fine. The only way I have found
> to restore network is to reboot. The IOCs are all VxWorks 5.4.2 on mv167
> with EPICS 3.13.10. Other IOCs which are MVME5110, VxWorks 5.5.1, EPICS
> 3.14 do not seem to be effected.
> 
> I was able to correlate these events to some rouge UDP traffic on the
> network (which has been eliminated). This traffic was Windows Messenger
> spam targeting the Windows RPC messenger service which typically listens
> on UDP ports 1025-1030.
> 
> I don't have a full understanding of what is happening, or whether the
> problem is from VxWorks or EPICS, but here is what I have. CA-UDP opens a
> UDP port on the IOC using the next available port (typically 1027-1029 on
> my IOCs) for sending UDP beacons to listeners on UDP 5065 every 15
> seconds. When a UDP packet is directed at CA_UDP's server port, however,
> something goes wrong. inetstatShow() shows a positive value for Recv-Q
> which never goes down, but increases as additional UDP traffic is directed
> at it. The size of this buffer is set in netLib.h (UDP_RCV_SIZE_DFLT) to
> 41600, but it does not seem to be the buffer filling, but the frequency of
> the traffic which locks up the network interface.
> 
> VxWorks utilities mbufShow, netStackDataPoolShow, netStackSysPoolShow,
> ifShow, etc. don't show any abnormalities. (I have seen inputs errors on
> ifShow of a hung IOC, but I think these are occuring after the fact.)
> tNetTask is still running.
> 
> I have been able to reproduce this by using nmap
> (http://www.insecure.org/nmap/) to send UDP scans at the CA_UDP port. With
> UDP_CA using port 1028 on the IOC, looping this nmap scan (as root) will
> cause the network to hang:
> 
> 	./nmap -sU -p 1028 testioc
> 
> inetstatShow will show:
> 
> Active Internet connections (including servers)
> PCB      Proto Recv-Q Send-Q  Local Address      Foreign Address
> (state)
> . . .
> 727480   UDP     1694      0  0.0.0.0.1028          0.0.0.0.0
> 
> with the Recv-Q increasing until at some point the IOC stops responding to
> all network traffic.
> 
> Interestingly, on the effected mv167 VxWorks 5.4.2 IOCs, nmap reports this
> port as in an "open" state, but on the non-effected MVME5110 VxWorks
> 5.5.1, the CA_UDP port is reported as "closed". I don't have any other
> targets available to try this out on.
> 
> Any insight or suggestions or other tests to run?
> 
> Thanks,
> --
> Steve Hartman
> [email protected] || 919-660-2650
> Duke Free Electron Laser Laboratory



Replies:
RE: UDP to CA_UDP hangs network? Steven Hartman
References:
UDP to CA_UDP hangs network? Steven Hartman

Navigate by Date:
Prev: Capfast symbol for R3.13 scan record Peregrine McGehee
Next: Re: Capfast symbol for R3.13 scan record Peregrine McGehee
Index: 1994  1995  1996  1997  1998  1999  2000  2001  2002  2003  2004  <20052006  2007  2008  2009  2010  2011  2012  2013  2014  2015  2016  2017  2018  2019  2020  2021  2022  2023  2024 
Navigate by Thread:
Prev: UDP to CA_UDP hangs network? Steven Hartman
Next: RE: UDP to CA_UDP hangs network? Steven Hartman
Index: 1994  1995  1996  1997  1998  1999  2000  2001  2002  2003  2004  <20052006  2007  2008  2009  2010  2011  2012  2013  2014  2015  2016  2017  2018  2019  2020  2021  2022  2023  2024 
ANJ, 02 Sep 2010 Valid HTML 4.01! · Home · News · About · Base · Modules · Extensions · Distributions · Download ·
· Search · EPICS V4 · IRMIS · Talk · Bugs · Documents · Links · Licensing ·