EPICS Controls Argonne National Laboratory

Experimental Physics and
Industrial Control System

1994  1995  1996  1997  1998  1999  2000  2001  2002  2003  2004  2005  2006  2007  2008  2009  <20102011  2012  2013  2014  2015  2016  2017  2018  2019  2020  2021  2022  2023  2024  Index 1994  1995  1996  1997  1998  1999  2000  2001  2002  2003  2004  2005  2006  2007  2008  2009  <20102011  2012  2013  2014  2015  2016  2017  2018  2019  2020  2021  2022  2023  2024 
<== Date ==> <== Thread ==>

Subject: RE: vxWorks 6.7 tNet0 task crashed with fage fault: help needed from vxWorks guru
From: "Jeff Hill" <[email protected]>
To: "'Matthias Clausen'" <[email protected]>
Cc: "'Siegfried Rettig'" <[email protected]>, [email protected]
Date: Thu, 30 Sep 2010 14:19:04 -0600
> > negotiated full / half duplex correctly (consistently).
> ok - next time it crashes
> or should we have a look at the running system?

In my experience, Ethernet can run in a degraded mode
when vxWorks and the switch don't successfully auto-negotiate 
into the same full / half duplex mode. Also, if there 
are any link level errors due to cabling loss, collisions,
or other issues you will see the error count going up
across several ifShow commands (if the vxWorks driver 
takes the trouble to increment the error counters). While
these types of link level issues are certainly not the 
root cause of software failures, they might be outside 
influencing factors which might be manipulated as a 
workaround solution. All the same is true for network 
buffer starvation situations.

> Anything we could retrieve from the tt and memory dumps?

These are also available, but usually less interesting 
when there are driver level issues.

inetstatShow
tcpstatShow
udpstatShow

With a debug build of the network driver, and after attaching the 
remote workbench debugger (which in-theory has access to the
source code) much better stack traces with symbols, stack 
variable values, accurate argument values, and local variable
values are possible.

I understand that vxWorks 6 also has post-mortem, i.e. crash dump,
debugging capabilities.

Jeff
______________________________________________________
Jeffrey O. Hill           Email        [email protected]
LANL MS H820              Voice        505 665 1831
Los Alamos NM 87545 USA   FAX          505 665 5107

Message content: TSPA


> -----Original Message-----
> From: Matthias Clausen [mailto:[email protected]]
> Sent: Thursday, September 30, 2010 9:34 AM
> To: Jeff Hill
> Cc: [email protected]; Siegfried Rettig
> Subject: Re: vxWorks 6.7 tNet0 task crashed with fage fault: help
> needed from vxWorks guru
> 
>   He Jeff,
> 
> On 30.09.2010 17:17, Jeff Hill wrote:
> > Hi Mathias,
> >
> > Some possible ideas.
> >
> > 1) First and foremost, make certain that the latest vxWorks driver
> patches
> > for your particular network interface chip are installed.
> Good idea - we got a hint from the hardware vendor that there might be
> a
> vxWorks 6.7 "SMP Perfomance Patch" which should solve 'similar'
> problems. - We'll check with WRS.
> Strange enough that they did not mention this to us...
> > 2) Also look for vxWorks 6 patches which could be installed.
> >
> > 3) Run ifShow (and switch diagnostics) to see if there are unusual
> link
> > level errors or problems where the switch and network interface
> haven?t auto
> > negotiated full / half duplex correctly (consistently).
> ok - next time it crashes
> or should we have a look at the running system?
> > 4) Run netstackSysPoolShow netstackDataPoolShow to see if your system
> is low
> > on network buffers. The network stack shouldn?t crash when it runs
> low on
> > buffers, but nevertheless adding more buffers might be a temporary
> > workaround. I understand that there is also a driver buffer pool that
> can be
> > adjusted on vxWorks.
> Unfortunately all of the network diagnostics do not work after a crash
> of the net task.
> The shell will be blocked and no more diagnostics are possible.
> 
> Lucky enough I did not use these this time - so I got some more
> diagnostic using tt etc.
> Anything we could retrieve from the tt and memory dumps?
> 
> -Matthias
> >
> > Jeff
> > ______________________________________________________
> > Jeffrey O. Hill           Email        [email protected]
> > LANL MS H820              Voice        505 665 1831
> > Los Alamos NM 87545 USA   FAX          505 665 5107
> >
> > Message content: TSPA
> >
> >
> >> -----Original Message-----
> >> From: [email protected] [mailto:core-talk-
> >> [email protected]] On Behalf Of Matthias Clausen
> >> Sent: Thursday, September 30, 2010 2:39 AM
> >> To: EPICS Core Talk
> >> Cc: Siegfried Rettig
> >> Subject: vxWorks 6.7 tNet0 task crashed with fage fault: help needed
> >> from vxWorks guru
> >>
> >>    Hi all.
> >>
> >> Since the beginning of this year we have random crashes of the tNet
> >> task
> >> in one of our new IOCs.
> >> We do not get any help from Wind River.
> >> We are going to get a WRS specialist to DESY to help us set up the
> IOC
> >> to enable us tracing back the problem when the next crash occurs. I
> >> have
> >> my doubts that this will happen soon and that we can set it up with
> our
> >> IOCs where the shell is still running locally ... we'll see.
> >>
> >> Meanwhile I hope to get some ideas from you.
> >>
> >> Since the Net task is down we can only use the local console with
> VGS
> >> output.
> >> Therefore I have real screen shots taken with my camera.
> >>
> >> You can find the page fault info and the tt output in the two jpg
> files
> >> attached.
> >> (also the lkAddr and the checkStack output)
> >> There's more information about the individual memory locations but I
> do
> >> not want to fill your mailbox with my garbage. In case you might
> have
> >> an
> >> idea an more debug information would help - let me know there's more
> >> information available.
> >>
> >> For now we have no clue what could be the root cause.
> >> We have two IOCs running with the same vxWorks image one of them is
> >> more
> >> heavily loaded then the other - where heavy means 20% CPU instead of
> >> 15%.
> >> The two CPUs are compact PCI CPUs from Kontron in Germany
> >> vxWorks image based on version 6.7
> >>
> >> Craches happen at intervals between a week and two months.
> >>
> >> Our plans:
> >> - Get a WRS specialist to help us set up the IOC to catch the root
> >> cause
> >> and trace it back
> >> - Install a PC with wireshark and monitor the traffic to disk
> >> - analyze the Ethernet traffic and filter out any -non IP- traffic
> to
> >> the IOC by setting up filters in the Cisco router before the IOC
> >>
> >> Any other idea?
> >>
> >> Of course any help is highly appreciated.
> >>
> >> Cheers
> >> Matthias



References:
RE: vxWorks 6.7 tNet0 task crashed with fage fault: help needed from vxWorks guru Jeff Hill
Re: vxWorks 6.7 tNet0 task crashed with fage fault: help needed from vxWorks guru Matthias Clausen

Navigate by Date:
Prev: Re: vxWorks 6.7 tNet0 task crashed with fage fault: help needed from vxWorks guru Steven M. Hartman
Next: Re: Cross compiling EPICS for cris v10 Burkhard Kolb
Index: 1994  1995  1996  1997  1998  1999  2000  2001  2002  2003  2004  2005  2006  2007  2008  2009  <20102011  2012  2013  2014  2015  2016  2017  2018  2019  2020  2021  2022  2023  2024 
Navigate by Thread:
Prev: Re: vxWorks 6.7 tNet0 task crashed with fage fault: help needed from vxWorks guru Steven M. Hartman
Next: Channel Archiver overlapping blocks james.rowland
Index: 1994  1995  1996  1997  1998  1999  2000  2001  2002  2003  2004  2005  2006  2007  2008  2009  <20102011  2012  2013  2014  2015  2016  2017  2018  2019  2020  2021  2022  2023  2024 
ANJ, 01 Oct 2010 Valid HTML 4.01! · Home · News · About · Base · Modules · Extensions · Distributions · Download ·
· Search · EPICS V4 · IRMIS · Talk · Bugs · Documents · Links · Licensing ·