----- Original Message -----
> From: "Jeff Hill" <[email protected]>
> To: "Benjamin Franksen" <[email protected]>, "EPICS tech-talk" <[email protected]>
> Sent: Tuesday, September 25, 2012 2:29:08 PM
> Subject: RE: CAC problem between RTEMS and vxWorks
>
> > Jeff:
> >
> > Dynamic re-assign means in CA terms that the sequencer will issue a
> > ca_clear_channel (for some connected channel), immediately followed
> > by a
> > ca_create_channel (to some other PV).
>
> Be careful about race conditions where a thread continues to use a
> chid
> after the channel that the chid references has been destroyed.
>
> > Hmmm, what happens if there is a
> > race between the ca_clear_channel call and the disconnect event
> > caused by the
> > IOC reboot? (Just think loud here.)
> >
>
> Reading Wesley's later post I think we can assume that it's the
> client (i.e.
> sequencer) side that is being rebooted.
>
> > It should be possible to devise a simple test client that issues a
> > long
> > sequence of such create/clear calls to check whether this is a CA
> > problem.
> > Unfortunately my automatic test setup does not yet support RTEMS,
> > so this
> > will take some time for me to do.
>
> I had a look at the CA client side regression tests and I don't see a
> test
> of this nature. Presumably such a test would create a few thousand
> channels
> and then begin destroying them while at least some of them are in the
> process of connecting. This would be perhaps an unusual test in that
> we aren't testing
> any metric other than that the test program can survive a somewhat
> unusual
> sequence of events, and because successfully testing of anything at
> all might
> be highly dependent on the various ratios of server/client cpu speeds
> and
> the speed of the network.
>
> Considering this particular situation further. All we know is that we
> have
> an RTEMS IOC with a sequencer CA client that is not reconnecting
> reliably
> when the RTEMS IOC undergoes a rapid (close together in time) series
> of
> reboots. One has to guess that the vxWorks IOC is running low on some
> resource perhaps its network buffers, file descriptors, ...
>
> Wesley:
> I think that we see evidence that the vxWorks IOC is taking some time
> to determine that the previous lifetimes of the RTEMS IOC have passed
> out
> of existence (since the RTEMS IOC didn't cleanup its CA circuits when
> it
> was shutdown). Have you seen the situation improve if you leave it
> for a
> sufficiently long interval? This would be long enough for the TCP
> watchdog
> timers to fire; that delay varies between different OS and different
> OS
> versions but is typically on the order, as I recall, of about 20
> minutes.
>
> Jeff
Jeff:
I've been working with Ben offline a bit. Let me see if I can sum up the findings (Ben, fill in as needed). I believe that the issues caused by repeated reboots were masking the true problems. When the sequencer assigns one PV per variable, even dynamically at runtime, there's no connection/reconnection issues. When a variable is re-assigned, CA to the VxWorks server is intermittent (without rebooting). Even re-assigning a variable to the same PV is a problem.
This is some debug output from the sequencer that is triggered every 5 seconds. The variable is assigned and then a pvPut is issued. In this case I'm simply re-assigning the variable to the same output PV. Sometimes the pvPut succeeds and sometimes it doesn't.
# sequencer output
pvName: ITVFL02op1_status
pvAssign Status: 0
pvPut Status: 0
pvName: ITVFL02op1_status
pvAssign Status: 0
pvPut Status: -2
pvName: ITVFL02op1_status
pvAssign Status: 0
pvPut Status: -2
pvName: ITVFL02op1_status
pvAssign Status: 0
pvPut Status: 0
pvName: ITVFL02op1_status
pvAssign Status: 0
pvPut Status: 0
pvName: ITVFL02op1_status
pvAssign Status: 0
pvPut Status: -2
from Ben:
"I am fairly certain that this is a CA problem and not one of the sequencer. The usage pattern (clear channel, then create new ones) is typical for some client applications, but not for code running on an IOC; which may be the reason it hasn't come up before."
Wesley
- Replies:
- Re: CAC problem between RTEMS and vxWorks Benjamin Franksen
- References:
- RE: CAC problem between RTEMS and vxWorks Hill, Jeff
- Navigate by Date:
- Prev:
Motor record -- URIP with soft limits Fong, Nia W.
- Next:
Re: Motor record -- URIP with soft limits J. Lewis Muir
- Index:
1994
1995
1996
1997
1998
1999
2000
2001
2002
2003
2004
2005
2006
2007
2008
2009
2010
2011
<2012>
2013
2014
2015
2016
2017
2018
2019
2020
2021
2022
2023
2024
- Navigate by Thread:
- Prev:
RE: CAC problem between RTEMS and vxWorks Hill, Jeff
- Next:
Re: CAC problem between RTEMS and vxWorks Benjamin Franksen
- Index:
1994
1995
1996
1997
1998
1999
2000
2001
2002
2003
2004
2005
2006
2007
2008
2009
2010
2011
<2012>
2013
2014
2015
2016
2017
2018
2019
2020
2021
2022
2023
2024
|