EPICS Controls Argonne National Laboratory

Experimental Physics and
Industrial Control System

1994  1995  1996  1997  1998  1999  2000  2001  2002  2003  2004  2005  2006  <20072008  2009  2010  2011  2012  2013  2014  2015  2016  2017  2018  2019  2020  2021  2022  2023  2024  Index 1994  1995  1996  1997  1998  1999  2000  2001  2002  2003  2004  2005  2006  <20072008  2009  2010  2011  2012  2013  2014  2015  2016  2017  2018  2019  2020  2021  2022  2023  2024 
<== Date ==> <== Thread ==>

Subject: Re: CA client on unix bug
From: Till Straumann <[email protected]>
To: Andrew Johnson <[email protected]>
Cc: TECHTALK tech-talk <[email protected]>
Date: Mon, 11 Jun 2007 12:02:19 -0700
Andrew Johnson wrote:
Hi Till,

Till Straumann wrote:
I found the following problem (tested on
linux, solaris, base-3.14.9, base-3.14.8.2)
with the CA client on unix:

Partly confirmed on linux-x86 (Fedora-5) against R3.14.9 as follows:


  uranus% cau
    cau:  get no_PV
  **** The executable "caRepeater" couldn't be located
  **** because of errno = "No such file or directory".
  **** You may need to modify your PATH environment variable.
  **** Unable to start "CA Repeater" process.
  error on search for no_PV
  couldn't open no_PV
    cau:

While cau is still running, a ps from another terminal gives this:

  uranus% ps -ef | grep cau
  anj       1440  2430  0 09:45 pts/7    00:00:00 cau
  anj       1444  1440  0 09:45 pts/7    00:00:00 [cau] <defunct>
  anj       1446  2428  0 09:45 pts/6    00:00:00 grep cau

In my case though, the <defunct> process' parent is still the original cau thread, and when the parent exits so does the defunct thread.
I'm not familiar with the internals of cau. However, note that the bug I reported bites only if
a thread with an associated exit handler that synchronizes with the tread's termination
(the 'errlog' thread is an example for such a thread) is already running prior to
the attempt to spawn the caRepeater. Therefore the sequence


errlogInit(0) // creates errlog thread *prior* to attempt to spawn caRepeater
epicsThreadSleep(0.5) // gives errlog thread time to register its epicsAtExit handler


prior to the first CA activity is essential.

The problem occurs because the
1 forked process doesn't inherit threads
2 execle("caRepeater") fails, the spawning wrapper then calls 'exit()'
3 the epicsAtExit handlers are executed
4 the errlog thread's exit handler blocks for the errlog thread
to terminate and
*** hangs forever *** because there is no errlog thread in the forked process (see 1)
Any long-running CA client application should be usable to prove this.

In libCom/osi/os/posix we can fix the problem using Till's fix of having osiSpawnDetachedProcess() call _exit() which is a Posix.1 routine. Both vxWorks and RTEMS just return osiSpawnDetachedProcessNoSupport so there's no issue there; the other implementations of osiSpawnDetachedProcess() are for VMS and WIN32, neither of which call exit().


Another problem though appears to be that at no point does the CA client library actually fork off a caRepeater itself, even after this message appears:


  CA client library is unable to contact CA repeater after 50 tries.
  Silence this message by starting a CA repeater daemon
  or by calling ca_pend_event() and or ca_poll() more often.
I'm afraid I don't understand. With caRepeater on the PATH everything works
as expected for me (3.14.9). E.g., the test program I submitted with my
last post succeeds in spawning a repeater and I see it running and listening
on port 5056.

Also, if spawning the caRepeater is successful then I do not get
the 'CA client library is unable to contact CA repeater after 50 tries' message.
(This I tested with a different CA client - obviously ca_zombie_test.c is not
suitable for this purpose)


-- T.

I proved that as follows: With caRepeater running, I can find the following output from netstat (NB: ca-2 is the IANA official name for our port on recent OS releases; on older releases you may have to use 5065 instead - look in your /etc/services file to see whether you have ca-1 and ca-2 defined):


  uranus% netstat -a | grep 'ca-2'
  udp        0      0 *:ca-2                      *:*

With cau running as above there is no output from that command. Was this functionality removed intentionally Jeff?

- Andrew



Replies:
Re: CA client on unix bug Andrew Johnson
References:
CA client on unix bug Till Straumann
Re: CA client on unix bug Andrew Johnson

Navigate by Date:
Prev: Re: EPICS/RTEMS bug in osdPoolStatus.c Till Straumann
Next: Re: CA client on unix bug Andrew Johnson
Index: 1994  1995  1996  1997  1998  1999  2000  2001  2002  2003  2004  2005  2006  <20072008  2009  2010  2011  2012  2013  2014  2015  2016  2017  2018  2019  2020  2021  2022  2023  2024 
Navigate by Thread:
Prev: Re: CA client on unix bug Andrew Johnson
Next: Re: CA client on unix bug Andrew Johnson
Index: 1994  1995  1996  1997  1998  1999  2000  2001  2002  2003  2004  2005  2006  <20072008  2009  2010  2011  2012  2013  2014  2015  2016  2017  2018  2019  2020  2021  2022  2023  2024 
ANJ, 10 Nov 2011 Valid HTML 4.01! · Home · News · About · Base · Modules · Extensions · Distributions · Download ·
· Search · EPICS V4 · IRMIS · Talk · Bugs · Documents · Links · Licensing ·