Experimental Physics and Industrial Control System
|
Jeff's reply is correct, but doesn't really deal with the problem
discovered by Dennis since it is stdout which is causing the problem
and stdout does not have 'close-on-exec' set.
I propose that on startup caRepeater check stdin/stdout/stderr and if
fstat(fileno(fp))reveals that the stream is a socket that the stream
be closed and reopened to /dev/null. This precludes running
caRepeater in a pipeline, but I'm not sure that's really that much of
a problem.
On Jan 8, 2007, at 12:53 PM, Jeff Hill wrote:
We recently ran into a very puzzling problem here using the EPICS
casr
(channel access save restore) tool. The problem showed up in one
of two
ways after you push the casrSave or casrRestore buttons.
On UNIX systems the caRepeater process is spawned off using a call
to the
fork function to create the new process followed by a call to the exec
function to force the new process to run the caRepeater executable.
The fork function does duplicate any open file descriptors into the
new
process. To avoid problems EPICS base does the following.
O In R3.13 the CA client library closes all open files except stdin/
out/err.
O In almost all versions of R3.14, instead of closing open files,
the "close
on exec flag" is set for all sockets created by a special socket
creation
function in EPICS base. This is a less intrusive approach.
Jeff
-----Original Message-----
From: Dennis Nicklaus [mailto:[email protected]]
Sent: Wednesday, January 03, 2007 4:10 PM
To: [email protected]
Subject: caRepeater must run before casr
We recently ran into a very puzzling problem here using the EPICS
casr
(channel access save restore) tool. The problem showed up in one
of two
ways after you push the casrSave or casrRestore buttons.
Sometimes the Tcl/Tk casr interface would give an error dialog
saying,
"error waiting for process to exit: child process lost (is SIGCHLD
ignored or trapped?)"
and other times it would just hang forever after you push
casrSave/casrRestore
without the error dialog (though the save/restore would be
processed).
The short solution is that you must have caRepeater running before
running casr.
A brief summary of the gory details: when one presses the Tk
casrSave
button, that causes tcl to
exec the casave program. casave in turn starts carepeater if
carepeater
isn't already there.
carepeater, in trying to be a nice forked process, closes all its
file
descriptors except
stdin, stdout ,and stderr. This is part of where the problem starts
because the pipe open between
the top-level wish (tcl) shell and the casave program gets dup-ed to
stdout of casave,
then when casave clones/forks off carepeater, the same stdout remains
open in carepeater.
Then when casave finishes, it's dead, but the higher level tcl is
still
trying to read() on the pipe,
which is being held open by carepeater. This wouldn't be a
problem if
the high level tcl shell
were getting a SIGCHLD from the casave process, but by sifting
through
trace output,
we saw that the casave process was being started with the clone()
system
call without
specifying SIGCHLD in the flags, and, as the clone() man page
says, "If
no signal is specified, then the parent process is not signaled
when
the child terminates." We don't know if this is a mistake in the
version of tcl we have or something with the version of linux and
TLS we
happen to be running,
though it happens on multiple linux kernel versions we have.
YMMV widely depending on your verions of unix and tcl.
I'm not suggesting anything necessarily needs to change in casr or
caRepeater, just trying to point out a bizarre problem someone
else may
bump into along the way.
Many thanks to Ron Rechenmacher who spent many hours puzzling over
this
one.
Dennis
--
Eric Norum <[email protected]>
Advanced Photon Source
Argonne National Laboratory
(630) 252-4793
- Replies:
- Re: caRepeater must run before casr Eric Norum
- References:
- RE: caRepeater must run before casr Jeff Hill
- Navigate by Date:
- Prev:
RE: caRepeater must run before casr Jeff Hill
- Next:
RE: soft IOC string and array records Jeff Hill
- Index:
1994
1995
1996
1997
1998
1999
2000
2001
2002
2003
2004
2005
2006
<2007>
2008
2009
2010
2011
2012
2013
2014
2015
2016
2017
2018
2019
2020
2021
2022
2023
2024
- Navigate by Thread:
- Prev:
RE: caRepeater must run before casr Jeff Hill
- Next:
Re: caRepeater must run before casr Eric Norum
- Index:
1994
1995
1996
1997
1998
1999
2000
2001
2002
2003
2004
2005
2006
<2007>
2008
2009
2010
2011
2012
2013
2014
2015
2016
2017
2018
2019
2020
2021
2022
2023
2024
|
ANJ, 10 Nov 2011 |
·
Home
·
News
·
About
·
Base
·
Modules
·
Extensions
·
Distributions
·
Download
·
·
Search
·
EPICS V4
·
IRMIS
·
Talk
·
Bugs
·
Documents
·
Links
·
Licensing
·
|