Andrew,
It's been awhile since I wrote that code but I seem to recall that our original intent was, when installing a SIGALRM handler, to avoid process termination due to an unhandled signal, but to also call any signal handlers that are already installed so that our use of the signal is transparent. That’s the only neighborly way to use signals on UNIX - of course. If we were successful with that then wouldn’t our existing code be compatible with third party libraries? Originally the posix standards for the signal library were pretty weak so maybe we just haven’t kept up with the newer interfaces that have come out. As I recall, there once was considerable variability in the implementation of the signal related calls on each of the UNIX implementations so one had to be very careful to stick to plain jane stuff. Maybe this is no-longer an issue, but if it is then portability must always trump, and we have to look for a solution that ports.
So I don’t fully understand what went wrong with our current approach. Was the code changed so it doesn’t call the signal handlers that are already installed?
1) To see what CA and other socket codes in base do with EINTR search in base for the string SOCK_EINTR. For socket interfaced codes we used a surrogate because the error codes returned from the socket library are implemented in slightly different ways depending on the OS and or socket library vendor's implementation. I will go out on a limb and say that most of my stuff is tolerant of its blocking system calls being interrupted by a signal (hopefully there will not be glaring evidence to the contrary :-).
2) I do use a posix signal to gracefully shutdown threads that might be blocking in a socket system call - on certain OS. There is no question that there is a lot of variation surrounding this issue in each of the socket library and OS implementations. This has always been quite messy. If someone wants to suggest a cleaner way to do this I am very open to suggestions.
3) The problem with jumping onto SIGUSR1 is that some 3rd party codes may be just as likely to be using it for a specific purpose also. I think it's probably better to transparently prevent the process from croaking if we get the signal, but otherwise interfere in no way with use of the signal for other purposes. That was certainly our original intent. I am not yet fully understanding what went wrong.
4) We have to also avoid process termination via SIGPIPE.
> This was originally your code, do you want me to fix it?
I will take care of it if needed, but I am also happy for someone else to work on it if they want to.
I think that there are maybe codes outside of base that use this stuff now so watch out for that when making changes that are not backwards compatible in a point release. One way to handle that might be having the code set a #define to enable a new interface.
Jeff
> -----Original Message-----
> From: Andrew Johnson [mailto:[email protected]]
> Sent: Wednesday, March 11, 2009 4:43 PM
> To: Jeff Hill
> Cc: Core-Talk
> Subject: SIGALRM => SIGUSR1?
>
> Hi Jeff,
>
> Mark Rivers is having problems integrating a third-party library into his
> IOCs
> because it uses a Posix 1.b interval timer API which is implemented using
> SIGALRM. It unfortunately also results in various system calls
> terminating
> prematurely with EINTR, but it seems that there were only a couple of
> places
> that we have already fixed up for that.
>
> We currently have code that registers an sa_handler(int) for SIGALRM,
> which
> appears to be needed on hpux but nowhere else. Unfortunately our wrapper
> code doesn't work with the posix timer library on Linux because that uses
> the
> 3-argument form, registering a sa_sigaction(int, siginfo_t *, void *)
> routine
> and when we call it with just one argument it dies.
>
> Mark has already proved that when he removes our code that registers the
> SIGALRM handler, the vendor library works quite happily. I'm guessing
> that
> the only reason we catch SIGALRM is in case we need to use it for the
> system
> call interrupt mechanism.
>
> There are several possible solutions that I can see
>
> 1. Change our code to register a 3-argument sa_sigaction routine instead,
> using the SA_SIGINFO bit in sa_flags to determine which type of routine we
> wrapped and hence how to call it. Mark has just proved that this fixes
> the
> problem as well, although his implementation is incomplete.
> 2. Switch from SIGALRM to using another signal, such as SIGUSR1. I don't
> know
> whether this will interrupt a blocking system call, although I don't see
> why
> it shouldn't.
> 3. Drop the SigAlarmIgnore code from osi/os/posix/osdSignal.cpp but
> provide a
> copy of the version with SIGALRM support in osi/os/hpux.
> 4. Make the calls to epicsSignalInstallSigAlarmIgnore() conditional on the
> return value from epicsSocketSystemCallInterruptMechanismQuery() being
> esscimqi_socketSigAlarmRequired.
>
> #3 is probably the least work. #2 might seem simplest but our enum is
> named
> esscimqi_socketSigAlarmRequired and using that name with a different
> signal
> would be misleading. Mark is probably going to prove that #1 works, but
> I'm
> not sure that we should be registering a signal handler that we don't
> need,
> so whatever we pick, I'd also like to do #4 as well.
>
> This was originally your code, do you want me to fix it?
>
> - Andrew
> --
> The best FOSS code is written to be read by other humans -- Harold Welte
- References:
- SIGALRM => SIGUSR1? Andrew Johnson
- Navigate by Date:
- Prev:
Re: SIGALRM => SIGUSR1? Ralph Lange
- Next:
String hash functions and resourceLib.h Andrew Johnson
- Index:
2002
2003
2004
2005
2006
2007
2008
<2009>
2010
2011
2012
2013
2014
2015
2016
2017
2018
2019
2020
2021
2022
2023
2024
- Navigate by Thread:
- Prev:
Re: SIGALRM => SIGUSR1? Ralph Lange
- Next:
String hash functions and resourceLib.h Andrew Johnson
- Index:
2002
2003
2004
2005
2006
2007
2008
<2009>
2010
2011
2012
2013
2014
2015
2016
2017
2018
2019
2020
2021
2022
2023
2024
|