EPICS Controls Argonne National Laboratory

Experimental Physics and
Industrial Control System

1994  1995  1996  1997  1998  1999  2000  2001  2002  2003  2004  2005  2006  2007  2008  <20092010  2011  2012  2013  2014  2015  2016  2017  2018  2019  2020  2021  2022  2023  2024  Index 1994  1995  1996  1997  1998  1999  2000  2001  2002  2003  2004  2005  2006  2007  2008  <20092010  2011  2012  2013  2014  2015  2016  2017  2018  2019  2020  2021  2022  2023  2024 
<== Date ==> <== Thread ==>

Subject: RE: Problem with WIN32
From: "Mark Rivers" <[email protected]>
To: "Jeff Hill" <[email protected]>, "Andrew Johnson" <[email protected]>, "Eric Norum" <[email protected]>
Cc: [email protected]
Date: Mon, 11 May 2009 21:19:02 -0500
Jeff,
 
Thanks for the rapid response.
 
> It's also an interesting question why general time was unable to provide any source of time at this point during the boot?
 
I strongly suspect that this is related to a problem I discovered in 3.14.10, and which Andrew and I thought we had fixed.  The problem was a race condition in the creation and use of the WIN32 general time server.  It showed up on WIN32 IOCs on multi-core systems.  Our fix cured the problem on the system we saw it on, but the new failure looks  similar.  That fix was made in version 1.38.2.8 of os/WIN32/osdTime.cpp.
 

> That change causes the code to return a time stamp based on an uninitialized auto (stack stored) structure. That would introduce undefined 

> behavior which is probably not a good idea IMHO.

 

Right, that was just intended to be a quick and dirty fix to get past the initial crash, and demonstrate that it was just the very first call to epicsGetTimeCurrent that was failing and throwing the exception.

 

Mark

 


________________________________

From: Jeff Hill [mailto:[email protected]]
Sent: Mon 5/11/2009 4:29 PM
To: Mark Rivers; 'Andrew Johnson'; 'Eric Norum'
Cc: [email protected]
Subject: RE: Problem with WIN32



Mark,

 

That change causes the code to return a time stamp based on an uninitialized auto (stack stored) structure. That would introduce undefined behavior which is probably not a good idea IMHO.

 

It's quite normal for C++ codes to report serious failures such as this one by throwing an exception. There are of course two ways you can get into trouble.

 

1)      If there isn't a try catch block when crossing a membrane between c code and c++ code trouble is almost guaranteed. I have been very careful about these membranes in R3.14.

2)      A C++ code isn't properly introducing try/catch blocks dealing with exceptions. Its starting to look like this one is the cause.

 

I had a look at the stack trace, which was very helpful BTW, and I can see that we are currently fetching the time in the last chance exception handler for a C++ based thread when a 2nd exception is thrown. I don't know which thread received the precipitating exception, but it is a pretty good guess that the precipitating exception might have also have been caused by fetching the current time.

 

I think that two fixes need to occur.

 

1)      We need to determine which thread caused the precipitating exception and upgrade its error handling - adding a try catch block at a strategic location. I am going to guess that the precipitating exception might have been thrown in the timer queue library.

2)      The code below in the exception handler that fetches the time in order to print a diagnostic message needs a try catch block.

 

It's also an interesting question why general time was unable to provide any source of time at this point during the boot? On windows the original R3.14 had an excellent source of time created using high precision performance counter based time synchronized to the real time clock - a similar approach to that used by the perl community BTW. So it's less than clear why one wouldn't fall back to that time source (essentially to one that is available from the OS) in general time? FWIW, NTP uses cascaded PLLs to produce discontinuity proof transitions between different time sources. That might be a better approach compared to the abrupt changes I seem to recall occur when switching time sources in general time.

 

    catch ( ... ) {

        if ( ! waitRelease ) {

            epicsTime cur = epicsTime::getCurrent ();  ç================= here

            char date[64];

            cur.strftime ( date, sizeof ( date ), "%a %b %d %Y %H:%M:%S.%f");

            char name [128];

            epicsThreadGetName ( pThread->id, name, sizeof ( name ) );

            errlogPrintf ( 

                "epicsThread: Unknown C++ exception in thread \"%s\" at %s\n",

                name, date );

            errlogFlush ();

            // this should behave as the C++ implementation intends when an 

            // exception isnt handled. If users dont like this behavior, they 

            // can install an application specific unexpected handler.

            std::unexpected ();

        }

    }

 

Jeff

 

From: [email protected] [mailto:[email protected]] On Behalf Of Mark Rivers
Sent: Monday, May 11, 2009 2:39 PM
To: Andrew Johnson; Eric Norum
Cc: [email protected]
Subject: RE: Problem with WIN32

 

Folks,

 

I am quite sure this is a bug in base.  I modified libCom/osi/epicsTime.cpp so it just prints an error, rather than throwing an exception, if epicsTimeGetCurrent() returns an error:

 

corvette:src/libCom/osi>cvs diff epicsTime.cpp

Index: epicsTime.cpp

===================================================================

RCS file: /net/phoebus/epicsmgr/cvsroot/epics/base/src/libCom/osi/epicsTime.cpp,v

retrieving revision 1.25.2.20

diff -u -r1.25.2.20 epicsTime.cpp

--- epicsTime.cpp       18 Apr 2008 18:39:19 -0000      1.25.2.20

+++ epicsTime.cpp       11 May 2009 20:30:01 -0000

@@ -192,7 +192,8 @@

     epicsTimeStamp current;

     int status = epicsTimeGetCurrent (&current);

     if (status) {

-        throwWithLocation ( unableToFetchCurrentTime () );

+printf("epicsTime::getCurrent, unable to fetch current time\n");

+        //throwWithLocation ( unableToFetchCurrentTime () );

     }

     return epicsTime ( current );

 }

 

 

With this change I observe that when the IOC starts up I see one of those error messages, and no more.  I think we still have a timing problem where the generalTime system is not up and running on Windows before it is first being called.  I have been using this patched version of 3.14.10 for a while with no problems, but my application recently got more complex, with more DLLs being loaded when the application starts.  I believe that is slowing things down enough when the application starts that we are now seeing another problem.  The version of osi/os/WIN32/osdTime.cpp that I am using is effectively 1.38.2.8, which Andrew and I worked on to fix similar problems in 3.14.10.

 

$ ../../bin/win32-x86/prosilicaApp.exe st.cmd.win32

epicsTime::getCurrent, unable to fetch current time

< envPaths.win32

epicsEnvSet("ARCH","win32-x86")

...

 

Mark

 

 

________________________________

From: Mark Rivers 
Sent: Monday, May 11, 2009 2:51 PM
To: Andrew Johnson; 'Eric Norum'
Subject: Problem with WIN32

 

Folks,

 

I am getting a crash when I start the areaDetector IOC on win32-x86 and win32-x86-debug.  It looks like it might be a problem in base, perhaps related to the bug Andrew and I previously fixed with the timer not being created before it was being used.  Here is the trace.  The problem is in the 7'th line from the bottom, copied here:

 

simDetectorApp.exe!throwExceptionWithLocation<epicsTime::unableToFetchCurrentTime>(const epicsTime::unableToFetchCurrentTime & parm={...}, const char * pFileName=0x00616620, unsigned int lineNo=195)  Line 74            C++

 

This is happening right when the IOC starts up, even without an st.cmd file.

 

Mark

 

 

 

            ntdll.dll!7c90eb94()        

            [Frames below may be incorrect and/or missing, no symbols loaded for ntdll.dll] 

            user32.dll!7e419418()     

            user32.dll!7e42dba8()    

            user32.dll!7e42593f()     

            user32.dll!7e43a91e()     

>          simDetectorApp.exe!_output_s_l(_iobuf * stream=0x00000000, const char * format=0x001449b0, localeinfo_struct * plocinfo=0x00144690, char * argptr=0x00012012)  Line 1164 + 0x17 bytes           C++

            user32.dll!7e466278()     

            user32.dll!7e450617()     

            user32.dll!7e4505cf()     

            simDetectorApp.exe!__crtMessageBoxA(const char * lpText=0x003b8620, const char * lpCaption=0x0061fe50, unsigned int uType=73746)  Line 145      C

            simDetectorApp.exe!__crtMessageWindowA(int nRptType=1, const char * szFile=0x00000000, const char * szLine=0x00000000, const char * szModule=0x00000000, const char * szUserMessage=0x003b9694)  Line 420 + 0x16 bytes            C

            simDetectorApp.exe!_VCrtDbgReportA(int nRptType=1, const char * szFile=0x00000000, int nLine=0, const char * szModule=0x00000000, const char * szFormat=0x0061e2c0, char * arglist=0x003be728)  Line 417 + 0x28 bytes     C

            simDetectorApp.exe!_CrtDbgReportV(int nRptType=1, const char * szFile=0x00000000, int nLine=0, const char * szModule=0x00000000, const char * szFormat=0x0061e2c0, char * arglist=0x003be728)  Line 300 + 0x1d bytes    C

            simDetectorApp.exe!_CrtDbgReport(int nRptType=1, const char * szFile=0x00000000, int nLine=0, const char * szModule=0x00000000, const char * szFormat=0x0061e2c0, ...)  Line 317 + 0x1d bytes    C

            simDetectorApp.exe!_NMSG_WRITE(int rterrnum=10)  Line 197 + 0x18 bytes       C

            simDetectorApp.exe!abort()  Line 59 + 0x7 bytes            C

            simDetectorApp.exe!terminate()  Line 136           C++

            simDetectorApp.exe!__CxxUnhandledExceptionFilter(_EXCEPTION_POINTERS * pPtrs=0x003bf1cc)  Line 72      C++

            kernel32.dll!7c863016()   

            simDetectorApp.exe!_XcptFilter(unsigned long xcptnum=3765269347, _EXCEPTION_POINTERS * pxcptinfoptrs=0x003bf1cc)  Line 237 + 0xa bytes           C

            simDetectorApp.exe!_callthreadstartex()  Line 350 + 0x17 bytes  C

            simDetectorApp.exe!@_EH4_CallFilterFunc@8()  + 0x12 bytes   Asm

            simDetectorApp.exe!_except_handler4(_EXCEPTION_RECORD * ExceptionRecord=0x003bf2c0, _EXCEPTION_REGISTRATION_RECORD * EstablisherFrame=0x003bff98, _CONTEXT * ContextRecord=0x003bf2e0, void * DispatcherContext=0x003bf294)  + 0xb7 bytes   C

            ntdll.dll!7c9037bf()         

            ntdll.dll!7c90378b()        

            ntdll.dll!7c937860()        

            ntdll.dll!7c90eafa()         

            kernel32.dll!7c812a5b() 

            ntdll.dll!7c9106eb()        

            ntdll.dll!7c911538()        

            kernel32.dll!7c812a5b() 

            ntdll.dll!7c911596()        

            ntdll.dll!7c9106eb()        

            ntdll.dll!7c9106eb()        

            ntdll.dll!7c911538()        

            ntdll.dll!7c919a9c()        

            ntdll.dll!7c919b3f()         

            ntdll.dll!7c919aeb()        

            ntdll.dll!7c911538()        

            ntdll.dll!7c919aeb()        

            ntdll.dll!7c919d27()        

            ntdll.dll!7c919a9c()        

            ntdll.dll!7c919b3f()         

            ntdll.dll!7c919aeb()        

            kernel32.dll!7c812a5b() 

            simDetectorApp.exe!fetchWin32ThreadGlobal()  Line 183 + 0x11 bytes    C

            simDetectorApp.exe!_CxxThrowException(void * pExceptionObject=0x003bf650, const _s__ThrowInfo * pThrowInfo=0x0063e0e0)  Line 166         C++

            simDetectorApp.exe!throwExceptionWithLocation<epicsTime::unableToFetchCurrentTime>(const epicsTime::unableToFetchCurrentTime & parm={...}, const char * pFileName=0x00616620, unsigned int lineNo=195)  Line 74            C++

            simDetectorApp.exe!epicsTime::getCurrent()  Line 195 + 0x18 bytes        C++

            simDetectorApp.exe!epicsThreadCallEntryPoint(void * pPvt=0x003a96f4)  Line 68 + 0xc bytes     C++

            simDetectorApp.exe!epicsWin32ThreadEntry(void * lpParameter=0x003a9a68)  Line 498 + 0x11 bytes      C

            simDetectorApp.exe!_callthreadstartex()  Line 348 + 0xf bytes    C

            simDetectorApp.exe!_threadstartex(void * ptd=0x003a9ac8)  Line 331     C

            kernel32.dll!7c80b683() 



References:
RE: Problem with WIN32 Mark Rivers
RE: Problem with WIN32 Jeff Hill

Navigate by Date:
Prev: MEDM 3.1.4 + extensionsTop Bertrand H.J. Biritz
Next: Re: MEDM 3.1.4 + extensionsTop Jack
Index: 1994  1995  1996  1997  1998  1999  2000  2001  2002  2003  2004  2005  2006  2007  2008  <20092010  2011  2012  2013  2014  2015  2016  2017  2018  2019  2020  2021  2022  2023  2024 
Navigate by Thread:
Prev: RE: Problem with WIN32 Jeff Hill
Next: MEDM 3.1.4 + extensionsTop Bertrand H.J. Biritz
Index: 1994  1995  1996  1997  1998  1999  2000  2001  2002  2003  2004  2005  2006  2007  2008  <20092010  2011  2012  2013  2014  2015  2016  2017  2018  2019  2020  2021  2022  2023  2024 
ANJ, 31 Jan 2014 Valid HTML 4.01! · Home · News · About · Base · Modules · Extensions · Distributions · Download ·
· Search · EPICS V4 · IRMIS · Talk · Bugs · Documents · Links · Licensing ·