EPICS Controls Argonne National Laboratory

Experimental Physics and
Industrial Control System

2002  2003  2004  2005  2006  2007  2008  2009  2010  2011  2012  2013  2014  <20152016  2017  2018  2019  2020  2021  2022  2023  2024  Index 2002  2003  2004  2005  2006  2007  2008  2009  2010  2011  2012  2013  2014  <20152016  2017  2018  2019  2020  2021  2022  2023  2024 
<== Date ==> <== Thread ==>

Subject: Re: Base R3.15.2-rc1 release
From: Ralph Lange <[email protected]>
To: EPICS Core-Talk <[email protected]>
Date: Tue, 05 May 2015 17:23:57 +0200
Hi Mark,   (moving the discussion to core-talk)

I can confirm your observations.

On my laptop
 - Windows 7 64bit SP1
 - 2 cores / 4 threads ([email protected]), 8GB RAM
 - VS 2013 compiler
 - EPICS_HOST_ARCH=win32-x86-static
 - Local installations on C: (SSD)

Using MinGW32-make (3.82.90) I get the same spontaneous hang-ups at different places when building parallel with '-j', one make thread using 100% of a "core". The sequential build takes 15:04.

Using Andrew's Make 4.1 things get a bit faster. The sequential build takes 13:52.
The parallel '-j' builds always hang with make taking a single "core". I have not been able to run a single build beginning to end. However, running 'make -j' again always goes through.
The compile times look good. The dynamic build takes 04:08 + 00:49, the static build 06:16 + 01:33 (consistent over three iterations each).

(Reference: 3.14 tip sequential build takes 08:45 in the static case and 14:10 for the dynamic build, a parallel '-j' build using the same setup takes 04:32 for the static case and consistently fails for the dynamic build.)

I can confirm that this is where all Make 4.1 '-j' builds fail (showing the static case, dynamic looks the same except for the '-static' suffix in the target name):
link -nologo -LTCG -incremental:no -opt:ref -release -version:3.15 -out:excas.exe main.obj exServer.obj exPV.obj exVectorPV.obj exScalarPV.obj exAsyncPV.obj exChannel.obj ../../../../../../lib/win32-x86-static/cas.lib ../../../../../../lib/win32-x86-static/gdd.lib ../../../../../../lib/win32-x86-static/ca.lib ../../../../../../lib/win32-x86-static/Com.lib ws2_32.lib advapi32.lib user32.lib ws2_32.lib advapi32.lib user32.lib kernel32.lib winmm.lib

Generating code

c:\users\langer\documents\work\epics\v3\3.15\tip\src\libcom\misc\epicsstdlib.c(387) : warning C4756: overflow in constant arithmetic

Finished generating code

"Installing created executable ../../../../../../bin/win32-x86-static/excas.exe"

make[3]: Leaving directory 'C:/Users/langer/Documents/Work/EPICS/V3/3.15/tip/src/ca/legacy/pcas/ex/O.win32-x86-static'

make[2]: Leaving directory 'C:/Users/langer/Documents/Work/EPICS/V3/3.15/tip/src/ca/legacy/pcas/ex'

[ ... here the build hangs ...]
../../configure/RULES_ARCHS:61: recipe for target 'install.win32-x86-static' failed

make[2]: *** [install.win32-x86-static] Error 130

../configure/RULES_DIRS:88: recipe for target 'std.install' failed

make[1]: *** [std.install] Error 130

configure/RULES_DIRS:88: recipe for target 'src.install' failed

make: *** [src.install] Error 130

16:21:34: The process "C:\GNU\make.exe" exited with code -1.

16:21:34: Canceled build/deployment.


I would say Make is still broken, and the compiler version seems to have a massive influence.


Cheers,
~Ralph



On 05/05/2015 14:24, Mark Rivers wrote:

Hi Andrew,

 

Thanks for the suggestions of using –jN and trying make 4.1.

 

I have done a series of tests with make 3.81 (GNUWin32 version) and make 4.1 (version you built).  The results are quite interesting.

 

Test conditions:

 

-    Windows 7 64-bit computer

-    8 cores

-    Visual Studio 2010 compiler

-    EPICS_HOST_ARCH=win32-x86-static

-    Local installations of base-3.14.12.5 and base-3.15.2-rc1 on C: drive.

-    Remote installation of base-3.15.2-rc1 on a Linux file server

 

I first tested building base-3.14.12.5 with both versions of make, and 1 (-s), 2 (-sj2), 4 (-sj4), 8 (-sj8) and unlimited (-sj) numbers of make threads.

 

The following table show the seconds to build as a function of the version of make and the make options.

 

              make options

make version      -s    -sj2    -sj4    -sj8    -sj

 

3.81             394     391     383     383     94

 

4.1              369     199     119      85     91

 

 

These results show a serious problem with make 3.81.  There is no significant decrease in the execution time as the number of threads is increased, unless there is no limit on the number of threads.  Using the Windows Task Manager I could easily see why: as the number of threads was increased, using –sj8 for example, there were multiple instances of “make” running.  However, there was only at most one instance of the a CPU bound tasks such as the compiler (cl.exe), linker (link.exe), or perl (perl.exe) running at once.  Only when an unlimited number of threads was allowed (-sj) were there then multiple instances of cl.exe, link.exe, and perl.exe.

 

Make 4.1, on the other hand appears to work well.  The execution time dropped smoothly with an increased number of threads.  It was a minimum when the number of threads matched the number of cores (8), and increased slightly from that when an unlimited number of threads was allowed.

 

I then tested building base-3.15.2-rc1.  I only tested with –sj, -sj8 (4.1 only because of above results) and –sj.  With –sj and –sj8 I tested several times with each version of make.  This is the seconds to build or whether the build hung up.

 

              make options

make version      -s    -sj8    -sj

 

3.81             644            157

                                150

                                149

                                (previous testing of 3.81 on with local installation of 3.15.2-rc1 did result in occasional hangs)

 

4.1              623    147    hung

                        157    hung

                        150    hung

                        155    hung

                        154     167

 

When make 4.1 hung this was the final output:

main.cc

exServer.cc

exPV.cc

exVectorPV.cc

exScalarPV.cc

exAsyncPV.cc

exChannel.cc

Generating code

Finished generating code

 

When it was hung make was using 100% of a single core CPU.

 

 

Finally I tested building both base-3.14.12.5 and base-3.15.2-rc1 on a remote Linux file server

 

              make options

make version  base version      -s    -sj8    -sj

 

3.81          3.14.12.5       1583            479

3.81          3.15.2-rc1      2549            hung (2/2 times)

 

4.1           3.14.12.5       1693     568    547

4.1           3.15.2-rc1      2798    1178    hung (2/2 times)

 

When make 3.81 or 4.1 hung building on the remote Linux file server this was the final output:

nfa.c

misc.c

gen.c

ecs.c

dfa.c

ccl.c

epicsTempFile.cpp

 

When it hung make was using 100% of a single core CPU.

 

My observations and conclusions:

 

-    make 3.81 does not work properly with –jN, but it does work properly with –j (unlimited threads).

-    make 3.81 and 4.1 both work fine building 3.14.12.5 either locally or remotely on a Linux file server with –sj (unlimited threads).

-    make 3.81 and 4.1 both frequently hang up when building 3.15.2-rc1 locally with –sj.  4.1 appears to with OK with –sj8.

-    make 3.81 and 4.1 both fail 100% of the time when building 3.15.2-rc-1 remotely with –sj.  This does NOT happen with 3.14.12.5. Note that this cannot be because the all 8 cores are 100% busy, because they are never more than about 40% busy.  The build is completely I/O bound when building remotely, but –sj still hangs.

-    3.15.2-rc1 takes 1.5 to 2 times longer to build than 3.14.12.5.

 

 

Mark

 

 

From: Johnson, Andrew N. [mailto:[email protected]]
Sent: Thursday, April 30, 2015 7:27 PM
To: Mark Rivers
Cc: Ralph Lange; EPICS Tech-Talk
Subject: Re: Base R3.15.2-rc1 release

 

Hi Mark,

 

Could you try using my Make 4.1 executable instead, which you can find linked from the Base - Windows page under Other Build Tools. I do remember a comment in the release notes about some improvement to parallel builds on Windows in version 4, although I don't know if this will help or not.

 

I usually use 'make -sj4' to limit the parallelism to 4 which I think might prevent some problems; obviously you can adjust the number of processes to see what works best on your machine. If you don't specify a limit make will keep adding processes until the CPU load gets too high, and I think from experience that's not a very good strategy on Windows.

- Andrew

 

-- 

Sent from my iPad


On Apr 30, 2015, at 5:23 PM, Mark Rivers <[email protected]> wrote:

I tested with a local installation of base-3.15.2-rc1 on my Windows system.

 

I still saw the problem with “make –j” hanging, with make using 100% of a core.  But the problem was intermittent.  I don’t have a lot of statistics, but it seemed to happen perhaps 50% of the time when using “make –sj” (very little terminal output), and less frequently with “make –j” where there is a lot of terminal output.

 

If it hung up I was able to kill it and just start it again, so it picked up where the previous hung run had left off.

 

I was able to successfully build all 4 WIN32 architectures with parallel make, though sometimes having to restart it once to get it to complete.

 

When it hung with “make –sj” this was the last output before I killed it.  Several times it hung in this same place.

 

exPV.cc

exServer.cc

exVectorPV.cc

exScalarPV.cc

exChannel.cc

exAsyncPV.cc

Generating code

Finished generating code

 

The performance with the local build is much faster than the remote non-parallel builds, which took 46 minutes or remote parallel builds on 3.14.12.5 which took 6 minutes.

 

win32-x86:        2:12

win32-x86-static  2:52

 

So it was 2-3 times faster than doing a remote parallel make on 3.14.12.5.

 

This is GNUWin32 make 3.81.

 

Mark

 

 

From: Mark Rivers
Sent: Thursday, April 30, 2015 3:56 PM
To: 'Ralph Lange'; EPICS Tech-Talk
Subject: RE: Base R3.15.2-rc1 release

 

Hi Ralph,

 

I just tested 3-15.2-rc1 on WIN32, and found some problems.

 

My setup is using a Linux file server that holds a single tree, in which I build for all architectures (Linux, WIN32, Cygwin, vxWorks, Darwin).  I don’t know if that configuration has any bearing on my results.

 

I found that parallel make does not work on any WIN32 architectures (win32-x86, win32-x86-static, windows-x64, windows-x64-static).  The symptom is that “make” goes to using 100% of a core at some point during the build of libCom, and no further output occurs.  This was the output of “make –j” at the point where it hung building win32-x86-static:

 

perl -CSD H:/epics/base-3.15.2-rc1/bin/win32-x86-static/mkmf.pl  -m epicsUnitTest.d -I. -I../O.Common -I. -I../../../src

/libCom/osi/compiler/msvc -I../../../src/libCom/osi/compiler/default -I. -I../../../src/libCom/osi/os/WIN32 -I../../../s

rc/libCom/osi/os/default -I.. -I../../../src/libCom/as -I../../../src/libCom/bucketLib -I../../../src/libCom/calc -I../.

./../src/libCom/cvtFast -I../../../src/libCom/cppStd -I../../../src/libCom/cxxTemplates -I../../../src/libCom/dbmf -I../

../../src/libCom/ellLib -I../../../src/libCom/env -I../../../src/libCom/error -I../../../src/libCom/fdmgr -I../../../src

/libCom/flex -I../../../src/libCom/freeList -I../../../src/libCom/gpHash -I../../../src/libCom/iocsh -I../../../src/libC

om/log -I../../../src/libCom/macLib -I../../../src/libCom/misc -I../../../src/libCom/osi -I../../../src/libCom/pool -I..

/../../src/libCom/ring -I../../../src/libCom/taskwd -I../../../src/libCom/timer -I../../../src/libCom/yacc -I../../../sr

c/libCom/yacc -I../../../src/libCom/yajl -I../../../include/compiler/msvc -I../../../include/os/WIN32 -I../../../include

         epicsUnitTest.d ../../../src/libCom/misc/epicsUnitTest.c

 

"make" was using 100% of one core at this point, and nothing further happened.  I had to kill it with ^C.

 

On base 3.14.12.5 win32-x86-static and windows-x64-static work fine with parallel make, while win32-x86 and windows-x64 do not. The failure for the archs that don’t work is an error reporting a missing .lib file I believe.

 

The performance is also significantly worse, even for non-parallel make.  These are times I measured today for building the win32-x86-static architecture:

 

3.14.12.5   parallel make       6 minutes

3.14.12.5   non-parallel make  22 minutes

3.15.2-rc-1 non-parallel make  46 minutes

 

So it is taking 7.5 times longer to build 3.15.2-rc1 than 3.14.12.5.  This is mostly because I can use parallel make, but even non-parallel make is over twice as fast on the older version.

 

Thanks,

Mark

 

 

 

-----Original Message-----
From: [email protected] [mailto:[email protected]] On Behalf Of Ralph Lange
Sent: Thursday, April 30, 2015 2:16 AM
To: EPICS Tech-Talk
Subject: Base R3.15.2-rc1 release

 

The (first) release candidate for the next release of the EPICS Base

3.15 series 3.15.2-rc1 is now available for download and testing. Please

read the Release Notes to see what's new in this release.

         http://www.aps.anl.gov/epics/base/R3-15/2.php

 

Please test - especially if you have less widespread OS/architecture

setups that you can test on - and report feedback and any problems you

encounter here on the tech-talk list.

Note that some support modules may require changes to build properly

with this release, so check with a module author to see if they have a

3.15 version available if you encounter problems.

 

If no major issues arise with this code, the final 3.15.2 release will

be created in the second week of May. Thanks to everyone who has

contributed towards this release, and especially to the other core

developers for all their work.

 

~Ralph

 



Navigate by Date:
Prev: Re: Weird behavior when linking libgdd Michael Davidsaver
Next: Base 3.15.2 about to be bundled Ralph Lange
Index: 2002  2003  2004  2005  2006  2007  2008  2009  2010  2011  2012  2013  2014  <20152016  2017  2018  2019  2020  2021  2022  2023  2024 
Navigate by Thread:
Prev: Re: Weird behavior when linking libgdd Michael Davidsaver
Next: Base 3.15.2 about to be bundled Ralph Lange
Index: 2002  2003  2004  2005  2006  2007  2008  2009  2010  2011  2012  2013  2014  <20152016  2017  2018  2019  2020  2021  2022  2023  2024 
ANJ, 16 Dec 2015 Valid HTML 4.01! · Home · News · About · Base · Modules · Extensions · Distributions · Download ·
· Search · EPICS V4 · IRMIS · Talk · Bugs · Documents · Links · Licensing ·