EPICS Controls Argonne National Laboratory

Experimental Physics and
Industrial Control System

1994  1995  1996  1997  1998  1999  2000  2001  2002  2003  2004  2005  2006  2007  2008  2009  2010  2011  2012  <20132014  2015  2016  2017  2018  2019  2020  2021  2022  2023  2024  Index 1994  1995  1996  1997  1998  1999  2000  2001  2002  2003  2004  2005  2006  2007  2008  2009  2010  2011  2012  <20132014  2015  2016  2017  2018  2019  2020  2021  2022  2023  2024 
<== Date ==> <== Thread ==>

Subject: FW: Gige performance increasing.
From: Mark Rivers <[email protected]>
To: "[email protected]" <[email protected]>
Date: Sat, 20 Apr 2013 14:16:34 +0000
Folks,

I am forwarding a message reporting some interesting problems running multiple Prosilica cameras on a single machine.

I have removed the complete report attachment for tech-talk, but the important table should be in the message.

Suggestions are most appreciated!

Mark

________________________________
From: Mark Rivers
Sent: Saturday, April 20, 2013 9:07 AM
To: Slava Isaev
Cc: Matjaz Kobal; Spencer J. Gessner; Matthew Boyes; Luciano Piccoli; Williams Jr., Ernest L.; Andrew Johnson
Subject: RE: Gige performance increasing.

Folks,

Yesterday I set up a system in my lab to try to reproduce the results that Matjaz presented in the report he sent me on Nov. 7, 2012.   I have attached the complete report.  The following is Table 1 from the report.

#cameras

Fps

maxScpu [%]

totalCpu [%]

Dropped [%]

TIFF

JPEG

Proc

Ana

7

30

40

220

5









7

10

20

80

0.5









7

10

50

240

1



*





7

10

20

120

1.5

*







7

10

60

270

3





*



7

10

80

600

30





*

*

7

10

60

300

10

*



*



7

10

60

400

15



*

*



7

10

90

700

25



*

*

*

7

5

60

400

20



*

*

*

7

2

40

190

3



*

*

*

7

1

20

100

1



*

*

*

7

0.2

10

25

2



*

*

*

6

10

20

60

0.05









6

10

40

220

10



*





6

10

80

600

30



*

*

*

6

5

60

360

10



*

*

*

6

2

30

160

5



*

*

*

6

1

20

90

0.5



*

*

*

5

10

20

60

0.2









5

10

40

180

2



*





5

10

80

500

20



*

*

*

5

5

60

300

10



*

*

*

5

1

15

70

<0.5



*

*

*

4

10

20

60

0.1









4

10

40

140

1



*





4

10

70

400

20



*

*

*

4

5

60

250

5



*

*

*

4

1

20

60

0.5



*

*

*

3

10

30

40

0.5









3

10

50

100

0.7



*





3

10

80

340

15



*

*

*

3

5

50

190

5



*

*

*

3

1

10

40

<0.05



*

*

*

2

10

20

30

<0.01









2

10

30

80

0.1



*





2

10

80

240

10



*

*

*

2

5

40

140

1



*

*

*

2

1

10

30

<0.05



*

*

*

1

10

10

20

<0.01









1

10

30

40

<0.01



*





1

10

80

140

1



*

*

*

1

5

40

80

<0.02



*

*

*


He was testing on a Linux system with 8 network cards (dedicated card per camera) and dual 6-core CPUs (12 cores total).  Each camera ran in its own IOC, so in its own process on Linux.

Note that with 3 cameras running at 10 Hz and all 3 plugins (JPEG, Process, and Analyze) running he observed 80% usage of a single CPU, and 340% usage of the total CPUs.  There was 15% frame loss between the cameras and the computer under these conditions.

The system I have available for testing has a single GigE network card and only 4 cores.  It is a dual-boot system with Fedora Core 14 and Windows 7 64-bit.

I first tested on Fedora.  I ran tests for each camera to collect 1000 frames, which took 100 seconds at 10 frames/sec.  Running 3 cameras at 10 Hz I observed results similar to what Matjaz did.  It was dropping less than 0.1% of the frames when I had no plugins enabled, and more than 10% of the frames when I enabled the JPEG and Statistics plugins on each camera, which was putting more than 80% load on each CPU.  I will get some more precise numbers next week before I come, but I believe I am effectively seeing results similar to Matjaz.

However, I then booted the system with Windows 7 and conducted the identical tests.  With all 3 cameras running at 10 Hz, and the JPEG and Statistics plugins running, Windows was reporting over 90% CPU utilization.  Windows task manager reports the %CPU utilization such that 100% means that all cores are saturated, unlike Linux which reports NCores*100% when all cores are saturated.  Thus the system was very close to fully CPU saturated.  Under these conditions the cameras did not drop a SINGLE frame! This was 0 dropped frames out of 3000 total, so less than 0.03%, compared to 15% dropped frames that Matjaz measured under almost identical conditions on a much more powerful Linux server with dedicated Ethernet port per camera and 3 times more cores.  The plugins also did not drop any frames, although I was monitoring the free queue size in each plugin, and they occasionally came close to depleting the queue and dropping frames.  That is exactly what I expect as the CPUs appro!
 ach saturation.

So my conclusion is that whatever is causing the dropped frames is not really a problem with the areaDetector driver or architecture, but is something specific to the Linux Ethernet driver or perhaps the Linux AVT driver library.

IMPORTANT NOTE:

I tried to automate my testing by writing an IDL script to turn the cameras on, wait for them to get done, and then read the statistics on the dropped frames.  This was using the normal IDL channel access library.  I observed VERY WEIRD behavior which I do not understand at all.  Here is what I observe:

- If I start each camera acquiring by using medm to set the Acquire PV to 1 then it almost always works fine.  I press Acquire on camera 1, then quickly press Acquire on camera 2, and then camera 3.  If I do that with all plugins disabled, then each cameras starts acquiring at 10 frames/sec with essentially no dropped frames.

- However, if I do the "identical" operation using IDL to set the Acquire PV to 1 on each camera in succession here is what I see:
  - If I only start 2 cameras, rather than 3 it works fine.  Both cameras acquire at 10 Hz with no dropped frames.
  - If I start 3 cameras with say a 2 second delay between starting each one (to simulate my delay when using medm to do it) then the first 2 cameras begin acquiring at 10 Hz.  But as soon as the third camera is started all 3 cameras begin dropping MORE THAN 90% of their frames!
  - This behavior of dropping 90% of frames when camera 3 starts happens no matter what delay (0.1 to 5 seconds) I put between starting the next camera.
  - I see  the identical behavior on Linux and Windows.

IDL and medm are both running on another Linux machine, not the machine running the camera IOCs, so these are channel access put operations from the same remote machine.

I am totally baffled by this.  Why does it make a difference if I start the cameras with medm or IDL?  They should both result in similar channel access put operations.  Furthermore, what can the IDL put operation be doing that causes the cameras to suddenly begin to drop 90% of their frames?

I see one other behavior that I don't understand.  When I use the "caput" program from EPICS base (3.14.12.3) to write to any PV in the camera IOC I see about a 2 second delay before the caput completes:

corvette:~>date ; /usr/local/epics/base-3.14.12.3/bin/linux-x86/caput 13PS1:cam1:Gain.DESC "Test" ; date
Sat Apr 20 08:58:54 CDT 2013
Old : 13PS1:cam1:Gain.DESC           Test
New : 13PS1:cam1:Gain.DESC           Test
Sat Apr 20 08:58:56 CDT 2013

Note that "date" is reporting that this operation took about 2 seconds.  There is a very noticeable delay between when the "New" value of the PV is printed, and when the Linux shell prompt returns.  Why?  This happens when all 3 cameras are not acquiring, so it cannot be a problem with Ethernet loading.  It happens whether the camera IOCs are running on Windows or Linux.

If I write to a PV in a vxWorks IOC I do not see this delay,  the Linux prompt returns "immediately" with no perceptible delay.

corvette:~>date ; /usr/local/epics/base-3.14.12.3/bin/linux-x86/caput 13LAB:m1.DESC "Test" ; date
Sat Apr 20 08:59:09 CDT 2013
Old : 13LAB:m1.DESC                  test
New : 13LAB:m1.DESC                  Test
Sat Apr 20 08:59:09 CDT 2013

I wonder if this could be related to the problem I am seeing with IDL starting the cameras?

Cheers,
Mark




_____________________________________
From: Slava Isaev [[email protected]]
Sent: Wednesday, April 10, 2013 7:53 AM
To: Mark Rivers
Cc: Matjaz Kobal; Spencer J. Gessner; Matthew Boyes; Luciano Piccoli; Williams Jr., Ernest L.
Subject: Re: Gige performance increasing.

Hi Mark,

this is continuation of this problem.
I as you suggested to Matjaz, I have built AVT samples and run it.

Now one camera is running AVT streaming sample and 6 others areaDetector with plugins.
When plugins enabled even AVT sample is losing frames.
In case plugins are disabled - there are no lost frames.

As you can imagine the same behavior for areaDetector with disabled plugins.
When plugins enabled on others cameras it is losing frames.

Best regards,
Slava Isaev

----- Original Message -----
From: "Mark Rivers" <[email protected]>
To: "Williams Jr., Ernest L." <[email protected]>, "Slava Isaev" <[email protected]>
Cc: "Matjaz Kobal" <[email protected]>, "Spencer J. Gessner" <[email protected]>, "Matthew Boyes" <[email protected]>, "Luciano Piccoli" <[email protected]>
Sent: Tuesday, April 9, 2013 5:25:31 PM
Subject: RE: Gige performance increasing.

How do these measurements relate to the ones that Mataj reported earlier?

I am attached the message that Mataj sent with his report, and my markup to his report.

I cannot reproduce this configuration here, because I don't have a Linux machine with multiple GigE Ethernet interfaces. But Mataj was reporting frame loss even when the aggregate load was much less than GigE saturation, so I can test it with a single GigE interface and multiple IOCs.

Mark


-----Original Message-----
From: Williams Jr., Ernest L. [mailto:[email protected]]
Sent: Tuesday, April 09, 2013 10:08 AM
To: Slava Isaev
Cc: Matjaz Kobal; Gessner, Spencer J.; Boyes, Matthew; Mark Rivers; Williams Jr., Ernest L.; Piccoli, Luciano
Subject: RE: Gige performance increasing.

Hi Slava,

I am CCing Mark Rivers.
Let's bring Mark into the loop as well.

Please describe the complete setup and test scenario for us.
Also, quantify the actual frame loss in frames/sec.

Mark, have you been experiencing frame loss
________________________________________
From: Slava Isaev [[email protected]]
Sent: Tuesday, April 09, 2013 7:56 AM
To: Williams Jr., Ernest L.
Cc: Matjaz Kobal; Gessner, Spencer J.; Boyes, Matthew
Subject: Re: Gige performance increasing.

Hi Ernest,

sorry to say this, I have to confirm that it is still losing frames.

Prosilica driver is based on the AVT library and It looses frames even with AVT samples.

I am running 6 cameras through AreaDetector with enabled plugins and the 7th camera is read by the AVT sample,
so experiment is tight to real life. Percentage of lost frames is about 2.5%.
Percentage is lower than with 7 cameras using areaDetector, but it could be cause of 15% (1/7) reduced payload.

I would say this indicates that losing itself is independent from the application (we are losing with and without areaDetector).
At the same time, we can't reproduce it with plain CPU pay-loading and AVT, it means that used areaDetector's configuration generates
other kind of payload which affects frame acquisition. It could be IO payload, kernel locking and etc.


I am going to test if problem relates to image saving (I/O payload).
It would be good if it is possible to install this tool:
http://www.iozone.org/


Best regards,
Slava Isaev

----- Original Message -----
From: "Williams Jr., Ernest L." <[email protected]>
To: "Slava Isaev" <[email protected]>
Cc: "Matjaz Kobal" <[email protected]>, "Spencer J. Gessner" <[email protected]>, "Williams Jr., Ernest L." <[email protected]>, "Matthew Boyes" <[email protected]>
Sent: Tuesday, April 9, 2013 12:28:24 AM
Subject: RE: Gige performance increasing.

Hi Slava,

How are you?

Please provide an update on the giGE camera vetting.
Matjaz, told me last week that you may have some good news.



Cheers,
Ernest
________________________________________
From: Slava Isaev [[email protected]]
Sent: Monday, March 04, 2013 8:29 AM
To: Williams Jr., Ernest L.
Cc: Matjaz Kobal; Gessner, Spencer J.
Subject: Gige performance increasing.

Hi Ernest,

here is my solution how we can improve performance of GigE cameras.
I mean to eliminate lost frames.

In short I would try to use special image grabber which has on board GigE interfaces and processor.
It will make GigE cameras independent from main CPU.


BR,
Slava
____________________ _ _ _ _ ____________________
|_| |_| |_| |_| |_|
Slava Isaev, Cosylab
Senior Software Developer http://www.cosylab.com
Email: [email protected] Teslova ulica 30
Phone: +1 (386)14 776-676 SI-1000 Ljubljana
Cell: +1 (386)30 323-999 Slovenia
_ _ _ _ _ __________________ _ _ _ _ __
|_| |_| |_| |_| |_| |_| |_| |_| |_| |_|




Replies:
FW: Gige performance increasing. Mark Rivers

Navigate by Date:
Prev: Re: Add choice to dbd file Bruno Santos
Next: FW: Gige performance increasing. Mark Rivers
Index: 1994  1995  1996  1997  1998  1999  2000  2001  2002  2003  2004  2005  2006  2007  2008  2009  2010  2011  2012  <20132014  2015  2016  2017  2018  2019  2020  2021  2022  2023  2024 
Navigate by Thread:
Prev: Re: Add choice to dbd file Andrew Johnson
Next: FW: Gige performance increasing. Mark Rivers
Index: 1994  1995  1996  1997  1998  1999  2000  2001  2002  2003  2004  2005  2006  2007  2008  2009  2010  2011  2012  <20132014  2015  2016  2017  2018  2019  2020  2021  2022  2023  2024 
ANJ, 20 Apr 2015 Valid HTML 4.01! · Home · News · About · Base · Modules · Extensions · Distributions · Download ·
· Search · EPICS V4 · IRMIS · Talk · Bugs · Documents · Links · Licensing ·