EPICS Home

Experimental Physics and Industrial Control System


 
<20022003  2004  2005  2006  2007  2008  2009  2010  2011  2012  2013  2014  2015  2016  2017  2018  2019  2020  2021  2022  2023  2024  Index <20022003  2004  2005  2006  2007  2008  2009  2010  2011  2012  2013  2014  2015  2016  2017  2018  2019  2020  2021  2022  2023  2024 
<== Date ==> <== Thread ==>

Subject: caEventRate
From: "Jeff Hill" <[email protected]>
To: "Ken Evans" <[email protected]>
Cc: <[email protected]>, "Bob Dalesio" <[email protected]>, "Marty Kraimer" <[email protected]>, "Eric Norum" <[email protected]>, "'Andrew Johnson'" <[email protected]>, "Ralph Lange" <[email protected]>, <[email protected]>
Date: Wed, 6 Nov 2002 17:57:57 -0700
Ken,

I am seeing a situation somewhat like what you reported for the PC. I
modified the program (from base) called caEeventRate so that you can
vary the number of subscriptions. I started it on Linux specifying 10000
subscriptions and connecting to a gateway on my PC. The Linux system is
94% idle. My PC is running both the gateway and the IOC and it is about
7% loaded. The gateway is using 7% of the CPU and the IOC is using
essentially none. The scan rate in your test database is 10 Hz so the
event rate should be 100 KHz, but this is not what I am seeing from
caEventRate so we know that we are saturating something. I can make a
crude estimate of the load on the network.

Requested:
100 KHz * ( 16 byte CA header plus 8 bytes data ) = 2.4 MBps = 19.2 Mbps

Observed:
34 Khz * ( 16 byte CA header plus 8 bytes data ) = 816 KBps = 6.5 Mbps

This does not appear to be network limited.

At this event rate the gateway is using 7% of the PC's CPUs. If I don't
compile base optimized then the PC's CPU consumption jumps to 40%, and
this is close to saturated since the portable server isn't yet
multithreaded and therefore can't use both CPUs. The saturated event
rate is 25KHz if the gateway isn't optimized and 34KHz if it is.

~/epicsDvl/epics/base/bin/linux-x86$ caEventRate evans:perf:c0a0 10000
Connecting to CA Channel "evans:perf:c0a0". done.
Subscribing 10000 times. done.
Waiting for initial value events. done.
CA Event Rate (Hz): current 22285.3 mean 22285.3 std dev 0.000186959
CA Event Rate (Hz): current 26468.9 mean 24377.1 std dev 2091.8
CA Event Rate (Hz): current 24323 mean 24359.1 std dev 1708.14
CA Event Rate (Hz): current 24949.8 mean 24506.8 std dev 1501.24
CA Event Rate (Hz): current 25518.6 mean 24709.1 std dev 1402.42
CA Event Rate (Hz): current 25569 mean 24852.4 std dev 1319.72
CA Event Rate (Hz): current 25583.6 mean 24956.9 std dev 1248.33

~/epicsDvl/epics/base/bin/linux-x86$ caEventRate evans:perf:c0a0 10000
Connecting to CA Channel "evans:perf:c0a0". done.
Subscribing 10000 times. done.
Waiting for initial value events. done.
CA Event Rate (Hz): current 34444.7 mean 34444.7 std dev 0.000533621
CA Event Rate (Hz): current 33595.5 mean 34020.1 std dev 424.58
CA Event Rate (Hz): current 33678.9 mean 33906.4 std dev 382.167
CA Event Rate (Hz): current 34060.6 mean 33944.9 std dev 337.635
CA Event Rate (Hz): current 34038.8 mean 33963.7 std dev 304.317
CA Event Rate (Hz): current 34092.1 mean 33985.1 std dev 281.893
CA Event Rate (Hz): current 34097.3 mean 34001.1 std dev 263.917

If I bypass the gateway and connect to the IOC directly I see that the
maximum event rate approaches 213KHz with caEventRate using 70% of the
CPU on Linux, and the IOC using about 22% of two CPUs on my PC.

213 Khz * ( 16 byte CA header plus 8 bytes data ) = 5.1 MBps = 41 Mbps

This is approaching the maximum network utilization that I have seen
with Channel Access over the network between these two machines.

~/epicsDvl/epics/base/bin/linux-x86$ caEventRate evans:perf:c0a0 20000
Connecting to CA Channel "evans:perf:c0a0".EPICS Environment
"EPICS_CA_SERVER_PORT" integer fetch failed
setting "EPICS_CA_SERVER_PORT" = 5064
 done.
Subscribing 20000 times. done.
Waiting for initial value events. done.
CA Event Rate (Hz): current 211345 mean 211345 std dev 0.00211696
CA Event Rate (Hz): current 213123 mean 212234 std dev 888.875
CA Event Rate (Hz): current 213293 mean 212587 std dev 881.032
CA Event Rate (Hz): current 213237 mean 212749 std dev 813.274
CA Event Rate (Hz): current 213407 mean 212881 std dev 773.456

The low 7% utilization at saturation by the gateway seems to point to
schedualing problems in either CA or in windows.

I remember that the R3.13 ca client library had particularly poor high
throuhghput put performance on windows compared to high throghput get
performance on windows. This disparity did not exist on other OS. Here
is an example.

D:\users\hill\R3.13\epics\base\src\ca\O.WIN32>catime joh:bill
Testing with 10000 channels named joh:bill
search test
Elapsed Per Item =   0.00006360 sec,    15723.3 Items per sec, 2.6 Mbps
Search tries per chan - mean=1.092800 std dev=0.290152 min=1.000000
max=2.00000

channel name=joh:bill, native type=6, native count=1
        pend event test
Elapsed Per Item =   0.00001000 sec,   100000.0 Items per sec
float test
        async put test
Elapsed Per Item =   0.00002943 sec,    33978.9 Items per sec, 4.6 Mbps
        async get test
Elapsed Per Item =   0.00000532 sec,   187969.9 Items per sec, 25.6 Mbps
        synch get test
Elapsed Per Item =   0.00030000 sec,     3333.3 Items per sec, 0.5 Mbps
string test
        async put test
Elapsed Per Item =   0.00001636 sec,    61124.7 Items per sec, 10.3 Mbps
        async get test
Elapsed Per Item =   0.00001262 sec,    79239.3 Items per sec, 13.3 Mbps
        synch get test
Elapsed Per Item =   0.00033000 sec,     3030.3 Items per sec, 0.5 Mbps
integer test
        async put test
Elapsed Per Item =   0.00002922 sec,    34223.1 Items per sec, 4.7 Mbps
        async get test
Elapsed Per Item =   0.00000613 sec,   163132.1 Items per sec, 22.2 Mbps
        synch get test
Elapsed Per Item =   0.00029000 sec,     3448.3 Items per sec, 0.5 Mbps
free test
Elapsed Per Item =   0.00000800 sec,   125000.0 Items per sec, 17.0 Mbps

In contrast, R3.14 catime runs much faster and does not show the large
disparity between puts and gets that R3.13 does.

D:\users\hill\epicsDvl\epics\base>catime joh:bill
Testing with 10000 channels named joh:bill
channel connect test
Elapsed Per Item =   0.00005709 sec,    17515.9 Items per sec, 9.2 Mbps
Search tries per chan - mean=1.356500 std dev=0.629927 min=1.000000
max=4.000000

channel name=joh:bill, native type=6, native count=1
        pend event test
Elapsed Per Item =   0.00002579 sec,    38775.6 Items per sec
float test
        async put test
Elapsed Per Item =   0.00000379 sec,   263913.4 Items per sec, 35.9 Mbps
        async get test
Elapsed Per Item =   0.00000529 sec,   189056.3 Items per sec, 49.9 Mbps
        synch get test
Elapsed Per Item =   0.00031608 sec,     3163.8 Items per sec, 0.8 Mbps
double test
        async put test
Elapsed Per Item =   0.00000418 sec,   238985.4 Items per sec, 32.5 Mbps
        async get test
Elapsed Per Item =   0.00000704 sec,   141972.2 Items per sec, 37.5 Mbps
        synch get test
Elapsed Per Item =   0.00030221 sec,     3308.9 Items per sec, 0.9 Mbps
string test
        async put test
Elapsed Per Item =   0.00000838 sec,   119318.4 Items per sec, 20.0 Mbps
        async get test
Elapsed Per Item =   0.00001277 sec,    78298.4 Items per sec, 23.2 Mbps
        synch get test
Elapsed Per Item =   0.00032599 sec,     3067.5 Items per sec, 0.9 Mbps
integer test
        async put test
Elapsed Per Item =   0.00000366 sec,   273575.3 Items per sec, 37.2 Mbps
        async get test
Elapsed Per Item =   0.00000513 sec,   194862.3 Items per sec, 51.4 Mbps
        synch get test
Elapsed Per Item =   0.00032621 sec,     3065.5 Items per sec, 0.8 Mbps
round trip jitter test
Round trip get delays - mean=0.000315 sec, std dev=0.000892 sec,
min=0.000273 se
c max=0.088771 sec
free test
Elapsed Per Item =   0.00000147 sec,   681094.6 Items per sec, 0.0 Mbps

As far as I can tell this appears to be related to two possible causes:

1) A winsock specific problem when using select() to schedual outgoing
send data on windows.
2) R3.13 does not use multiple processors like R3.14 does because it is
select() schedualed.

I strongly suspect that both are involved because windows was the only
OS where R3.13 puts were slower than gets.

Since the portable server is still using select() schedualing like the
CA client library was in R3.13, and considering the fact that the
situation where monitors are approaching maximum throughput is very
similar to the situation where puts are reaching maximum throughput,
then one is left with a strong suspicion that this particular
performance anomaly is related to using select() for schedualing in the
portable CA server on windows.

To further test this theory I ran caEventRate against the example server
and I saw very nearly the same event rate limit with it and the same low
7% CPU utilization.

~/epicsDvl/epics/base/bin/linux-x86$ caEventRate joh:jane 10000
Connecting to CA Channel "joh:jane".EPICS Environment
"EPICS_CA_SERVER_PORT" integer fetch failed
setting "EPICS_CA_SERVER_PORT" = 5064
 done.
Subscribing 10000 times. done.
Waiting for initial value events. done.
CA Event Rate (Hz): current 33096.7 mean 33096.7 std dev 0.000507684
CA Event Rate (Hz): current 33936.6 mean 33516.7 std dev 419.933
CA Event Rate (Hz): current 33508.1 mean 33513.8 std dev 342.897
CA Event Rate (Hz): current 33890.5 mean 33608 std dev 338.816

So the bottom line is that I think that I understand this problem, and
so I will move on to looking at your efficency issues on Solaris.

Are you certain that all parts of the code {gateway, cas, ca, libCom,
gdd} were compiled optimized when you ran your tests.

Jeff
__________________________________________________________
Jeff Hill                     Internet     [email protected]
LANL MS H820                  Voice        505 665 1831
Los Alamos NM 87545 USA       FAX          505 665 5107


Navigate by Date:
Prev: RE: Your commits (cont'd) Ralph Lange
Next: more portable server performance data Jeff Hill
Index: <20022003  2004  2005  2006  2007  2008  2009  2010  2011  2012  2013  2014  2015  2016  2017  2018  2019  2020  2021  2022  2023  2024 
Navigate by Thread:
Prev: RE: Your commits (cont'd) Ralph Lange
Next: more portable server performance data Jeff Hill
Index: <20022003  2004  2005  2006  2007  2008  2009  2010  2011  2012  2013  2014  2015  2016  2017  2018  2019  2020  2021  2022  2023  2024