Hi Matt,
In ca.create_channel method, after libca.ca_create_channel it does also a poll. It would be more efficient to poll after creating all the channels.
I only have ~3000 channels to test. The connection time already shows some difference.
Poll after each channel creation: 4.5 sec
Poll after all channels creation: 0.35 sec
So Python program would in the end match the speed of sddscasr, whew..
BTW: I am doing this originally to see how CaChannel performs in comparison.
Best,
Xiaoqiang
-----Original Message-----
From: [email protected] [mailto:[email protected]] On Behalf Of Matt Newville
Sent: Saturday, September 17, 2011 5:40 PM
To: Matt Newville; EPICS tech-talk
Subject: Re: How to write a python-based backpup tool?
Hi Emmanuel,
OK, I also improved my earlier numbers with a more careful test.
With 20K PVs, all connected and all on a single subnet (but distributed over several IOCs, most of them on VME/vxWorks IOCs), I get about 13 to 14 sec total, or ~0.7usec / PV, with it basically breaking down as
20K x ( pv = epics.PV(pname) ) : 10 to 11 sec
20K x ( val = pv.get() ) : 2 to 3 sec
write to disk : 0.1 sec
I'm pretty sure most of that 10 to 11 sec for the PV "creation" is actually in the automated connection and event callbacks.
Using the lower-level epics.ca calls didn't change the total time at all. But that does still use a lot of python code, including implicit connection callback and unpacking of the values from ca_array_get(), etc. I haven't tried using the simplest, most direct calls to the CA library. I'm curious if there would be any improvement.
You say that sddscasr and burtrb are 5 times faster. To be honest, I'd be surprised if a pyepics solution could be improved by a factor of 5, but it's worth looking into.
Anyway, my earlier estimates were definitely pessimistic (and my times probably dominated by unconnected PVs).
--Matt Newville <newville at cars.uchicago.edu> 630-252-0431
On Fri, Sep 16, 2011 at 5:38 PM, <[email protected]> wrote:
>
>
> With same request files ( ~11K Pvs), I have sddscasr 2 sec burtrb 2
> sec python 10 sec (with gigabit network)
>
> I am wondering what is the reason for the difference.
> As you can see below, just the connection to PV is comparatively 'slow'
>
> debug output:
> [Fri Sep 16 16:11:15 2011] import epics start [Fri Sep 16 16:11:15
> 2011] import epics end [Fri Sep 16 16:11:15 2011] request read [Fri
> Sep 16 16:11:21 2011] connected to pv [Fri Sep 16 16:11:23 2011]
> fetched PV values [Fri Sep 16 16:11:23 2011] write on disk [Fri Sep 16
> 16:11:23 2011] write completed
>
> with
>
> #!/usr/bin/env python
>
> WRITE_ON_DISK = True
> USE_SQLITE = False
>
> import time
>
> print "[%s] import epics start" % time.asctime(
> time.localtime(time.time()) ) import epics print "[%s] import epics
> end" % time.asctime( time.localtime(time.time()) )
>
>
> pvNames = []
> pvs = []
> vals = []
>
> with open("burt.req") as fp:
> for line in iter(fp.readline, ''):
> pvNames.append(line[:-1])
>
> print "[%s] request read" % time.asctime( time.localtime(time.time())
> )
>
> for pvName in pvNames:
> pvs.append(epics.PV(pvName))
>
> print "[%s] connected to pv" % time.asctime(
> time.localtime(time.time()) )
>
> for pv in pvs:
> vals.append(pv.get())
>
> print "[%s] fetched PV values" % time.asctime(
> time.localtime(time.time()) )
>
> if not WRITE_ON_DISK : exit()
>
> print "[%s] write on disk" % time.asctime( time.localtime(time.time())
> )
>
> if USE_SQLITE :
> from pysqlite2 import dbapi2 as sqlite
>
> connection = sqlite.connect('test.db')
> cursor = connection.cursor()
> cursor.execute('CREATE TABLE backup (id INTEGER PRIMARY KEY, pvname
> VARCHAR(50), value VARCHAR(50))')
>
> for index, pvName in enumerate(pvNames):
> sqlquery = "INSERT INTO backup VALUES (null, '%s','%s')" %
> (pvName, vals[index] )
> #cursor.execute('INSERT INTO names VALUES (null, "%s" % pvName,
> "%s" % vals[index])')
> cursor.execute(sqlquery)
>
>
> connection.commit()
> cursor.close()
> connection.close()
>
> else:
> with open("test.txt",'w') as fp:
> for index, pvName in enumerate(pvNames) :
> print >>fp , "%s %s" % ( pvName, vals[index] )
>
>
> print "[%s] write completed" % time.asctime(
> time.localtime(time.time()) )
>
>
>
>
>
>
> On 16:30 Fri 16 Sep , Matt Newville wrote:
>> Hi Emmanuel,
>>
>> On Fri, Sep 16, 2011 at 3:09 PM, <[email protected]> wrote:
>> >> pvlist = []
>> >> for pvName in pvNames:
>> >> pvlist.append( epics.PV(pvName) )
>> >>
>> >> for pvs in pvlist:
>> >> val = pv.get()
>> >> <store value>
>> >>
>> >> With this sort of approach, I typically see on the order of 10ms
>> >> per PV connection on startup. That is, if I create and connect
>> >> to 5K PVs, it takes ~50 seconds (meaning 40 to 80 seconds) to get
>> >> initial values. I believe that is mostly the CA library, not the
>> >> python part, and I believe it would scale, suggesting that any
>> >> save/restore process that runs once and then quit would take 10 minutes for 50K PVs.
>> >
>> > Are you saying that 10 min for 50K PVs is also what is observed with C utilities?
>> > 10 min for 50K is rather long ...
>>
>> I'd be happy to be shown wrong, and I'm certainly not the right
>> person here to answer for sure, but I think this might be the case.
>>
>> > If that is the case, then one whould be careful to backup only PV
>> > which are of interest and not the 'entire' machine.
>> > How is this being managed at large installation?
>>
>> I think they break up the set of variables across
>> processes/machines/NICs and/or have a long-running process that
>> repeatedly saves variables instead of
>> start process; connect to 50K PVs; write values; end process.
>>
>> > I understand your points, but somehow I am having a hard time believing this.
>> > Did anyone benchmark their backup platform?
>> > (I remember seeing a few powerpoints on this issue a while back...)
>>
>> For what it's worth, and also gladly proven wrong, etc: I recall
>> that with non-preemptive-callback context, creating and connecting
>> PVs took exactly 30ms, to very high precision and reproducibility. I
>> never fully understood that.... From that view, the 10 ms average
>> with the preemptive-context is an improvement.
>>
>> It's probably easy to tell whether it is closer to 1, 10, or 30 ms
>> even with a few hundred PVs. But even at 1ms / PV connection, you
>> would probably want to not start a new connection per PV for saving
>> 50K PVs.
>>
>> > PS:
>> > Is there a difference between a connection to 50K Pvs versus, 50K connections to the same PVs?
>>
>> Yes, these are definitely different.
>>
>> --Matt
>>
>
> --
> Emmanuel
>
>
- Replies:
- Re: How to write a python-based backpup tool? Matt Newville
- References:
- How to write a python-based backpup tool? emmanuel_mayssat
- Re: How to write a python-based backpup tool? Matt Newville
- Re: How to write a python-based backpup tool? emmanuel_mayssat
- Re: How to write a python-based backpup tool? Matt Newville
- Re: How to write a python-based backpup tool? emmanuel_mayssat
- Re: How to write a python-based backpup tool? Matt Newville
- Navigate by Date:
- Prev:
Re: Bugs in StreamDevice 2 Dirk Zimoch
- Next:
2 software engineering associate positions open at Advanced Photon Source Claude Saunders
- Index:
1994
1995
1996
1997
1998
1999
2000
2001
2002
2003
2004
2005
2006
2007
2008
2009
2010
<2011>
2012
2013
2014
2015
2016
2017
2018
2019
2020
2021
2022
2023
2024
- Navigate by Thread:
- Prev:
Re: How to write a python-based backpup tool? Matt Newville
- Next:
Re: How to write a python-based backpup tool? Matt Newville
- Index:
1994
1995
1996
1997
1998
1999
2000
2001
2002
2003
2004
2005
2006
2007
2008
2009
2010
<2011>
2012
2013
2014
2015
2016
2017
2018
2019
2020
2021
2022
2023
2024
|