EPICS Home

Experimental Physics and Industrial Control System


 
1994  1995  1996  1997  1998  1999  2000  2001  2002  2003  2004  2005  2006  2007  2008  2009  2010  2011  2012  2013  2014  2015  2016  <20172018  2019  2020  2021  2022  2023  2024  Index 1994  1995  1996  1997  1998  1999  2000  2001  2002  2003  2004  2005  2006  2007  2008  2009  2010  2011  2012  2013  2014  2015  2016  <20172018  2019  2020  2021  2022  2023  2024 
<== Date ==> <== Thread ==>

Subject: Re: Archiver: Problems with disconnected PVs
From: "Shankar, Murali" <[email protected]>
To: Gabriel de Souza Fedel <[email protected]>, "[email protected]" <[email protected]>
Date: Tue, 5 Sep 2017 17:33:37 +0000
>> When I restart archiver "broken" PVs back to normally.
>> I will keep monitoring them
This is always a good idea. I have a script that uses the getCurrentlyDisconnectedPVs BPL and validates these on a path independent of whatever the archiver is using. That is, if the archiver goes thru a gateway, this monitoring script checks the liveness of the PV's directly on the VLAN. 

Please let me know if you find any more details.

Regards,
Murali

________________________________________
From: Gabriel de Souza Fedel <[email protected]>
Sent: Monday, September 4, 2017 5:28 AM
To: Shankar, Murali; [email protected]
Cc: ([email protected])
Subject: Re: Archiver: Problems with disconnected PVs

Hi,

When I restart archiver "broken" PVs back to normally. I will keep
monitoring them

Em 01-09-2017 16:43, Shankar, Murali escreveu:
> Also, which version is this? Apologies if you mentioned it before but I naturally assume you have a reasonably recent version.
Our version is Oct/2016

>
> Regards,
> Murali
Regards and Thank you again for help.


>
> ________________________________________
> From: Shankar, Murali
> Sent: Friday, September 1, 2017 12:23 PM
> To: Gabriel de Souza Fedel; [email protected]
> Cc: ([email protected])
> Subject: Re: Archiver: Problems with disconnected PVs
>
>>> One thing look's strange, max_array_bytes.
> This would probably affect your waveforms mostly...
>
>>> and a few has problems (eventually disconnect).
> Are all these PV's from one or two IOCs? In this case I would look at the IOC.
>
>>>>> Your PV Details page for this PV could have information
> There should be lines here for "When did we request CA to make a connection to this PV?" and "Time elapsed since search request (s)". Does these look ok? Since you paused/resumed the PV; this should be the time you resumed the PV.
>
> The "Currently DisconnectedPV's" report also has some additional information; this is not always helpful but if all your live but disconnected PV's are in the same CAJ context ID then a restart may be required. Might be useful to get a stack trace of all the threads; there might be some clue here on where the search thread for that context is stuck.
>
>>> I take a look, but i can't see anything strange
> I think after this we get into wireshark territory (aka I'm out of ideas). You'll need to see if we are issuing search requests properly and getting proper responses etc.
>
> Regards,
> Murali
> ________________________________________
> From: Gabriel de Souza Fedel <[email protected]>
> Sent: Friday, September 1, 2017 11:59 AM
> To: Shankar, Murali; [email protected]
> Cc: ([email protected])
> Subject: Re: Archiver: Problems with disconnected PVs
>
> Em 01-09-2017 15:10, Shankar, Murali escreveu:
>>>> is there another location
>> Depends on your setup of course but I see something like this in my arch.log right at the beginning.
>>
>> <context class="com.cosylab.epics.caj.CAJContext">
>>   <preemptive_callback>true</preemptive_callback>
>>   <addr_list>gateway:5076 gateway:5077 gateway:5078 other-gateway:5064</addr_list>
>>   <auto_addr_list>false</auto_addr_list>
>>   <connection_timeout>30.0</connection_timeout>
>>   <beacon_period>15.0</beacon_period>
>>   <repeater_port>5069</repeater_port>
>>   <server_port>5076</server_port>
>>   <max_array_bytes>80000000</max_array_bytes>
>>   <event_dispatcher class="org.epics.archiverappliance.engine.epics.JCAEventDispatcherBasedOnPVName"/>
>> </context>
>>
> I found it:
>
>
> <context class="com.cosylab.epics.caj.CAJContext">
>
>    <preemptive_callback>true</preemptive_callback>
>
>    <addr_list></addr_list>
>
>    <auto_addr_list>true</auto_addr_list>
>
>    <connection_timeout>30.0</connection_timeout>
>
>    <beacon_period>30.0</beacon_period>
>
>    <repeater_port>5065</repeater_port>
>
>    <server_port>5064</server_port>
>
>    <max_array_bytes>30.0</max_array_bytes>
>
>    <event_dispatcher
> class="org.epics.archiverappliance.engine.epics.JCAEventDispatcherBasedOnPVName"/>
> </context>
>
> One thing look's strange, max_array_bytes...Looks a bit low, can be the
> problem?
>
>>>> pause/resume I tried
>> Pausing/resuming tears down and recreates the CAJ channel for the PV so if you are unable to connect even after this you probably have some misconfiguration; that is, you can rule out most transient errors.
>>
>>>> IOC's log on ioc machine right
>> Yes; sometimes there could be stuck tasks on the IOC side; you can check for that.
>>
>> Your PV Details page for this PV could have information there that could help. You can get to this using something like so - http://localhost:17665/mgmt/bpl/getPVDetails?pv=Your_PV
>>
> I take a look, but i can't see anything strange
>> Finally, would you be in a position to attempt a restart?
> I will try it, but on next week.
>
> The most strange thing is a lot of PV's work well, and a few has
> problems (eventually disconnect)
>
> Thank you again
>
> Regards
>
>>
>> Regards,
>> Murali
>>
>>
>>
>>
>> ________________________________________
>> From: Gabriel de Souza Fedel <[email protected]>
>> Sent: Friday, September 1, 2017 10:52 AM
>> To: Shankar, Murali; [email protected]
>> Cc: ([email protected])
>> Subject: Re: Archiver: Problems with disconnected PVs
>>
>> Em 01-09-2017 13:39, Shankar, Murali escreveu:
>>>>> This seems like it might be a CA client configuration issue
>>>
>>> This is the most likely case. The engine prints out it's CAJ
>>> configuration on startup and you should be able to see this in your logs.
>>>
>> I didn't find it. I find engine/logs/catalina.err, is there another
>> location?
>>
>>>
>>> You can try a couple of things. You can pause/resume the PV's in
>>> question and see if they reconnect back. You can also look in the IOC's
>>> logs and see if there is anything interesting going there.
>>>
>> pause/resume I tried. IOC's log on ioc machine right? apparently there
>> is no errors (on epics console).
>>
>>
>>>
>>> Regards,
>>
>> Regards and thank you for the answer
>>>
>>> Murali
>>>
>>>
>>
>> --
>> Gabriel Fedel
>> Software de Operação das Linhas de Luz
>> Laboratório Nacional de Luz Síncrotron – (LNLS)
>> Centro Nacional de Pesquisa em Energia e Materiais (CNPEM)
>> [email protected] | +55 (19) 3512 1226
>> www.lnls.cnpem.br
>>
>
> --
> Gabriel Fedel
> Software de Operação das Linhas de Luz
> Laboratório Nacional de Luz Síncrotron – (LNLS)
> Centro Nacional de Pesquisa em Energia e Materiais (CNPEM)
> [email protected] | +55 (19) 3512 1226
> www.lnls.cnpem.br
>

--
Gabriel Fedel
Software de Operação das Linhas de Luz
Laboratório Nacional de Luz Síncrotron – (LNLS)
Centro Nacional de Pesquisa em Energia e Materiais (CNPEM)
[email protected] | +55 (19) 3512 1226
www.lnls.cnpem.br

Replies:
Re: Archiver: Problems with disconnected PVs Gabriel de Souza Fedel
References:
Re: Archiver: Problems with disconnected PVs Shankar, Murali
Re: Archiver: Problems with disconnected PVs Gabriel de Souza Fedel
Re: Archiver: Problems with disconnected PVs Shankar, Murali
Re: Archiver: Problems with disconnected PVs Gabriel de Souza Fedel
Re: Archiver: Problems with disconnected PVs Shankar, Murali
Re: Archiver: Problems with disconnected PVs Shankar, Murali
Re: Archiver: Problems with disconnected PVs Gabriel de Souza Fedel

Navigate by Date:
Prev: Re: C++ multi threaded application. Andrew Johnson
Next: Re: data refresh and add pv Shankar, Murali
Index: 1994  1995  1996  1997  1998  1999  2000  2001  2002  2003  2004  2005  2006  2007  2008  2009  2010  2011  2012  2013  2014  2015  2016  <20172018  2019  2020  2021  2022  2023  2024 
Navigate by Thread:
Prev: Re: Archiver: Problems with disconnected PVs Gabriel de Souza Fedel
Next: Re: Archiver: Problems with disconnected PVs Gabriel de Souza Fedel
Index: 1994  1995  1996  1997  1998  1999  2000  2001  2002  2003  2004  2005  2006  2007  2008  2009  2010  2011  2012  2013  2014  2015  2016  <20172018  2019  2020  2021  2022  2023  2024