In send-only-on-change subscription update systems one always needs some per
subscription state that remembers that there was no queue space and that we
were forced to discard a particular update. Later when queue space becomes
available we will need to send an update for all subscriptions that are not
current. In the CA server's event queue I handle this by replacing the last
update on the queue for a particular subscription when the queue saturates.
Based on private discussions with Geoff and Andrei I fear that they are
currently implementing an unreliable send-only-on-change subscription update
event stream where alarm state change edges can be lost forever (if the
alarm queue saturates during an alarm burst).
The original Fermilab alarm system is, I suspect, not a send-only-on-change
system, but instead a send-periodically based system and therefore might be
more likely to keep clients in sync.
Anyways that is my perception surrounding recent alarm hook developments. I
am fearful that the EPICS community could be straying away from reliable
event transport, but feel free to realign my discernment of the situation if
I am uninformed.
Jeff
> -----Original Message-----
> From: Dalesio, Leo `Bob` [mailto:[email protected]]
> Sent: Thursday, June 22, 2006 5:47 AM
> To: Ralph Lange
> Cc: Geoff Savage; Matthias Clausen; Andrew Johnson; EPICS core-talk; Fritz
> Bartlett; Liyu, Andrei
> Subject: RE: alarm hook
>
> I'm an optimist about the general reuse of code. It's a habit. I assume
> that we could make a minimum set of alarm consumers that would be useful
> for all of us.
>
> I would also like the dbPostEvent hook for archiving. I think that a plant
> archiver that only wants data on occasion would prefer to have it pushed
> out - rather than connect to 60000 channels. But there needs to be a
> discussion on that.
>
>
> -----Original Message-----
> From: Ralph Lange [mailto:[email protected]]
> Sent: Thursday, June 22, 2006 4:44 AM
> To: Dalesio, Leo `Bob`
> Cc: Geoff Savage; Matthias Clausen; Andrew Johnson; EPICS core-talk; Fritz
> Bartlett; Liyu, Andrei
> Subject: Re: alarm hook
>
> I think the answers to all your questions highly depend on what A and B
> are, which is the part not covered by the recent discussions.
> Depending on the implementation of createMessage() A and B might be Oracle
> servers, parts of the D0 event system, CMLOG servers .... with all these
> different systems, I doubt there even will be a default implementation for
> createMessage().
> The alarm viewers are clients to these A and B servers, so even the specs
> might differ depending on the system or installation.
>
> Ralph
>
>
> Dalesio, Leo `Bob` wrote:
> > Will A&B have a way to synchronize?
> > Is there thoughts of being able to serve the alarm information from
> multiple sources to many clients?
> > Any specs on the alarm viewers?
> > Thanks,
> > Bob
> >
> > -----Original Message-----
> > From: Geoff Savage [mailto:[email protected]]
> > Sent: Wednesday, June 21, 2006 10:36 PM
> > To: Matthias Clausen
> > Cc: Geoff Savage; Andrew Johnson; EPICS core-talk; Fritz Bartlett;
> > Liyu, Andrei
> > Subject: Re: alarm hook
> >
> > Hi,
> >
> > I am willing to accept the hook as proposed by Andrew. This is not the
> exact interface that I am currently using but it does provide the
> information that I require. I will modify the FNAL interface to match the
> proposed interface once the hook is in base.
> >
> > It is not clear to me how an alarm hook function is registered in a
> startup script. Can someone please provide an example that is operating
> system independent?
> >
> > From our (Matthias Clausen, Fritz Bartlett, Vladimir Sirotenko, Geoff
> Savage) discussions on Tuesday we developed a "generic" (but not matching
> Andrei's interface) interface. I'll address Andrei's email next. We
> propose to include the following "tools" for pushing alarms from the
> server in a package separate from base. Here is a simple chain showing
> all the pieces.
> >
> > hook -> logAlarm -> queue -> sendToNetwork -> createMessage -> send to
> > A -> send to B (if send to A fails)
> >
> > Here is a summary of our discussions on each piece with some of my
> experiences included (and indicated).
> >
> > 1. We start with the generic hook as proposed by Andrew.
> >
> > 2. The hook function which a user will register is logAlarm(struct
> dbCommon *prec, unsigned short sevr, unsigned short stat). Once this
> function has determined that an alarm needs to be sent it gathers the
> volatile data and inserts it into the alarm data queue. It should also do
> something reasonable if the data can't be inserted in the queue. It might
> simply keep track of the number of insertion failures and report this
> number in a message once the queue has space.
> > Geoff - It should send messages on bad and good transitions. Care
> should be taken when a record is successfully processed for the first time
> as it transitions from an undefined (bad) state to a good state.
> >
> > 3. Users should be able to adjust the size of this queue to accommodate
> different numbers of records in the IOCs.
> > Geoff - There are more alarms during maintenance periods than during run
> times.
> >
> > 4. The sendToNetwork function is running as a separate thread (vxworks
> task) at a priority lower than the scan tasks but higher than channel
> access. It waits for data to arrive in the alarm data queue. When data
> arrives it passes the data to the createMessage routine.
> >
> > 5. The createMessage routine constructs a string message to be sent
> across the network. To be more generic users should be able to define
> their own createMessage routine. This allows users to use a different
> message protocol within the push out framework.
> > Geoff - Using a string for messages removes byte ordering issues.
> >
> > 6. Send the message to server A. If sending the message to server A
> fails then send the message to server B.
> >
> > All the requirements are not included here. Hopefully DESY will provide
> more details as the project progresses.
> >
> > Some other issues to consider -
> > a. From Jeff Hill - can we detect on a record by record basis when an
> alarm is not pushed out. I think this requires some study of the common
> fields available to all records. This is on my to do list.
> > b. What will be in the data inserted into the alarm data queue? This
> should be all the volatile data needed in the network message to decrease
> the time spent collecting the data.
> > c. What are the contents of the network message?
> > d. A generic server also needs to be provided.
> >
> > Geoff
> > P.S. I need to sleep and will reply to Andrei's email in the morning.
> >
> >
> >
> > On Jun 19, 2006, at 10:02 PM, Matthias Clausen wrote:
> >
> >
> >> Hi Andrew,
> >>
> >> I had a meeting today with Geoff, Fritz and Vladimir at Fermilab.
> >> We discussed the implementation based on the proposed function call
> >> and agreed on an implementation which should be as generic as
> >> possible.
> >> After Geoff's and Bernd Schoeneburg's vacation (next two weeks) we
> >> will work on an implementation.
> >> If Geoff does not see any unforeseen problems I would like to give
> >> you a 'go' for the change in base.
> >>
> >> Thanks for your help!
> >> And - by the way - thanks for your clarification regarding Andrei's
> >> mail.
> >>
> >> Matthias
> >>
> >>
> >> Andrew Johnson wrote:
> >>
> >>> Hi Matthias,
> >>>
> >>> Matthias Clausen wrote:
> >>>
> >>>> in preparation for my meeting with Geoff at Fermilab I wand to send
> >>>> you the proposed hook into base which Bernd Schoeneburg and Bob
> >>>> already 'somehow' agreed on:
> >>>>
> >>>> Here's Berns mail to Bob:
> >>>>
> >>>>
> >>>>> recGblResetAlarms is called in monitor() which is called in the
> >>>>> end of record processing just before recGblFwdLink and after
> >>>>> recGblGetTimeStamp. After calling recGblResetAlarms in monitor()
> >>>>> the value changes are checked (not interesting for us).
> >>>>> recGblResetAlarms checks for alarm changes and returns the fist
> >>>>> approach of the monitor mask, which is later used for postEvents.
> >>>>> postEvents can be called from anywhere like device support, snl,
> >>>>> subroutines, 'homebrew' records etc. I think recGblResetAlarms is
> >>>>> called in the monitor function of records only. So I think it is
> >>>>> the perfect place. Please check it and correct me if I am wrong.
> >>>>>
> >>>>> The code could look like this (the end of recGblResetAlarms):
> >>>>>
> >>>>> if(sevr!=nsev || stat!=nsta) {
> >>>>>
> >>>>> ++: logAlarm (pdbc, sevr, stat);
> >>>>> ++: /* nsev and nsta are in pdbc->sevr and pdbc->stat */
> >>>>>
> >>>>> ackt = pdbc->ackt; acks = pdbc->acks;
> >>>>> if(!ackt || nsev>=acks){
> >>>>> pdbc->acks=nsev;
> >>>>> db_post_events(pdbc,&pdbc->acks,DBE_VALUE);
> >>>>> }
> >>>>> }
> >>>>> return(mask);
> >>>>> }
> >>>>>
> >>>> My question:
> >>>> Do you also agree with approach?
> >>>>
> >>> I agree with the location and arguments of the call, which I believe
> >>> are the same as Fermilab have been using.
> >>>
> >>>
> >>>> And - what would be implemented in base for logAlarm ()?
> >>>> This could be an empty function which just returns - or it could be
> >>>> the 'real' thing where you'll have to check whether alarm logging
> >>>> should be used at all.
> >>>> The empty function could be replaced/ overloaded by the 'real'
> >>>> function if you want to use putAlarm.
> >>>>
> >>> I think we just need a global pointer for the routine which will
> >>> be called if it's not NULL, so your code just sets it to hook in.
> >>> Here's my proposed patch:
> >>>
> >>>
> >>> Index: recGbl.h
> >>> ===================================================================
> >>> RCS file: /net/phoebus/epicsmgr/cvsroot/epics/base/src/db/recGbl.h,v
> >>> retrieving revision 1.9
> >>> diff -u -b -r1.9 recGbl.h
> >>> --- recGbl.h 12 Feb 2003 21:22:23 -0000 1.9
> >>> +++ recGbl.h 19 Jun 2006 15:25:16 -0000
> >>> @@ -30,13 +30,23 @@
> >>> : FALSE\
> >>> )
> >>>
> >>> +/* Structures needed for args */
> >>>
> >>> -/* Global Record Support Routines*/ struct link; struct dbAddr;
> >>> struct dbr_alDouble; struct dbr_ctrlDouble; struct dbr_grDouble;
> >>> +struct dbCommon;
> >>> +
> >>> +/* Hook Routine */
> >>> +
> >>> +typedef void (*RECGBL_ALARM_HOOK_ROUTINE)(struct dbCommon *prec,
> >>> + unsigned short sevr, unsigned short stat); extern
> >>> +RECGBL_ALARM_HOOK_ROUTINE recGblAlarmHook;
> >>> +
> >>> +/* Global Record Support Routines */
> >>> +
> >>> epicsShareFunc void epicsShareAPI recGblDbaddrError(
> >>> long status, struct dbAddr *paddr, char *pcaller_name);
> >>> epicsShareFunc void epicsShareAPI recGblRecordError(
> >>> Index: recGbl.c
> >>> ===================================================================
> >>> RCS file: /net/phoebus/epicsmgr/cvsroot/epics/base/src/db/recGbl.c,v
> >>> retrieving revision 1.60.2.2
> >>> diff -u -b -r1.60.2.2 recGbl.c
> >>> --- recGbl.c 4 Nov 2004 19:21:08 -0000 1.60.2.2
> >>> +++ recGbl.c 19 Jun 2006 15:25:16 -0000
> >>> @@ -42,6 +42,10 @@
> >>> #include "recGbl.h"
> >>>
> >>>
> >>> +/* Hook Routines */
> >>> +
> >>> +RECGBL_ALARM_HOOK_ROUTINE recGblAlarmHook = NULL;
> >>> +
> >>> /* local routines */
> >>> static void getMaxRangeValues();
> >>>
> >>> @@ -239,6 +243,7 @@
> >>> if(stat_mask)
> >>> db_post_events(pdbc,&pdbc->stat,stat_mask);
> >>> if(sevr!=nsev || stat!=nsta) {
> >>> + if (recGblAlarmHook) (*recGblAlarmHook)(pdbc, sevr, stat);
> >>> ackt = pdbc->ackt; acks = pdbc->acks;
> >>> if(!ackt || nsev>=acks){
> >>> pdbc->acks=nsev;
> >>>
> >>>
> >>> If there is general agreement between DESY and FNAL about this, I'll
> >>> commit the change which will then appear in R3-14-9.
> >>>
> >>> - Andrew
> >>>
> >> --
> >> ---------------------------------------------------------------------
> >> -
> >> --
> >> Matthias Clausen Cryogenic Controls Group
> >> (MKS-2)
> >> phone: +49-40-8998-3256 Deutsches Elektronen
> >> Synchrotron
> >> fax: +49-40-8994-3256
> >> Notkestr. 85
> >> e-mail: [email protected] 22607
> >> Hamburg
> >> WWW-MKS2.desy.de
> >> Germany
> >> ---------------------------------------------------------------------
> >> -
> >> --
> >>
> >>
> >
> >
- Replies:
- RE: alarm hook Dalesio, Leo `Bob`
- References:
- RE: alarm hook Dalesio, Leo `Bob`
- Navigate by Date:
- Prev:
RE: seq debugger Jeff Hill
- Next:
RE: alarm hook Dalesio, Leo `Bob`
- Index:
2002
2003
2004
2005
<2006>
2007
2008
2009
2010
2011
2012
2013
2014
2015
2016
2017
2018
2019
2020
2021
2022
2023
2024
- Navigate by Thread:
- Prev:
RE: alarm hook Dalesio, Leo `Bob`
- Next:
RE: alarm hook Dalesio, Leo `Bob`
- Index:
2002
2003
2004
2005
<2006>
2007
2008
2009
2010
2011
2012
2013
2014
2015
2016
2017
2018
2019
2020
2021
2022
2023
2024
|