EPICS Controls Argonne National Laboratory

Experimental Physics and
Industrial Control System

1994  <19951996  1997  1998  1999  2000  2001  2002  2003  2004  2005  2006  2007  2008  2009  2010  2011  2012  2013  2014  2015  2016  2017  2018  2019  2020  2021  2022  2023  2024  Index 1994  <19951996  1997  1998  1999  2000  2001  2002  2003  2004  2005  2006  2007  2008  2009  2010  2011  2012  2013  2014  2015  2016  2017  2018  2019  2020  2021  2022  2023  2024 
<== Date ==> <== Thread ==>

Subject: Re: EPICS status codes proposal
From: [email protected] (William Lupton)
To: [email protected]
Date: Tue, 24 Oct 95 17:47:10 HST
Dear all,

  I have a few comments (well, not so few...) on the status codes
discussion which has taken place since I posted my proposal last week.

  William


1. Broadening the scope
-----------------------

> From [email protected] Thu Oct 19 15:49:12 1995
> Subject: RE: EPICS status codes proposal

> We are like Keck in that we have a mix of EPICS and non-EPICS systems,
> both of which log error messages.  We want to get the messages from both
> systems to the same error logging service.

> From [email protected] Fri Oct 20 04:57:58 1995
> Subject: RE: EPICS status codes proposal

> If this is being revisited, then I think the scope should be bigger than
> just IOC code, and should include high level applications, cdev, and other
> non-EPICS systems.  An IOC based solution would be too limiting.  I hope
> that we can discuss this at the SOSH workshop Nov 4th, with a view towards
> portable applications.

My proposals were supposed to result in a portable status code and error
reporting system which could be used both within and without the EPICS
environment. Unfortunately I removed a remark to that effect before
sending off the proposals. These comments seem to add weight to the
importance of portability between environments.

> From [email protected] Fri Oct 20 04:57:59 1995
> Subject: RE: EPICS status codes proposal

> Has anyone look at the MURMUR system from FNAL?
> Doc is available via their WWW. It is a rather extensive
> error reporting system. We may be getting into a big project
> so it is a good idea to see if there is a free system that already
> meets our needs.

All I know of it is that it will be used for the Sloan Digital Sky
Survey.  I will find out more. I agree that if it becomes a big project
then we should look wider.

There are other alternative systems. Another from within the astronomy
community is the DRAMA error reporting system from the Anglo-Australian
Observatory. This is described as "a portable (VMS compatible) message
code system". A brief excerpt from its documentation follows (I can
provide a URL for those interested in more details):

"The MESSGEN utility and the Mess routines provide a portable technique
 for generating unique error codes and then associating text with the
 error codes at run time.

 MESSGEN will generate, if required, include files defining constants for
 each error code, in the C, Fortran, (Vax) Pascal and TCL languages.

 The Mess routines provide the ability to fetch the text associated with
 each message at program run time.

 The grammar used to specify the message codes and associated text is a
 subset of that accepted by the VMS message utility and generates codes
 compatible with those generated by VMS message. This allows existing VMS
 code to be ported to other machines and allows existing user interfaces
 running on VMS machines to translate MESSGEN error codes into text."


2. Specific comments on my proposals
------------------------------------

> From [email protected] Thu Oct 19 15:49:12 1995
> Subject: RE: EPICS status codes proposal
 
> You mentioned that you need the ability to make a message log-only and
> not visible to the operator.  We have the same requirement.  We also have
> the requirement that some fatal messages generate a particular noise
> in the control room (like a bomb drop - I kid you not!).

I think this is where the logPrintf() call comes in. See later.

> On our non-EPICS (VMS) system, we use a different kind of severity
> (success, informational, warning, error, fatal).  I suppose we can
> do some sort of translation from the SEVR values if necessary.  The 
> non-EPICS system also uses "facility" which is like your group/subsys.

I proposed using the SEVR encoding because I thought that would go down
well in the EPICS community.

I proposed using the top bits for severity because

  (a) it allows existing codes to remain unchanged (but I wasn't aware
      that originally the lsb was an error bit), and

  (b) it allows error codes to be negative, which happens to be our
      convention for Keck I

...but this could easily be changed, e.g. to use VMS severities. Are
there other opinions here?

> From [email protected] Fri Oct 20 06:23:12 1995
> Subject: error handling

> COMMENT 1:
> Perhaps "errSubsysRegister()" should have two arguments:
> 1) a symbolic name for the subsystem (instead of a pointer 
> 	to the error code table) and
> 2) the sub system identification code (an unsigned integer)
> 
> We could use naming conventions and the sub system symbolic name
> to find the error text table in the symbol table...

fine

> COMMENT 2:
> Perhaps we should start using an architecture independent type
> to hold status instead of the C type "long"...

fine

> COMMENT 3:
> Is a group code appropriate? I am worried that there may be more than 64
> distinct sites and there may be more than 64 subsystems at a site. I am 
> less worried that there will be more than 4096 sub systems...

fine; let's forget group codes and assume some central authority [assuming
that we stick with this scheme at all]

> From [email protected] Mon Oct 23 10:54:12 1995
> Subject: Re: EPICS status codes proposal

> The following discusses the use and also the limitations of
> EPICS status codes. The following are the result of discussions between
> Jim Kowalkowski, John Winans, and myself that we had as a result of the
> proposal submitted by William Lupton.

> 		Discussion of error status
> 
> Now let us state some things for which error status should NOT be used.
> 
> 1)A replacement or addition to an alarm system...

agree 

> 2)A way to communicate error status codes across many layers of software...

I don't fully agree; I think that the example given is overly influenced
by the way that existing EPICS database processing works.

With the advent of the portable channel access server there are going to
be many different types of EPICS databases. One of the simplest (but
none the less useful) will simply provide synchronous reads and writes
to named variables. Why, in such a case, forbid the application code
which implements the reads and writes from returning one of its own
error codes (by all means associated with a text string)? It is quite
possible that the client will know what to do with this error code (at
the very least it will be able to make use of the severity information)
and it doesn't make sense to me that the layers in between should
discard this potentially useful information.

I think that similar remarks apply to some (albeit simple) applications
using the current database, where a single record processes as a result
of a direct CA put to one of its fields. Why not pass back the status
from driver -> device -> record -> scan-task -> client (I fully accept
that channel access failures and such should "win" in that a CA error
will always bump an application-specific status - perhaps you only bump
status which is not of a greater severity but that's another
discussion)?

Also, why not support ERNO and ERST fields to hold current record status
values and error strings? This provides a consistent convention on how
clients can pick up information when things have gone wrong.

> We dont think error status values should have an embeded severity
> level. For EPICS S_ definitions the convention originally used was that
> even numbers meant informational messages and odd numbers errors.
> After several sets of status values were defined it was noticed that all
> values were odd. I think that the reason is as follows: When defining
> the S_ values the circumstances under which the value is used are not known...

I think that this is a good point. I certainly think that having one of
the severity levels be "no severity information" is important because
then it is possible for generic code potentially to OR SEVR information
from the current record into the severity bits where there is no a
priori severity information.

Sometimes though I think it is definitely useful to be able to associate
a static severity with a given code, perhaps mostly for fatal errors.

> What IS often desired is a verbosity level. vxWorks programmers often create
> a global variable such as xxxDEBUG, and then have statements such as
> 
>    if(xxxDEBUG >= 3) epicsPrintf(...
> 
> Where 3 would be one of the verbosity levels

I definitely agree with this and this is where I would add support for
the "log just to log file and not to user" sort of facility that I
mentioned in my proposal and which Stephanie mentioned as well. One could
also handle the "please provide verbal alarm" in a similar way.

This is obvious perhaps, but there are three verbosity levels involved
here:

1. the value in the call to logPrintf() or whatever, which is a fixed
   attribute of this particular message.

2. a global server-wide (or several for different bits of server code)
   value which controls what the server forwards to the client

3. a client-side value which controls what the client does with which
   messages

> 		Possible new log system interface
> 
> Let libraries define their own status values as small integers.
> The problem is to be able to generate error messages
> that contain both a meaningful message and the context. In order to solve
> this problem any library could make a call to the following routine:
>
>  logPrintf(int verbosityLevel, char *system, char *subsystem,
>	     char *format,...)

I don't really understand this. Is this an additional facility to that
which has already been discussed (i.e. do we now have errPrintf() which
reports messages, translates status etc., _and_  logPrintf()), or is it
an alternative?

- If the former, then how can errPrintf() translate the small codes? It
  can't call xxxGetSystem() because it doesn't know "xxx" (and a static
  "current system" variable doesn't work either).

- If the latter, then I don't see how the status values are being used
  or how they or the severities can possibly be propagated.

I can see the advantages of small codes from the point of view of
simplicity but I rather hoped that this one had been laid to rest in
favor of unique codes including facility numbers some time last week.

I can also see the advantage of providing a generic logging layer which
sits below the errPrintf() layer and which knows nothing about how to
translate error codes and everything about how to communicate logging
information to a logging server.

> It is assummed that logPrintf would add taskid and hostid to messages
> sent to a system wide error handler.
> The above scheme would allow a system wide error handler that lets a
> user select messages from particular hosts, particular subsystems,
> and particulay verbosity levels.

errPrintf() could do the same at present if it wanted to couldn't it?


3. Concluding remarks
---------------------

If possible I think that we should agree on a set of changes which have
minimal impact on existing applications but achieve the major aim, which
is to come up with an alternative to errMdef.h. I think that the search
for a better final solution should be regarded as a separate project
since such a solution will almost certainly impact existing coding
techniques and will therefore take longer to become used!

It now seems that we first have to resolve the issue of whether status
codes are to continue to be globally unique (within a given processor)
before we can move on (and the related severity issue).

William Lupton


Replies:
Re: EPICS status codes proposal watson

Navigate by Date:
Prev: RE: EPICS status codes proposal William Lupton
Next: Re: NT changes for 3.12.2 Andrew Johnson
Index: 1994  <19951996  1997  1998  1999  2000  2001  2002  2003  2004  2005  2006  2007  2008  2009  2010  2011  2012  2013  2014  2015  2016  2017  2018  2019  2020  2021  2022  2023  2024 
Navigate by Thread:
Prev: RE: EPICS status codes proposal William Lupton
Next: Re: EPICS status codes proposal watson
Index: 1994  <19951996  1997  1998  1999  2000  2001  2002  2003  2004  2005  2006  2007  2008  2009  2010  2011  2012  2013  2014  2015  2016  2017  2018  2019  2020  2021  2022  2023  2024 
ANJ, 10 Aug 2010 Valid HTML 4.01! · Home · News · About · Base · Modules · Extensions · Distributions · Download ·
· Search · EPICS V4 · IRMIS · Talk · Bugs · Documents · Links · Licensing ·