Experimental Physics and
Industrial Control System

Benjamin Franksen <[email protected]> · Fri, 1 Jul 2005 02:08:54 +0200

On Thursday 30 June 2005 23:05, Andrew Johnson wrote:
> Ralph Lange wrote:
> > Our idea is to /not/ make that type info a simple fixed enumeration
> > of fixed types, but use a dynamic structure to describe a type.
>
> Interesting idea, although I worry slightly about what it will do to
> performance.  See below though...
>
> > We came up with the following:
> >
> > type         =   string                      1)
> >
> >               | timestamp                   2)
> >               | signed (bitwidth)           3)
> >               | unsigned (bitwidth)         3)
> >               | float (float_format)
> >               | struct                      4)
> >               | array (base_type, size)     5)
>
> Can we add a generic 8-bit byte type, which allows specialized data
> types to be stored in an array of bytes.  In RDB terms this gives us
> BLOBs.
>
> > float_format = ieee32 | ieee64 | ieee80
>
> I don't particularly like the "ieee32" and "ieee64" encoding, why not
> just use a bitwidth?

Well, teh idea was that bitwidth does not completely identify the 
format. With integers bitwidth & signedness take care of all the 
differences between numeric types, with floating point numbers you have 
to spefify width of mantissa, exponent, several zero representations, 
NaN, +/-Inf etc..... see IEEE754 for details.

> IEEE 754 defines single precision (32-bit) and double precision
> (64-bit) formats explicitly, but beyond that it defines a *class* of
> double extended precision formats which are not portable between CPU
> families. SPARC uses a quadruple precision format which is 128 bits
> wide, while 68k and x86 have 80-bit formats which actually take 96
> bits to store (16 of these are unused, and there are also endian
> differences which affect the layout).  I don't think we should be
> adding long double (ieee80) because of this portability issue, and
> also because I've not heard anyone suggesting they might need more
> precision.

Many good reasons to not include ieee80, but the idea was to make it 
easy to incorporate additional foating point types as standards evolve. 
So what about:

float_format = ieee754_single | ieee754_double

with the option to add more formats as need arises?

> > Annotations:
> >
> > 1) String should be a basic atomic type - regardless of its
> > implementation.
>
> <pedantic>
> I think you mean "fundamental" rather than "atomic", to which I
> agree. To me the term "atomic type" means one which can be
> read/written in a single CPU read/write cycle so that it will always
> self-consistent even in the presence of interrupts, DMA or multiple
> threads or processors that could cause the value to be changed while
> you're accessing it. </pedantic>

Different perspective. We meant atomic with respect to the type notation 
language given above, not with respect to machine instructions.

BTW, your understanding of what is 'atomic' is quite architecture 
dependent, a 64 bit integer might be atomic one, but non-atomic on 
another machine.

However, I'm fine using the term 'fundamental' if that sits better with 
you machine-level thinkers... (no offense meant).

> > 3) A server or client app can choose any locally available data
> > type that is appropriate and wide enough to store the data. Future
> > wider integers are supported. As for signedness, alternatively
> > there could be
> >
> > one integer type with another parameter giving the integer format:
> >               | integer (int_format, bitwidth)
> >
> > with
> > int_format   =   unsigned |signed
>
> I still question the need to support unsigned integer types at all.
> If we want to transfer bit-array data (the most common use for
> unsigned types if you exclude Jeff's code), they should be sent as an
> array of bytes.

Forget sending and receiving for a moment. CA can send in array of bytes 
if it likes or use unsigneds or whatnot.

The question is: do you really want to forbid unsigned integral types 
for record fields? I doubt this is a good idea. I agree that they are 
rarely needed, but if they are needed it is a pain to have to use an 
array of bytes or the next larger signed integer type. I am talking 
about representing raw 'hardware' values in record fields. With 
'hardware' I do not only mean VME registers, but also values that 
arrive via CAN-bus or other field busses.

To give a concrete example: CANopen specifies that "objects are 
accessible by a 16-bit index and in the case of arrays and records 
there is an additionally 8-bit sub-index" (Quoted from 
http://www.can-cia.org/canopen/protocol/index.html). How am I supposed 
to represent these indices and sub-indices if I want to create a 
generic CANopen record?

Side note:

When I discussed this with Ralph, I had the idea to use a static range 
(low,high) instead of bit-width and signed/unsigned. This would have 
nicely avoided any reference to signedness, but still allow 
representation in native unsigned int format. However, as Ralph pointed 
out, we would need some integral type that was big enough to represent 
the bounds which would brings us all around to the start again... :(

The only way out would be to have unbounded integers as a fundamental 
type. Ralph didn't mention it because we agreed that there are probably 
too few applications for them in control systems, so including them 
solely to represent type information for bounded integer types probably 
isn't worth it. I'll send a question to techtalk to find out if there 
are any people who would dearly like to have them but just didn't dare 
to ask...;)

> > 4) This basic type is needed for properties that actually are a
> > contained list of other properties.
> >
> > 5) base_type is another type_info. If we need a type where arrays
> > do not start with element zero, size might be replaced by a (min,
> > max) pair. Multi-dimensional arrays are represented as arrays of
> > arrays.
>
> Our basic array types should only support a zero-based index;
> anything else can easily be derived.

Yes.

> > This type_info has a dynamic size (it's a union - which really
> > seems to be a natural way to describe lots of things...). Making
> > itself a property catalog seems a slim and convenient way to
> > interface it.
>
> It's not a union (which is not type-safe); it's a pointer to an
> abstract base class, and a set of derived classes with a single
> instance of each type.

Since we are being pedantic, it is an algebraic data type 
(http://en.wikipedia.org/wiki/Algebraic_data_type). In an OO-language 
you can simulate them via base class and sub-classes. In C you can 
simulate them with a struct containing an enum as discriminator tag and 
a C-union.

The term 'union' has been gravely misused by the C-language. 'Union' 
originally referred to the type-theoretic concept of a sum-type, i.e. 
given types X and Y, each viewed as sets of values, union(X,Y) is the 
**disjoint** union of these sets. This is completely type safe!

I used the term 'tagged union' whenever I meant 'algebraic data type' in 
the past in an attempt to capitalize on familiarity with the concept of 
union in C. This was apparently misguided because it associates the 
concept with an unsafe C feature that is more similar to a type cast 
than to algebraic data types.

> Inside the IOC we'd be passing around and comparing these pointers.

Yes, a possible C++ implementation could use a base class TypeDescriptor 
and derived classes for the concrete cases (IntegerDescriptor, 
ArrayDescriptor, ...); and use subtyping on pointers for up-casts plus 
dynamic_cast downward. <sarcasm>When (if) we have completed the general 
framework (including recursive types), we should probably try and make 
this an entry into C++Boost (http://www.boost.org/).</sarcasm>

> Hmm, those object could provide methods to allow you to manipulate
> and string-convert values.  We could even make use of the C++
> std::type_info class and the std::typeid() routine somehow...

Hmmm. Do you mean: a type describing object contains appropriate 
converter functions as members? But wouldn't this rather be separate 
functions that dispatch on /both/, target and source types? And isn't 
this somewhat similar if not identical with what the DA support 
libraries are about?

> Am I heading off in completely the wrong direction here?

Maybe, maybe not. At least, this was not our original idea. Instead we 
imagined to use DA property catalogs. You have an enumeration on top 
and -- depending on its value -- several sub-properties like bit-width 
or signedness or array-element-type are revealed. We would need a small 
number of predefined DA properties to describe all possible types.

Ben

Experimental Physics and Industrial Control System

Experimental Physics and
Industrial Control System