Experimental Physics and
Industrial Control System

Jeff Hill <[email protected]> · Wed, 02 Mar 2005 12:30:21 -0700

As mentioned before, my first idea was to interface to strings 
using the streambuf interface in the standard library, and if I
had 
not discovered that most implementations call malloc when
creating 
a streambuf then I would probably still be using, and very happy
with, 
that design because the standard streambuf interface allows for
many different string implementations. 

My inclination was to interpose an interface that did not enforce
an early decision. The stringSegment interface should be able to
interface efficiently
with standard streambuf based strings should we decide to use
them, and also
directly with the existing fixed length string buffers in the
database should
we end up using them.

There isnt anything fundamentaly wrong with the streambuf
interface
other than its complexity. IMHO, this complexity is based on a
lack of 
clear division between the responsibilities on either side of the

interface (there is public data). Nevertheless, that complaint is

easily rectified by reading the doc and looking at some examples.

The real problem is the call to malloc for the locale stuff in
the 
streambuf base class' constructor (in many implementations). 
It has been almost a year since I made that evaluation, and it
*is* 
a shame to discard this approach based purely on use of malloc.
I seem to recall that the issue was that an implementation 
is required to makes a copy of the locale.
It might be possible to sidestep this streambuf constructor 
calls malloc problem by carefully choosing our standard 
library implementation (possibly GNU), and I would be happy 
to revist my original evaluation if there was some interest
in that.

> > A better
> > alternative would be a function that copies the string to a
user
> > supplied contiguous buffer.
>
> I agree that it would be clearer then whose responsibility it
is to 
> allocate/free the memory (the user's), instead of spreading 
> it between 
> std::string (allocation) and user (de-allocation). OTOH, such 
> an interface 
> would be more vulnerable to buffer overflow errors.
> 
> The main point, however, is that you cannot solve the 
> underlying problem by refusing to provide method c_str. 
> Either the functionality is not really 
> needed. In this case it would be sufficient to (strongly) 
> discourage use of this method and to point out its
inefficiency.
> 
> Or, as you seem to suggest, users really /need/ this 
> functionality i.e. there /is/ lots of code that expects 
> a classic contiguous C string (for 
> instance, because many old-style C string routines are used).
In that 
> case /any/ implementation based on non-contiguous storage has 
> the same problem, regardless of the interface, and regardless
of who 
> (user or string class) does the work of allocating additional
storage and 
> copying the data.

I am not convinced that the following isn't providing equivalent
capability to c_str() while avoiding all of the major weaknesses
with c_str.

void copyOutString ( char * pMyBuf, size_t bufLen)

> 
> An advantage of providing c_str is that allocation and 
> copying can be avoided 
> in cases where the string is short enough to fit into a 
> single (fixed-size) 
> block.

Among crushing, cataclysmic negatives there is this one 
advantage :-)

> 
> > BTW: Functions like c_str are also a real problem from a
thread
> > safe interface perspective.
> 
> Why?

Andrew has already given a good answer here. He has mentioned
that
locking is usually provided at a higher level. I agree, but
should
mention that when maintaining large multi-threaded
programs there are possibilities of failing to remember where in
the function
call hierarchy the locking must eventually be implemented.
Snippets of
code get reused in many different situations. Entry points might
be called
without holding the proper lock. I am recently starting to use
mutex guard 
classes to enforce the locking requirements of interfaces at
compile time,
and this approach requires that the interface of each class be
perfectly
clear in terms of thread safety, and enforced based on the mutex
guard that
must be passed to member functions. All of those warnings in the
standard
about not using the ptr returned by c_str after the next call to
a basic_string member function sound like an invitation for race 
condition nightmares when trying to maintain the locking in a 
large multithreaded code.

> 
> > [...]
> > Note however that the pure virtual string interface in data
> > access exists to provide us options. We may use almost any
string
> > implementation we would like. This includes standard library
> > strings and standard library streams should they be found to
be
> > suitable for a particular application.
> 
> Of course, the smaller your interface, the larger the set of
possible 
> implementations that can be fit unto it. Or so it would seem.
> 
> Unfortunately, however versatile your string interface may 
> be, it imposes an 
> imperative style on the implementation: it completely 
> precludes functional 
> style (immutable) strings. Such strings are *so* much easier 
> to handle, than 
> the traditional mutable ones. Take concatenation as an 
> example. Functional 
> style:
> 
> 	res = concat(s1,s2);
> 
> Imperative style:
> 
> 	res = new string( s1.length() + s2.length() ); // or 
> was it -1 or +1 ???
> 	res.copy( s1 );
> 	res.append( s2 );

Take a 2nd look at the stringSegment interface (see below). It
does 
not preclude user defined operators for your "imperative 
style", and in fact includes interfaces directly supporting it.
For example, the write interface that takes a stringSegment can 
be used to append a stringSegment to a stringSegment.

class stringSegment : 
    public streamPosition, 
    public streamRead,
    public streamWrite {
public:
    virtual bool getChar ( unsigned & inChar ) const  = 0;
    virtual bool putChar ( unsigned outChar )  = 0;
    virtual stringDiff compare ( const stringSegment & ) const  =
0;
};

class streamPosition {
public:
    // returns the number of elements in the stream
    virtual size_t length () const = 0;
    // get current position
    virtual size_t position () const = 0;
    // set the current stream position
    // (returns false if request cant be satisfied)
    virtual bool movePosition ( size_t newPosition ) = 0;
    // returns the number of immediately viewable 
    // elements after the current position.
    virtual size_t viewable () = 0;
    // remove all elements from current position to the 
    // end of the stream
    virtual bool prune () = 0;
    // flush cached output entries 
    virtual void flush () = 0;
};

epicsShareExtern class propertyCatalog & voidCatalog;

enum streamWriteStatus { 
    swsSuccess = 0, 
    swsUnableToExtend = 1
};

class streamWrite {
public:
    virtual streamWriteStatus write ( 
        const double &, const propertyCatalog & = voidCatalog ) =
0;
    virtual streamWriteStatus write ( 
        const int &, const propertyCatalog & = voidCatalog ) = 0;
    virtual streamWriteStatus write ( 
        const long &, const propertyCatalog & = voidCatalog ) =
0;
    virtual streamWriteStatus write ( 
        const unsigned &, const propertyCatalog & = voidCatalog )
= 0;
    virtual streamWriteStatus write ( 
        const unsigned long &, const propertyCatalog & =
voidCatalog ) = 0;
    virtual streamWriteStatus write ( 
        const epicsTime &, const propertyCatalog & = voidCatalog
) = 0;
    virtual streamWriteStatus write ( 
        const class stringSegment &, const propertyCatalog & =
voidCatalog ) = 0;
};

enum streamReadStatus { 
    srsSuccess = 0, 
    srsOutOfRangeLow = 1, 
    srsOutOfRangeHigh = 2, 
    srsIncompatible = 3, 
    srsIncomplete = 4
};

class streamRead {
public:
    virtual streamReadStatus read ( 
        double &, const propertyCatalog & = voidCatalog ) const =
0;
    virtual streamReadStatus read ( 
        int &, const propertyCatalog & = voidCatalog ) const = 0;
    virtual streamReadStatus read ( 
        long &, const propertyCatalog & = voidCatalog ) const =
0;
    virtual streamReadStatus read ( 
        unsigned &, const propertyCatalog & = voidCatalog ) const
= 0;
    virtual streamReadStatus read ( 
        unsigned long &, const propertyCatalog & = voidCatalog )
const = 0;
    virtual streamReadStatus read ( 
        epicsTime &, const propertyCatalog & = voidCatalog )
const = 0;
    virtual streamReadStatus read ( 
        class stringSegment &, const propertyCatalog & =
voidCatalog ) const = 0;
};

> An implementation based on non-contiguous storage, could take 
> advantage of its 
> storage model, and almost completely avoid copying (at the 
> cost of slightly 
> increasing the overall memory footprint). 
> For instance, functional 
> concatenation can be done in constant time (avoiding all 
> allocation and 
> copying).
> For instance, functional 
> concatenation can be done in constant time (avoiding all 
> allocation and 
> copying). As long as strings are immutable and references are 
> properly 
> tracked, an implementation can easily share the storage 
> between different 
> strings (except the meta data). I would bet that such an 
> implementation is in 
> the end a lot more efficient than any implementation based on 
> mutability, 
> such as imposed by the dataAccess string interface.
> 

Among other good reasons for encouraging implementations 
based on non-contiguous fixed sized memory management!

Jeff

Experimental Physics and Industrial Control System

Experimental Physics and
Industrial Control System