EPICS Controls Argonne National Laboratory

Experimental Physics and
Industrial Control System

2002  2003  2004  <20052006  2007  2008  2009  2010  2011  2012  2013  2014  2015  2016  2017  2018  2019  2020  2021  2022  2023  2024  Index 2002  2003  2004  <20052006  2007  2008  2009  2010  2011  2012  2013  2014  2015  2016  2017  2018  2019  2020  2021  2022  2023  2024 
<== Date ==> <== Thread ==>

Subject: string implementations (was: memory management)
From: Benjamin Franksen <[email protected]>
To: [email protected]
Date: Tue, 5 Jul 2005 11:58:15 +0200
On Wednesday 02 March 2005 02:45, Andrew Johnson wrote:
> There *are* implementations of the C++ std::string API that use
> non-contiguous blocks; the original SGI template library contains an
> extension called rope<T, Alloc> which implements such a thing - see
> http://www.sgi.com/tech/stl/Rope.html for details.

Because this question intersects with epicsTypes, I have been doing some 
research on the string implementation matter.

The above mentioned SGI rope implementation is horrible. It is 
completely unreadable, the whole implementation is in the header file, 
it's impossible to tell which is interface and which is implementation. 
A lot of the complication is apparently due to compatibility with 
std:string and the whole stupid iterator stuff from the standard 
library. These are bad, bloated interfaces and we should build our own, 
instead.

> I also found this comparison of string libraries which might be
> interesting - there are lots of possibilities out there already...
> http://www.and.org/vstr/comparison.html

This comparison site is indeed quite interesting, although of course 
somewhat biased toward the Vstr library. A main point with Vstr seems 
to be the netstrings, which are not so interesting for EPICS (we 
already have our own network protocol and don't want to use character 
encodings anyway). The Vstr string operations are also not thread safe. 
Otherwiase it has a lot in common with the library I'll discuss in the 
next paragraph.

The most interesting library I found is the Cord library (cord=lighter 
than ropes, heavier than strings) which is part of the Boehm garbage 
collector (http://www.hpl.hp.com/personal/Hans_Boehm/gc/). Highlights 
of this library are:

o cords are immutable => automatically thread-safe
o fast concatenation and substring operations (O(1))
o most of the memory is shared between operations
o any kind of character source (function) with pre-determined size can
  be represented directly as a cord, particularly:
o can represent an entire file as a cord without reading everything into 
  memory
o direct access (read-only) via index possible but somewhat slow
  (O(log(size))); but systematic traversal is better
o no performance hit (speed, memory) for very large strings
o large cords are usually represented as concatenations of small buffers
  (using a tree) so danger of fragmentation is low
o good integration with standard C strings (constants, e.g. literals, 
  need not be copied and can be used as they are)
o conversion functions (printf family); probably need to improve these

o implemented in C in /very/ readable style => amenable to changes and 
  additions (and fixes, if necessary)
o not a monster (~900 LOC for the base, another 600 for extended
  functionality), even though non-trivial algorithms are used
o uses the conservative Boehm GC (very mature & performant, ~26000 LOC)

I think this would be a fine implementation to use on the IOC. I 
experimented with writing a (very light-weight) C++ wrapper around 
cords and this looks very promising. Stay tuned...

Ben

References:
RE: memory management Jeff Hill
Re: memory management Benjamin Franksen
Re: memory management Andrew Johnson

Navigate by Date:
Prev: Re: Record support and user-defined fields Benjamin Franksen
Next: again: memory management Benjamin Franksen
Index: 2002  2003  2004  <20052006  2007  2008  2009  2010  2011  2012  2013  2014  2015  2016  2017  2018  2019  2020  2021  2022  2023  2024 
Navigate by Thread:
Prev: RE: memory management Jeff Hill
Next: Re: memory management Benjamin Franksen
Index: 2002  2003  2004  <20052006  2007  2008  2009  2010  2011  2012  2013  2014  2015  2016  2017  2018  2019  2020  2021  2022  2023  2024 
ANJ, 02 Feb 2012 Valid HTML 4.01! · Home · News · About · Base · Modules · Extensions · Distributions · Download ·
· Search · EPICS V4 · IRMIS · Talk · Bugs · Documents · Links · Licensing ·