On Wednesday 02 March 2005 02:45, Andrew Johnson wrote:
> There *are* implementations of the C++ std::string API that use
> non-contiguous blocks; the original SGI template library contains an
> extension called rope<T, Alloc> which implements such a thing - see
> http://www.sgi.com/tech/stl/Rope.html for details.
Because this question intersects with epicsTypes, I have been doing some
research on the string implementation matter.
The above mentioned SGI rope implementation is horrible. It is
completely unreadable, the whole implementation is in the header file,
it's impossible to tell which is interface and which is implementation.
A lot of the complication is apparently due to compatibility with
std:string and the whole stupid iterator stuff from the standard
library. These are bad, bloated interfaces and we should build our own,
instead.
> I also found this comparison of string libraries which might be
> interesting - there are lots of possibilities out there already...
> http://www.and.org/vstr/comparison.html
This comparison site is indeed quite interesting, although of course
somewhat biased toward the Vstr library. A main point with Vstr seems
to be the netstrings, which are not so interesting for EPICS (we
already have our own network protocol and don't want to use character
encodings anyway). The Vstr string operations are also not thread safe.
Otherwiase it has a lot in common with the library I'll discuss in the
next paragraph.
The most interesting library I found is the Cord library (cord=lighter
than ropes, heavier than strings) which is part of the Boehm garbage
collector (http://www.hpl.hp.com/personal/Hans_Boehm/gc/). Highlights
of this library are:
o cords are immutable => automatically thread-safe
o fast concatenation and substring operations (O(1))
o most of the memory is shared between operations
o any kind of character source (function) with pre-determined size can
be represented directly as a cord, particularly:
o can represent an entire file as a cord without reading everything into
memory
o direct access (read-only) via index possible but somewhat slow
(O(log(size))); but systematic traversal is better
o no performance hit (speed, memory) for very large strings
o large cords are usually represented as concatenations of small buffers
(using a tree) so danger of fragmentation is low
o good integration with standard C strings (constants, e.g. literals,
need not be copied and can be used as they are)
o conversion functions (printf family); probably need to improve these
o implemented in C in /very/ readable style => amenable to changes and
additions (and fixes, if necessary)
o not a monster (~900 LOC for the base, another 600 for extended
functionality), even though non-trivial algorithms are used
o uses the conservative Boehm GC (very mature & performant, ~26000 LOC)
I think this would be a fine implementation to use on the IOC. I
experimented with writing a (very light-weight) C++ wrapper around
cords and this looks very promising. Stay tuned...
Ben
- References:
- RE: memory management Jeff Hill
- Re: memory management Benjamin Franksen
- Re: memory management Andrew Johnson
- Navigate by Date:
- Prev:
Re: Record support and user-defined fields Benjamin Franksen
- Next:
again: memory management Benjamin Franksen
- Index:
2002
2003
2004
<2005>
2006
2007
2008
2009
2010
2011
2012
2013
2014
2015
2016
2017
2018
2019
2020
2021
2022
2023
2024
- Navigate by Thread:
- Prev:
RE: memory management Jeff Hill
- Next:
Re: memory management Benjamin Franksen
- Index:
2002
2003
2004
<2005>
2006
2007
2008
2009
2010
2011
2012
2013
2014
2015
2016
2017
2018
2019
2020
2021
2022
2023
2024
|