EPICS Controls Argonne National Laboratory

Experimental Physics and
Industrial Control System

2002  2003  2004  <20052006  2007  2008  2009  2010  2011  2012  2013  2014  2015  2016  2017  2018  2019  2020  2021  2022  2023  2024  Index 2002  2003  2004  <20052006  2007  2008  2009  2010  2011  2012  2013  2014  2015  2016  2017  2018  2019  2020  2021  2022  2023  2024 
<== Date ==> <== Thread ==>

Subject: Re: CA V4 Protocol Specification
From: Benjamin Franksen <[email protected]>
To: [email protected]
Date: Thu, 27 Oct 2005 00:11:13 +0200
On Wednesday 26 October 2005 16:42, Andrew Johnson wrote:
> Benjamin Franksen wrote:
> ...
> Marty Kraimer replied:
> > Java 5 uses 16 bits for char, which is not sufficient to encode all
> > uni-code character sets.
> > It uses 2 consecutive chars to hold a unicode character that does
> > not fit in 16 bits.
> >
> > At least some C/C++ implementations use 32 bits for wchar which is
> > sufficient for all unicode characters.
> > But what if an implementation uses 16 bits?
> >
> > Thus how will the number of characters in a UTF-8 string be used?
>
> Unicode/UTF-8 (which is what we really mean when we say UTF-8) is
> well-defined in that if a routine understands the multi-byte encoding
> rules it can scan a UTF-8 string and count the number of Unicode
> 'code points' contained in it, which is probably what Benjamin means
> when he talks about a character count.

Yes.

> However like Marty I would strongly question the usefulness of this
> information to anything other than the final GUI display widget that
> is going to put the thing on a screen; even if it were using a
> monospaced font, some Unicode code points actually encode 'combining'
> characters like accents so the number of code points wouldn't always
> match the width of the final output.

Ok. It was just a thought. It seems you have put much more thought into 
this than I ever did, so you (both) are probably right. I was just 
thinking that conversion to other encoding/formats might be faster if a 
character (or code point) count was readily supplied. I agree that this 
is probably largely a client side matter and thus not so critical.

Ben

References:
CA V4 Protocol Specification Jeff Hill
Re: CA V4 Protocol Specification Marty Kraimer
Re: CA V4 Protocol Specification Andrew Johnson

Navigate by Date:
Prev: Re: CA V4 Protocol Specification Andrew Johnson
Next: RE: Release 3.14.8: What goes in it and when? Jeff Hill
Index: 2002  2003  2004  <20052006  2007  2008  2009  2010  2011  2012  2013  2014  2015  2016  2017  2018  2019  2020  2021  2022  2023  2024 
Navigate by Thread:
Prev: Re: CA V4 Protocol Specification Andrew Johnson
Next: Re: CA V4 Protocol Specification Andrew Johnson
Index: 2002  2003  2004  <20052006  2007  2008  2009  2010  2011  2012  2013  2014  2015  2016  2017  2018  2019  2020  2021  2022  2023  2024 
ANJ, 02 Feb 2012 Valid HTML 4.01! · Home · News · About · Base · Modules · Extensions · Distributions · Download ·
· Search · EPICS V4 · IRMIS · Talk · Bugs · Documents · Links · Licensing ·