Last call for ISO 15924-based updates

Phillips, Addison addison at amazon.com
Tue Mar 17 16:19:37 CET 2009


The code ‘Zinh’ in ISO 15924 is used in the Unicode Character Database to indicate the ‘script’ property of certain characters. As such, it is unlikely to be used in any type of document to tag anything. It’s main use is in libraries for processing text (which need, sometimes, to know about Unicode properties).

For example, a program that implements Unicode Technical Standard #18 (Unicode Regular Expressions) needs to know about script properties because a program might ask for a sequence of characters that matches the expression: \p{script=latin}. Combining marks can appear within such a sequence. Knowing that these “inherit” their script from surrounding base characters would allow them to match the expression also.

Addison

Addison Phillips
Globalization Architect -- Lab126

Internationalization is not a feature.
It is an architecture.

From: ietf-languages-bounces at alvestrand.no [mailto:ietf-languages-bounces at alvestrand.no] On Behalf Of CE Whitehead
Sent: Tuesday, March 17, 2009 8:10 AM
To: ietf-languages at iana.org
Cc: cewcathar at hotmail.com
Subject: RE: Last call for ISO 15924-based updates

Hi!
I still favor a comment for [zinh] . . . in spite of misgivings here . . .

> Michael Everson everson at evertype.com <mailto:ietf-languages at alvestrand.no?Subject=Last%20call%20for%20ISO%2015924-based%20updates&In-Reply-To=68723E6B2E0EDC4999504D17DDE8F94906E35BF3 at S90X2HUB1.ad.insee.intra>
> Mon Mar 16 10:58:01 CET 2009


>



>I disagree. The statement of what Zinh is in the registry for is

>already in Doug's draft. There is no reason to add an imperative

>statement telling users of the registry Not To Use It.

I did not find it only Michael's citation of it . . . (I went through the draft online

http://tools.ietf.org/html/draft-ietf-ltru-4645bis-10 )



CHANGE COMMENT TO (the following is from information from John--the latter I extracted it from John's emails):

> Code
. . .
> used to label Unicode combining marks, which "inherit"
> their script property from the . . . character they are combined with. 'Dummy' script { Not used to tag documents ??}

However, I do question "Not used to tag documents" I am still totally lost (in spite of Peter's great explanations, below).  What exactly does "Not used to tag documents" mean?  Does it mean not used in the language tag indicating the overall document language, but possibly used somewhere in the document to indicate a diacritic mark on a character (where the display of the diacritic mark depends on the script/character)

(Sorry to ask a dumb question & I know this is long but I like lucid explanations that make sense to the unitiated.)

* * *

--C. E. Whitehead
cewcathar at hotmail.com<mailto:cewcathar at hotmail.com>



> From: petercon at microsoft.com
> To: cewcathar at hotmail.com; ietf-languages at iana.org
> Date: Fri, 13 Mar 2009 19:50:11 -0700
> Subject: RE: Last call for ISO 15924-based updates
>
> From: ietf-languages-bounces at alvestrand.no [mailto:ietf-languages-bounces at alvestrand.no] On Behalf Of CE Whitehead
>
> > My question is:  does this particular subtag 'zinh' help any applications display characters properly?
>
> The ISO 15924 script identifier, as it would be used in the Unicode Character Database, most definitely is used in software implementations to display text properly.
>
> As a subtag in a BCP47 language tag, it would most certainly NOT help any applications display characters properly. In a language tag, it would have no useful purpose.
>
>
>
> Peter
>


-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.alvestrand.no/pipermail/ietf-languages/attachments/20090317/fbde6576/attachment.htm 


More information about the Ietf-languages mailing list