A proposed solution for descriptions (was: Re: ISO 639 - New itemapproved - N'Ko)

Sun Jun 11 12:49:56 CEST 2006

+1 

> -----Original Message-----
> From: ietf-languages-bounces at alvestrand.no 
> [mailto:ietf-languages-bounces at alvestrand.no] On Behalf Of Doug Ewell
> Sent: 11 June 2006 05:13
> To: ietf-languages at iana.org
> Subject: A proposed solution for descriptions (was: Re: ISO 
> 639 - New itemapproved - N'Ko)
> 
> Mark Crispin <mrc at CAC dot Washington dot EDU> wrote:
> 
> > The problem is that you guys are trying to resolve 
> conflicting desires 
> > into a single name.  Long experience tells me that this 
> doesn't work, 
> > and ultimately forces the registry into wretched compromises that 
> > displease everybody.
> 
> Richard Ishida <ishida at w3 dot org> wrote:
> 
> > In the case of the actual registry, there currently is no 
> N'Ko ASCII 
> > text, and one would have to type N&#x2019;Ko to get a 
> match, knowing 
> > the right code point to use, and how to represent that as 
> an NCR. You 
> > cannot google that by typing in N'Ko. I don't think that 
> situation is 
> > very helpful to the average user.
> 
> Originally I was opposed to adding new Description values to 
> solve this problem, but Mark's and Richard's arguments have 
> thoroughly convinced me that this is necessary, and isn't a 
> slippery slope that would lead to dozens of Description 
> strings for every subtag.  I stand corrected, and no, I don't 
> mind being called a flip-flopper.
> 
> I hereby propose some changes to the Description fields of 28 
> existing records, based on the following issues that 
> presented themselves more or less in this order.
> 
> 1.  With the addition of N'Ko the language, the Registry now 
> has 14 subtag records with Description fields that include a 
> non-ASCII character (and therefore a hex NCR).  I propose 
> that for each of these, a corresponding ASCII-only 
> Description be added.  Example: "N&#x2019;Ko" 
> will be joined by "N'Ko".  This applies not only to 
> apostrophes, but to all non-ASCII characters such as accented 
> letters: "Volapük" will be joined by "Volapuk".  This solves 
> most of the problem described by Richard.
> 
> 2.  Conversely, those subtags that have a Description with an 
> ASCII apostrophe should have a corresponding Description 
> added with the appropriate non-ASCII directional apostrophe 
> or modifier letter. 
> Example: "Mi'kmaq" will be joined by "Mi&#x2BC;kmaq".  This 
> should answer the concerns of Michael and others that a 
> Description in "the correct characters" be available for all subtags.
> 
> 3.  A few names (Gwich'in, Ge'ez) currently have the *wrong* 
> non-ASCII apostrophe.  I propose that these be changed to a 
> more appropriate character, as well as adding the pure-ASCII 
> equivalent.  Example: 
> "Gwich´in" will be deleted and two new Description fields, 
> "Gwich&#x2BC;in" and "Gwich'in", will be added.  This also 
> answers a concern raised by Michael.
> 
> 4.  Some subtags were found to have a Description with a 
> second name in parentheses, which is really an alternate name 
> rather than a qualifier of the first name.  In the case of 
> script subtag "Hano", the Description "Hanunoo 
> (Hanun&#xF3;o)" already does what we are trying to achieve: 
> it provides ASCII and non-ASCII equivalents for the same 
> name.  This should be replaced by two new Description fields, 
> "Hanunoo" and "Hanun&#xF3;o".
> 
> 5.  Likewise for a Description like "Lepcha (R&#xF3;ng)", it 
> doesn't make sense to repeat the "Lepcha" part simply to 
> provide an ASCII and non-ASCII version of "Róng".  What would 
> make sense would be to split this into three Descriptions: 
> "Lepcha", "R&#xF3;ng", and "Rong".
> 
> 6.  For that matter, any Description fields with an alternate 
> name in parentheses (not a qualifier) should really be split 
> into multiple Descriptions, regardless of whether non-ASCII 
> characters are present. 
> Example: "Falkland Islands (Malvinas)" should be split into 
> "Falkland Islands" and "Malvinas".  This is what we did with 
> language subtags, which are separated by semicolons in ISO 
> 639: we converted them to multiple Description fields.  What 
> I propose is that we do this consistently with scripts and 
> regions as well.
> 
> Note that items 4 through 6 have no effect on Description 
> fields where the parenthesized portion acts as a qualifier to 
> the unparenthesized portion.  For example, "Cyrillic (Old 
> Church Slavonic variant)" would NOT be split into "Cyrillic" 
> and "Old Church Slavonic variant" since this would make no 
> sense, and would give "Cyrl" and "Cyrs" the same Description.
> 
> 7.  Finally, getting back to the apostrophe issue, it appears 
> that the language Amis, represented by the grandfathered tag 
> "i-ami", should not have an apostrophe at all.  This was 
> listed as 'Amis in the RFC 1766 registration form dating back 
> to 1999, and so it was copied that way to the initial RFC 
> 3066bis Registry, but apparently this was a typo or editing 
> error.  I propose changing this to "Amis".
> 
> In a separate mail I will present proposed registration forms 
> for all 28 subtags that are affected in one way or another by 
> these issues.  They are severable; each should be considered 
> and discussed by the group on its own merits.  We aren't 
> really constrained by time on this, but we should keep the 
> discussion moving so that the appropriate changes (as agreed 
> by the list) can be made to the Registry.
> 
> --
> Doug Ewell
> Fullerton, California, USA
> http://users.adelphia.net/~dewell/
> 
> 
> _______________________________________________
> Ietf-languages mailing list
> Ietf-languages at alvestrand.no
> http://www.alvestrand.no/mailman/listinfo/ietf-languages