dewell at adelphia.net
Thu May 3 08:09:17 CEST 2007
Reshat Sabiq (Reşat) <tatar dot iqtelif dot i18n at gmail dot com>
> I don't find the Comments entry that was submitted to be well worded.
> I can agree w/ the 2nd sentenceö for the most partö for brevıty, but
> the first one is doesn't look right.
I apologize if Reşat or others are dissatisfied with the Comments field
for this subtag. I was largely responsible for trying to reduce the
length and 2-dimensional structure from Reşat's proposed:
> Comments: Denotes alphabet/orthography used in Turkic
> republics/regions of the former USSR in late 1920s, and throughout
> 1930s, which aimed at representing equivalent phonemes in a unified
> fashion. Other names:
> a. New Turkic Alphabet
> b. Birlәşdirilmiş Jeni Türk Әlifbasь (in Azeri)
> c. Jaŋalif (in Qazan Tatar: abbreviation of "New Alphabet").
I had hoped the shorter version Michael and I agreed on was mostly
equivalent to this. I can see I'm fighting a losing battle in trying to
keep the Comments fields down to a clause or two, as had been the
tradition for registered subtags in the past. (See, for instance, the
Comments field attached to the 'boont' subtag, which makes no attempt to
document the history and circumstances of the invention of Boontling.)
Now that we are sending the registration form as well as the new record
to IANA and to the list (did anyone notice we did that for 'tarask'?),
there will be greater public review and I probably won't be able to do
anything about the length of comments, but I also hope the archiving of
the forms will encourage proposers to put lengthy explanations and
bibliographic references there and not in the Registry.
> 1) It mentions only 1930s, which could mislead or confuse some people.
I had not thought the difference between "in late 1920s, and throughout
1930s" and "in the 1930s" would be substantial enough to cause genuine
> 2) It has a semicolon after Jeni which breaks a single correct name
> into two incorrect names, if i understand ;'s role as a delimiter in
> this thing.
I'm pretty sure the extra semicolon was a typo.
> 3) I also suggest changing ';' as a punctuation sign in names list to
> I'd appreciate feedback on possibility of changing the Comments from:
> Latin orthography used in the Soviet Union in the 1930s for writing
> Turkic languages. Also called New Turkic Alphabet;
> Birlәşdirilmiş Jeni; Türk Әlifbasь;
> or Jaŋalif.
> Denotes alphabet used in Turkic republics/regions of the former USSR
> in late 1920s, and throughout 1930s, which aspired to represent
> equivalent phonemes in a unified fashion. Also known as: New Turkic
> Alphabet, Birlәşdirilmiş Jeni Türk
> Әlifbasь, Jaŋalif.
(Reşat later changed the spelling from of 'Türk' to 'Tyrk' for reasons
that are not clear to me.)
I am not opposed to these changes, especially the one involving the
semicolon which incorrectly breaks up one of the names of the
orthography. Reşat's proposed change is not dramatically longer than
the one currently registered. I would suggest that we review this one
carefully, register the Right Thing, and then put it to rest, and not
let ourselves get into a pattern of micro-analyzing every Comments field
that appears in future registrations.
> 4) Lastly, I believe there is no dispute about the following being
> true for this subtag, and yet it is not so indicated, as i suggested
> Suppress-Script: Latn
Section 3.1 of RFC 4646 states clearly:
"The field 'Suppress-Script' MUST only appear in records whose 'Type'
field-value is 'language'."
Suppress-Script values are not added to variant subtags. If you feel
this should be changed, please join the LTRU Working Group list (link
available at bottom of this message) and discuss it there. This group
is not empowered to change RFC 4646.
Michael Everson <everson at evertype dot com> replied:
> I am not really very happy about tinkering so soon after registration.
> But if we do change it I would like to get rid of the illegible &xxxx;
> notation. If the registry entries are to be in HTML, they should be so
> normatively, with charset tagging so that they display properly. If
> they are not tagged, then ASCII fallback should be used so the strings
> are legible. As it is I can only guess, or drag out the Unicode book
> and look them up. That's not legibility.
RFC 4646 specifies that non-ASCII characters be represented using these
ugly hex NCRs. There are pros and cons to using UTF-8 for the Registry,
and even though I an a huge fan of UTF-8, there are valid points to be
made on both sides (i.e. many e-mail systems, even today, mangle UTF-8).
Again, this list is not empowered to disregard or overturn what RFC 4646
says. We have had the debate in LTRU probably three or four times now,
and the hex NCRs appear likely to stay.
Doug Ewell * Fullerton, California, USA * RFC 4645 * UTN #14
More information about the Ietf-languages