Request: Add retired tag "eml" to the IANA registry

Lang Gérard gerard.lang at insee.fr
Fri Dec 11 17:36:27 CET 2009


The assertion that "no deprecated code element representing the name a linguistic entity can be reattributed  inside ISO 639" is not written inside the normative text of any of the 5 valid parts (1, 2, 3, 5 and 6), nor in the generic methodological soon valid part (4) of ISO 639.
Gérard LANG, member of ISO 639/RA-JAC

-----Message d'origine-----
De : ietf-languages-bounces at alvestrand.no [mailto:ietf-languages-bounces at alvestrand.no] De la part de Phillips, Addison
Envoyé : vendredi 11 décembre 2009 17:14
À : Michael(tm) Smith
Cc : ietf-languages at alvestrand.no
Objet : RE: Request: Add retired tag "eml" to the IANA registry

I think there are two potential problems here.

1. Although I sincerely hope that retired codes are reserved forever, I'm not sure what happens with them. If ISO 639 can recycle the retired ones, that would create problems and we should not register the retired items so that they can eventually be used for something useful.

If they have a policy of not reusing the retired codes, then we probably should register them as deprecated (this would solve your problem).

2. If someone is using an unregistered subtag in their language tags (and there is no reason whatsoever for that to happen---there are ample private use mechanisms), then the meaning of that subtag cannot be entirely clear without greater context. You happen to know that 'eml' is meant to be Emiliano-Romagnolo in the page you are testing. But generically in a tool such as yours or Richard's, it could just be a typo... because it was *never* valid as a subtag. And my feeling is that tools should NOT encourage any other source for subtags than the IANA registry or its extensions.

The retired codes appear to be "retired for a reason". Future retirements will result in deprecations in the registry. I found the same list John Cowan did and, if ISO 639's policy is never to reuse these items, then I think we should register the deprecated items in the interest of being complete.

Addison Phillips
Globalization Architect -- Lab126

Internationalization is not a feature.
It is an architecture.


> -----Original Message-----
> From: Michael(tm) Smith [mailto:mike at w3.org]
> Sent: Friday, December 11, 2009 12:40 AM
> To: Phillips, Addison
> Cc: Richard Ishida; ietf-languages at alvestrand.no
> Subject: Re: Request: Add retired tag "eml" to the IANA registry
> 
> Hi Addison,
> 
> [Cc'ing Richard since I see this now possibly being more of an 
> application issue that also relate to tools he's built.]
> 
> So I think I'm beginning to realize that there may not be much point 
> in me trying pursue a resolution for the specific case of eml, and 
> wondering instead if there might be some way to address the general 
> issue. (I'm kind of thinking out loud here, so I hope you'll indulge 
> me a bit.)
> 
> So I now notice the following:
> 
>   Retired ISO 693-3 codes
>   http://www.sil.org/ISO639-3/codes_retired.asp
> 
>   Retired Code Element Mappings
>   http://www.sil.org/ISO639-3/iso-639-3_Retirements_20090126.tab
> 
> There are around 159 codes listed on there. So it's a large list, but 
> not a huge one. I would think it might be worth integrating into the 
> IANA registry at some point but if that's not likely to happen, then 
> it seems like tool developers who want to have their apps report 
> something helpful to users about these codes/tags can have their apps 
> parse and use that data in that .tab file.
> 
> The use case that I think needs to be considered is the case of what 
> apps like markup validators or Richard Ishida's Language Subtag Lookup 
> tool should report for codes/tags that are in that ISO 639-3 list, 
> e.g. "eml". I think it would be preferable to have some level of 
> interoperability among those kinds of tools -- so that Validator.nu 
> reports something similar to what Richard's tool reports for any given 
> case.
> 
> Right now, because both tools are currently relying solely on the IANA 
> subtag registry for information (at least I know that's all 
> Validator.nu is using -- though not completely sure if that's the case 
> for Richard's tool) what we report is that it's a completely unknown 
> subtag. Which doesn't seem like the right thing to be doing. Because 
> it's actually a known subtag, but just retired. So we should report it 
> as such, along with the details about whether there's any replacement.
> 
> Since we can't do that based on the information in the IANA subtag 
> registry alone, and it seems like we will both need to refine the 
> tools to also use the information from the ISO 639-3 .tab file -- of 
> we want to report something helpful to users for these cases.
> 
> So lacking any likelihood of those retired tags actually getting into 
> the IANA subtag registry, I'd suggest it'd at least be nice to have -- 
> in whatever IETF or IANA document where it would be most appropriate 
> -- some statement of guidance to developers saying, "Tools that check 
> and report on instances of usage of language subtags SHOULD use both 
> the data in the IANA subtag registry and the available data about 
> IS0-639-3 retired codes."
> 
>   --Mike
> 
> "Phillips, Addison" <addison at amazon.com>, 2009-12-11 00:45 -0500:
> 
> > Hello Michael,
> >
> > There are two problems with your request:
> >
> > 1. You cannot request a grandfathered subtag. That category
> exists for subtags already extant in the old RFC 3066 registry at the 
> time RFC 4646 came into being. The list is fixed in a number of 
> difficult-to-undo ways. You can request a primary language subtag 
> based on ISO 639-3 RA action (this isn't one) or a longer-than- 
> three-letter code to replace 'eml' (this isn't what you're looking 
> for).
> >
> > My reasoning is that 2009-01-16 predates RFC 5646, which
> incorporated ISO 639-3 into the registry. 'eml' could not have been 
> registered under RFC 4646. Since the code should not have been used at 
> any time as a valid primary language subtag and since ISO 639-3 has a 
> reasonably large number of "retired" codes, it probably would be a bad 
> idea to take this particular one in. There doesn't seem to be a basis 
> in the RFC for doing so.
> >
> > 2. I don't believe that your requested Preferred-Value is valid.
> Each Preferred-Value must contain exactly one subtag and there can be 
> only one P-V field per record. It is permissible to omit the 
> Preferred-Value. You could include a Comments field explaining 
> preferred values, as deprecation of a record is permitted without 
> having a single preferred mapping. For example, see region subtag
> YU:
> >
> > %%
> > Type: region
> > Subtag: YU
> > Description: Yugoslavia
> > Added: 2005-10-16
> > Deprecated: 2003-07-23
> > Comments: see BA, HR, ME, MK, RS, or SI %%
> >
> > Addison
> >
> > Addison Phillips
> > Globalization Architect -- Lab126
> >
> > Internationalization is not a feature.
> > It is an architecture.
> >
> > > -----Original Message-----
> > > From: ietf-languages-bounces at alvestrand.no [mailto:ietf-
> languages-
> > > bounces at alvestrand.no] On Behalf Of Michael(tm) Smith
> > > Sent: Thursday, December 10, 2009 8:09 PM
> > > To: ietf-languages at alvestrand.no
> > > Subject: Request: Add retired tag "eml" to the IANA registry
> > >
> > > This is a request to add the retired tag "eml" to the IANA 
> > > language-subtag registry as a grandfathered tag. I realize this
> is
> > > an odd request; for the rationale, see "6. Any other relevant 
> > > information" below.
> > >
> > > 1. Name of requester: Michael(tm) Smith 2. E-mail address of 
> > > requester: mike at w3.org 3. Record Requested:
> > > [[
> > >    Type: grandfathered
> > >    Tag: eml
> > >    Description: Emiliano-Romagnolo
> > >    Added: 2010-XX-XX
> > >    Deprecated: 2009-01-16
> > >    Preferred-Value: egl or rgn
> > > ]]
> > > 4. Intended meaning of the tag: Emiliano-Romagnolo 5. Reference to 
> > > published description of the language:
> > >    http://www.sil.org/ISO639-3/documentation.asp?id=eml
> > > 6. Any other relevant information:
> > > [[
> > > My understanding about "eml" is:
> > >
> > >   - is has never been in the registry nor was it in the set of
> > >     tags that were grandfathered into the registry
> > >
> > >   - its status is "retired", a state that doesn't exactly 
> > > correspond
> > >     to any existing field values in the registry but that based
> on
> > >     what I have read[1] means that it remains valid but
> deprecated
> > >
> > >     [1] message from Peter Constable on LTRU list, stating
> > >         '"Retired" means it's no longer recommended --
> basically
> > >         the same as deprecated.'
> > >         http://www.ietf.org/mail-
> > > archive/web/ltru/current/msg08352.html
> > >
> > > The fact that it is not in the registry makes it impossible, using 
> > > the registry alone, to distinguish a use of "eml" from
> being
> > > an instance of a invalid tag. If it is in fact still valid but 
> > > deprecated, it seems that it should be included in the registry 
> > > and marked as such, so that its actual status is clear.
> > >
> > > Problems:
> > >
> > >   - Grandfathered vs. retired. I recognize that the semantics
> of
> > >     the "grandfathered" type are different from the semantics
> of
> > >     "retired", but the only other solution would seem to be to
> add
> > >     "retired" as a new documented value for the type field, and
> it
> > >     would seem like there would not be enough benefit to
> justify
> > >     doing that.
> > >
> > >   - Syntax of the Preferred-Value field. I don't know what
> > >     documented constraints there are on the syntax of the
> > >     Preferred-Value field, nor what expectations/ assumptions
> any
> > >     current parsers of the registry have about the value of the
> > >     field. But for this case, if "eml" is added, it would seem
> to
> > >     require that the field be able to contain multiple values.
> > >     If/when it does, I don't know what would be the best way to
> > >     separate the values should be -- just space-separated or
> > >     comma-separated, or what -- but it seems like just putting
> > >     "or" between might be good as far as trying to keep
> backward
> > >     compatibility with existing tools (which I would guess are
> > >     just reading in the whole value as a string).
> > >
> > >   - "Added" date. Not sure what the Added date would best be
> for
> > >     this case. Though I can see it being odd to have a record
> with
> > >     an Added date that is after its Deprecated date, it seems
> like
> > >     it'd best need to be the date of if/when this actually does
> > >     get added to the registry.
> > >
> > > That's it in a nutshell. The rest of the info below is just
> about
> > > the particular context/use-case underlying my making this
> request.
> > >
> > > ---------------------------------------------------------------
> --
> > > More details about the context for this request
> > > ---------------------------------------------------------------
> --
> > > The context for this request is that I contribute to
> development
> > > of a markup validation tool, Validator.nu that includes a
> feature
> > > for checking the conformance of the values of HTML lang and XML 
> > > xml:lang attributes. The feature is enabled through a backend 
> > > parser that reads and parses the IANA language-subtag registry.
> > >
> > > We recently got a report from an admin at Wikipedia about some
> of
> > > the error messages that tool emits. The context for the report
> is
> > > that the http://wikipedia.org home page includes links to all 
> > > Wikipedias available in any language that one has been created
> for.
> > >
> > > One of those existing Wikipedias is http://eml.wikipedia.org
> > >
> > > (As far as why Wikipedia has a eml.wikipedia.org site instead
> of
> > > having separate egl.wikipedia.org and rgn.wikipedia.org sites,
> I
> > > dunno. But they do, and it would seem that as long as it exists, 
> > > it is reasonable for eml to be the tag to uniquely identify it.)
> > >
> > > When I run Validator.nu on the Wikipedia.org home page, I get:
> > >
> > >   http://qa-dev.w3.org:8008/?doc=http%3A%2F%2Fwikipedia.org
> > >
> > > Notice the error "Bad value eml for attribute lang on element a:
> > > Bad ISO language part in language tag".
> > >
> > > What it seem like should be reported for this case is a warning:
> > > "Bad value eml for attribute lang on element a: The language
> tag
> > > eml is deprecated. Use egl or rgn instead.
> > >
> > > But because "eml" is not in the registry, I currently have no
> way
> > > of having the application correctly report for that problem -- 
> > > except to special-case "eml" in the application code (which I
> can
> > > do easily enough but would prefer first to try getting it into
> the
> > > registry so that other developers don't also end up having to 
> > > special-casing it in the code too).
> > > ]]
> > >
> 
> --
> Michael(tm) Smith
> http://people.w3.org/mike/
_______________________________________________
Ietf-languages mailing list
Ietf-languages at alvestrand.no
http://www.alvestrand.no/mailman/listinfo/ietf-languages


More information about the Ietf-languages mailing list