Review period; Nepali and Oriya

Mark Davis ☕ mark at macchiato.com
Thu Aug 23 12:06:40 CEST 2012


I think part of the problem is where A is not clearly defined to either
include or exclude C. Because there have been no formal definitions of what
A encompasses in the ISO standards, it was always unclear whether "Arabic"
meant "Modern Standard Arabic" or was more freeform (for example).

For stability, it would be better to interpret each code when defined as
the predominant form (eg Arabic = MSA), and then add additional language
codes for mutually-incomprensible forms whenever they can be clearly
identified. So I see it as:

There will be situations in which an existing entry, A, is determined to
have been used not only for A, but also for a distinct language C. The
strategy that can be taken is:

iv) define a new code C, and clarify that A does not encompass C.

That's the 'German' approach:

   - Add 'gsw'
   - Leave 'de' alone: don't define 'de' as a macrolanguage with
   encompassed languages 'dex' (Standard High German) and 'gsw' Swiss German.

Simply the fact that some Swiss German data in 2000 was tagged as 'de'
shouldn't be taken as evidence that 'de' encompasses it; it was probably
just the 'best available choice'.

Mark <https://plus.google.com/114199149796022210033>
*
*
*— Il meglio è l’inimico del bene —*
**



On Mon, Aug 20, 2012 at 10:46 PM, Peter Constable <petercon at microsoft.com>wrote:

> The 639-3 RA faces a bit of a conundrum in certain cases. (It's not clear
> to me if this really applies to the Oriya and Nepali cases or not; I
> talking in general terms.) There will be situations in which an existing
> entry, A, is determined to encompass multiple distinct languages, B and C.
> (For sake of discussion, I'll assume that B is the more well-known /
> developed of the two languages.) Now in principle, there are three
> strategies that can be taken:
>
> i) deprecate the identifier for A and create new identifiers for B and C
> ii) create new identifiers for B and C, change the scope of A to
> macrolanguage, and have A encompass B and C
> iii) redefine A to denote B but not C, and create a new identifier for C
>
> There's a general problem with option (iii) that existing records tagged
> with A may actually be in C, and so those records have suddenly become
> incorrectly tagged. That problem doesn't occur for either (i) or (ii). To
> avoid this problem, the text of 639-3 stipulates that, in maintaining the
> code table, strategy (iii) could not be employed.
>
> Now, it's become clear since the approval of 639-3 that, while the above
> is strictly speaking true, that hypothetical impact on existing records is
> not the only type of negative business impact that may arise from the need
> to accommodate the B/C distinction, and that strategies (i) and (ii) can
> also have negative business impacts, and potentially moreso than strategy
> (iii). But strictly speaking, the text of 639-3 doesn't permit that. (That
> hasn't stopped the RA and JAC from doing something in the vein of (iii) in
> some cases, but with a hand-wave of assuming that A was really never meant
> to denote C.)
>
> Even if the text of 639-3 did permit strategy (iii), it won't always be
> clear how the strategies compare in terms of their real-world business
> impacts.
>
>
>
> Peter
>
> -----Original Message-----
> From: ietf-languages-bounces at alvestrand.no [mailto:
> ietf-languages-bounces at alvestrand.no] On Behalf Of Doug Ewell
> Sent: Monday, August 20, 2012 11:25 AM
> To: ietf-languages at iana.org
> Subject: Re: Review period; Nepali and Oriya
>
> Mark Davis 🍶 <mark at macchiato dot com> wrote:
>
> > Now, that being said, if this group wants to have Nepali and Oriya be
> > macro languages, it is not really a problem for CLDR; simply more
> > entries in the tables. It will cause migration hassles for other
> > implementations that use BCP47, but that is not an issue with CLDR.
> > The more common the language, the worse the hassles. For example,
> > consider what would happen were ISO to decide that 'en' really was a
> > macrolanguage with 'ens' being Standard English, and 'enz' being New
> > Zealand English—how much software would hiccough when it hit
> > 'enz-GB'...
>
> I don't think this is an argument for or against creating extlangs. It's
> more an argument that ISO 639-3/RA should stop converting individual
> language code elements into macrolanguages. This group didn't decide to
> have Nepali and Oriya be macrolanguages, of course; that was the RA's
> decision.
>
> If the RA did what you posit with English, and ietf-languages followed
> this by creating extlangs, then the theoretical "New Zealand English as
> used in the United Kingdom" could be tagged, using extlang form, as
> "en-enz-GB". Existing software might have an easier time with this than
> with "enz-GB". Of course, in order to assign extlangs under English, this
> group would have to buy the notion that there is a "specific dominant
> variety" of English (§4.1.2), and that seems improbable.
>
> --
> Doug Ewell | Thornton, Colorado, USA
> http://www.ewellic.org | @DougEwell ­
>
> _______________________________________________
> Ietf-languages mailing list
> Ietf-languages at alvestrand.no
> http://www.alvestrand.no/mailman/listinfo/ietf-languages
> _______________________________________________
> Ietf-languages mailing list
> Ietf-languages at alvestrand.no
> http://www.alvestrand.no/mailman/listinfo/ietf-languages
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.alvestrand.no/pipermail/ietf-languages/attachments/20120823/bdda0bba/attachment.html>


More information about the Ietf-languages mailing list