<font face="times new roman,serif">I think part of the problem is where A is not clearly defined to either include or exclude C. Because there have been no formal definitions of what A encompasses in the ISO standards, it was always unclear whether "Arabic" meant "Modern Standard Arabic" or was more freeform (for example). </font><div>

<font face="times new roman, serif"><br></font></div><div><font face="times new roman, serif">For stability, it would be better to interpret each code when defined as the predominant form (eg Arabic = MSA), and then add additional language codes for mutually-incomprensible forms whenever they can be clearly identified. So I see it as:</font></div>

<div><font face="times new roman, serif"><br></font></div><div><span style="font-family:arial,sans-serif;font-size:13px;background-color:rgb(255,255,255)">There will be situations in which an existing entry, A, is determined to have been used not only for A, but also for a distinct language C. The strategy that can be taken is:</span></div>

<div><span style="font-family:arial,sans-serif;font-size:13px;background-color:rgb(255,255,255)"><br></span></div><div><span style="font-family:arial,sans-serif;font-size:13px;background-color:rgb(255,255,255)">iv) define a new code C, and clarify that A does not encompass C.</span></div>

<div><span style="font-family:arial,sans-serif;font-size:13px;background-color:rgb(255,255,255)"><br></span></div><div><span style="font-family:arial,sans-serif;font-size:13px;background-color:rgb(255,255,255)">That's the 'German' approach: </span></div>

<div><ul><li><span style="background-color:rgb(255,255,255);font-family:arial,sans-serif;font-size:13px">Add 'gsw'</span></li><li><span style="background-color:rgb(255,255,255);font-family:arial,sans-serif;font-size:13px">Leave 'de' alone: don't define 'de' as a macrolanguage with encompassed languages 'dex' (Standard High German) and 'gsw' Swiss German.</span></li>

</ul><div><font face="arial, sans-serif">Simply the fact that some Swiss German data in 2000 was tagged as 'de' shouldn't be taken as evidence that 'de' encompasses it; it was probably just the 'best available choice'.</font></div>

</div><div><span style="font-family:arial,sans-serif;font-size:13px;background-color:rgb(255,255,255)"><br></span></div><div><div><div><font face="'times new roman', serif"><div style="background-color:transparent;margin-top:0px;margin-left:0px;margin-bottom:0px;margin-right:0px">

<a href="https://plus.google.com/114199149796022210033" target="_blank">Mark</a></div><div style="background-color:transparent;margin-top:0px;margin-left:0px;margin-bottom:0px;margin-right:0px"><i><br></i></div><div style="background-color:transparent;margin-top:0px;margin-left:0px;margin-bottom:0px;margin-right:0px">

<i>— Il meglio è l’inimico del bene —</i></div></font><div><div><font face="'times new roman', serif"><i><span style="font-style:normal"><i></i></span><i></i></i></font></div></div><br>

<br><br><div class="gmail_quote">On Mon, Aug 20, 2012 at 10:46 PM, Peter Constable <span dir="ltr"><<a href="mailto:petercon@microsoft.com" target="_blank">petercon@microsoft.com</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">

The 639-3 RA faces a bit of a conundrum in certain cases. (It's not clear to me if this really applies to the Oriya and Nepali cases or not; I talking in general terms.) There will be situations in which an existing entry, A, is determined to encompass multiple distinct languages, B and C. (For sake of discussion, I'll assume that B is the more well-known / developed of the two languages.) Now in principle, there are three strategies that can be taken:<br>


<br>

i) deprecate the identifier for A and create new identifiers for B and C<br>

ii) create new identifiers for B and C, change the scope of A to macrolanguage, and have A encompass B and C<br>

iii) redefine A to denote B but not C, and create a new identifier for C<br>

<br>

There's a general problem with option (iii) that existing records tagged with A may actually be in C, and so those records have suddenly become incorrectly tagged. That problem doesn't occur for either (i) or (ii). To avoid this problem, the text of 639-3 stipulates that, in maintaining the code table, strategy (iii) could not be employed.<br>


<br>

Now, it's become clear since the approval of 639-3 that, while the above is strictly speaking true, that hypothetical impact on existing records is not the only type of negative business impact that may arise from the need to accommodate the B/C distinction, and that strategies (i) and (ii) can also have negative business impacts, and potentially moreso than strategy (iii). But strictly speaking, the text of 639-3 doesn't permit that. (That hasn't stopped the RA and JAC from doing something in the vein of (iii) in some cases, but with a hand-wave of assuming that A was really never meant to denote C.)<br>


<br>

Even if the text of 639-3 did permit strategy (iii), it won't always be clear how the strategies compare in terms of their real-world business impacts.<br>

<span class="HOEnZb"><font color="#888888"><br>

<br>

<br>

Peter<br>

</font></span><div class="im HOEnZb"><br>

-----Original Message-----<br>

From: <a href="mailto:ietf-languages-bounces@alvestrand.no">ietf-languages-bounces@alvestrand.no</a> [mailto:<a href="mailto:ietf-languages-bounces@alvestrand.no">ietf-languages-bounces@alvestrand.no</a>] On Behalf Of Doug Ewell<br>


Sent: Monday, August 20, 2012 11:25 AM<br>

To: <a href="mailto:ietf-languages@iana.org">ietf-languages@iana.org</a><br>

Subject: Re: Review period; Nepali and Oriya<br>

<br>

</div><div class="HOEnZb"><div class="h5">Mark Davis 🍶 <mark at macchiato dot com> wrote:<br>

<br>

> Now, that being said, if this group wants to have Nepali and Oriya be<br>

> macro languages, it is not really a problem for CLDR; simply more<br>

> entries in the tables. It will cause migration hassles for other<br>

> implementations that use BCP47, but that is not an issue with CLDR.<br>

> The more common the language, the worse the hassles. For example,<br>

> consider what would happen were ISO to decide that 'en' really was a<br>

> macrolanguage with 'ens' being Standard English, and 'enz' being New<br>

> Zealand English—how much software would hiccough when it hit<br>

> 'enz-GB'...<br>

<br>

I don't think this is an argument for or against creating extlangs. It's more an argument that ISO 639-3/RA should stop converting individual language code elements into macrolanguages. This group didn't decide to have Nepali and Oriya be macrolanguages, of course; that was the RA's decision.<br>


<br>

If the RA did what you posit with English, and ietf-languages followed this by creating extlangs, then the theoretical "New Zealand English as used in the United Kingdom" could be tagged, using extlang form, as "en-enz-GB". Existing software might have an easier time with this than with "enz-GB". Of course, in order to assign extlangs under English, this group would have to buy the notion that there is a "specific dominant variety" of English (§4.1.2), and that seems improbable.<br>


<br>

--<br>

Doug Ewell | Thornton, Colorado, USA<br>

<a href="http://www.ewellic.org" target="_blank">http://www.ewellic.org</a> | @DougEwell <br>

<br>

_______________________________________________<br>

Ietf-languages mailing list<br>

<a href="mailto:Ietf-languages@alvestrand.no">Ietf-languages@alvestrand.no</a><br>

<a href="http://www.alvestrand.no/mailman/listinfo/ietf-languages" target="_blank">http://www.alvestrand.no/mailman/listinfo/ietf-languages</a><br>

_______________________________________________<br>

Ietf-languages mailing list<br>

<a href="mailto:Ietf-languages@alvestrand.no">Ietf-languages@alvestrand.no</a><br>

<a href="http://www.alvestrand.no/mailman/listinfo/ietf-languages" target="_blank">http://www.alvestrand.no/mailman/listinfo/ietf-languages</a><br>

</div></div></blockquote></div><br></div></div></div>