registration requests re Portuguese

Doug Ewell doug at ewellic.org
Mon Apr 13 21:56:36 CEST 2015


Yury <yury dot tarasievich at gmail dot com> wrote:

>> In other words, in order to tag content, one must know the proper way
>> to tag the content.
>
> Not quite. In order to tag, one has to have a reference point. Once
> there is a reference point (book etc.), the country becomes redundant.
> Or does it?

I'm not sure what you mean, but I suppose it depends of what sort of
"reference point" you are talking about.

Here are two English sentences:

1. He says Otis "has modernized its elevators."
2. He says Otis "have modernised their lifts".

The differences between the conventions used here are (a) the use of
singular vs. plural verbs and pronouns to refer to a company with a
singular name, (b) the choice of "-ized" vs. "-ised" spelling, (c) the
vocabulary choice of "elevator" vs. "lift," and (d) the placement of the
period (full stop) inside vs. outside the quotation marks for a partial
quote.

I guess the "reference point" here is the knowledge one has about
different English conventions, regardless of whether one consults a
dictionary or style guide or government decree, or just knows. (Some
might argue the fine points of these examples; The Economist says "Otis"
is singular but "Manchester United" is plural.)

But in any event, by far the most common way to characterize these
differences is to say that the first sentence is "American English" and
the second is "British English." And the way to tag that is to use
"en-US" and "en-GB" respectively.

A similar exercise could be performed for other languages, though of
course not all.

>> For some languages, there are noticeable differences in usage
>> (spelling, pronunciation, lexicon, grammar, etc.) that are best
>> described as being characteristic of one region or another. Any
>> English speaker will agree there are differences between "American
>> English" and "British English,"
>
> So you have fuzzily defined categories, like those you mention,
> generalising the localised practice, for which and for which only
> lang_REGION are of use, and you have precisely defined categories,
> which do not need REGION, and which are trans-border, indeed.

I don't think "fuzzy" vs. "precise" has much to do with this. Some
distinctions are typical of regional usage, and some are not. Indeed,
that's exactly the argument that applies to the Portuguese variants:

There are non-orthographic distinctions between "Portuguese of Portugal"
and "Portuguese of Brazil" which are best tagged as "pt-PT" and "pt-BR".
Then there is the question of standard orthographies, best tagged (if
these variants are approved) as 'abl1943' vs. 'ao1990' vs. 'colb1945'.
Any combination of region subtag plus variant subtag, or just one, or
neither, might make sense depending on the context.

>> None of this has anything to do with currency signs, thousands
>> separators, date formats, and the like.
>
> Yes. That's why I drew the line between culture and strict language
> relatedness.

But the premise that region identifiers are not appropriate for language
identification still doesn't hold. They are entirely appropriate for
identifying some types of language varieties, and not at all appropriate
for others.

--
Doug Ewell | http://ewellic.org | Thornton, CO 🇺🇸



More information about the Ietf-languages mailing list