English as spoken in Northern Ireland (long)

Doug Ewell dewell at adelphia.net
Sun Jun 1 14:12:40 CEST 2003

Only 24 hours ago, but it feels like much longer, Marion Gunn <mgunn at
egt dot ie> wrote:

> What, then, is the code for the English of 'Northern Ireland'?
> (GB+NI=UK.)

Let's see if we can bring a little order to a discussion that has become
very chaotic and personal.

First, let us agree that this thread shall not be cross-posted to
Unicode or any other inappropriate list or individual.  Doing so does
not help solve any problems or increase understanding.

Language tags, as described in RFC 3066, are meant to identify languages
spoken, written, and otherwise used by humans.  In the simple case,
there are separate codes for English, French, Spanish, Chinese, etc.

However, when you deal with human languages, you often find you must
venture beyond the simple case.  For example, there are different
dialects of each of the four languages mentioned above.  These dialects
can loosely be interpreted as corresponding to geographical areas, which
in turn are often (but not always) most easily described as country
names.  Sometimes the distinction between languages and dialects is
itself unclear.

For example, one can talk about "English as spoken in the United States"
vs. "English as spoken in the United Kingdom," or "French as spoken in
France" vs. "French as spoken in Canada."  These distinctions might
relate to spoken accent, vocabulary, choice of formal vs. informal
pronouns, choice of prepositions ("different from" vs. "different to"),

There are other aspects that make language identification complicated.
Most languages are not only spoken, but also written.  Sometimes it is
necessary or desirable to identify language variants based on written
aspects, such as spelling ("color" vs. "colour") or writing system
(Latin vs. Cyrillic) or writing sub-system (Han simplified vs. Han
traditional) or even the use of punctuation (whether full stops go
inside or outside quotation marks).

Importantly, it is *by no means* obvious, to all people at all times,
which language distinctions need to be captured in tags and which do
not.  Everyone would agree that if different languages are to be tagged,
English and Chinese should be tagged differently.  Not everyone would
agree, though, that the differences between U.S. English and Canadian
English need to be tagged.  Fewer still would agree to tagging the
differences between California English and New York City English, though
such differences exist and have been highlighted in books, movies,
popular music, and jokes.

In the early 1980s there was a song by Frank Zappa called "Valley Girl,"
which featured running commentary by Zappa's teenaged daughter in a
self-conscious dialect of English that featured exaggerated tone ranges,
specialized vocabulary, and frequent interjections of "like...
y'know...fer surrre" and such.  The dialect, sometimes called
"Valspeak," was supposed to be identified with the San Fernando Valley
(north of Los Angeles) but was instantly recognized by many Southern
Californians who live outside "the Valley" as well.  (When I heard the
song, it reminded me instantly of a girl I knew who lived in Orange,
about 60 miles (100 km) southeast of the Valley geographically and much
farther than that culturally.)

There are other examples of "Southern California English" that have
drawn popular attention; see, for example, Sean Penn's performance as
Jeff Spicoli in the movie "Fast Times at Ridgemont High," from about

Of course, not all Southern Californians talk like this, but there is
certainly a speech pattern or accent or dialect or WHATEVER that can be,
and has been, associated with (parts of) Southern California.

How would you tag this?  "English as spoken in the United States, and
specifically California" could be coded as en-US-CA using ISO 3166-2
subdivision tags.  But clearly that's not focused tightly enough; people
in San Francisco might well take offense.  So, y'know, should a tag
totally be registered to capture this, like, unique dialect or what?

Now let's look at "English as spoken in Northern Ireland."

The first thing we must do, if we want to talk about language tagging
and not politics, is to recognize that the use of ISO 3166-1 country
codes to denote "regions" where languages are spoken is a convenience,
not a political statement.  People do not speak, read, and write
fundamentally different versions of English in Detroit, Michigan, and
across the border in Windsor, Ontario.  Any differences associated with
national or sub-national boundaries are simply made for convenience.

Marion asked (roughly) how one should indicate "English as spoken in
Northern Ireland."

First, this assumes that there is a language distinction worth capturing
between (1) English as spoken in Northern Ireland and (2) English as
spoken in Ireland, the independent country (Eire), or for that matter
between (1) above and (3) English as spoken on the island of Great
Britain.  This is all about language, not national boundaries or
political self-determination.

If there is a LANGUAGE distinction worth making, then it is worth
considering a new tag to denote this special variation of English.  If
there is not, then en-GB or en-IE should be used.  That's pretty much
all there is to it.

Michael's suggestion, to incorporate the term "Ulster" in such a new
tag, was almost certainly meant to imply that if such a language
distinction does exist, it should be identified not with the political
entity "Northern Ireland" (note capital N) but with the geographical
area "northern Ireland" (note small n).

Note that the ISO 3166-1 country code "GB" refers to the United Kingdom
of Great Britain and Northern Ireland.  It is not intended to favor the
British-island portion of that political entity over the Irish-island
portion.  The choice of "GB" rather than "UK" as a country code was a
decision made by the ISO 3166 Maintenance Agency, and any disagreement
with that choice should be communicated to the ISO 3166/MA.

I apologize for the length of this post.  Hopefully it will add some
light to a debate that has so far been mostly heat.

-Doug Ewell
 Fullerton, California

More information about the Ietf-languages mailing list