Criteria for languages?

CE Whitehead cewcathar at
Fri Dec 4 01:03:47 CET 2009


I agree that except for Standard German, most Germanic varieties are not particularly developed
in terms of written literature; my understanding is that some of these varieties may be
used in emails.  I found no Walliserdeutsch content online.

Perhaps I did not search far enough.


Peter Constable petercon at 
Wed Dec 2 00:43:06 CET 2009

> Consider it in terms of putative character disunifications. If someone asked to have a character “Blort” encoded and we had no reason to suspect a connection to any already encoded character, then we’d probably treat that differently than if we knew that lots of content was already representing this as an already-encoded character “Flort”. When the decision is taken, we use the information available; if we had no knowledge of a connection with an already encoded character, shipped a new version of Unicode including “Blort” and then someone came along with additional info about the connection with “Flort”, that wouldn’t substantively change anything.

> So, I guess your question, as it pertains to Walliserdeutsch in relation to “de” and “gsw”, is whether it was reasonable to expect anyone had used “de” or “gsw” to tag Walliserdeutsch content.

> This rationale comes to mind: by a vast, vast margin, content tagged “de” is in Standard “High” German, and with a few exceptions most other Germanic varieties are not particularly developed in terms of literature. So, it isn’t unreasonable to assume that there is no significant use of “de” for those varieties _unless given evidence otherwise_.
This seems to be so as far as I can tell.
> A further principle supporting this rationale is that there can be definite _dis_advantages to using macrolanguage entities: they are useful so long as a certain distinction is not particularly interesting, but once that distinction becomes interesting then the macrolanguage becomes a burden. Consider, for instance, the inconvenience of having “lav” for the Latvian macrolanguage and also “lvs” for Standard Latvian / “ltg” for Latgalian. So, from this perspective, it seems to me like macrolanguage entities are things we would rather avoid whenever possible, implying that we create them only when effectively forced to. In the case of “lav”, established and documented usage in MARC may force us to change “lav” into a macrolanguage; but I don’t know of anything compelling us to do so for “de” or “gsw” as a result of coding Wallisertitsch.

I have no final opinion on this matter.  Best,
--C. E. Whitehead
cewcathar at
> Peter
-------------- next part --------------
An HTML attachment was scrubbed...

More information about the Ietf-languages mailing list