Registration of el-Latn language tag
lucp at skopos.be
Wed Sep 28 11:23:16 CEST 2005
> "it's Greek, and the script is Latin; for all other properties - guess".
That sums it up quite nicely, I think.
In the case of contemporary Greek, it could have been
transliterated/transcribed (TL/TS'd) according to ISO 843 or ELOT 743 or
anything else (just Google for "Greeklish" and you'll see what I mean).
Never mind that international standards tend to rely heavily on the
English language's sound system. If you're TL/TS-ing for a non-English
audience, some twists and tweaks may be needed, especially if your
purpose is educational.
Many of the TL/TS mappings are not fully reversible, especially not
for a computer. I can see only one practical way to spell-check an
el-Latn document and that is a) to agree with the author on a set of
TL/TS rules (the hardest part ;-) and b) check against a dictionary that
is obtained by applying those same rules to the words from a "real"
Greek dictionary. A spell checker manufacturer could provide several
such el-Latn dictionaries, each one made with a different TL/TS
"standard". In my case, I would look for "Greek in Latin script,
transliterated for a Dutch-speaking audience" in the drop-down list,
rather than "Greek transliterated with ISO843".
Of course this applies not just to Greek. I have been thinking that
it's a pity that RFC3033bis doesn't address this issue explicitly. A
"transliteration ruleset used" subtag, underneath the script subtag,
would have been solved the problem - in theory. Not that I see how that
would be practical or possible, given such an open-ended set of TL/TS
methods. Maybe it could be handled with a registry that requires the
requester to provide a "public domained" computer algorithm that
describes the mapping and/or working, open-sourced computer code. Easier
said than done. And I suppose it's off-topic for this list anyway.
While in philosophy mode, I'll allow myself to note that I don't think
this issue is limited to script subtags only. It applies to the entire
tagging system as a whole. Intended as it is to tag human-to-human
communication, it does not - and can not - eliminate 100% of the
guesswork. There will always be some ambiguity.
More information about the Ietf-languages