Registration of el-Latn language tag
mark.davis at icu-project.org
Wed Sep 28 16:41:17 CEST 2005
In CLDR (http://unicode.org/cldr/) we have just recently agreed to add
transliterations (which include roundtripping and non-roundtripping (aka
transcriptions) to the registry of locale data, so they should be
available in the next release. Of course, the data will initially be
limited, but that is the purpose of the registry, to provide a common
location for collecting such data.
Luc Pardon wrote:
>>"it's Greek, and the script is Latin; for all other properties - guess".
> That sums it up quite nicely, I think.
> In the case of contemporary Greek, it could have been
>transliterated/transcribed (TL/TS'd) according to ISO 843 or ELOT 743 or
>anything else (just Google for "Greeklish" and you'll see what I mean).
>Never mind that international standards tend to rely heavily on the
>English language's sound system. If you're TL/TS-ing for a non-English
>audience, some twists and tweaks may be needed, especially if your
>purpose is educational.
> Many of the TL/TS mappings are not fully reversible, especially not
>for a computer. I can see only one practical way to spell-check an
>el-Latn document and that is a) to agree with the author on a set of
>TL/TS rules (the hardest part ;-) and b) check against a dictionary that
>is obtained by applying those same rules to the words from a "real"
>Greek dictionary. A spell checker manufacturer could provide several
>such el-Latn dictionaries, each one made with a different TL/TS
>"standard". In my case, I would look for "Greek in Latin script,
>transliterated for a Dutch-speaking audience" in the drop-down list,
>rather than "Greek transliterated with ISO843".
> Of course this applies not just to Greek. I have been thinking that
>it's a pity that RFC3033bis doesn't address this issue explicitly. A
>"transliteration ruleset used" subtag, underneath the script subtag,
>would have been solved the problem - in theory. Not that I see how that
>would be practical or possible, given such an open-ended set of TL/TS
>methods. Maybe it could be handled with a registry that requires the
>requester to provide a "public domained" computer algorithm that
>describes the mapping and/or working, open-sourced computer code. Easier
>said than done. And I suppose it's off-topic for this list anyway.
> While in philosophy mode, I'll allow myself to note that I don't think
>this issue is limited to script subtags only. It applies to the entire
>tagging system as a whole. Intended as it is to tag human-to-human
>communication, it does not - and can not - eliminate 100% of the
>guesswork. There will always be some ambiguity.
> Luc Pardon
>Ietf-languages mailing list
>Ietf-languages at alvestrand.no
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the Ietf-languages