[Fwd]: Response to Mark's message]

Martin Duerst duerst at w3.org
Thu Apr 10 16:35:47 CEST 2003


At 12:26 03/04/10 +0100, Jon Hanna wrote:

>Currently the only method for deducing scripts is either heuristically (look
>at the characters used and then deduce that the script used is whatever
>script uses those characters)

This is not heuristic. This is highly deterministic. Things can
go wrong only for cases where there is no actual text (e.g. only
numbers or only punctuation). There is absolutely nothing wrong
with infering the script this way.

>or guessing from the language as in the second
>point above. While we all agree that this is not ideal, we have to recognise
>that software doing so will continue to exist for some time after a better
>solution is available.

I highly doubt that active markup is a better solution for
indicating the script than deduction from the actual
characters. Looking at the current state of language markup
on the Web, in most cases, such markup is missing, and in
some cases, it is wrong. Automatic deduction from the
actual characters is the most accurate thing to do.
There is no 'better solution'.

[That doesn't mean that script variants in language tags can't
be handy for negotiation, and for libraries, where it's more
difficult to inspect the actual characters.]


Regards,    Martin.


More information about the Ietf-languages mailing list